A tool to make any musician's practice more productive.
Project by Jon Tsai (jt765) & Alexander Wood-Thomas (adw75)
The Raspberry Pi Music Assistant is a device that utilizes a Raspberry Pi to implement three useful musical practice tools. These include a metronome, tuner, and recorder, each of which is created completely through software. The RPi Music Assistant is controlled completely through its touch screen interface, and it also uses a microphone and speakers for audio input and output.
Our project can be divided into four major parts: menu display and navigation, the metronome, the tuner, and the recorder. In terms of hardware, Figure 1 shows the components used and connections among them for the device, and Figure 2 shows the actual physical setup of the device. We used a Raspberry Pi, a piTFT display, a USB microphone, and external speakers/headphones.
Figure 1. An initial diagram for planning the required hardware. Other than the Pi and PiTFT screen, the project only requires a USB microphone and speakers.
Figure 2. Physical setup of device.
The pygame library was used for creating the various menu screens and touchscreen buttons that allowed for control by the user. The initial designs for these screens are shown in Figure 3. A different file was created to handle the functionality for each screen, and they are called on by an overall controlling file that manages the navigation between each menu. For each menu, the program will check for when the user taps on the touchscreen and determine which button was pressed if any. If a button was pressed, then another action will occur based on the button's functionality. When the RPi Music Assistant boots up, it starts at the main menu which contains touchscreen buttons for each of the musical tools and quitting out of the application. The main menu screen on the device is shown in Figure 4.
Figure 3. Initial concept sketches for the different screens of the music assistant. Concept sketches helped guide GUI development.
Figure 4. Main menu screen. Contains buttons for each of the music tools.
The metronome provides visual and audio cues at a constant pace, and this helps a musician maintain a consistent tempo when practicing. The standard used for the tempo is beats per minute (bpm), which can be adjusted by the user through onscreen buttons. The menu for the metronome contains three buttons, for returning to the main menu, increasing the bpm, and decreasing the bpm. Other visual elements include the current bpm setting, a blinking red circle, and a line that bounces left and right. The completed metronome tool is shown in Figure 5.
Figure 5. Metronome menu screen. The line bounces left and right and the red dot blinks according to the bpm rate.
The bpm is maintained as a global variable in the metronome file, so when either of the bpm buttons are pressed, this value is updated accordingly and redisplayed on the screen. We decided to have the bpm change in increments of 5 to reach the desired bpm more quickly and also because most music uses bpm that is a multiple of 5. To have the blinking circle and bouncing line follow the timing of the given bpm, we used the time library and set display behavior based on the bpm value. First, we determined how many seconds correspond to one beat based on the bpm. For the blinking red circle, we set it so that at the beginning of each beat, it would show up on the display for 0.1 seconds and then disappear for the remainder of the beat duration. Generally, most music will be well under 300 bpm which corresponds to a beat duration of 0.2 seconds, so we did not worry about cases where the red circle would display for longer than the beat duration. Additionally, at the start of every beat, we would play an audio file that contained a simple percussive wood block sound. At the same time as displaying the red circle each beat, we would use a subprocess call with aplay to play the sound file, giving the audio cue for every beat.
Displaying the line that would bounce left and right was more challenging to do. The pendulum is drawn with a PyGame Line object and is redrawn at 60 frames per second to produce the animation. The end positions of the pendulum relative to the base point are defined according to equations 1 and 2.
We first defined the angle of the line as a sine function of time (equation 3). A sine function for the angle results in a smooth motion of the ticker.
We decided that the smooth motion wasn't as visually useful, and instead decided to use a constant angular rate with a triangular wave function for the angle. A triangular wave is more complex and less intuitive to implement analytically, but we did so successfully with equation 4.
Once we implemented the movement of the ticker, we noticed that the ticker and the blinking circle were not synced together, even though they both followed the bpm correctly. We realized that this was because the ticker and circle used different time variables which were reset different amounts, so to fix this problem, we set both of them to use the same time counter. To test the metronome, we observed the ticker and circle to confirm that they follow the bpm and repeated this for different bpm values. Other than having to synchronize the two visual elements, we were able to progress through this portion of the project smoothly.
The tuner is used to play a specified note or listen to the sound received through the microphone and indicate whether it is in tune or not. The tuner screen contains buttons to change the note to be played, to play the note, to listen for the user's sound, and return to the main menu. The completed tuner screen is given in Figure 6, while Figure 7 shows the tuner when listening for input.
Figure 6. Tuner screen. Can play a selected note and indicate how in tune the note being received is.
Figure 7. Tuner listening screen. The symbol underneath the note in the top left indicates if the input is flat or sharp.
In order to play different notes, we had to store a list of notes and their frequencies in the tuner file and then call on a specific index of these lists to match the note to be played. Additionally, for the listening functionality, we needed to determine what note was being received in order to decide if its intonation was flat or sharp. To do this, we set thresholds for each note so that if the frequency received was between a certain range, we would know what note it was. The thresholds were simply set as the average frequency between two consecutive notes. To play a given note, we decided to output a sine wave with frequency equal to the specified note's frequency.
It took multiple attempts at different approaches to create a clean sounding tone. Initial attempts used the PyAudio library to play a fixed frequency, by feeding sine data in chunks to an output stream. This resulted in extremely choppy sound, which didn't produce any recognizable frequency. Continued research and multiple attempts at implementing example code were fruitless for the first couple days of attempts. Our final attempt used the PyGame library's audio functionality. The Note class extends a PyGame Sound object, which already has functions to start and stop playing sound. On instantation, the Sound object is created with a buffer of sound data for one period of the tone. Once the Sound object is created, we can start and stop playback on command with built in PyGame functions.
To use the tuner to detect frequency, it has to be placed in "listening mode." Listening mode continually pulls from a 8000 sample per second PyAudio stream in chunks of 4096 samples. This effectively analyzes half second chunks of sound at a time. It first uses a numpy Hanning filter to smooth the data, and then performs a Fast Fourier Transform (FFT) on the data (also with numpy). It considers only the lower half of computed frequency amplitudes to reject any high frequency noise, finds the max amplitude, and then returns the corresponding frequency bin.
The recorder consists of two parts: displaying the list of recordings, and making a new recording. The recording list screen displays recordings stored on the device three at a time and contains buttons for scrolling between pages of recordings, moving to a new screen to make a recording, and returning to the main menu. Additionally, when one of the recordings is touched, the recording will play, and three additional buttons for deleting, stopping, and pausing the recording will show. The completed recording list screen is shown in Figure 8, with the playback buttons being shown in Figure 9.
Figure 8. Recording list screen. Shows all recordings on the device and can play a selected one.
Figure 9. Recording list playback screen. Can delete, stop, or pause the file that is playing.
The recordings are stored in a local directory on the device and are shown three at a time on the screen. After obtaining the list of files in the directory through the function "os.listdir()", we used a global variable to keep track of which page of recordings to display. For the three recordings to be shown, we created a pygame screen button for each and flipped them onto the display. Then, if a recording's button was tapped by the user, we would call a function to create and display the delete, stop, and pause buttons, and then play the file.
The recordings are all made to be in .wav format, so we opened the file using the "wave.open()" function. Then, we used the PyAudio library to handle playback of the recording. This works by opening a stream based on the channels, frame rate, chunk size, and format of the opened file, and then reading the file's data based on the chunk size. To produce the sound, the data is continuously written to the stream until the file has completed playing. In order to control the playback of the file, while the data is written to the stream, we also have to check for any button presses. If "Stop" is pressed, then we stop writing to the stream, if "Delete" is pressed, we also stop writing to the stream and then delete the file, and if "pause" is pressed, we enter a while loop to prevent writing to the stream. Here, the pause button changes to a "Play" button, and if this button is pressed, then we exit this while loop and continue writing. When the stream is paused, the "Stop" and "Delete" buttons still can be pressed. Once writing to the stream has finished, we redisplay the recording list and remove the extra playback buttons.
To test the recording list screen, we created several recordings that were stored on the device and also inserted a song to test the audio quality. When we first completed this part, the audio quality of the playback was extremely choppy to the point of the tempo being inaccurate as well. We noticed this was the case because when playing the test song, it almost sounded like it was slowed down. Upon examining our code, we realized that we had not set the chunk size for the playback stream, and that resulted in the poor audio quality. We set it to a value of 8192, and that improved the playback significantly, although it was not perfectly seamless. However, with this, we encountered a new problem: when playing the recordings, some of it was cut off at the end. It turns out that this was also related to the chunk size that was set. The chunk size represents the number of frames in each chunk that the audio signal is split into, so we figured out that the end of the files were cut off because there were not enough frames to make a complete chunk. So, we set the chunk size to be the minimum of 8192 and 10% of the total number of frames in the file. In the case of small files, this will ensure that the entire file will be played, and in the case of large files, the chunk size will not be set too large. Fixing these problems significantly improved the audio playback quality of our recorder.
The recorder screen is shown in Figure 10. When making a new recording, there are only two buttons, one for starting the recording and the other for returning to the recording list menu. Once the "Start" button is pressed, the audio received from the microphone starts to be recorded, and the display on the screen will show a "Stop" button, the duration of the recording and a blinking red dot. This is shown in Figure 11. Once the "Stop" button is pressed, the recording stops and then writes and saves a file to the directory containing all of the recordings. The file name is simply made to be the date and time that the recording was taken.
The PyAudio library was used again in this case to record the audio. We create an input stream of a certain polling rate (8000 samples/second), which saves the data to an array in memory in chunks (every 1024 samples). Once recording ends, the python wave library is used to save the data to a wav file. It saves it as a mono-channel audio file, with the same sample rate the file was recorded with.
Playback in the recording list happens conversely. The wave library is used to open a WAV file and load the data to a buffer. The buffer is then directed to a PyAudio output stream in chunks until there is no data left or if the user presses a button.
Figure 10. Recorder screen. Allows user to make a recording that is then saved to the device.
Figure 11. Recorder screen while making a new recording.
We were able to successfully implement all of the originally intended functionality of the music assistant, including a metronome, tuner, and recording device. In particular the metronome and certain aspects of the tuner worked exceptionally. The metronome was able to provide an easily adjustable, accurate tempo for long periods of time. It also gave clear audio and visual feedback with the simulated pendulum, blinking dot, and crisp wooden tick sound. The tuner was able to accurately produce fixed frequency notes with no distortion or noise. It could also detect the peak frequency of a computer generated pure tone within 1 Hz.
Having high accuracy for detecting these frequencies was a great result, but this was only the case in ideal scenarios. The microphone we used had both low sensitivity and high noise. This meant that tones had to be played at high volume and very close to the microphone. This meant that it was also difficult to pinpoint the frequency of a voice note, because of the lower volume and overtones of human voice.
The recording list display also worked well. It clearly displayed all previous recordings in the file structure by navigating through files in a simple page-based format. These recordings could be easily played, paused, stopped, or deleted. The main issue with the recorder was in making the recordings: just as with the tuner, the low-quality microphone made recordings quiet and low quality. It was also difficult to keep track of recordings: they were automatically named based on system date and time, to give unique names and order the files. This worked well in general, but had issues when the RPi was disconnected from WiFi because of the lack of an onboard clock.
For our project, we were able to successfully make the device that we had designed and include all of the functionality that we intended for it. The metronome can provide audio and visual cues to help the user maintain a given tempo, and the tuner can tell the user how accurate their intonation is. Although the audio input for the tuner does seem to pick up a lot of background noise, the tuner has very high accuracy when determining the frequency of the input. The recorder tool allows the user to make new recordings and also listen to any of them that are stored on the device. The quality of the audio playback of the recordings is good, although it is not completely smooth. Overall, despite some features not working as perfectly as we would have liked, the result of our project is a useful and helpful tool to assist musicians in their practice.
If we had more time to work on the project, we would focus on expanding the functionality of each music tool to improve the value of the RPi Music Assistant. For the metronome, we could add a time signature setting that dictates what type of note corresponds to a beat and how many beats per measure. We would need to have two different sound files, one to play for the first beat of each measure, and the other for the rest of the beats in the measure. This feature in particular would be useful for novice musicians who may still need to get used to performing in different time signatures. In the tuner, we currently only indicate whether the note being received is sharp or flat, but it does not give a very good sense of how out of tune it is. There exists a logarithmic unit called cent to describe the tone difference between notes on a linear scale. This unit is defined such that two adjacent notes will be 100 cents different, so this could be a useful type of measure to indicate to the user how in tune their note is. It would also be helpful to have a visual line similarly to the metronome that would shift further left or right depending on how flat or sharp the note being received is. Finally, for the recorder tool, we could have some sample background music files for the user to practice playing with. This would be very useful especially for cases like in jazz where improvisation is very prominent. With the background music, a user could freely practice whatever melodies they think of and hear how it would fit in with an actual band. We would also try to adjust the recorder so that you could play one of these background music files while making the recording. This would give the user a much better idea of how their own music fits in with others. These are just a few ideas that we could use to extend the project, and we think that there is a lot of potential for such a device to encompass even more features.
Wrote audio input/output code including: metronome tick, tone generation and frequency detection, and recording and playback. Animated the metronome. In the report, described the design and implementation of these elements, in addition to setting up the website and writing the results section, parts list, references, and code appendix.
Designed GUI, user interface, and feature list. Created general code architecture and created the recording list. Provided all relevant musical knowledge. In the report, wrote the introduction, parts of design and testing, conclusion, and future work.