Activity 2: Audio Recognition
This is the second activity in the AI with Finch series and assumes you have completed Activity 1: Image Recognition prior to beginning this lesson. An audio recognition model works similarly to an image recognition model, but the model is trained with audio clips instead of images. For a review on how machine learning models work, see Activity 1: Image Recognition. This activity utilizes the Snap! programming language. If you are not familiar with Snap!, we recommend you complete our Finch with Snap! tutorials prior to starting this activity.
*You will need access to a computer with a microphone to complete this activity.
In this activity, you will train your first Audio Recognition Model using Google’s Teachable Machine. To learn more about Google’s Teachable Machine and how prediction models work, see Activity 1: Image Recognition.
$wistialoaded = true; ?>Find a quiet room to complete this activity. Because your audio recognition model will make its predictions using sound, too much background noise will impact its accuracy.
Open Google Teachable Machine in a new tab or window. This tab will need to stay open and visible on your screen while you work. This button will take you directly to start a new Audio Project.
For background information about AI terms like “class”, visit the first activity in this series, Image Recognition (Step 3).
This is very similar to the process that you followed to create your Image Recognition model. For this Audio Recognition model, you will need three different classes, including a specific class for background sound.
Create three different classes in your image recognition model. Name them Background Noise, Stop, and Go.
Next, we will collect audio samples to train the model. Your model will use these samples to learn how to predict which class it hears.
Select the Background Noise class. Before recording your Background Noise audio samples, make sure that the room is relatively quiet. When you’re ready, click the Mic button, select the correct microphone if necessary, and click Record 20 Seconds. Google Teachable Machine will record 20 seconds of audio.
Next, add audio samples for your Go class. Click the Mic button and, when you are ready to record, click Record 1 Seconds. Say the word “Go” loudly and clearly.
After recording, you can use the play button to hear your recording. Listen and be sure that the sound is clear. If you are happy with your recorded sound, click Extract Sample. The new audio sample will show up in the Audio Samples column on the right.
Repeat this step at least 8 times for the Go class.
Once you have collected enough data, click Train Model. Training the model will take a few minutes. It may take longer depending on your device and internet speed.
WARNING
Your browser may pop up a warning that the window is unresponsive – ignore this and allow the machine to keep working. The tab MUST stay open and visible while the model is training.
You may need to close other tabs on your computer.
When your model is finished training, your screen will say Model Trained, and a Preview panel will appear on the right side of the screen. Now, you can test your model.
Say the words “Stop” and “Go”, and watch the way that your model reacts. It should recognize these sounds in the Preview panel. When you are not speaking, it should recognize these sounds as background noise. If it doesn’t, you may need to add more audio samples for each class and train again.
After your model has uploaded, you will see a sharable link highlighted in blue. You will need this link in the following steps to create a Snap! project with your model and complete this activity. Copy the shareable link. After you copy the link, you can close the export window. (You can always open the window and access the link again later.)
Open snap.birdbraintechnologies.com in a new tab. Connect your Finch Robot to Snap!. Click here for instructions on how to connect your Finch Robot to Snap!.
SNAP.BIRDBRAINTECHNOLOGIES.COM
Using the audio classifier in Snap! Is very similar to the process you used in Activity 1 for the image classifier.
If you are using snap.birdbraintechnologies.com, download and import this project into Snap!:
Snap! Audio Recognition (Browser)
If you are using the BlueBird Connector instead of snap.birdbraintechnologies.com, open the project in Snap! with the following button and save a copy for yourself.
Once your URL is pasted into the variable space in the Set block, click the block to run it. The blocks will appear to glow when they are clicked. You will only need to run this script once.
Watch the screen. If the model loaded correctly, you will see a message flash that reads “Model loaded successfully”. The URL will also update in the Snap! stage on the right. If you do not see this message, check that you pasted the URL correctly and try clicking the stack again.
Now that you know the probabilities that correspond with each class, you can use that information to control the Finch’s actions. You can use the probabilities to make the Finch perform certain actions when each class is detected.
When the probability that the model recognizes the audio as “Go” is greater than 90%, the Finch will move forward at 20% speed. When the probability that the model recognizes the audio as “Stop” is greater than 90%, the Finch’s wheels will stop moving.
Now that your model can predict what command it is hearing, we can use that data to control the Finch Robot’s actions. How will you have your Finch Robot respond to your different classes? Use the model that you just created as your beginning code – then modify and create!
Try these challenges:
- Modify the program to add new commands for turning left and right. (You will need to retrain your model to add new classes.)
- Modify the program to make the Finch move at different speeds according to the volume or speed of your voice.
- Modify the program to light up or make sounds as it moves.
- Combine movement, sound, and lights to create a different reaction for each command.
Here are some tips to make editing your program quick and easy:
- Right click and select Duplicate to make a copy of any of our sample code, and pick it up. Then you can edit that code!
- To edit which item your code is responding to, highlight the item number, delete it, and type in your new number. Do this instead of attempting to select from the drop-down
As you test your program, notice what happens if the computer hears audio that is not similar to something in your training data.












