Summary

10 minutes

Congratulations on building an audio binary classification speech model!

You now have a better understanding of how an analog audio turns to digital sound, and how to create spectrogram images of your wave files. You used the PyTorch Speech Commands dataset, parsed the classes down to yes and no, and then looked at ways to understand and visualize audio data patterns. From there, you took the spectrograms, created images, and used a convolutional neural network to build your model.

You can expand on this knowledge by looking at other datasets and sounds, and also by looking at the MFCC transformer. Then you can build your model.

Be sure to check out these other modules, too:

Introduction to PyTorch
Computer Vision with PyTorch
Natural Language Processing with PyTorch

Tip

To open a hyperlink, right-click and choose Open in new tab or window. That way you can see the resource and easily return to the module.

Continue

Feedback