Artificial Intelligence

Automatically splitting an Audio File into smaller chunks based on silence

Learn how you can split an Audio File into smaller chunks to create an AI dataset

The most important part for Audio Classification is to be able to create a dataset with sounds. To do so, we can easily go to a Video Provider and download the source file as a .WAV file, but how can we now change this source file from a lengthy one (e.g., 200MB+) towards smaller chunks?

For the project I am working on I did just that! So let's go through the process of what I did that resulted in creating smaller chunks of audio from a big audio file.

Downloading the Source File

First, I started by finding a video that was to my interest for my dataset

💡 Interesting files are "Compilation" files as they contain a lot of the audio that we require.

Once we found such a file, we can download it straight to the .WAV file. For this, I utilized https://youtubeto.org/en/youtube-wav.html which resulted in a ~250MB .WAV file.

Splitting the Audio File on the Silent Parts

Now the most interesting part is to split this file into smaller chunks. Go and install https://www.audacityteam.org/ which is an Audio Editor that is going to help us tremendously!

💡 This works best when there is a small "gap" in between audio fragments

Once installed, open your file in Audacity

This will open the spectrogram in a normalized linear view. Now, to make our lives easier we can switch this to a logarithmic dB view by right-clicking on the vertical scale and selecting "dB"