Speech Recognition Audacity Plugin
A speech recognition plugin for Audacity for automatic transcription.

Installation

I am using Ubuntu Desktop 19.04 with anaconda python environment manager. Python 3.6 will be used throughout.

Start by installing the Google Speech-to-Text pip package.

pip install --upgrade google-cloud-speech

A service and access account has to be created on the Create Service Account Key page. Download the credentials and note the location.

Set the location of the credentials environmental variable.

export GOOGLE_APPLICATION_CREDENTIALS="[PATH]"

Enable API access from your API Services Console.

More instructions are available on the general Google Cloud Speech-to-Text Docs website and Google Cloud Speech Python-specific documentation.

Compile Audacity with mod-script-pipe

Audacity does not ship with scripting support for linux. So it must be compiled with Audacity from source. I am following the instructions here, but look for the most current build instructions here.

NOTE: There are a lot of caveats to this setup. Consult the linux build wiki for more info.

If you already have audacity installed via apt-get, remove it first.

sudo apt remove audacity

Install wxWidgets and other packages:

sudo apt install cmake build-essential gcc libwxgtk3.0 libavformat-dev libsndfile1 libasound2-dev libgtk2.0-dev gettext libid3tag0-dev libmad0-dev libsoundtouch-dev libogg-dev libvorbis-dev libflac-dev libmp3lame0 -y

Get the Audacity source code from FossHub and extract it. Then we build:

cd audacity-*
mkdir build
cd build
../configure --with-lib-preference="local system" --with-ffmpeg="system" --disable-dynamic-loading --with-mod-script-pipe
make -j4

We also need to build mod-script-pipe

cd lib-src/mod-script-pipe
make -j4

Let’s test it.

cd ../../
mkdir "Portable Settings"
./audacity

If everything works, install it.

sudo make install
sudo mkdir /usr/local/share/audacity/modules
sudo cp lib-src/mod-script-pipe/.libs/mod-script-pipe.so /usr/local/share/audacity/modules/mod-script-pipe.so
sudo cp lib-src/mod-script-pipe/.libs/mod-script-pipe.so.0.0.0 /usr/local/share/audacity/modules/mod-script-pipe.so.0.0.0

Run Audacity and enable mod-script-pipe from the Edit -> Preferences -> Modules menu.

image
image

Make sure to restart Audacity to use the pipe.

Download and run pipe_test.py to make sure everything is connected.

wget https://raw.githubusercontent.com/audacity/audacity/master/scripts/piped-work/pipe_test.py
python pip_test.py

You should get lots of OKs.

Enable the Extras menu by clicking View –> Extra Menus (on/off) so that scripting will work correctly.

Work Log

27 Apr 2020

I have an idea.

Lately I have been starting some voice-over work for how-to videos. It seems that sound (and perhaps video) editing will be a major consumer of my time. Since I am mainly doing voiceover, i.e., spoken English language, work, perhaps I can use speech recognition to tag sections for easy editing.

The process that I am envisioning consists of two parts: A python API call to Google Speech and an Audacity python script to insert tags.

Well it’s 2am and I finally got the thing to work.

Bibliography