### Extracting Audio from Video with FFmpeg First, you'll extract the audio from your video file into a `.wav` format suitable for speech recognition: 1. **Open your terminal.** 2. **Run the FFmpeg command to extract audio:** ```bash ffmpeg -i input_video.mp4 -vn -acodec pcm_s16le -ar 16000 -ac 1 output_audio.wav ``` - Replace `input_video.mp4` with the path to your video file. - The output will be a `.wav` file named `output_audio.wav`. ### Setting Up the Python Virtual Environment and DeepSpeech Next, prepare your environment for running DeepSpeech: 1. **Update your package list (optional but recommended):** ```bash sudo apt update ``` 2. **Install Python3-venv if you haven't already:** ```bash sudo apt install python3-venv ``` 3. **Create a Python virtual environment:** ```bash python3 -m venv deepspeech-venv ``` 4. **Activate the virtual environment:** ```bash source deepspeech-venv/bin/activate ``` ### Installing DeepSpeech With your virtual environment active, install DeepSpeech: 1. **Install DeepSpeech within the virtual environment:** ```bash pip install deepspeech ``` ### Downloading DeepSpeech Pre-trained Models Before transcribing, you need the pre-trained model files: 1. **Download the pre-trained DeepSpeech model and scorer files from the [DeepSpeech GitHub releases page](https://github.com/mozilla/DeepSpeech/releases).** Look for files named similarly to `deepspeech-0.9.3-models.pbmm` and `deepspeech-0.9.3-models.scorer`. 2. **Place the downloaded files in a directory where you plan to run the transcription, or note their paths for use in the transcription command.** ### Transcribing Audio to Text Finally, you're ready to transcribe the audio file to text: 1. **Ensure you're in the directory containing both the audio file (`output_audio.wav`) and the DeepSpeech model files, or have their paths noted.** 2. **Run DeepSpeech with the following command:** ```bash deepspeech --model deepspeech-0.9.3-models.pbmm --scorer deepspeech-0.9.3-models.scorer --audio output_audio.wav ``` - Replace `deepspeech-0.9.3-models.pbmm` and `deepspeech-0.9.3-models.scorer` with the paths to your downloaded model and scorer files, if they're not in the current directory. - Replace `output_audio.wav` with the path to your `.wav` audio file if necessary. This command will output the transcription of your audio file directly in the terminal. The transcription process might take some time depending on the length of your audio file and the capabilities of your machine. ### Deactivating the Virtual Environment After you're done, you can deactivate the virtual environment: ```bash deactivate ``` This guide provides a streamlined process for extracting audio from video files and transcribing it to text using DeepSpeech on Debian-based Linux systems. It's a handy reference for tasks involving speech recognition and transcription.