Current Position:fig. beginning " AI hands-on tutorials

Use Whisper to transcribe your voice to multiple formats verbatim for free!

2025-01-06

969

使用 Whisper+Google Colab 免费将语音文件转录为多种格式文本-1

Do you often need to transcribe meeting recordings or interviews into text? Since writing verbatim scripts is time-consuming and laborious, it's a good idea to utilize AI tools to convert audio recordings into text. In this article, we'll introduce Whisper, an automatic speech recognition (ASR) system from the OpenAI team. According to OpenAI's description on Github, Whisper is an open-source speech recognition model that currently recognizes about 96 languages around the world and converts them into text. In terms of Chinese recognition accuracy, Whisper has reached a high level. As a result of the Whisper It's open source technology, so all users need is a Google account and a command code to set it up. Once downloaded and installed on your computer, you can use Whisper to perform speech recognition and transcription tasks free of charge and without developer restrictions.

Whisper installation code:!pip install git+https://github.com/openai/whisper.git

Ffmpeg installation code:!sudo apt update && sudo apt install ffmpeg

Speech-to-text execution code:!whisper "文件名（需要替换）.mp3" --model medium

Step 1: Sign in to your Google account, open Google Drive, click "+New" in the upper left corner, scroll down to find More, and then click "Connect More Apps".

如何使用 Whisper AI 轻松完成逐字稿？-1

Step 2: The first time you do this, the Google Workspace app marketplace will open, type in "Google Colaboratory" in the search bar and select it.

Step 3: Click "Install" to install and select "Continue" to continue. You will be asked to sign in with your Google account and follow the instructions to complete the installation.

Step 4: Go back to Google Drive home page, click on "+New" in the upper left corner again, and select "Google Colaboratory" app in more options.

如何使用 Whisper AI 轻松完成逐字稿？-1

Step 5: Once opened, you can change the name of the file so that you can quickly find and reuse it later.

如何使用 Whisper AI 轻松完成逐字稿？-1

Step 6: Click "Execution Phase" in the upper column and select "Change Execution Phase Type".

如何使用 Whisper AI 轻松完成逐字稿？-1

Step 7: At this point, you can select different run types and compute resources. Please select "Python 3" and "T4 GPU" and click "Save".

如何使用 Whisper AI 轻松完成逐字稿？-1

Step 8: Find the word "Connect" in the upper right corner of the window, click on it and wait for the connection to be successful.

如何使用 Whisper AI 轻松完成逐字稿？-1

Step 9: Once the connection is complete, you can see the computer's parameters, including GPU, memory, and hard disk information.

如何使用 Whisper AI 轻松完成逐字稿？-1

Step 10: Next, to install Whisper, enter the Whisper installation code and the ffmpeg installation code on the first and second lines of the center bar, respectively, and click Run.

如何使用 Whisper AI 轻松完成逐字稿？-1

Step 11: After the installation is completed, click the folder icon on the left side and select "Upload Files" to upload the MP3 files you need to transcribe.

如何使用 Whisper AI 轻松完成逐字稿？-1

Step 12: Click "+Code" and enter the speech-to-text execution code. Make sure the file name and suffix are the same as the uploaded file, and finally click Run.

如何使用 Whisper AI 轻松完成逐字稿？-1

May not be reproduced without permission:Chief AI Sharing Circle " Use Whisper to transcribe your voice to multiple formats verbatim for free!

Use Whisper to transcribe your voice to multiple formats verbatim for free!

Step 1: Sign in to your Google account, open Google Drive, click "+New" in the upper left corner, scroll down to find More, and then click "Connect More Apps".

Step 2: The first time you do this, the Google Workspace app marketplace will open, type in "Google Colaboratory" in the search bar and select it.

Step 3: Click "Install" to install and select "Continue" to continue. You will be asked to sign in with your Google account and follow the instructions to complete the installation.

Step 4: Go back to Google Drive home page, click on "+New" in the upper left corner again, and select "Google Colaboratory" app in more options.

Step 5: Once opened, you can change the name of the file so that you can quickly find and reuse it later.

Step 6: Click "Execution Phase" in the upper column and select "Change Execution Phase Type".

Step 7: At this point, you can select different run types and compute resources. Please select "Python 3" and "T4 GPU" and click "Save".

Step 8: Find the word "Connect" in the upper right corner of the window, click on it and wait for the connection to be successful.

Step 9: Once the connection is complete, you can see the computer's parameters, including GPU, memory, and hard disk information.

Step 10: Next, to install Whisper, enter the Whisper installation code and the ffmpeg installation code on the first and second lines of the center bar, respectively, and click Run.

Step 11: After the installation is completed, click the folder icon on the left side and select "Upload Files" to upload the MP3 files you need to transcribe.

Step 12: Click "+Code" and enter the speech-to-text execution code. Make sure the file name and suffix are the same as the uploaded file, and finally click Run.

Related articles

Recommended

Can't find AI tools? Try here!

testimonials

newest