Using Amazon Transcribe for transcription and speaker separation

There is a transcription service called Amazon Transcribe.

The features of this service include

  • Speaker separation as of 2021/05/17
  • Japanese language support
  • It supports both video and audio formats.

There are some points.

I’ll write down how to use it with GUI.

Procedure

There is nothing difficult about the procedure, just follow the instructions on the screen. 1.

Login to the AWS console. 1.

  1. you need to create an account for AWS. 2.
  2. upload the audio file to S3. 1. You don’t need to tweak any options. 2. 2. choose a region that is close to your location.
  3. transcribe with Amazon Transcribe
    1. click “create job
    2. name can be anything as long as it’s distinguishable. 3. For Langueage, select Japanese. 4. For Input data, select the file you uploaded from Browse S3. Click Next. 6. Audio setting Turn on Audio identification (speaker separation). Select Speaker Identification to distinguish between speakers by using the characteristics of each speaker. 3. Turn on Alternative results. 1. Set Maximum alternative to 1. 2. The number of alternatives you set here will be output. If you want to examine the variations, you can set the number to more than 2. 7.
    3. click create. 8.
    4. confirm that the job you entered in Name is added to Transcription jobs. 9. Wait for the Status to become Complete. 10. Click on the target to get the output json.

Extract speech by speaker

Example script Writing in python

[Read More]