Skip to content

boyangzhang1993/swift-transcription

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

17 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

swift-transcription

Description: An easy implemenation of audio transcription in swift

Input: Audio file such as .mp3 file

Output: Transcription

First step: set up view

Three essential components are: Title, Text space, Button, and Activity spinner.

Title and Text space

Title describes the purpose of this app. Title component is very simple. You can use a label in Swift to make it.You can center and horizontal alignment for the title.

Text space is to display the final transcription results.So under your title, add a text field for it.

Button and Activity spinner

Button is the main function of this app. If an user presses it, the app will make a transcription. Activity spinner plays a spinner animation to indicate the transcription process. First, the basic setup is also simple. You can use button in swift to make it and add any alignment you like.

Next, we need to add Text space, Activity spinner, and Button in viewcontroller. Right click the button and connect it to viewcontroller.

Button functions

You could use the ViewController, AppDelegate, and SceneDelegate file I posed in this github.

The following part is to explain some parts of it.

Inside the outlet, we first define three To-do functions.

    @IBAction func playButtonPressed(_ sender: Any) {
        
        activitySpinner.isHidden = false # Show activity spinner
        activitySpinner.startAnimating()# Start animation of activity spinner
        requestSpeechAuth()# Start transcription
    }

First function is to show the hidden activitySpinner. Second function is to start animation of activity spinner.

First and second function were already defined by Xcode. So we don't need to do anything.

Third function is to start transcription. This function calls a mp3 file from local file and starts transcription. SFSpeechRecognizer is the main variable. We need to feed it with a path, and it will do the task.

    func requestSpeechAuth() {
        SFSpeechRecognizer.requestAuthorization { authStatus in
            if authStatus == SFSpeechRecognizerAuthorizationStatus.authorized{
                if let path = Bundle.main.url(forResource: "test", withExtension: "mp3"){
                    do{
                        let sound = try AVAudioPlayer(contentsOf: path)
                        self.audioPlay = sound
                        self.audioPlay.delegate = self
                        sound.play()
                        
                        
                        
                    } catch{
                      print("error")
                    }
                    
                    let recognizer = SFSpeechRecognizer()
                    let request = SFSpeechURLRecognitionRequest(url: path)
                    recognizer?.recognitionTask(with: request){(result, error) in
                        if let error = error{
                            print("There was an error:\(error)")
                        } else {
                            print(result?.bestTranscription.formattedString)
                            self.transcriptionTextField.text = result?.bestTranscription.formattedString
                        }
                    }
                }
                
            }
        }
    }

Results

Finally, when we press this button, activity spinner will start spin and swift can call SFSpeechRecognizer to detect the most likely transcription.

About

A easy implemenation of audio transcription in swift

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages