- Trending Categories
Data Structure
Networking
RDBMS
Operating System
Java
MS Excel
iOS
HTML
CSS
Android
Python
C Programming
C++
C#
MongoDB
MySQL
Javascript
PHP
Physics
Chemistry
Biology
Mathematics
English
Economics
Psychology
Social Studies
Fashion Studies
Legal Studies
- Selected Reading
- UPSC IAS Exams Notes
- Developer's Best Practices
- Questions and Answers
- Effective Resume Writing
- HR Interview Questions
- Computer Glossary
- Who is Who
How to convert speech to text using JavaScript?
Overview
To convert the spoken words to the text we generally use the Web Speech API’s component that is “SpeechRecognition”. The SpeechRecognition component recognizes the spoken words in the form of audio and converts them to the text. The spoken words are stored in it in an array which are then displayed inside the HTML element on the browser screen.
Syntax
The basic syntax used in is −
let recognization = new webkitSpeechRecognition();
We can also use SpeechRecognition() instead of webkitSpeechRecognition() as webkitSpeechRecognition() is used in chrome and apple safari browser for speech recognition.
Algorithm
Step 1 − Create a HTML page as given below, create a HTML button using <button> tag. Add an onclick event in it with the function name “runSpeechRecog()”. Also create a <p> tag with id “action” in it.
Step 2 − Create a runSpeechRecog() arrow function inside a script tag as we are using internal javascript.
Step 3 − Select the “p” tag of HTML using Document Object Model (DOM) as document.getElementById(). Store it in a variable.
Step 4 − Create an object of a webkitSpeechRecognition() constructor and store it in a reference variable. So that all the methods of webkitSpeechRecognition() class will be in the reference variable.
let recognization = new webkitSpeechRecognition();
Step 5 − Use “recognition.onstart()“, this function will return the action when the recognition is started.
recognization.onstart = () => { action.innerHTML = "Listening..."; }
Step 6 − Now use recognition.onresult() to display the spoken words on the screen.
recognization.onresult = (e) => { var transcript = e.results[0][0].transcript; var confidence = e.results[0][0].confidence; output.innerHTML = transcript; output.classList.remove("hide") action.innerHTML = ""; }
Step 7 − Use the recognition.start() method to start the speech recognition.
recognization.start();
Example
<html> <head> <title>Speech to text</title> </head> <body> <div class="speaker" style="display: flex;justify-content: space-between;width: 13rem;box-shadow: 0 0 13px #0000003d;border-radius: 5px;"> <p id="action" style="color: grey;font-weight: 800; padding: 0; padding-left: 2rem;"></p> <button onclick="runSpeechRecog()" style="border: transparent;padding: 0 0.5rem;"> Speech </button> </div> <h3 id="output" class="hide"></h3> <script> runSpeechRecog = () => { document.getElementById("output").innerHTML = "Loading text..."; var output = document.getElementById('output'); var action = document.getElementById('action'); let recognization = new webkitSpeechRecognition(); recognization.onstart = () => { action.innerHTML = "Listening..."; } recognization.onresult = (e) => { var transcript = e.results[0][0].transcript; output.innerHTML = transcript; output.classList.remove("hide") action.innerHTML = ""; } recognization.start(); } </script> </body> </html>
Description
When the “runSpeechRecog()” function is triggered the webkitSpeechRecognition() is initialized and all the properties of this are stored in the reference and shows the below output as the browser is ready to listen to the user's spoken words.
When the user has stopped speaking the sentence, the result is stored in the form of an array of words. Then these words are returned as a transcript of a sentence on the user browser screen. For example a user runs this speech to text program on its browser and presses the speech button and start speaking as “tutorialpoint.com”, as user stops speaking the speech recognition program will stop and will display the transcript on the browser as “tutorialpoint.com”.
Conclusion
The Web Speech API of JavaScript is used in many types of applications. As the web speech api has two different components as SpeechRecognition API which is used for speech-text conversion and SpeechSynthesis API which is used for text-speech conversion. The above SpeechRecognition is supported for the browser Chrome, Apple Safari, Opera.