Realtime Speech to Text converter using javascript
Building a Real-Time Speech-to-Text Web Application with the Web Speech API in Javascript
Understanding the Web Speech API
The Web Speech API is a browser-based API that enables developers to integrate speech recognition and synthesis capabilities into web applications. It provides a straightforward way to capture spoken language and convert it into text.
Prerequisites
Before we dive into the implementation, ensure you have a modern web browser that supports the Web Speech API. Google Chrome is a popular choice for this purpose.
Setting Up the HTML Structure
Let's start by creating the basic HTML structure for our web application. We'll include buttons for starting and stopping speech recognition, as well as a container to display the transcribed text.
<!DOCTYPE html> <html lang="en"> <head> <meta charset="UTF-8"> <title>Real-time Speech-to-Text</title> </head> <body> <h1>Real-time Speech-to-Text</h1> <button id="startSpeech">Start Speaking</button> <button id="stopSpeech" disabled>Stop Speaking</button> <div id="output"></div> <div id="stopDiv"></div> </body> </html>
Implementing Speech Recognition with JavaScript
Now, let's create a script to handle the speech recognition logic. This script will interact with the Web Speech API to capture the user's voice input and display the transcribed text in real-time.
<script> document.addEventListener('DOMContentLoaded', (event) => { const startSpeechButton = document.getElementById('startSpeech'); const stopSpeechButton = document.getElementById('stopSpeech'); const outputDiv = document.getElementById('output'); const stopDiv = document.getElementById('stopDiv'); let recognition = new webkitSpeechRecognition(); // For WebKit browsers like Chrome recognition.continuous = true; recognition.lang = 'en-US'; recognition.onstart = () => { outputDiv.innerHTML = 'Listening...'; startSpeechButton.disabled = true; stopSpeechButton.disabled = false; }; recognition.onresult = (event) => { const transcript = event.results[event.results.length - 1][0].transcript; if (outputDiv.innerHTML == "Listening...") { outputDiv.innerHTML = ""; } outputDiv.innerHTML = outputDiv.innerHTML + ' ' + transcript; }; recognition.onerror = (event) => { outputDiv.innerHTML = 'Error occurred: ' + event.error; stopSpeech(); }; recognition.onend = () => { stopDiv.innerHTML = 'Speech recognition stopped.'; startSpeechButton.disabled = false; stopSpeechButton.disabled = true; }; startSpeechButton.addEventListener('click', startSpeech); stopSpeechButton.addEventListener('click', stopSpeech); function startSpeech() { recognition.start(); } function stopSpeech() { recognition.stop(); } }); </script>
Testing the Application
Open the HTML file in a supported browser, click the "Start Speaking" button, and start speaking. The recognized speech should be displayed in real-time on the web page.
Conclusion
Integrating real-time speech-to-text functionality into web applications can enhance user interaction and accessibility. The Web Speech API provides a convenient way to implement this feature, and with the provided HTML and JavaScript code, you can get started building your own real-time speech-to-text web application. Experiment with different browsers and settings to ensure a seamless experience for your users.
So this is how we can create speech recognition or Realtime Speech to Text converter using javascript.
Happy coding!