Realtime Speech to Text converter using javascript

Date- Dec 16,2023

5020

Free Download

Speech Recognition

Building a Real-Time Speech-to-Text Web Application with the Web Speech API in Javascript

Understanding the Web Speech API

The Web Speech API is a browser-based API that enables developers to integrate speech recognition and synthesis capabilities into web applications. It provides a straightforward way to capture spoken language and convert it into text.

Prerequisites

Before we dive into the implementation, ensure you have a modern web browser that supports the Web Speech API. Google Chrome is a popular choice for this purpose.

Setting Up the HTML Structure

Let's start by creating the basic HTML structure for our web application. We'll include buttons for starting and stopping speech recognition, as well as a container to display the transcribed text.

<!DOCTYPE html>
<html lang="en">
<head>
    <meta charset="UTF-8">
    <title>Real-time Speech-to-Text</title>
</head>
<body>
    <h1>Real-time Speech-to-Text</h1>
    <button id="startSpeech">Start Speaking</button>
    <button id="stopSpeech" disabled>Stop Speaking</button>
    <div id="output"></div>
 <div id="stopDiv"></div>
</body>
</html>

Implementing Speech Recognition with JavaScript

Now, let's create a script to handle the speech recognition logic. This script will interact with the Web Speech API to capture the user's voice input and display the transcribed text in real-time.

<script>
    document.addEventListener('DOMContentLoaded', (event) => {
        const startSpeechButton = document.getElementById('startSpeech');
        const stopSpeechButton = document.getElementById('stopSpeech');
        const outputDiv = document.getElementById('output');
        const stopDiv = document.getElementById('stopDiv');

        let recognition = new webkitSpeechRecognition(); // For WebKit browsers like Chrome

        recognition.continuous = true;
        recognition.lang = 'en-US';

        recognition.onstart = () => {
            outputDiv.innerHTML = 'Listening...';
            startSpeechButton.disabled = true;
            stopSpeechButton.disabled = false;
        };

        recognition.onresult = (event) => {
            const transcript = event.results[event.results.length - 1][0].transcript;
            if (outputDiv.innerHTML == "Listening...") {
                outputDiv.innerHTML = "";
            }
            outputDiv.innerHTML = outputDiv.innerHTML + ' ' + transcript;
        };

        recognition.onerror = (event) => {
            outputDiv.innerHTML = 'Error occurred: ' + event.error;
            stopSpeech();
        };

        recognition.onend = () => {
            stopDiv.innerHTML = 'Speech recognition stopped.';
            startSpeechButton.disabled = false;
            stopSpeechButton.disabled = true;
        };

        startSpeechButton.addEventListener('click', startSpeech);
        stopSpeechButton.addEventListener('click', stopSpeech);

        function startSpeech() {
            recognition.start();
        }

        function stopSpeech() {
            recognition.stop();
        }
    });
</script>

Testing the Application

Open the HTML file in a supported browser, click the "Start Speaking" button, and start speaking. The recognized speech should be displayed in real-time on the web page.

Conclusion

Integrating real-time speech-to-text functionality into web applications can enhance user interaction and accessibility. The Web Speech API provides a convenient way to implement this feature, and with the provided HTML and JavaScript code, you can get started building your own real-time speech-to-text web application. Experiment with different browsers and settings to ensure a seamless experience for your users.

So this is how we can create speech recognition or Realtime Speech to Text converter using javascript.

Happy coding!