Speech Recognition Microservice (node.js)

A microservice that translates audio to text and generates notes using Google Cloud Speech-to-Text API and OpenAI.

Features

Audio file upload and processing
Speech-to-text conversion using Google Cloud Speech API
Text processing with OpenAI
CORS support for cross-origin requests
File validation middleware
Streaming audio processing

Prerequisites

Node.js (v14 or higher)
Google Cloud Platform account with Speech-to-Text API enabled
OpenAI API key

Installation

Clone the repository:

git clone <repository-url>
cd speachRecognition

Install dependencies:

npm install

Create a .env file in the root directory with the following variables:

PORT=8080
GOOGLE_APPLICATION_CREDENTIALS=path/to/your/SpeechClient.json
OPENAI_API_KEY=your_openai_api_key

Project Structure

src/
├── CORS/
│   └── config.js         # CORS configuration
├── config/
│   └── SpeechClient.json # Google Cloud credentials
├── handlers/
│   ├── streamHandler.js  # Audio stream processing
│   ├── speechRecognize.js # Speech recognition logic
│   └── gptProcess.js     # OpenAI text processing
├── middleware/
│   ├── isFile.js        # File validation middleware
│   └── isAuth.js        # Authentication middleware
└── index.js             # Main application entry point

API Endpoints

POST /upload-audio

Upload and process audio files.

Request:

Method: POST
Content-Type: multipart/form-data
Body:
- voice: Audio file

Response:

Success: 200 OK with processed text
Error: 400 Bad Request or 500 Internal Server Error

Usage

Start the development server:

npm run dev

Start the production server:

npm start

The server will start on port 8080 (or the port specified in your .env file).

Dependencies

@google-cloud/speech: ^6.5.0
dotenv: ^16.4.5
express: ^4.19.2
multer: ^1.4.5-lts.1
openai: ^4.40.0

Security Notes

Keep your .env file secure and never commit it to version control
Ensure your Google Cloud credentials are properly secured
The service includes CORS configuration for secure cross-origin requests
File validation middleware is implemented to ensure proper file uploads

License

ISC

Author

Bohdan Kukhar

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Speech Recognition Microservice (node.js)

Features

Prerequisites

Installation

Project Structure

API Endpoints

POST /upload-audio

Usage

Dependencies

Security Notes

License

Author

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

Speech Recognition Microservice (node.js)

Features

Prerequisites

Installation

Project Structure

API Endpoints

POST /upload-audio

Usage

Dependencies

Security Notes

License

Author

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Packages