Sync with whisper.cpp v1.4.3 🚀

ErcinDedeoglu · Nov 9, 2023 · 74dc0df · 74dc0df
1 parent 9dd19cb
commit 74dc0df
Show file tree

Hide file tree

Showing 4 changed files with 18 additions and 21 deletions.
diff --git a/.github/workflows/publish-docker.yml b/.github/workflows/publish-docker.yml
@@ -47,12 +47,12 @@ jobs:
         run: |
           source $GITHUB_ENV
           # Always set the default tags
-          DEFAULT_TAGS="dublok/stt:${GITHUB_REF_NAME},dublok/stt:${SHA_SHORT}"
+          DEFAULT_TAGS="dublok/whisperdock:${GITHUB_REF_NAME},dublok/whisperdock:${SHA_SHORT}"
           echo "TAGS=${DEFAULT_TAGS}" >> $GITHUB_ENV
           
           # Check if GITHUB_REF_NAME starts with 'v' and if so, append 'latest'
           if [[ $GITHUB_REF_NAME == v* ]]; then
-            LATEST_TAGS="${DEFAULT_TAGS},dublok/stt:latest"
+            LATEST_TAGS="${DEFAULT_TAGS},dublok/whisperdock:latest"
             echo "TAGS=${LATEST_TAGS}" >> $GITHUB_ENV
           fi
 

diff --git a/.github/workflows/sync-whisper.yml b/.github/workflows/sync-whisper.yml
@@ -111,7 +111,7 @@ jobs:
     steps:
       - name: 🛒 Checkout repository for sync
         run: |
-          git clone https://x-access-token:${{ secrets.PAT }}@github.com/ErcinDedeoglu/stt.git .
+          git clone https://x-access-token:${{ secrets.PAT }}@github.com/ErcinDedeoglu/whisperdock.git .
 
       - name: 🔄 Sync whisper.cpp repository to src/whisper
         run: |

diff --git a/README.md b/README.md
@@ -1,14 +1,14 @@
-# [Speech-to-Text (STT) Transcription Service 🎤](https://github.com/ErcinDedeoglu/stt)
+# [WhisperDock - Speech-to-Text Service 🎤](https://github.com/ErcinDedeoglu/WhisperDock)
 
 This repository hosts the Dockerized speech-to-text transcription service, which utilizes Whisper C++ alongside Python to provide an API for audio file transcription.
 
-<div align="center"><img src="https://github.com/ErcinDedeoglu/stt/assets/6512072/0f63bc19-7662-4a03-b7bb-dea8bfbb44d7" width="400"></div>
+<div align="center"><img src="https://github.com/ErcinDedeoglu/WhisperDock/raw/main/assets/logo.png" width="400"></div>
 
 ## Background and Motivation
 
-In the rapidly advancing field of machine learning, access to efficient and robust tools for everyday applications is essential. Speech-to-text transcription is one of the areas that has seen significant improvements, but deploying these models quickly and efficiently remains a challenge. Whisper C++, a high-performance transcription tool, has emerged as a powerful option, yet it still requires a streamlined pathway to deployment.
+Access to efficient and robust tools for everyday applications is essential in the rapidly advancing field of machine learning. Speech-to-text transcription is one of the areas that has seen significant improvements, but deploying these models quickly and efficiently remains a challenge. Whisper C++, a high-performance transcription tool, has emerged as a powerful option, yet it still requires a streamlined pathway to deployment.
 
-This repository was created out of a necessity to bridge the gap between the development of speech-to-text models and their deployment in real-world applications. Many existing solutions require extensive setup, intricate knowledge of systems, and can be time-consuming to deploy, creating a barrier for developers, researchers, and businesses who want to integrate transcription capabilities into their services.
+This repository was created out of a necessity to bridge the gap between the development of speech-to-text models and their deployment in real-world applications. Many existing solutions require extensive setup and intricate knowledge of systems and can be time-consuming to deploy, creating a barrier for developers, researchers, and businesses who want to integrate transcription capabilities into their services.
 
 The Speech-to-Text Transcription Service aims to provide a fast, reliable, and easy-to-use solution for deploying Whisper C++ models. By containerizing the service with Docker, we significantly reduce the complexity of deployment and make it possible to launch a transcription service that is both scalable and accessible.
 
@@ -17,7 +17,7 @@ Here are some of the key motivations behind this project:
 - **Speed of Deployment**: By providing a Dockerized solution, we enable rapid deployment of the transcription service, allowing users to go from zero to a fully functioning service in minutes.
 - **Ease of Use**: The provided APIs and Docker setup are designed to be as simple as possible, requiring minimal configuration and allowing for easy integration into existing workflows.
 - **Accessibility**: Making Whisper C++ easily deployable opens up more opportunities for developers and organizations of all sizes to utilize state-of-the-art transcription technology.
-- **Continuous Integration and Delivery**: With GitHub Actions, updates and improvements are integrated seamlessly, ensuring that the service remains up-to-date with the latest advancements from the Whisper C++ repository.
+- **Continuous Integration and Delivery**: With GitHub Actions, updates, and improvements are integrated seamlessly, ensuring the service remains up-to-date with the latest advancements from the Whisper C++ repository.
 
 In contributing to this repository, I hope to empower individuals and organizations to harness the capabilities of Whisper C++ without the overhead of complex deployment processes, thus fostering innovation and development in the field of speech recognition.
 
@@ -31,14 +31,14 @@ For quick deployment, use the Docker images provided in the Docker registry.
 
 For the latest stable version:
 ```bash
-docker pull dublok/stt:latest
-docker run -p 5000:5000 dublok/stt:latest
+docker pull dublok/whisperdock:latest
+docker run -p 5000:5000 dublok/whisperdock:latest
 ```
 
 For the nightly build (unstable but with early access to new features):
 ```bash
-docker pull dublok/stt:main
-docker run -p 5000:5000 dublok/stt:main
+docker pull dublok/whisperdock:main
+docker run -p 5000:5000 dublok/whisperdock:main
 ```
 
 The service should now be accessible at `http://localhost:5000`.
@@ -47,21 +47,19 @@ The service should now be accessible at `http://localhost:5000`.
 
 1. Clone the repository:
 ```bash
-git clone https://github.com/ErcinDedeoglu/stt
+git clone https://github.com/ErcinDedeoglu/WhisperDock
 ```
 
 2. Build the Docker image:
 ```bash
-docker build -t stt-service .
+docker build -t whisperdock .
 ```
 
 3. Run the container:
 ```bash
-docker run -p 5000:5000 stt-service
+docker run -p 5000:5000 whisperdock
 ```
 
-Certainly! Below is an updated README section that includes an example response from the API after submitting an audio file for transcription.
-
 ---
 
 ## API Usage
@@ -72,7 +70,7 @@ To transcribe audio, make a POST request to the `/transcribe` endpoint with the
 curl -X POST -F 'file=@/path/to/your/audio.wav' http://localhost:5000/transcribe
 ```
 
-Make sure your audio file is in WAV format with a sample rate of 16kHz.
+Ensure your audio file is in WAV format with a sample rate of 16kHz.
 
 ### Example Response
 
@@ -84,7 +82,7 @@ Upon successful transcription, the service will return a JSON response containin
     {
       "start_time": "00:00:00.000",
       "end_time": "00:00:03.000",
-      "text": "Welcome to our speech to text service."
+      "text": "Welcome to our speech-to-text service."
     },
     {
       "start_time": "00:00:03.500",
@@ -135,5 +133,4 @@ This project uses GitHub Actions for continuous integration, which automates the
 
 ## License
 
-This Speech-to-Text Transcription Service is made available under the [CC0 1.0 Universal](LICENSE) public domain dedication.
-
+This Speech-to-Text Transcription Service is available under the [CC0 1.0 Universal](https://github.com/ErcinDedeoglu/WhisperDock/blob/main/LICENSE) public domain dedication.
diff --git a/assets/logo.png b/assets/logo.png