mirror of https://github.com/shirayu/whispering.git synced 2024-11-22 08:51:01 +00:00

Streaming transcriber with whisper

Find a file

Yuta Hayashibe 6ac27504df Updated README		2022-09-25 00:46:37 +09:00
.github	Initial commit	2022-09-23 19:20:29 +09:00
whisper_streaming	Add --allow-padding option	2022-09-25 00:43:03 +09:00
.gitignore	Initial commit	2022-09-23 19:20:29 +09:00
.markdownlint.json	Initial commit	2022-09-23 19:20:29 +09:00
LICENSE	Initial commit	2022-09-23 19:20:29 +09:00
LICENSE.whisper	Initial commit	2022-09-23 19:20:29 +09:00
Makefile	Fix setting for isort	2022-09-23 20:05:33 +09:00
package-lock.json	Initial commit	2022-09-23 19:20:29 +09:00
package.json	Initial commit	2022-09-23 19:20:29 +09:00
poetry.lock	Add websocket server	2022-09-24 20:45:20 +09:00
pyproject.toml	Add websocket server	2022-09-24 20:45:20 +09:00
README.md	Updated README	2022-09-25 00:46:37 +09:00
setup.cfg	Fix setting for isort	2022-09-23 20:05:33 +09:00

README.md

whisper_streaming (beta version)

Streaming transcriber with whisper. Transcribing in real time, enough machine power is needed.

Setup

git clone https://github.com/shirayu/whisper_streaming.git
cd whisper_streaming
poetry install --only main

# If you use GPU, install proper torch and torchaudio with "poetry run pip install -U"
# Example : torch for CUDA 11.6
poetry run pip install -U torch torchaudio --extra-index-url https://download.pytorch.org/whl/cu116

Example of microphone

# Run in English
poetry run whisper_streaming --language en --model base

--help shows full options
--language sets the language to transcribe. The list of languages are shown with poetry run whisper_streaming -h
-t sets temperatures to decode. You can set several like (-t 0.0 -t 0.1 -t 0.5), but too many temperatures exhaust decoding time
--debug outputs logs for debug

Parse interval

If you want quick response, set small -n and add --allow-padding. However, this may be at the sacrifice of accuracy.

poetry run whisper_streaming --language en --model base -n 20 --allow-padding

Example of web socket

⚠ No security mechanism. Please make secure with your responsibility.

# Host
poetry run whisper_streaming --language en --model base --host 0.0.0.0 --port 8000

# Client
poetry run python -m whisper_streaming.websocket_client --host ADDRESS_OF_HOST --port 8000

Tips

PortAudio Error

If you get OSError: PortAudio library not found: Install portaudio

# Ubuntu
sudo apt-get install portaudio19-dev

License

MIT License
Some codes are ported from the original whisper. Its license is also MIT License