whispering/README.md
2022-09-25 03:59:09 +09:00

2.5 KiB

whisper_streaming (beta version)

CI CodeQL Typos

Streaming transcriber with whisper. Transcribing in real time, enough machine power is needed.

Setup

git clone https://github.com/shirayu/whisper_streaming.git
cd whisper_streaming
poetry install --only main

# If you use GPU, install proper torch and torchaudio with "poetry run pip install -U"
# Example : torch for CUDA 11.6
poetry run pip install -U torch torchaudio --extra-index-url https://download.pytorch.org/whl/cu116

Example of microphone

# Run in English
poetry run whisper_streaming --language en --model tiny
  • --help shows full options
  • --language sets the language to transcribe. The list of languages are shown with poetry run whisper_streaming -h
  • -t sets temperatures to decode. You can set several like (-t 0.0 -t 0.1 -t 0.5), but too many temperatures exhaust decoding time
  • --debug outputs logs for debug

Parse interval

If you want quick response, set small -n and add --allow-padding. However, this may sacrifice the accuracy.

poetry run whisper_streaming --language en --model tiny -n 20 --allow-padding

Example of web socket

No security mechanism. Please make secure with your responsibility.

Run with --host and --port.

Host

poetry run whisper_streaming --language en --model tiny --host 0.0.0.0 --port 8000

You can set --allow-padding and other options. (-n for hosts makes no sense)

Client

poetry run python -m whisper_streaming.websocket_client --host ADDRESS_OF_HOST --port 8000 -n 20

You can set -n and other options. (--allow-padding fro clients makes no sense)

Tips

PortAudio Error

If you get OSError: PortAudio library not found: Install portaudio

# Ubuntu
sudo apt-get install portaudio19-dev

License