mirror of
https://github.com/shirayu/whispering.git
synced 2025-01-22 14:48:09 +00:00
Streaming transcriber with whisper
.github | ||
whisper_streaming | ||
.gitignore | ||
.markdownlint.json | ||
LICENSE | ||
LICENSE.whisper | ||
Makefile | ||
package-lock.json | ||
package.json | ||
poetry.lock | ||
pyproject.toml | ||
README.md | ||
setup.cfg |
whisper_streaming (beta version)
Streaming transcriber with whisper. Transcribing in real time, enough machine power is needed.
Setup
git clone https://github.com/shirayu/whisper_streaming.git
cd whisper_streaming
poetry install --only main
# If you use GPU, install proper torch and torchaudio with "poetry run pip install -U"
# Example : torch for CUDA 11.6
poetry run pip install -U torch torchaudio --extra-index-url https://download.pytorch.org/whl/cu116
Example of microphone
# Run in English
poetry run whisper_streaming --language en --model tiny
--help
shows full options--language
sets the language to transcribe. The list of languages are shown withpoetry run whisper_streaming -h
-t
sets temperatures to decode. You can set several like (-t 0.0 -t 0.1 -t 0.5
), but too many temperatures exhaust decoding time--debug
outputs logs for debug
Parse interval
If you want quick response, set small -n
and add --allow-padding
.
However, this may sacrifice the accuracy.
poetry run whisper_streaming --language en --model tiny -n 20 --allow-padding
Example of web socket
⚠ No security mechanism. Please make secure with your responsibility.
Run with --host
and --port
.
Host
poetry run whisper_streaming --language en --model tiny --host 0.0.0.0 --port 8000
You can set --allow-padding
and other options.
(-n
for hosts makes no sense)
Client
poetry run python -m whisper_streaming.websocket_client --host ADDRESS_OF_HOST --port 8000 -n 20
You can set -n
and other options.
(--allow-padding
fro clients makes no sense)
Tips
PortAudio Error
If you get OSError: PortAudio library not found
: Install portaudio
# Ubuntu
sudo apt-get install portaudio19-dev
License
- MIT License
- Some codes are ported from the original whisper. Its license is also MIT License