mirror of
https://github.com/shirayu/whispering.git
synced 2024-11-22 00:41:02 +00:00
Streaming transcriber with whisper
.github | ||
scripts | ||
tests | ||
whispering | ||
.gitignore | ||
.markdownlint.json | ||
LICENSE | ||
LICENSE.whisper | ||
Makefile | ||
package-lock.json | ||
package.json | ||
poetry.lock | ||
pyproject.toml | ||
README.md | ||
setup.cfg |
Whispering
Streaming transcriber with whisper. Enough machine power is needed to transcribe in real time.
Setup
pip install -U git+https://github.com/shirayu/whispering.git@v0.5.0
# If you use GPU, install proper torch and torchaudio
# Check https://pytorch.org/get-started/locally/
# Example : torch for CUDA 11.6
pip install -U torch torchaudio --extra-index-url https://download.pytorch.org/whl/cu116
If you get OSError: PortAudio library not found
in Linux, install "PortAudio".
sudo apt -y install portaudio19-dev
Example of microphone
# Run in English
whispering --language en --model tiny
--help
shows full options--model
set the model name to use. Larger models will be more accurate, but may not be able to transcribe in real time.--language
sets the language to transcribe. The list of languages are shown withwhispering -h
--no-progress
disables the progress message-t
sets temperatures to decode. You can set several like-t 0.0 -t 0.1 -t 0.5
, but too many temperatures exhaust decoding time--debug
outputs logs for debug--no-vad
disables VAD (Voice Activity Detection). This forces whisper to analyze non-voice activity sound period--output
sets output file (Default: Standard output)
Parse interval
Without --allow-padding
, whispering just performs VAD for the period,
and when it is predicted as "silence", it will not be passed to whisper.
If you want to change the VAD interval, change -n
.
If you want quick response, set small -n
and add --allow-padding
.
However, this may sacrifice the accuracy.
whispering --language en --model tiny -n 20 --allow-padding
Example of web socket
⚠ No security mechanism. Please make secure with your responsibility.
Run with --host
and --port
.
Host
whispering --language en --model tiny --host 0.0.0.0 --port 8000
Client
whispering --host ADDRESS_OF_HOST --port 8000 --mode client
You can set -n
, --allow-padding
and other options.
License
- MIT License
- Some codes are ported from the original whisper. Its license is also MIT License