2022-09-23 10:20:11 +00:00
|
|
|
|
2022-09-25 15:29:20 +00:00
|
|
|
# Whispering
|
2022-09-23 10:20:11 +00:00
|
|
|
|
2022-09-25 15:25:44 +00:00
|
|
|
[![MIT License](https://img.shields.io/apm/l/atomic-design-ui.svg?)](LICENSE)
|
|
|
|
[![Python Versions](https://img.shields.io/badge/Python-3.8%20--%203.10-blue)](https://pypi.org/project/bunkai/)
|
|
|
|
|
2022-09-25 15:29:20 +00:00
|
|
|
[![CI](https://github.com/shirayu/whispering/actions/workflows/ci.yml/badge.svg)](https://github.com/shirayu/whispering/actions/workflows/ci.yml)
|
|
|
|
[![CodeQL](https://github.com/shirayu/whispering/actions/workflows/codeql-analysis.yml/badge.svg)](https://github.com/shirayu/whispering/actions/workflows/codeql-analysis.yml)
|
|
|
|
[![Typos](https://github.com/shirayu/whispering/actions/workflows/typos.yml/badge.svg)](https://github.com/shirayu/whispering/actions/workflows/typos.yml)
|
2022-09-23 10:20:11 +00:00
|
|
|
|
2022-09-24 00:38:40 +00:00
|
|
|
Streaming transcriber with [whisper](https://github.com/openai/whisper).
|
2022-09-24 19:49:17 +00:00
|
|
|
Enough machine power is needed to transcribe in real time.
|
2022-09-23 10:20:11 +00:00
|
|
|
|
2022-09-24 15:46:37 +00:00
|
|
|
## Setup
|
2022-09-23 10:20:11 +00:00
|
|
|
|
|
|
|
```bash
|
2022-09-25 15:29:20 +00:00
|
|
|
pip install -U git+https://github.com/shirayu/whispering.git
|
2022-09-23 13:19:53 +00:00
|
|
|
|
2022-09-25 15:25:44 +00:00
|
|
|
# If you use GPU, install proper torch and torchaudio
|
2022-09-24 03:38:37 +00:00
|
|
|
# Example : torch for CUDA 11.6
|
2022-09-25 15:14:20 +00:00
|
|
|
pip install -U torch torchaudio --extra-index-url https://download.pytorch.org/whl/cu116
|
2022-09-24 15:46:37 +00:00
|
|
|
```
|
|
|
|
|
|
|
|
## Example of microphone
|
2022-09-24 02:02:40 +00:00
|
|
|
|
2022-09-24 15:46:37 +00:00
|
|
|
```bash
|
2022-09-24 00:26:37 +00:00
|
|
|
# Run in English
|
2022-09-25 15:29:20 +00:00
|
|
|
whispering --language en --model tiny
|
2022-09-23 10:20:11 +00:00
|
|
|
```
|
|
|
|
|
2022-09-24 00:38:40 +00:00
|
|
|
- ``--help`` shows full options
|
2022-09-25 15:29:20 +00:00
|
|
|
- ``--language`` sets the language to transcribe. The list of languages are shown with ``whispering -h``
|
2022-09-24 04:58:39 +00:00
|
|
|
- ``-t`` sets temperatures to decode. You can set several like (``-t 0.0 -t 0.1 -t 0.5``), but too many temperatures exhaust decoding time
|
2022-09-23 13:46:27 +00:00
|
|
|
- ``--debug`` outputs logs for debug
|
2022-09-23 13:19:53 +00:00
|
|
|
|
2022-09-24 15:46:37 +00:00
|
|
|
### Parse interval
|
|
|
|
|
|
|
|
If you want quick response, set small ``-n`` and add ``--allow-padding``.
|
2022-09-24 16:14:23 +00:00
|
|
|
However, this may sacrifice the accuracy.
|
2022-09-24 15:46:37 +00:00
|
|
|
|
|
|
|
```bash
|
2022-09-25 15:29:20 +00:00
|
|
|
whispering --language en --model tiny -n 20 --allow-padding
|
2022-09-24 15:46:37 +00:00
|
|
|
```
|
|
|
|
|
2022-09-24 12:54:25 +00:00
|
|
|
## Example of web socket
|
|
|
|
|
2022-09-24 12:57:46 +00:00
|
|
|
⚠ **No security mechanism. Please make secure with your responsibility.**
|
2022-09-24 12:54:25 +00:00
|
|
|
|
2022-09-24 18:59:09 +00:00
|
|
|
Run with ``--host`` and ``--port``.
|
2022-09-24 15:48:27 +00:00
|
|
|
|
2022-09-24 15:55:37 +00:00
|
|
|
### Host
|
|
|
|
|
2022-09-24 12:54:25 +00:00
|
|
|
```bash
|
2022-09-25 15:29:20 +00:00
|
|
|
whispering --language en --model tiny --host 0.0.0.0 --port 8000
|
2022-09-24 12:54:25 +00:00
|
|
|
```
|
|
|
|
|
2022-09-24 18:59:09 +00:00
|
|
|
You can set ``--allow-padding`` and other options.
|
|
|
|
|
2022-09-24 15:48:27 +00:00
|
|
|
### Client
|
|
|
|
|
2022-09-24 12:54:25 +00:00
|
|
|
```bash
|
2022-09-25 15:29:20 +00:00
|
|
|
whispering --model tiny --host ADDRESS_OF_HOST --port 8000 --mode client
|
2022-09-24 12:54:25 +00:00
|
|
|
```
|
|
|
|
|
2022-09-24 18:59:09 +00:00
|
|
|
You can set ``-n`` and other options.
|
|
|
|
|
2022-09-23 10:20:11 +00:00
|
|
|
## Tips
|
|
|
|
|
2022-09-24 04:58:39 +00:00
|
|
|
## PortAudio Error
|
|
|
|
|
2022-09-23 10:20:11 +00:00
|
|
|
If you get ``OSError: PortAudio library not found``: Install ``portaudio``
|
|
|
|
|
|
|
|
```bash
|
|
|
|
# Ubuntu
|
|
|
|
sudo apt-get install portaudio19-dev
|
|
|
|
```
|
|
|
|
|
|
|
|
## License
|
|
|
|
|
|
|
|
- [MIT License](LICENSE)
|
|
|
|
- Some codes are ported from the original whisper. Its license is also [MIT License](LICENSE.whisper)
|