Add a `tuning` feature which adds counters that help with performance
evaluation. The only counter added so far accumulates the duration a
Scheduler has been parked, which is pretty accurate an indication of
CPU usage of the Scheduler.
- Reworked buffer push.
- Reworked stats.
- Make first elements logs stand out. This make it possible to
follow what's going on with pipelines containing 1000s of
elements.
- Actually handle EOS.
- Use more significant defaults.
- Allow building without `clap` feature.
Implement a test that initializes pipelines with minimalistic
theadshare src and sink. This can help with the evaluation of
changes to the threadshare runtime or with element
implementation details. It makes it easy to run flamegraph or
callgrind and to focus on the threadshare runtime overhead.