.. _bibliography:

##############
Bibliography
##############

.. [1] Radford, A., Kim, J. W., Xu, T., Brockman, G., McLeavey, C., & Sutskever, I. (2022). *Robust Speech Recognition via Large-Scale Weak Supervision*. `arXiv:2212.04356 <https://arxiv.org/abs/2212.04356>`_.

.. [2] Bredin, H., Yin, R., Coria, J. M., Gelly, G., Korshunov, P., Lavechin, M., Fustes, D., Titeux, H., Bouaziz, W., & Gill, M. (2020). *pyannote.audio: neural building blocks for speaker diarization*. `arXiv:1911.01255 <https://arxiv.org/abs/1911.01255>`_.

.. [3] Desplanques, B., Thienpondt, J., & Demuynck, K. (2020). *ECAPA-TDNN: Emphasized Channel Attention, Propagation and Aggregation in TDNN Based Speaker Verification*. `arXiv:2005.07143 <https://arxiv.org/abs/2005.07143>`_.

.. [4] Silero Team. (2024). *Silero VAD: pre-trained enterprise-grade Voice Activity Detector (VAD), Number Detector and Language Classifier*. `GitHub repository <https://github.com/snakers4/silero-vad>`_.

.. [5] Défossez, A., Usunier, N., Bottou, L., & Bach, F. (2019). *Music Source Separation in the Waveform Domain*. `arXiv:1911.13254 <https://arxiv.org/abs/1911.13254>`_.

.. [6] Schröter, H., Escalante-B., A. N., & Rosenkranz, T. (2022). *DeepFilterNet: A Low Complexity Speech Enhancement Framework for Full-Band Audio based on Deep Filtering*. `arXiv:2110.05588 <https://arxiv.org/abs/2110.05588>`_.