Audio operations
Contents
Audio operations#
This page lists all components related to audio processing.
Note
For more details about all sub-packages, refer to
medkit.audio.
Pre-processing operations#
This section provides some information about how to use preprocessing modules for audio.
Note
For more details about public APIs, refer to medkit.audio.preprocessing.
Downmixer#
For more details, refer to medkit.audio.preprocessing.downmixer.
Power normalizer#
For more details, refer to medkit.audio.preprocessing.power_normalizer.
Resampler#
Important
Resampler needs additional dependencies
that can be installed with pip install medkit-lib[resampler]
For more details, refer to medkit.audio.preprocessing.resampler.
Segmentation operations#
This section lists audio segmentation operations. They are part of the
medkit.audio.segmentation module.
WebRTC voice detector#
For more details, refer to
medkit.audio.segmentation.webrtc_voice_detector.
Pyannote speaker detector#
Important
PASpeakerDetector is an experimental feature.
It depends on a version of pyannote-audio that is not released yet on PyPI.
To install it, you may use the JSALT2023 tag :
pip install https://github.com/pyannote/pyannote-audio/archive/refs/tags/JSALT2023.tar.gz
For more details, refer to medkit.audio.segmentation.pa_speaker_detector.
Audio Transcription#
This section lists operations and other components to use to perform audio
transcription.
They are part of the medkit.audio.transcription module.
DocTranscriber is the operation handling the
transformation of AudioDocument instances into
TranscribedDocument instances (subclass of
TextDocument).
The actual conversion from text to audio is delegated to components complying
with the TranscriberFunction protocol.
HFTranscriberFunction and
SBTranscriberFunction are implementations of
TranscriberFunction, allowing to use HuggingFace
transformer models and speechbrain models respectively.
DocTranscriber#
For more details, refer to medkit.audio.transcription.doc_transcriber.
TranscribedDocument#
For more details, refer to medkit.audio.transcription.transcribed_document.
HFTranscriberFunction#
Important
HFTranscriberFunction needs additional
dependencies that can be installed with
pip install medkit-lib[hf-transcriber-function]
For more details, refer to
medkit.audio.transcription.hf_transcriber_function.
SBTranscriberFunction#
Important
SBTranscriberFunction needs additional
dependencies that can be installed with
pip install medkit-lib[sb-transcriber-function]
For more details, refer to
medkit.audio.transcription.sb_transcriber_function.