medkit.audio.transcription.hf_transcriber_function#

This module needs extra-dependencies not installed as core dependencies of medkit. To install them, use pip install medkit-lib[hf-transcriber_function].

Classes:

HFTranscriberFunction([model, ...])

Transcriber function based on a Hugging Face transformers model.

class HFTranscriberFunction(model='facebook/s2t-large-librispeech-asr', add_trailing_dot=True, capitalize=True, device=- 1, batch_size=1, cache_dir=None)[source]#

Transcriber function based on a Hugging Face transformers model.

To be used within a DocTranscriber

Parameters
  • model (str) – Name of the ASR model on the Hugging Face models hub. Must be a model compatible with the AutomaticSpeechRecognitionPipeline transformers class.

  • add_trailing_dot (bool) – If True, a dot will be added at the end of each transcription text.

  • capitalize (bool) – It True, the first letter of each transcription text will be uppercased and the rest lowercased.

  • device (int) – Device to use for pytorch models. Follows the Hugging Face convention (-1 for cpu and device number for gpu, for instance 0 for “cuda:0”)

  • batch_size (int) – Size of batches processed by ASR pipeline.

  • cache_dir (Union[str, Path, None]) – Directory where to store downloaded models. If not set, the default HuggingFace cache dir is used.