medkit.audio.transcription.sb_transcriber
=========================================

.. py:module:: medkit.audio.transcription.sb_transcriber


Classes
-------

.. autoapisummary::

   medkit.audio.transcription.sb_transcriber.SBTranscriber


Module Contents
---------------

.. py:class:: SBTranscriber(model: str | pathlib.Path, needs_decoder: bool, output_label: str = 'transcribed_text', add_trailing_dot: bool = True, capitalize: bool = True, cache_dir: str | pathlib.Path | None = None, device: int = -1, batch_size: int = 1, uid: str | None = None)

   Bases: :py:obj:`medkit.core.Operation`


   Transcriber operation based on a SpeechBrain model.

   For each segment given as input, a transcription attribute will be created
   with the transcribed text as value. If needed, a text document can later be
   created from all the transcriptions of an audio document using
   :func:`~medkit.audio.transcription.TranscribedTextDocument.from_audio_doc
   <TranscribedTextDocument.from_audio_doc>`

   :Parameters:

       **model** : str or Path
           Name of the model on the Hugging Face models hub, or local path.

       **needs_decoder** : bool
           Whether the model should be used with the speechbrain
           `EncoderDecoderASR` class or the `EncoderASR` class. If unsure,
           check the code snippets on the model card on the hub.

       **output_label** : str, default="transcribed_text"
           Label of the attribute containing the transcribed text that will be
           attached to the input segments

       **add_trailing_dot** : bool, default=True
           If `True`, a dot will be added at the end of each transcription text.

       **capitalize** : bool, default=True
           It `True`, the first letter of each transcription text will be
           uppercased and the rest lowercased.

       **cache_dir** : str or Path, optional
           Directory where to store the downloaded model. If `None`,
           speechbrain will use "pretrained_models/" and "model_checkpoints/"
           directories in the current working directory.

       **device** : int, default=-1
           Device to use for pytorch models. Follows the Hugging Face convention
           (`-1` for cpu and device number for gpu, for instance `0` for "cuda:0")

       **batch_size** : int, default=1
           Number of segments in batches processed by the model.

       **uid** : str, optional
           Identifier of the transcriber.


   ..
       !! processed by numpydoc !!

   .. py:attribute:: model_name


   .. py:attribute:: output_label
      :value: 'transcribed_text'


   .. py:attribute:: add_trailing_dot
      :value: True


   .. py:attribute:: capitalize
      :value: True


   .. py:attribute:: cache_dir
      :value: None


   .. py:attribute:: device
      :value: -1


   .. py:attribute:: batch_size
      :value: 1


   .. py:attribute:: _torch_device
      :value: 'cpu'


   .. py:attribute:: _asr


   .. py:attribute:: _sample_rate


   .. py:method:: run(segments: list[medkit.core.audio.Segment])

      
      Run the transcription operation.

      Add a transcription attribute to each segment with a text value containing
      the transcribed text.

      :Parameters:

          **segments** : list of Segment
              List of segments to transcribe


      ..
          !! processed by numpydoc !!


   .. py:method:: _transcribe_audios(audios: list[medkit.core.audio.AudioBuffer]) -> list[str]