medkit.audio.metrics.transcription
==================================

.. py:module:: medkit.audio.metrics.transcription


Classes
-------

.. autoapisummary::

   medkit.audio.metrics.transcription.TranscriptionEvaluatorResult
   medkit.audio.metrics.transcription.TranscriptionEvaluator


Module Contents
---------------

.. py:class:: TranscriptionEvaluatorResult

   
   Results returned by :class:`~.TranscriptionEvaluator`.


   :Attributes:

       **wer** : float
           Word Error Rate, combination of word insertions, deletions and
           substitutions

       **word_insertions** : float
           Ratio of extra words in prediction (over `word_support`)

       **word_deletions** : float
           Ratio of missing words in prediction (over `word_support`)

       **word_substitutions** : float
           Ratio of replaced words in prediction (over `word_support`)

       **word_support** : int
           Total number of words

       **cer** : float
           Character Error Rate, same as `wer` but at character level

       **char_insertions** : float
           Identical to `word_insertions` but at character level

       **char_deletions** : float
           Identical to `word_deletions` but at character level

       **char_substitutions** : float
           Identical to `word_substitutions` but at character level

       **char_support** : int
           Total number of characters (not including whitespaces, post punctuation
           removal and unicode replacement)


   ..
       !! processed by numpydoc !!

   .. py:attribute:: wer
      :type:  float


   .. py:attribute:: word_insertions
      :type:  float


   .. py:attribute:: word_deletions
      :type:  float


   .. py:attribute:: word_substitutions
      :type:  float


   .. py:attribute:: word_support
      :type:  int


   .. py:attribute:: cer
      :type:  float


   .. py:attribute:: char_insertions
      :type:  float


   .. py:attribute:: char_deletions
      :type:  float


   .. py:attribute:: char_substitutions
      :type:  float


   .. py:attribute:: char_support
      :type:  int


.. py:class:: TranscriptionEvaluator(speech_label: str = 'speech', transcription_label: str = 'transcription', case_sensitive: bool = False, remove_punctuation: bool = True, replace_unicode: bool = False)

   
   Word Error Rate (WER) and Character Error Rate (CER) computation based on `speechbrain`.

   The WER is the ratio of predictions errors at the word level, taking into
   accounts:

   - words present in the reference transcription but missing from the
     prediction;

   - extra predicted words not present in the reference;

   - reference words mistakenly replaced by other words in the prediction.

   The CER is identical to the WER but computed at the character level rather
   than at the word level.

   This component expects as input reference documents containing speech
   segments with reference transcription attributes, as well as corresponding
   speech segments with predicted transcription attributes.

   :Parameters:

       **speech_label** : str, default="speech"
           Label of the speech segments on the reference documents

       **transcription_label** : str, default="transcription"
           Label of the transcription attributes on the reference and predicted
           speech segments

       **case_sensitive** : bool, default=False
           Whether to take case into consideration when comparing reference and
           prediction

       **remove_punctuation** : bool, default=True
           If True, punctuation in reference and predictions is removed before
           comparing (based on `string.punctuation`)

       **replace_unicode** : bool, default=False
           If True, special unicode characters in reference and predictions are
           replaced by their closest ASCII characters (when possible) before
           comparing


   ..
       !! processed by numpydoc !!

   .. py:method:: compute(reference: Sequence[medkit.core.audio.AudioDocument], predicted: Sequence[Sequence[medkit.core.audio.Segment]]) -> TranscriptionEvaluatorResult

      
      Compute the WER and CER for predicted transcription attributes against reference annotated documents.


      :Parameters:

          **reference** : sequence of AudioDocument
              Reference documents containing speech segments with `speech_label`
              as label, each of them containing a transcription attribute with
              `transcription_label` as label.

          **predicted** : sequence of sequence of Segment
              Predicted segments containing each a transcription attribute with
              `transcription_label` as label. This is a list of list that must be
              of the same length and ordering as `reference`.

      :Returns:

          TranscriptionEvaluatorResult
              Computed metrics


      ..
          !! processed by numpydoc !!


   .. py:method:: _convert_speech_segs_to_words(segments: Sequence[medkit.core.audio.Segment]) -> list[str]

      
      Convert speech segments with transcription attributes to speechbrain words.


      ..
          !! processed by numpydoc !!