:py:mod:`medkit.io.rttm`
========================

.. py:module:: medkit.io.rttm


Module Contents
---------------

Classes
~~~~~~~

.. autoapisummary::

   medkit.io.rttm.RTTMInputConverter
   medkit.io.rttm.RTTMOutputConverter


.. py:class:: RTTMInputConverter(turn_label: str = 'turn', speaker_label: str = 'speaker', converter_id: str | None = None)


   Bases: :py:obj:`medkit.core.InputConverter`

   
   Class for conversions from Rich Transcription Time Marked (.rttm) into turn segments.

   Convert Rich Transcription Time Marked (.rttm) files containing diarization
   information into turn segments.

   For each turn in a .rttm file containing diarization information, a
   :class:`~medkit.core.audio.annotation.Segment` will be created, with an
   associated :class:`~medkit.core.Attribute` holding the name of the turn
   speaker as value. The segments can be retrieved directly or as part of an
   :class:`~medkit.core.audio.document.AudioDocument` instance.

   If a :class:`~medkit.core.ProvTracer` is set, provenance information will be
   added for each segment and each attribute (referencing the input converter
   as the operation).

   :Parameters:

       **turn_label** : str, default="turn"
           Label of segments representing turns in the .rttm file.

       **speaker_label** : str, default="speaker"
           Label of speaker attributes to add to each segment.

       **converter_id** : str, optional
           Identifier of the converter.


   :Attributes:

       **description** : OperationDescription
           Description for the operation.


   ..
       !! processed by numpydoc !!
   .. py:property:: description
      :type: medkit.core.OperationDescription

      
      Contains all the input converter init parameters.


      ..
          !! processed by numpydoc !!

   .. py:method:: set_prov_tracer(prov_tracer: medkit.core.ProvTracer)

      
      Enable provenance tracing.


      :Parameters:

          **prov_tracer:**
              The provenance tracer used to trace the provenance.


      ..
          !! processed by numpydoc !!

   .. py:method:: load(rttm_dir: str | pathlib.Path, audio_dir: str | pathlib.Path | None = None, audio_ext: str = '.wav') -> list[medkit.core.audio.AudioDocument]

      
      Load all .rttm files in a directory into a list of audio documents.

      For each .rttm file, they must be a corresponding audio file with the
      same basename, either in the same directory or in an separated audio
      directory.

      :Parameters:

          **rttm_dir** : str or Path
              Directory containing the .rttm files.

          **audio_dir** : str or Path, optional
              Directory containing the audio files corresponding to the .rttm files,
              if they are not in `rttm_dir`.

          **audio_ext** : str, default=".wav"
              File extension to use for audio files.

      :Returns:

          list of AudioDocument
              List of generated documents.


      ..
          !! processed by numpydoc !!

   .. py:method:: load_doc(rttm_file: str | pathlib.Path, audio_file: str | pathlib.Path) -> medkit.core.audio.AudioDocument

      
      Load a single .rttm file into an audio document.


      :Parameters:

          **rttm_file** : str or Path
              Path to the .rttm file.

          **audio_file** : str or Path
              Path to the corresponding audio file.

      :Returns:

          AudioDocument
              Generated document.


      ..
          !! processed by numpydoc !!

   .. py:method:: load_turns(rttm_file: str | pathlib.Path, audio_file: str | pathlib.Path) -> list[medkit.core.audio.Segment]

      
      Load a .rttm file as a list of segments.


      :Parameters:

          **rttm_file** : str or Path
              Path to the .rttm file.

          **audio_file** : str or Path
              Path to the corresponding audio file.

      :Returns:

          list of Segment
              Turn segments as found in the .rttm file.


      ..
          !! processed by numpydoc !!

   .. py:method:: _load_rows(rttm_file: pathlib.Path)
      :staticmethod:


   .. py:method:: _build_turn_segment(row: dict[str, Any], full_audio: medkit.core.audio.FileAudioBuffer) -> medkit.core.audio.Segment


.. py:class:: RTTMOutputConverter(turn_label: str = 'turn', speaker_label: str = 'speaker')


   Bases: :py:obj:`medkit.core.OutputConverter`

   
   Class for conversions to Rich Transcription Time Marked (.rttm).

   Build Rich Transcription Time Marked (.rttm) files containing diarization
   information from :class:`~medkit.core.audio.annotation.Segment` objects.

   There must be a segment for each turn, with an associated
   :class:`~medkit.core.Attribute` holding the name of the turn speaker as
   value. The segments can be passed directly or as part of
   :class:`~medkit.core.audio.document.AudioDocument` instances.

   :Parameters:

       **turn_label** : str, default="turn"
           Label of segments representing turns in the audio documents.

       **speaker_label** : str, default="speaker"
           Label of speaker attributes attached to each turn segment.


   ..
       !! processed by numpydoc !!
   .. py:method:: save(docs: list[medkit.core.audio.AudioDocument], rttm_dir: str | pathlib.Path, doc_names: list[str] | None = None)

      
      Save a collection of audio documents to RTTM files in a directory.


      :Parameters:

          **docs** : list of AudioDocument
              List of audio documents to save.

          **rttm_dir** : str or Path
              Directory into which the generated .rttm files will be stored.

          **doc_names** : list of str, optional
              Optional list of names to use as basenames and file ids for the
              generated .rttm files (2d column). If none provided, the document
              ids will be used.


      ..
          !! processed by numpydoc !!

   .. py:method:: save_doc(doc: medkit.core.audio.AudioDocument, rttm_file: str | pathlib.Path, rttm_doc_id: str | None = None)

      
      Save a single audio document to a RTTM file.


      :Parameters:

          **doc** : AudioDocument
              Audio document to save.

          **rttm_file** : str or Path
              Path of the generated .rttm file.

          **rttm_doc_id** : str, optional
              File uid to use for the generated .rttm file (2d column). If none
              provided, the document uid will be used.


      ..
          !! processed by numpydoc !!

   .. py:method:: save_turn_segments(turn_segments: list[medkit.core.audio.Segment], rttm_file: str | pathlib.Path, rttm_doc_id: str | None)

      
      Save :class:`~medkit.core.audio.annotation.Segment` objects into a .rttm file.


      :Parameters:

          **turn_segments** : list of Segment
              Turn segments to save.

          **rttm_file** : str or Path
              Path of the generated .rttm file.

          **rttm_doc_id** : str, optional
              File uid to use for the generated .rttm file (2d column).


      ..
          !! processed by numpydoc !!

   .. py:method:: _build_rttm_row(turn_segment: medkit.core.audio.Segment, rttm_doc_id: str | None) -> dict[str, Any]