:py:mod:`medkit.io.srt`
=======================

.. py:module:: medkit.io.srt


Module Contents
---------------

Classes
~~~~~~~

.. autoapisummary::

   medkit.io.srt.SRTInputConverter
   medkit.io.srt.SRTOutputConverter


.. py:class:: SRTInputConverter(turn_segment_label: str = 'turn', transcription_attr_label: str = 'transcribed_text', converter_id: str | None = None)


   Bases: :py:obj:`medkit.core.InputConverter`

   
   Convert .srt files containing transcription information into turn segments with transcription attributes.

   For each turn in a .srt file, a
   :class:`~medkit.core.audio.annotation.Segment` will be created, with an
   associated :class:`~medkit.core.Attribute` holding the transcribed text as
   value. The segments can be retrieved directly or as part of an
   :class:`~medkit.core.audio.document.AudioDocument` instance.

   If a :class:`~medkit.core.ProvTracer` is set, provenance information will be
   added for each segment and each attribute (referencing the input converter
   as the operation).

   :Parameters:

       **turn_segment_label** : str, default="turn"
           Label to use for segments representing turns in the .srt file.

       **transcription_attr_label** : str, default="transcribed_text"
           Label to use for segments attributes containing the transcribed text.

       **converter_id** : str, optional
           Identifier of the converter.


   ..
       !! processed by numpydoc !!
   .. py:property:: description
      :type: medkit.core.OperationDescription

      
      Contains all the input converter init parameters.


      ..
          !! processed by numpydoc !!

   .. py:method:: set_prov_tracer(prov_tracer: medkit.core.ProvTracer)

      
      Enable provenance tracing.


      :Parameters:

          **prov_tracer** : ProvTracer
              The provenance tracer used to trace the provenance.


      ..
          !! processed by numpydoc !!

   .. py:method:: load(srt_dir: str | pathlib.Path, audio_dir: str | pathlib.Path | None = None, audio_ext: str = '.wav') -> list[medkit.core.audio.AudioDocument]

      
      Load all .srt files in a directory into a list of audio documents.

      For each .srt file, they must be a corresponding audio file with the
      same basename, either in the same directory or in an separated audio
      directory.

      :Parameters:

          **srt_dir** : str or Path
              Directory containing the .srt files.

          **audio_dir** : str or Path, optional
              Directory containing the audio files corresponding to the .srt files,
              if they are not in `srt_dir`.

          **audio_ext** : str, default=".wav"
              File extension to use for audio files.

      :Returns:

          list of AudioDocument
              List of generated documents.


      ..
          !! processed by numpydoc !!

   .. py:method:: load_doc(srt_file: str | pathlib.Path, audio_file: str | pathlib.Path) -> medkit.core.audio.AudioDocument

      
      Load a single .srt file into an audio document containing turn segments with transcription attributes.


      :Parameters:

          **srt_file** : str or Path
              Path to the .srt file.

          **audio_file** : str or Path
              Path to the corresponding audio file.

      :Returns:

          AudioDocument
              Generated document.


      ..
          !! processed by numpydoc !!

   .. py:method:: load_segments(srt_file: str | pathlib.Path, audio_file: str | pathlib.Path) -> list[medkit.core.audio.Segment]

      
      Load a .srt file and return a list of segments corresponding to turns with transcription attributes.


      :Parameters:

          **srt_file** : str or Path
              Path to the .srt file.

          **audio_file** : str or Path
              Path to the corresponding audio file.

      :Returns:

          list of Segment
              Turn segments as found in the .srt file, with transcription
              attributes attached.


      ..
          !! processed by numpydoc !!

   .. py:method:: _build_segment(srt_item: pysrt.SubRipItem, full_audio: medkit.core.audio.FileAudioBuffer) -> medkit.core.audio.Segment


.. py:class:: SRTOutputConverter(segment_turn_label: str = 'turn', transcription_attr_label: str = 'transcribed_text')


   Bases: :py:obj:`medkit.core.OutputConverter`

   
   Build .srt files containing transcription information from segments.

   There must be a segment for each turn, with an associated
   :class:`~medkit.core.Attribute` holding the transcribed text as
   value. The segments can be passed directly or as part of
   :class:`~medkit.core.audio.document.AudioDocument` instances.

   :Parameters:

       **segment_turn_label** : str, default="turn"
           Label of segments representing turns in the audio documents.

       **transcription_attr_label** : str, default="transcribed_text"
           Label of segments attributes containing the transcribed text.


   ..
       !! processed by numpydoc !!
   .. py:method:: save(docs: list[medkit.core.audio.AudioDocument], srt_dir: str | pathlib.Path, doc_names: list[str] | None = None)

      
      Save multiple audio documents as .srt files in a directory.


      :Parameters:

          **docs** : list of AudioDocument
              List of audio documents to save.

          **srt_dir** : str or Path
              Directory into which the generated .str files will be stored.

          **doc_names** : list of str, optional
              Optional list of names to use as basenames for the generated .srt
              files.


      ..
          !! processed by numpydoc !!

   .. py:method:: save_doc(doc: medkit.core.audio.AudioDocument, srt_file: str | pathlib.Path)

      
      Save a single audio document as a .srt file.


      :Parameters:

          **doc** : AudioDocument
              Audio document to save.

          **srt_file** : str or Path
              Path of the generated .srt file.


      ..
          !! processed by numpydoc !!

   .. py:method:: save_segments(segments: list[medkit.core.audio.Segment], srt_file: str | pathlib.Path)

      
      Save segments representing turns into a .srt file.


      :Parameters:

          **segments** : list of Segment
              Turn segments to save.

          **srt_file** : str or Path
              Path of the generated .srt file.


      ..
          !! processed by numpydoc !!