medkit.core.audio
=================

.. py:module:: medkit.core.audio


Submodules
----------

.. toctree::
   :maxdepth: 1

   /reference/api/medkit/core/audio/annotation/index
   /reference/api/medkit/core/audio/annotation_container/index
   /reference/api/medkit/core/audio/audio_buffer/index
   /reference/api/medkit/core/audio/document/index
   /reference/api/medkit/core/audio/operation/index
   /reference/api/medkit/core/audio/span/index


Classes
-------

.. autoapisummary::

   medkit.core.audio.Segment
   medkit.core.audio.AudioAnnotationContainer
   medkit.core.audio.AudioBuffer
   medkit.core.audio.FileAudioBuffer
   medkit.core.audio.MemoryAudioBuffer
   medkit.core.audio.AudioDocument
   medkit.core.audio.PreprocessingOperation
   medkit.core.audio.SegmentationOperation
   medkit.core.audio.Span


Package Contents
----------------

.. py:class:: Segment(label: str, audio: medkit.core.audio.audio_buffer.AudioBuffer, span: medkit.core.audio.span.Span, attrs: list[medkit.core.attribute.Attribute] | None = None, metadata: dict[str, Any] | None = None, uid: str | None = None)

   Bases: :py:obj:`medkit.core.dict_conv.SubclassMapping`


   
   Audio segment referencing part of an :class:`~.core.audio.AudioDocument`.



   :Attributes:

       **uid: str**
           Unique identifier of the segment.

       **label: str**
           Label of the segment.

       **audio: AudioBuffer**
           The audio signal of the segment. It must be consistent with the span,
           in the sense that it must correspond to the audio signal of the document
           at the span boundaries. But it can be a modified, processed version of this
           audio signal.

       **span: Span**
           Span (in seconds) indicating the part of the document's full signal that
           this segment references.

       **attrs: AttributeContainer**
           Attributes of the segment. Stored in a
           :class:{~medkit.core.AttributeContainer} but can be passed as a list at
           init.

       **metadata: dict of str to Any**
           Metadata of the segment.

       **keys: set of str**
           Pipeline output keys to which the annotation belongs to.













   ..
       !! processed by numpydoc !!

   .. py:attribute:: uid
      :type:  str


   .. py:attribute:: label
      :type:  str


   .. py:attribute:: audio
      :type:  medkit.core.audio.audio_buffer.AudioBuffer


   .. py:attribute:: span
      :type:  medkit.core.audio.span.Span


   .. py:attribute:: attrs
      :type:  medkit.core.attribute_container.AttributeContainer


   .. py:attribute:: metadata
      :type:  dict[str, Any]


   .. py:attribute:: keys
      :type:  set[str]


   .. py:method:: __init_subclass__()
      :classmethod:



   .. py:method:: to_dict() -> dict[str, Any]


   .. py:method:: from_dict(data: dict[str, Any]) -> Segment
      :classmethod:



.. py:class:: AudioAnnotationContainer(doc_id: str, raw_segment: medkit.core.audio.annotation.Segment)

   Bases: :py:obj:`medkit.core.annotation_container.AnnotationContainer`\ [\ :py:obj:`medkit.core.audio.annotation.Segment`\ ]


   
   Manage a list of audio annotations belonging to an audio document.

   This behaves more or less like a list: calling `len()` and iterating are
   supported. Additional filtering is available through the `get()` method.

   Also provides handling of raw segment.















   ..
       !! processed by numpydoc !!

   .. py:attribute:: raw_segment


   .. py:method:: add(ann: medkit.core.audio.annotation.Segment)

      
      Attach an annotation to the document.


      :Parameters:

          **ann** : AnnotationType
              Annotation to add.







      :Raises:

          ValueError
              If the annotation is already attached to the document
              (based on `annotation.uid`)







      ..
          !! processed by numpydoc !!


   .. py:method:: get(*, label: str | None = None, key: str | None = None) -> list[medkit.core.audio.annotation.Segment]

      
      Return a list of the annotations of the document.


      :Parameters:

          **label** : str, optional
              Label to use to filter annotations.

          **key** : str, optional
              Key to use to filter annotations.














      ..
          !! processed by numpydoc !!


   .. py:method:: get_by_id(uid) -> medkit.core.audio.annotation.Segment

      
      Return the annotation corresponding to a specific identifier.


      :Parameters:

          **uid** : str
              Identifier of the annotation to return.














      ..
          !! processed by numpydoc !!


.. py:class:: AudioBuffer(sample_rate: int, nb_samples: int, nb_channels: int)

   Bases: :py:obj:`abc.ABC`, :py:obj:`medkit.core.dict_conv.SubclassMapping`


   
   Audio buffer base class. Gives access to raw audio samples.


   :Parameters:

       **sample_rate:**
           Sample rate of the signal, in samples per second.

       **nb_samples:**
           Duration of the signal in samples.

       **nb_channels:**
           Number of channels in the signal.














   ..
       !! processed by numpydoc !!

   .. py:attribute:: sample_rate


   .. py:attribute:: nb_samples


   .. py:attribute:: nb_channels


   .. py:property:: duration
      :type: float


      
      Duration of the signal in seconds.
















      ..
          !! processed by numpydoc !!


   .. py:method:: read(copy: bool = False) -> numpy.ndarray
      :abstractmethod:


      
      Return the signal in the audio buffer.


      :Parameters:

          **copy:**
              If `True`, the returned array will be a copy that can be safely mutated.



      :Returns:

          np.ndarray:
              Raw audio samples











      ..
          !! processed by numpydoc !!


   .. py:method:: trim(start: int | None, end: int | None) -> AudioBuffer
      :abstractmethod:


      
      Return the signal from the original buffer trimmed by start and end indexes.


      :Parameters:

          **start: int, optional**
              Start sample of the new buffer (defaults to `0`).

          **end: int, optional**
              End sample of the new buffer, excluded (default to full duration).



      :Returns:

          AudioBuffer:
              Trimmed audio buffer with new start and end samples,
              of same type as original audio buffer.











      ..
          !! processed by numpydoc !!


   .. py:method:: trim_duration(start_time: float | None = None, end_time: float | None = None) -> AudioBuffer

      
      Return the signal from the original buffer trimmed by start and end times.

      Return a new audio buffer pointing to a portion of the signal in the original buffer,
      using boundaries in seconds. Since `start_time` and `end_time` are in seconds, the exact
      trim boundaries will be rounded to the nearest sample and will therefore depend on the
      sampling rate.

      :Parameters:

          **start_time: float, optional**
              Start time of the new buffer (defaults to `0.0`).

          **end_time: float, optional**
              End time of thew new buffer, excluded (default to full duration).



      :Returns:

          AudioBuffer:
              Trimmed audio buffer with new start and end samples,
              of same type as original audio buffer.











      ..
          !! processed by numpydoc !!


   .. py:method:: __init_subclass__()
      :classmethod:



   .. py:method:: from_dict(data_dict: dict[str, Any]) -> typing_extensions.Self
      :classmethod:



   .. py:method:: to_dict() -> dict[str, Any]
      :abstractmethod:



   .. py:method:: __eq__(other: object) -> bool
      :abstractmethod:



.. py:class:: FileAudioBuffer(path: str | pathlib.Path, trim_start: int | None = None, trim_end: int | None = None, sf_info: Any | None = None)

   Bases: :py:obj:`AudioBuffer`


   
   Audio buffer giving access to audio files stored on the filesystem.

   To be used when manipulating unmodified raw audio.

   Supports all file formats handled by `libsndfile`_

   .. _libsndfile: http://www.mega-nerd.com/libsndfile/#Features

   :Parameters:

       **path: str or Path**
           Path to the audio file.

       **trim_start: int, optional**
           First sample of audio file to consider.

       **trim_end: int, optional**
           First sample of audio file to exclude.

       **sf_info: Any, optional**
           Optional metadata dict returned by soundfile.














   ..
       !! processed by numpydoc !!

   .. py:attribute:: path


   .. py:attribute:: _trim_end


   .. py:attribute:: _trim_start
      :value: 0



   .. py:attribute:: _sf_info
      :value: None



   .. py:method:: read(copy: bool = False) -> numpy.ndarray

      
      Return the signal in the audio buffer.


      :Parameters:

          **copy:**
              If `True`, the returned array will be a copy that can be safely mutated.



      :Returns:

          np.ndarray:
              Raw audio samples











      ..
          !! processed by numpydoc !!


   .. py:method:: trim(start: int | None = None, end: int | None = None) -> AudioBuffer

      
      Return the signal from the original buffer trimmed by start and end indexes.


      :Parameters:

          **start: int, optional**
              Start sample of the new buffer (defaults to `0`).

          **end: int, optional**
              End sample of the new buffer, excluded (default to full duration).



      :Returns:

          AudioBuffer:
              Trimmed audio buffer with new start and end samples,
              of same type as original audio buffer.











      ..
          !! processed by numpydoc !!


   .. py:method:: to_dict() -> dict[str, Any]


   .. py:method:: from_dict(data: dict[str, Any]) -> typing_extensions.Self
      :classmethod:



   .. py:method:: __eq__(other: object) -> bool


.. py:class:: MemoryAudioBuffer(signal: numpy.ndarray, sample_rate: int)

   Bases: :py:obj:`AudioBuffer`


   
   Audio buffer giving access to signals stored in memory.

   To be used for reading or writing a modified audio signal.

   :Parameters:

       **signal: ndarray**
           Samples constituting the audio signal, with shape `(nb_channel, nb_samples)`.

       **sample_rate: int**
           Sample rate of the signal, in samples per second.














   ..
       !! processed by numpydoc !!

   .. py:attribute:: _signal


   .. py:method:: read(copy: bool = False) -> numpy.ndarray

      
      Return the signal in the audio buffer.


      :Parameters:

          **copy:**
              If `True`, the returned array will be a copy that can be safely mutated.



      :Returns:

          np.ndarray:
              Raw audio samples











      ..
          !! processed by numpydoc !!


   .. py:method:: trim(start: int | None = None, end: int | None = None) -> AudioBuffer

      
      Return the signal from the original buffer trimmed by start and end indexes.


      :Parameters:

          **start: int, optional**
              Start sample of the new buffer (defaults to `0`).

          **end: int, optional**
              End sample of the new buffer, excluded (default to full duration).



      :Returns:

          AudioBuffer:
              Trimmed audio buffer with new start and end samples,
              of same type as original audio buffer.











      ..
          !! processed by numpydoc !!


   .. py:method:: to_dict() -> dict[str, Any]


   .. py:method:: from_dict(data: dict[str, Any]) -> typing_extensions.Self
      :classmethod:



   .. py:method:: __eq__(other: object) -> bool


.. py:class:: AudioDocument(audio: medkit.core.audio.audio_buffer.AudioBuffer, anns: Sequence[medkit.core.audio.annotation.Segment] | None = None, attrs: Sequence[medkit.core.Attribute] | None = None, metadata: dict[str, Any] | None = None, uid: str | None = None)

   Bases: :py:obj:`medkit.core.dict_conv.SubclassMapping`


   
   Document holding audio annotations.



   :Attributes:

       **uid: str**
           Unique identifier of the document.

       **audio: AudioBuffer**
           Audio buffer containing the entire signal of the document.

       **anns: :class:`~.audio.AudioAnnotationContainer`**
           Annotations of the document. Stored in an
           :class:`~.audio.AudioAnnotationContainer` but can be passed as a list at init.

       **attrs: :class:`~.core.AttributeContainer`**
           Attributes of the document. Stored in an
           :class:`~.core.AttributeContainer` but can be passed as a list at init

       **metadata: dict of str to Any**
           Document metadata.

       **raw_segment: :class:`~.audio.Segment`**
           Auto-generated segment containing the full unprocessed document audio.













   ..
       !! processed by numpydoc !!

   .. py:attribute:: RAW_LABEL
      :type:  ClassVar[str]
      :value: 'RAW_AUDIO'


      
      Label to be used for raw segment
















      ..
          !! processed by numpydoc !!


   .. py:attribute:: uid
      :type:  str


   .. py:attribute:: anns
      :type:  medkit.core.audio.annotation_container.AudioAnnotationContainer


   .. py:attribute:: attrs
      :type:  medkit.core.AttributeContainer


   .. py:attribute:: metadata
      :type:  dict[str, Any]


   .. py:attribute:: raw_segment
      :type:  medkit.core.audio.annotation.Segment


   .. py:method:: _generate_raw_segment(audio: medkit.core.audio.audio_buffer.AudioBuffer, doc_id: str) -> medkit.core.audio.annotation.Segment
      :classmethod:



   .. py:property:: audio
      :type: medkit.core.audio.audio_buffer.AudioBuffer



   .. py:method:: __init_subclass__()
      :classmethod:



   .. py:method:: to_dict(with_anns: bool = True) -> dict[str, Any]


   .. py:method:: from_dict(data: dict[str, Any]) -> typing_extensions.Self
      :classmethod:



   .. py:method:: from_file(path: os.PathLike) -> typing_extensions.Self
      :classmethod:


      
      Create document from an audio file.


      :Parameters:

          **path: path-like**
              Path to the audio file. Supports all file formats handled by
              `libsndfile` (http://www.mega-nerd.com/libsndfile/#Features)



      :Returns:

          AudioDocument
              Audio document with signal of `path` as audio. The file path is
              included in the document metadata.











      ..
          !! processed by numpydoc !!


   .. py:method:: from_dir(path: os.PathLike, pattern: str = '*.wav') -> list[typing_extensions.Self]
      :classmethod:


      
      Create documents from audio files in a directory.


      :Parameters:

          **path: path-like**
              Path of the directory containing audio files

          **pattern: str, default="*.wav"**
              Glob pattern to match audio files in `path`. Supports all file
              formats handled by `libsndfile`
              (http://www.mega-nerd.com/libsndfile/#Features)



      :Returns:

          List[AudioDocument]
              Audio documents with signal of each file as audio











      ..
          !! processed by numpydoc !!


.. py:class:: PreprocessingOperation(uid: str | None = None, name: str | None = None, **kwargs)

   Bases: :py:obj:`medkit.core.operation.Operation`


   
   Abstract operation for pre-processing segments.

   It uses a list of segments as input and produces a list of pre-processed
   segments. Each input segment will have a corresponding output segment.















   ..
       !! processed by numpydoc !!

   .. py:method:: run(segments: list[medkit.core.audio.annotation.Segment]) -> list[medkit.core.audio.annotation.Segment]
      :abstractmethod:



.. py:class:: SegmentationOperation(uid: str | None = None, name: str | None = None, **kwargs)

   Bases: :py:obj:`medkit.core.operation.Operation`


   
   Abstract operation for segmenting audio.

   It uses a list of segments as input and produces a list of new segments.
   Each input segment will have zero, one or more corresponding output
   segments.















   ..
       !! processed by numpydoc !!

   .. py:method:: run(segments: list[medkit.core.audio.annotation.Segment]) -> list[medkit.core.audio.annotation.Segment]
      :abstractmethod:



.. py:class:: Span

   Bases: :py:obj:`NamedTuple`


   
   Boundaries of a slice of audio.



   :Attributes:

       **start: float**
           Starting point in the original audio, in seconds.

       **end: float**
           Ending point in the original audio, in seconds.













   ..
       !! processed by numpydoc !!

   .. py:attribute:: start
      :type:  float


   .. py:attribute:: end
      :type:  float


   .. py:property:: length

      
      Length of the span, in seconds.
















      ..
          !! processed by numpydoc !!


   .. py:method:: to_dict() -> dict[str, Any]


   .. py:method:: from_dict(data: dict[str, Any]) -> Span
      :classmethod:



