medkit.io.srt
medkit.io.srt#
Classes:
|
Convert .srt files containing transcription information into turn segments with transcription attributes. |
|
Build .srt files containing transcription information from |
- class SRTInputConverter(turn_segment_label='turn', transcription_attr_label='transcribed_text', converter_id=None)[source]#
Convert .srt files containing transcription information into turn segments with transcription attributes.
For each turn in a .srt file, a
Segmentwill be created, with an associatedAttributeholding the transcribed text as value. The segments can be retrieved directly or as part of anAudioDocumentinstance.If a
ProvTraceris set, provenance information will be added for each segment and each attribute (referencing the input converter as the operation).- Parameters
turn_segment_label (
str) – Label to use for segments representing turns in the .srt file.transcription_attr_label (
str) – Label to use for segments attributes containing the transcribed text.converter_id (
Optional[str]) – Identifier of the converter.
Attributes:
Contains all the input converter init parameters.
Methods:
load(srt_dir[, audio_dir, audio_ext])Load all .srt files in a directory into a list of
AudioDocumentobjects.load_doc(srt_file, audio_file)Load a single .srt file into an
AudioDocumentcontaining turn segments with transcription attributes.load_segments(srt_file, audio_file)Load a .srt file and return a list of
Segmentobjects corresponding to turns, with transcription attributes.set_prov_tracer(prov_tracer)Enable provenance tracing.
- property description: medkit.core.operation_desc.OperationDescription#
Contains all the input converter init parameters.
- Return type
- set_prov_tracer(prov_tracer)[source]#
Enable provenance tracing.
- Parameters
prov_tracer (
ProvTracer) – The provenance tracer used to trace the provenance.
- load(srt_dir, audio_dir=None, audio_ext='.wav')[source]#
Load all .srt files in a directory into a list of
AudioDocumentobjects.For each .srt file, they must be a corresponding audio file with the same basename, either in the same directory or in an separated audio directory.
- Parameters
srt_dir (
Union[str,Path]) – Directory containing the .srt files.audio_dir (
Union[str,Path,None]) – Directory containing the audio files corresponding to the .srt files, if they are not in srt_dir.audio_ext (
str) – File extension to use for audio files.
- Return type
List[AudioDocument]- Returns
List[AudioDocument] – List of generated documents.
- load_doc(srt_file, audio_file)[source]#
Load a single .srt file into an
AudioDocumentcontaining turn segments with transcription attributes.- Parameters
srt_file (
Union[str,Path]) – Path to the .srt file.audio_file (
Union[str,Path]) – Path to the corresponding audio file.
- Return type
- Returns
AudioDocument – Generated document.
- class SRTOutputConverter(segment_turn_label='turn', transcription_attr_label='transcribed_text')[source]#
Build .srt files containing transcription information from
Segmentobjects.There must be a segment for each turn, with an associated
Attributeholding the transcribed text as value. The segments can be passed directly or as part ofAudioDocumentinstances.- Parameters
segment_turn_label (
str) – Label of segments representing turns in the audio documents.transcription_attr_label (
str) – Label of segments attributes containing the transcribed text.
Methods:
save(docs, srt_dir[, doc_names])Save
AudioDocumentinstances as .srt files in a directory.save_doc(doc, srt_file)Save a single
AudioDocumentas a .srt file.save_segments(segments, srt_file)Save
Segmentobjects representing turns into a .srt file.- save(docs, srt_dir, doc_names=None)[source]#
Save
AudioDocumentinstances as .srt files in a directory.- Parameters
docs (
List[AudioDocument]) – List of audio documents to save.str_dir – Directory into which the generated .str files will be stored.
doc_names (
Optional[List[str]]) – Optional list of names to use as basenames for the generated .srt files.
- save_doc(doc, srt_file)[source]#
Save a single
AudioDocumentas a .srt file.- Parameters
doc (
AudioDocument) – Audio document to save.srt_file (
Union[str,Path]) – Path of the generated .srt file.