medkit.text.preprocessing.char_replacer#

Classes:

CharReplacer(output_label[, rules, name, uid])

Generic character replacer to be used as pre-processing module

class CharReplacer(output_label, rules=None, name=None, uid=None)[source]#

Generic character replacer to be used as pre-processing module

This module is a non-destructive module allowing to replace selected 1-char string with the wanted n-chars strings. It respects the span modification by creating a new text-bound annotation containing the span modification information from input text.

Parameters
  • output_label (str) – The output label of the created annotations

  • rules (Optional[List[Tuple[str, str]]]) – The list of replacement rules. Default: ALL_CHAR_RULES

  • name (Optional[str]) – Name describing the pre-processing module (defaults to the class name)

  • uid (str) – Identifier of the pre-processing module

Methods:

run(segments)

Run the module on a list of segments provided as input and returns a new list of segments

run(segments)[source]#

Run the module on a list of segments provided as input and returns a new list of segments

Parameters

segments (List[Segment]) – List of segments to process

Return type

List[Segment]

Returns

List[~medkit.core.text.Segment] – List of new segments