medkit.text.preprocessing.normalizer
medkit.text.preprocessing.normalizer#
Classes:
|
Generic normalizer to be used as pre-processing module |
|
Create new instance of NormalizerRule(pattern_to_replace, new_text) |
- class Normalizer(output_label, rules=None, name=None, uid=None)[source]#
Generic normalizer to be used as pre-processing module
This module is a non-destructive module allowing to replace selected characters with the wanted characters. It respects the span modification by creating a new text-bound annotation containing the span modification information from input text.
- Parameters
output_label (
str) – The output label of the created annotationsrules (
Optional[List[Tuple[str,str]]]) – The list of replacement rulesname (
Optional[str]) – Name describing the pre-processing module (defaults to the class name)uid (str) – Identifier of the pre-processing module
Methods:
run(segments)Run the module on a list of segments provided as input and returns a new list of segments