medkit.text.segmentation.tokenizer_utils#

Functions:

`lstrip`(text[, start, chars])	Returns a copy of the string with leading characters removed and its corresponding new start index.
`rstrip`(text[, end, chars])	Returns a copy of the string with trailing characters removed and its corresponding new end index.
`strip`(text[, start, chars])	Returns a copy of the string with leading characters removed and its corresponding new start and end indexes.

lstrip(text, start=0, chars=None)[source]#

Returns a copy of the string with leading characters removed and its corresponding new start index.

Parameters

text (str) – The text to strip.
start (int) – The start index from the original text if any.
chars (Optional[str]) – The list of characters to strip. Default behaviour is like str.lstrip([chars]).

Return type

Tuple[str, int]

rstrip(text, end=None, chars=None)[source]#

Returns a copy of the string with trailing characters removed and its corresponding new end index.

Parameters

text (str) – The text to strip.
end (Optional[int]) – The end index from the original text if any.
chars (Optional[str]) – The list of characters to strip. Default behaviour is like str.rstrip([chars]).

Return type

Tuple[str, int]

strip(text, start=0, chars=None)[source]#

Returns a copy of the string with leading characters removed and its corresponding new start and end indexes.

Parameters

text (str) – The text to strip.
start (int) – The start index from the original text if any.
chars (Optional[str]) – The list of characters to strip. Default behaviour is like str.lstrip([chars]).

Return type

Tuple[str, int, int]