medkit.text.spacy.edsnlp#

This package needs extra-dependencies not installed as core dependencies of medkit. To install them, use pip install medkit[edsnlp].

Classes:

EDSNLPDocPipeline(nlp[, medkit_labels_anns, ...])

DocPipeline to obtain annotations created using EDS-NLP

EDSNLPPipeline(nlp[, spacy_entities, ...])

Segment annotator relying on an EDS-NLP pipeline

Functions:

build_adicap_attribute(spacy_span, spacy_label)

Build a medkit ADICAP normalization attribute from an EDS-NLP attribute with an ADICAP object as value.

build_date_attribute(spacy_span, spacy_label)

Build a medkit date attribute from an EDS-NLP attribute with a date object as value.

build_duration_attribute(spacy_span, spacy_label)

Build a medkit duration attribute from an EDS-NLP attribute with a duration object as value.

build_measurement_attribute(spacy_span, ...)

Build a medkit attribute from an EDS-NLP attribute with a measurement object as value.

build_tnm_attribute(spacy_span, spacy_label)

Build a medkit TNM attribute from an EDS-NLP attribute with a TNM object as value.

Data:

DEFAULT_ATTRIBUTE_FACTORIES

Pre-defined attribute factories to handle EDS-NLP attributes

class EDSNLPPipeline(nlp, spacy_entities=None, spacy_span_groups=None, spacy_attrs=None, medkit_attribute_factories=None, name=None, uid=None)[source]#

Segment annotator relying on an EDS-NLP pipeline

Initialize the segment annotator

Parameters:
  • nlp (Language) – Language object with the loaded pipeline from Spacy

  • spacy_entities (list of str, optional) – Labels of new spacy entities (doc.ents) to convert into medkit entities. If None (default) all the new spacy entities will be converted

  • spacy_span_groups (list of str, optional) – Name of new spacy span groups (doc.spans) to convert into medkit segments. If None (default) new spacy span groups will be converted

  • spacy_attrs (list of str, optional) – Name of span extensions to convert into medkit attributes. If None, all non-redundant EDS-NLP attributes will be handled.

  • medkit_attribute_factories (dict of str to Callable, optional) – Mapping of factories in charge of converting spacy attributes to medkit attributes. Factories will receive a spacy span and an an attribute label when called. The key in the mapping is the attribute label. Pre-defined default factories are listed in DEFAULT_ATTRIBUTE_FACTORIES

  • name (str, optional) – Name describing the pipeline (defaults to the class name).

  • uid (str, optional) – Identifier of the pipeline

Attributes:

description

Contains all the operation init parameters.

Methods:

run(segments)

Run a spacy pipeline on a list of segments provided as input and returns a new list of segments.

set_prov_tracer(prov_tracer)

Enable provenance tracing.

property description: OperationDescription#

Contains all the operation init parameters.

Return type:

OperationDescription

run(segments)#

Run a spacy pipeline on a list of segments provided as input and returns a new list of segments. Each segment is converted to spacy document (Doc object). Then, the spacy pipeline is executed and finally, the new annotations and attributes are converted into medkit annotations.

Parameters:

segments (list of Segment) – List of segments on which to run the spacy pipeline

Return type:

list[Segment]

Returns:

list of Segment – List of new annotations

set_prov_tracer(prov_tracer)#

Enable provenance tracing.

Parameters:

prov_tracer (ProvTracer) – The provenance tracer used to trace the provenance.

class EDSNLPDocPipeline(nlp, medkit_labels_anns=None, medkit_attrs=None, spacy_entities=None, spacy_span_groups=None, spacy_attrs=None, medkit_attribute_factories=None, name=None, uid=None)[source]#

DocPipeline to obtain annotations created using EDS-NLP

Initialize the pipeline

Parameters:
  • nlp (Language) – Language object with the loaded pipeline from Spacy

  • medkit_labels_anns (list of str, optional) – Labels of medkit annotations to include in the spacy document. If None (default) all the annotations will be included.

  • medkit_attrs (list of str, optional) – Labels of medkit attributes to add in the annotations that will be included. If None (default) all the attributes will be added as custom attributes in each annotation included.

  • spacy_entities (list of str, optional) – Labels of new spacy entities (doc.ents) to convert into medkit entities. If None (default) all the new spacy entities will be converted and added into its origin medkit document.

  • spacy_span_groups (list of str, optional) – Name of new spacy span groups (doc.spans) to convert into medkit segments. If None (default) new spacy span groups will be converted and added into its origin medkit document.

  • spacy_attrs (list of str, optional) – Name of span extensions to convert into medkit attributes. If None, all non-redundant EDS-NLP attributes will be handled.

  • medkit_attribute_factories (dict of str to Callable, optional) – Mapping of factories in charge of converting spacy attributes to medkit attributes. Factories will receive a spacy span and an an attribute label when called. The key in the mapping is the attribute label. Pre-defined default factories are listed in DEFAULT_ATTRIBUTE_FACTORIES

  • name (str, optional) – Name describing the pipeline (defaults to the class name).

  • uid (str, optional) – Identifier of the pipeline

Attributes:

description

Contains all the operation init parameters.

Methods:

run(medkit_docs)

Run a spacy pipeline on a list of medkit documents.

set_prov_tracer(prov_tracer)

Enable provenance tracing.

property description: OperationDescription#

Contains all the operation init parameters.

Return type:

OperationDescription

run(medkit_docs)#

Run a spacy pipeline on a list of medkit documents. Each medkit document is converted to spacy document (Doc object), with the selected annotations and attributes. Then, the spacy pipeline is executed and finally, the new annotations and attributes are converted into medkit annotations.

Parameters:

medkit_docs (list of TextDocument) – List of TextDocuments on which to run the pipeline

Return type:

None

set_prov_tracer(prov_tracer)#

Enable provenance tracing.

Parameters:

prov_tracer (ProvTracer) – The provenance tracer used to trace the provenance.

build_date_attribute(spacy_span, spacy_label)[source]#

Build a medkit date attribute from an EDS-NLP attribute with a date object as value.

Parameters:
  • spacy_span (SpacySpan) – Spacy span having an ESD-NLP date attribute

  • spacy_label (str) – Label of the date attribute on spacy_spacy. Ex: “date”, “consultation_date”

Return type:

Attribute

Returns:

AttributeDateAttribute or RelativeDateAttribute instance, depending on the EDS-NLP attribute

build_duration_attribute(spacy_span, spacy_label)[source]#

Build a medkit duration attribute from an EDS-NLP attribute with a duration object as value.

Parameters:
  • spacy_span (SpacySpan) – Spacy span having an ESD-NLP date attribute

  • spacy_label (str) – Label of the date attribute on spacy_spacy. Ex: “duration”

Return type:

DurationAttribute

Returns:

DurationAttribute – Medkit duration attribute

build_adicap_attribute(spacy_span, spacy_label)[source]#

Build a medkit ADICAP normalization attribute from an EDS-NLP attribute with an ADICAP object as value.

Parameters:
  • spacy_span (SpacySpan) – Spacy span having an ADICAP object as value

  • spacy_label (str) – Label of the attribute on spacy_spacy. Ex: “adicap”

Return type:

ADICAPNormAttribute

Returns:

ADICAPNormAttribute – Medkit ADICAP normalization attribute

build_tnm_attribute(spacy_span, spacy_label)[source]#

Build a medkit TNM attribute from an EDS-NLP attribute with a TNM object as value.

Parameters:
  • spacy_span (SpacySpan) – Spacy span having a TNM object as value

  • spacy_label (str) – Label of the attribute on spacy_spacy. Ex: “tnm”

Return type:

TNMAttribute

Returns:

TNMAttribute – Medkit TNM attribute

build_measurement_attribute(spacy_span, spacy_label)[source]#

Build a medkit attribute from an EDS-NLP attribute with a measurement object as value.

Parameters:
  • spacy_span (SpacySpan) – Spacy span having a measurement object as value

  • spacy_label (str) – Label of the attribute on spacy_spacy. Ex: “size”, “weight”, “bmi”

Return type:

Attribute

Returns:

Attribute – Medkit attribute with normalized measurement value and “unit” metadata

DEFAULT_ATTRIBUTE_FACTORIES = {'adicap': <function build_adicap_attribute>, 'bmi': <function build_measurement_attribute>, 'consultation_date': <function build_date_attribute>, 'date': <function build_date_attribute>, 'duration': <function build_duration_attribute>, 'size': <function build_measurement_attribute>, 'tnm': <function build_tnm_attribute>, 'volume': <function build_measurement_attribute>, 'weight': <function build_measurement_attribute>}#

Pre-defined attribute factories to handle EDS-NLP attributes