medkit.text.spacy.edsnlp#

This package needs extra-dependencies not installed as core dependencies of medkit. To install them, use pip install medkit[edsnlp].

Classes:

EDSNLPDocPipeline(nlp[, medkit_labels_anns, ...])

DocPipeline to obtain annotations created using EDS-NLP

EDSNLPPipeline(nlp[, spacy_entities, ...])

Segment annotator relying on an EDS-NLP pipeline

Functions:

build_context_attribute(spacy_span, spacy_label)

Build a medkit attribute from an EDS-NLP context/qualifying attribute, adding the cues as metadata

build_date_attribute(spacy_span, spacy_label)

Build a medkit date attribute from an EDS-NLP attribute with a date object as value.

build_history_attribute(spacy_span, spacy_label)

Build a medkit attribute from an EDS-NLP "history" attribute, adding the cues as metadata

build_score_attribute(spacy_span, spacy_label)

Build a medkit attribute from an EDS-NLP "score_name" and corresponding "score_value" attribute.

build_value_attribute(spacy_span, spacy_label)

Build a medkit attribute from an EDS-NLP "value" attribute with a custom object as value:

Data:

DEFAULT_ATTRIBUTE_FACTORIES

Pre-defined attribute factories to handle EDS-NLP attributes

class EDSNLPPipeline(nlp, spacy_entities=None, spacy_span_groups=None, spacy_attrs=None, medkit_attribute_factories=None, name=None, uid=None)[source]#

Segment annotator relying on an EDS-NLP pipeline

Initialize the segment annotator

Parameters
  • nlp (Language) – Language object with the loaded pipeline from Spacy

  • spacy_entities (Optional[List[str]]) – Labels of new spacy entities (doc.ents) to convert into medkit entities. If None (default) all the new spacy entities will be converted

  • spacy_span_groups (Optional[List[str]]) – Name of new spacy span groups (doc.spans) to convert into medkit segments. If None (default) new spacy span groups will be converted

  • spacy_attrs (Optional[List[str]]) – Name of span extensions to convert into medkit attributes. If None, all non-redundant EDS-NLP attributes will be handled.

  • medkit_attribute_factories (Optional[Dict[str, Callable[[Span, str], Attribute]]]) – Mapping of factories in charge of converting spacy attributes to medkit attributes. Factories will receive a spacy span and an an attribute label when called. The key in the mapping is the attribute label. Pre-defined default factories are listed in DEFAULT_ATTRIBUTE_FACTORIES

  • name (Optional[str]) – Name describing the pipeline (defaults to the class name).

  • uid (str) – Identifier of the pipeline

Attributes:

description

Contains all the operation init parameters.

Methods:

run(segments)

Run a spacy pipeline on a list of segments provided as input and returns a new list of segments.

set_prov_tracer(prov_tracer)

Enable provenance tracing.

property description: medkit.core.operation_desc.OperationDescription#

Contains all the operation init parameters.

Return type

OperationDescription

run(segments)#

Run a spacy pipeline on a list of segments provided as input and returns a new list of segments. Each segment is converted to spacy document (Doc object). Then, the spacy pipeline is executed and finally, the new annotations and attributes are converted into medkit annotations.

Parameters

segments (List[Segment]) – List of segments on which to run the spacy pipeline

Return type

List[Segment]

Returns

List[Segments] – List of new annotations

set_prov_tracer(prov_tracer)#

Enable provenance tracing.

Parameters

prov_tracer (ProvTracer) – The provenance tracer used to trace the provenance.

class EDSNLPDocPipeline(nlp, medkit_labels_anns=None, medkit_attrs=None, spacy_entities=None, spacy_span_groups=None, spacy_attrs=None, medkit_attribute_factories=None, name=None, uid=None)[source]#

DocPipeline to obtain annotations created using EDS-NLP

Initialize the pipeline

Parameters
  • nlp (Language) – Language object with the loaded pipeline from Spacy

  • medkit_labels_anns (Optional[List[str]]) – Labels of medkit annotations to include in the spacy document. If None (default) all the annotations will be included.

  • medkit_attrs (Optional[List[str]]) – Labels of medkit attributes to add in the annotations that will be included. If None (default) all the attributes will be added as custom attributes in each annotation included.

  • spacy_entities (Optional[List[str]]) – Labels of new spacy entities (doc.ents) to convert into medkit entities. If None (default) all the new spacy entities will be converted and added into its origin medkit document.

  • spacy_span_groups (Optional[List[str]]) – Name of new spacy span groups (doc.spans) to convert into medkit segments. If None (default) new spacy span groups will be converted and added into its origin medkit document.

  • spacy_attrs (Optional[List[str]]) – Name of span extensions to convert into medkit attributes. If None, all non-redundant EDS-NLP attributes will be handled.

  • medkit_attribute_factories (Optional[Dict[str, Callable[[Span, str], Attribute]]]) – Mapping of factories in charge of converting spacy attributes to medkit attributes. Factories will receive a spacy span and an an attribute label when called. The key in the mapping is the attribute label. Pre-defined default factories are listed in DEFAULT_ATTRIBUTE_FACTORIES

  • name (Optional[str]) – Name describing the pipeline (defaults to the class name).

  • uid (str) – Identifier of the pipeline

Attributes:

description

Contains all the operation init parameters.

Methods:

run(medkit_docs)

Run a spacy pipeline on a list of medkit documents.

set_prov_tracer(prov_tracer)

Enable provenance tracing.

property description: medkit.core.operation_desc.OperationDescription#

Contains all the operation init parameters.

Return type

OperationDescription

run(medkit_docs)#

Run a spacy pipeline on a list of medkit documents. Each medkit document is converted to spacy document (Doc object), with the selected annotations and attributes. Then, the spacy pipeline is executed and finally, the new annotations and attributes are converted into medkit annotations.

Parameters

medkit_docs (List[TextDocument]) – List of TextDocuments on which to run the pipeline

Return type

None

set_prov_tracer(prov_tracer)#

Enable provenance tracing.

Parameters

prov_tracer (ProvTracer) – The provenance tracer used to trace the provenance.

build_date_attribute(spacy_span, spacy_label)[source]#

Build a medkit date attribute from an EDS-NLP attribute with a date object as value.

Parameters
  • spacy_span (Span) – Spacy span having an ESD-NLP date attribute

  • spacy_label (str) – Label of the date attribute on spacy_spacy. Ex: “date”, “consultation_date”

Return type

Attribute

Returns

AttributeDateAttribute, RelativeDateAttribute or DurationAttribute instance, depending on the EDS-NLP attribute

build_value_attribute(spacy_span, spacy_label)[source]#

Build a medkit attribute from an EDS-NLP “value” attribute with a custom object as value:

  • if the value is an EDS-NLP Adipcap object, a ADICAPNormAttribute instance is returned;

  • if the value is an EDS-NLP TNN object, a TNMAttribute instance is returned;

  • if the value is an EDS-NLP SimpleMeasurement object, a Attribute instance is returned.

Otherwise an error is raised.

Parameters
  • spacy_span (Span) – Spacy span having an attribute custom object as value

  • spacy_label (str) – Label of the attribute on spacy_spacy. Ex: “value”

Return type

Attribute

Returns

Attribute – Medkit attribute corresponding to the spacy attribute value

build_score_attribute(spacy_span, spacy_label)[source]#

Build a medkit attribute from an EDS-NLP “score_name” and corresponding “score_value” attribute.

Parameters
  • spacy_span (Span) – Spacy span having “score_name” and “score_value” attributes

  • spacy_label (str) – Must be “score_name”

Return type

Attribute

Returns

Attribute – Medkit attribute with “score_name” value as label and “score_value” value as value

build_context_attribute(spacy_span, spacy_label)[source]#

Build a medkit attribute from an EDS-NLP context/qualifying attribute, adding the cues as metadata

Parameters
  • spacy_span (Span) – Spacy span having a context/qualifying attribute

  • spacy_label (str) – Label of the attribute on spacy_spacy. Ex: “negation”, “hypothesis”, etc

Return type

Attribute

Returns

Attribute – Medkit attribute corresponding to the spacy attribute

build_history_attribute(spacy_span, spacy_label)[source]#

Build a medkit attribute from an EDS-NLP “history” attribute, adding the cues as metadata

Parameters
  • spacy_span (Span) – Spacy span having a “history” attribute

  • spacy_label (str) – Must be “history”

Return type

Attribute

Returns

Attribute – Medkit attribute corresponding to the spacy attribute

DEFAULT_ATTRIBUTE_FACTORIES = {'consultation_date': <function build_date_attribute>, 'date': <function build_date_attribute>, 'family': <function build_context_attribute>, 'history': <function build_history_attribute>, 'hypothesis': <function build_context_attribute>, 'negation': <function build_context_attribute>, 'reported_speech': <function build_context_attribute>, 'score_name': <function build_score_attribute>, 'value': <function build_value_attribute>}#

Pre-defined attribute factories to handle EDS-NLP attributes