:py:mod:`medkit.io.spacy`
=========================

.. py:module:: medkit.io.spacy


Module Contents
---------------

Classes
~~~~~~~

.. autoapisummary::

   medkit.io.spacy.SpacyInputConverter
   medkit.io.spacy.SpacyOutputConverter




.. py:class:: SpacyInputConverter(entities: list[str] | None = None, span_groups: list[str] | None = None, attrs: list[str] | None = None, uid: str | None = None)


   
   Class for converting spaCy documents into a collection of TextDocuments.


   :Parameters:

       **entities** : list of str, optional
           Labels of spacy entities (`doc.ents`) to convert into medkit entities.
           If `None` (default) all spacy entities will be converted and added into
           its origin medkit document.

       **span_groups** : list of str, optional
           Name of groups of spacy spans (`doc.spans`) to convert into medkit segments.
           If `None` (default) all groups of spacy spans will be converted and added into
           the medkit document.

       **attrs** : list of str, optional
           Name of span extensions to convert into medkit attributes.
           If `None` (default) all non-None extensions will be added for each annotation

       **uid** : str, optional
           Identifier of the converter












   :Attributes:

       **description** : OperationDescription
           Description for the operation.


   ..
       !! processed by numpydoc !!
   .. py:property:: description
      :type: medkit.core.OperationDescription


   .. py:method:: set_prov_tracer(prov_tracer: medkit.core.ProvTracer)


   .. py:method:: load(spacy_docs: list[spacy.tokens.Doc]) -> list[medkit.core.text.TextDocument]

      
      Create a list of TextDocuments from a list of spacy Doc objects.

      Depending on the configuration of the converted, the selected annotations
      and attributes are included in the documents.

      :Parameters:

          **spacy_docs** : list of Doc
              A list of spacy documents to convert

      :Returns:

          list of TextDocument
              A list of TextDocuments













      ..
          !! processed by numpydoc !!

   .. py:method:: _load_anns(spacy_doc: spacy.tokens.Doc)



.. py:class:: SpacyOutputConverter(nlp: spacy.Language, apply_nlp_spacy: bool = False, labels_anns: list[str] | None = None, attrs: list[str] | None = None, uid: str | None = None)


   
   Class for converting TextDocuments into a list of spaCy documents.


   :Parameters:

       **nlp** : Language
           Language object with the loaded pipeline from Spacy

       **apply_nlp_spacy** : bool, default=False
           If True, each component of `nlp` pipeline is applied to the new spacy document.
           Some features, such as 'POS TAG', are added by a component of the pipeline, this
           parameter should be True, in order to add such attributes.
           If False, the `nlp` pipeline is not applied in the spacy document, so the document
           contains only the annotations and attributes transferred by medkit.

       **labels_anns** : list of str, optional
           Labels of medkit annotations to include in the spacy document.
           If `None` (default) all the annotations will be included.

       **attrs** : list of str, optional
           Labels of medkit attributes to add in the annotations that will be included.
           If `None` (default) all the attributes will be added as `custom attributes`
           in each annotation included.

       **uid** : str, optional
           Identifier of the pipeline












   :Attributes:

       **description** : OperationDescription
           Description for the operation.


   ..
       !! processed by numpydoc !!
   .. py:property:: description
      :type: medkit.core.OperationDescription


   .. py:method:: convert(medkit_docs: list[medkit.core.text.TextDocument]) -> list[spacy.tokens.Doc]

      
      Convert a list of TextDocuments into a list of spacy Doc objects.

      Depending on the configuration of the converted, the selected annotations
      and attributes are included in the documents.

      :Parameters:

          **medkit_docs** : list of TextDocument
              A list of TextDocuments to convert

      :Returns:

          list of Doc
              A list of spacy Doc objects













      ..
          !! processed by numpydoc !!


