:py:mod:`medkit.text.ner.quick_umls_matcher`
============================================

.. py:module:: medkit.text.ner.quick_umls_matcher


Module Contents
---------------

Classes
~~~~~~~

.. autoapisummary::

   medkit.text.ner.quick_umls_matcher.QuickUMLSMatcher




.. py:class:: QuickUMLSMatcher(version: str, language: str, lowercase: bool = False, normalize_unicode: bool = False, overlapping: typing_extensions.Literal[length, score] = 'length', threshold: float = 0.9, window: int = 5, similarity: typing_extensions.Literal[dice, jaccard, cosine, overlap] = 'jaccard', accepted_semtypes: list[str] = quickumls.constants.ACCEPTED_SEMTYPES, attrs_to_copy: list[str] | None = None, output_label: str | dict[str, str] | None = None, name: str | None = None, uid: str | None = None)


   Bases: :py:obj:`medkit.core.text.NEROperation`

   
   Entity annotator relying on QuickUMLS.

   This annotator requires a QuickUMLS installation performed
   with `python -m quickumls.install` with flags corresponding
   to the params `language`, `version`, `lowercase` and `normalize_unicode`
   passed at init. QuickUMLS installations must be registered with the
   `add_install` class method.

   For instance, if we want to use `QuickUMLSMatcher` with a french
   lowercase QuickUMLS install based on UMLS version 2021AB,
   we must first create this installation with:

   >>> python -m quickumls.install --language FRE --lowercase /path/to/umls/2021AB/data /path/to/quick/umls/install

   then register this install with:

   >>> QuickUMLSMatcher.add_install(
   >>>        "/path/to/quick/umls/install",
   >>>        version="2021AB",
   >>>        language="FRE",
   >>>        lowercase=True,
   >>> )

   and finally instantiate the matcher with:

   >>> matcher = QuickUMLSMatcher(
   >>>     version="2021AB",
   >>>     language="FRE",
   >>>     lowercase=True,
   >>> )

   This mechanism makes it possible to store in the OperationDescription
   how the used QuickUMLS was created, and to reinstantiate the same matcher
   on a different environment if a similar install is available.















   ..
       !! processed by numpydoc !!
   .. py:attribute:: _install_paths
      :type: ClassVar[dict[_QuickUMLSInstall, str]]

      

   .. py:method:: add_install(path: str | pathlib.Path, version: str, language: str, lowercase: bool = False, normalize_unicode: bool = False)
      :classmethod:

      
      Register path and settings of a QuickUMLS installation.


      :Parameters:

          **path** : str or Path
              The path to the destination folder passed to the install command

          **version** : str
              The version of the UMLS database, for instance "2021AB"

          **language** : str
              The language flag passed to the install command, for instance "ENG"

          **lowercase** : bool, default=False
              Whether the --lowercase flag was passed to the install command
              (concepts are lowercased to increase recall)

          **normalize_unicode** : bool, default=False
              Whether the --normalize-unicode flag was passed to the install command
              (non-ASCII chars in concepts are converted to the closest ASCII chars)














      ..
          !! processed by numpydoc !!

   .. py:method:: clear_installs()
      :classmethod:

      
      Remove all QuickUMLS installation registered with `add_install`.
















      ..
          !! processed by numpydoc !!

   .. py:method:: _get_path_to_install(version: str, language: str, lowercase: bool = False, normalize_unicode: bool = False) -> str
      :classmethod:

      
      Find a QuickUMLS install with corresponding settings.

      The QuickUMLS install must have been previously registered with `add_install`.















      ..
          !! processed by numpydoc !!

   .. py:method:: _get_label_mapping(output_label: None | str | dict[str, str]) -> dict[str, str]
      :staticmethod:

      
      Return label mapping according to `output_label`.
















      ..
          !! processed by numpydoc !!

   .. py:method:: run(segments: list[medkit.core.text.Segment]) -> list[medkit.core.text.Entity]

      
      Return entities (with UMLS normalization attributes) for each match in `segments`.


      :Parameters:

          **segments** : list of Segment
              List of segments into which to look for matches

      :Returns:

          list of Entity
              Entities found in `segments`, with :class:`~UMLSNormAttribute` attributes.













      ..
          !! processed by numpydoc !!

   .. py:method:: _find_matches_in_segment(segment: medkit.core.text.Segment) -> Iterator[medkit.core.text.Entity]



