medkit.text.context.negation_detector
=====================================

.. py:module:: medkit.text.context.negation_detector


Classes
-------

.. autoapisummary::

   medkit.text.context.negation_detector.NegationDetectorRule
   medkit.text.context.negation_detector.NegationMetadata
   medkit.text.context.negation_detector.NegationDetector


Module Contents
---------------

.. py:class:: NegationDetectorRule

   
   Regexp-based rule to use with `NegationDetector`.

   Input text may be converted before detecting rule.

   :Parameters:

       **regexp** : str
           The regexp pattern used to match a negation

       **exclusion_regexps** : list of str, optional
           Optional exclusion patterns

       **id** : str, optional
           Unique identifier of the rule to store in the metadata of the entities

       **case_sensitive** : bool, default=False
           Whether to consider case when running `regexp and `exclusion_regexs`

       **unicode_sensitive** : bool, default=False
           If True, rule matches are searched on unicode text.
           So, `regexp and `exclusion_regexs` shall not contain non-ASCII chars because
           they would never be matched.
           If False, rule matches are searched on closest ASCII text when possible.
           (cf. NegationDetector)














   ..
       !! processed by numpydoc !!

   .. py:attribute:: regexp
      :type:  str


   .. py:attribute:: exclusion_regexps
      :type:  list[str]
      :value: []



   .. py:attribute:: id
      :type:  str | None
      :value: None



   .. py:attribute:: case_sensitive
      :type:  bool
      :value: False



   .. py:attribute:: unicode_sensitive
      :type:  bool
      :value: False



   .. py:method:: __post_init__()


.. py:class:: NegationMetadata

   Bases: :py:obj:`typing_extensions.TypedDict`


   
   Metadata dict added to negation attributes with `True` value.


   :Parameters:

       **rule_id** : str or int
           Identifier of the rule used to detect a negation.
           If the rule has no uid, then the index of the rule in
           the list of rules is used instead.














   ..
       !! processed by numpydoc !!

   .. py:attribute:: rule_id
      :type:  str | int


.. py:class:: NegationDetector(output_label: str, rules: list[NegationDetectorRule] | None = None, uid: str | None = None)

   Bases: :py:obj:`medkit.core.text.ContextOperation`


   
   Annotator creating negation attributes.

   Because negation attributes will be attached to whole annotations,
   each input annotation should be "local"-enough rather than
   a big chunk of text (ie a sentence or a syntagma).

   For detecting negation, the module uses rules that may be sensitive to unicode or
   not. When the rule is not sensitive to unicode, we try to convert unicode chars to
   the closest ascii chars. However, some characters need to be pre-processed before
   (e.g., `n°` -> `number`). So, if the text lengths are different, we fall back on
   initial unicode text for detection even if rule is not unicode-sensitive.
   In this case, a warning is logged for recommending to pre-process data.















   ..
       !! processed by numpydoc !!

   .. py:attribute:: output_label


   .. py:attribute:: rules
      :value: None



   .. py:attribute:: _non_empty_text_pattern


   .. py:attribute:: _patterns


   .. py:attribute:: _exclusion_patterns


   .. py:attribute:: _has_non_unicode_sensitive_rule


   .. py:method:: run(segments: list[medkit.core.text.Segment])

      
      Run the operation.

      Add a negation attribute to each segment with a boolean value
      indicating if a hypothesis has been found.

      Negation attributes with a `True` value have a metadata dict with
      fields described in :class:`.NegationRuleMetadata`.

      :Parameters:

          **segments** : list of Segment
              List of segments to detect as being negated or not














      ..
          !! processed by numpydoc !!


   .. py:method:: _detect_negation_in_segment(segment: medkit.core.text.Segment) -> medkit.core.Attribute | None


   .. py:method:: _find_matching_rule(text: str) -> str | int | None


   .. py:method:: load_rules(path_to_rules: pathlib.Path, encoding: str | None = None) -> list[NegationDetectorRule]
      :staticmethod:


      
      Load all rules stored in a yml file.


      :Parameters:

          **path_to_rules** : Path
              Path to a yml file containing a list of mappings
              with the same structure as `NegationDetectorRule`

          **encoding** : str, optional
              Encoding of the file to open



      :Returns:

          list of NegationDetectorRule
              List of all the rules in `path_to_rules`,
              can be used to init a `NegationDetector`











      ..
          !! processed by numpydoc !!


   .. py:method:: check_rules_sanity(rules: list[NegationDetectorRule])
      :staticmethod:


      
      Check consistency of a set of rules.
















      ..
          !! processed by numpydoc !!


   .. py:method:: save_rules(rules: list[NegationDetectorRule], path_to_rules: pathlib.Path, encoding: str | None = None)
      :staticmethod:


      
      Store rules in a yml file.


      :Parameters:

          **rules** : list of NegationDetectorRule
              The rules to save

          **path_to_rules** : Path
              Path to a .yml file that will contain the rules

          **encoding** : str, optional
              Encoding of the .yml file














      ..
          !! processed by numpydoc !!


