:py:mod:`medkit.text.metrics.ner`
=================================

.. py:module:: medkit.text.metrics.ner


Module Contents
---------------

Classes
~~~~~~~

.. autoapisummary::

   medkit.text.metrics.ner.SeqEvalEvaluator
   medkit.text.metrics.ner.SeqEvalMetricsComputer




.. py:class:: SeqEvalEvaluator(tagging_scheme: typing_extensions.Literal[bilou, iob2] = 'bilou', return_metrics_by_label: bool = True, average: typing_extensions.Literal[macro, weighted] = 'macro', tokenizer: Any | None = None, labels_remapping: dict[str, str] | None = None)


   
   Evaluator to compute the performance of labeling tasks such as named entity recognition.

   This evaluator compares TextDocuments of reference with its predicted annotations
   and returns a dictionary of metrics.

   The evaluator converts the set of entities and documents to tags before compute the metric.
   It supports two schemes, IOB2 (a BIO scheme) and BILOU. The IOB2 scheme tags the Beginning,
   the Inside and the Outside text of a entity. The BILOU scheme tags the Beginning,
   the Inside and the Last tokens of multi-token entity as well as Unit-length entity.

   For more information about IOB schemes, refer to the `Wikipedia page <https://en.wikipedia.org/wiki/Inside%E2%80%93outside%E2%80%93beginning_(tagging)>`_

   .. hint::
       If **tokenizer** is not defined, the evaluator tokenizes the text by character.
       This may generate a lot of tokens with large documents and may affect execution time.
       You can use a fast tokenizer from HuggingFace, i.e. : bert tokenizer

       >>> from transformers import AutoTokenizer
       >>> tokenizer = AutoTokenizer.from_pretrained("bert-base-uncased", use_fast=True)

   :Parameters:

       **tagging_scheme** : str, default="bilou"
           Scheme for tagging the tokens, it can be `bilou` or `iob2`

       **return_metrics_by_label** : bool, default=True
           If `True`, return the metrics by label in the output dictionary.
           If `False`, only global metrics are returned

       **average** : str, default="macro"
           Type of average to be performed in metrics.
           - `macro`, unweighted mean (default)
           - `weighted`, weighted average by support (number of true instances by label)

       **tokenizer** : Any, optional
           Fast Tokenizer to convert text into tokens.
           If not provided, the text is tokenized by character.

       **labels_remapping** : dict of str to str, optional
           Remapping of labels, useful when there is a mismatch
           between the predicted labels and the reference labels to evaluate
           against. If a label (of a reference of predicted entity) is found in
           this dict, the corresponding value will be used as label instead.














   ..
       !! processed by numpydoc !!
   .. py:method:: compute(documents: list[medkit.core.text.TextDocument], predicted_entities: list[list[medkit.core.text.Entity]]) -> dict[str, float]

      
      Compute metrics of entity matching giving predictions.


      :Parameters:

          **documents** : list of TextDocuments
              Text documents containing entities of reference

          **predicted_entities** : list of list of Entity
              List of predicted entities by document

      :Returns:

          dict of str to float
              A dictionary with average and per type metrics if required. The metrics included are:
              accuracy, precision, recall and F1 score.













      ..
          !! processed by numpydoc !!

   .. py:method:: _tag_text_with_entities(text: str, entities: list[medkit.core.text.Entity])



.. py:class:: SeqEvalMetricsComputer(id_to_label: dict[int, str], tagging_scheme: typing_extensions.Literal[bilou, iob2] = 'bilou', return_metrics_by_label: bool = True, average: typing_extensions.Literal[macro, weighted] = 'macro')


   
   Compute evaluation metrics using seqeval.

   An implementation of :class:`~medkit.training.MetricsComputer` using seqeval
   to compute metrics in the training of named-entity recognition components.

   The metrics computer can be used with a :class:`~medkit.training.Trainer`

   :Parameters:

       **id_to_label** : dict of int to str
           Mapping integer value to label, it should be the same used in preprocess

       **tagging_scheme** : str, default="bilou"
           Scheme used for tagging the tokens, it can be `bilou` or `iob2`

       **return_metrics_by_label** : bool, default=True
           If `True`, return the metrics by label in the output dictionary.
           If `False`, only return average metrics

       **average** : str, default="macro"
           Type of average to be performed in metrics.
           - `macro`, unweighted mean (default)
           - `weighted`, weighted average by support (number of true instances by attr value)














   ..
       !! processed by numpydoc !!
   .. py:method:: prepare_batch(model_output: medkit.training.utils.BatchData, input_batch: medkit.training.utils.BatchData) -> dict[str, list[list[str]]]

      
      Prepare a batch of tensors to compute the metric.


      :Parameters:

          **model_output** : BatchData
              A batch data including the `logits` predicted by the model

          **input_batch** : BatchData
              A batch data including the `labels` of reference

      :Returns:

          dict of str to list of list of str
              A dictionary with the true and predicted tags representation of a batch data













      ..
          !! processed by numpydoc !!

   .. py:method:: compute(all_data: dict[str, list[Any]]) -> dict[str, float]

      
      Compute the metrics.

      Compute metrics using the tag representation collected by batches
      during the training/evaluation loop.

      :Parameters:

          **all_data** : dict of str to list of Any
              A dictionary with the true and predicted tags collected by batches

      :Returns:

          dict of str to float
              A dictionary with average and per label metrics if required. The metrics
              included are : accuracy, precision, recall and F1 score.













      ..
          !! processed by numpydoc !!


