medkit.text.metrics.classification#
This module needs extra-dependencies not installed as core dependencies of medkit. To install them, use pip install medkit-lib[metrics-text-classification].
Classes:
|
An evaluator for attributes of TextDocuments |
- class TextClassificationEvaluator(attr_label)[source]#
An evaluator for attributes of TextDocuments
Initialize the text classification evaluator
- Parameters:
attr_label (str) – Label of the attribute to evaluate.
Methods:
compute_classification_report(true_docs, ...)Compute classification metrics of document attributes giving annotated documents.
compute_cohen_kappa(docs_annotator_1, ...)Compute the cohen's kappa score, an inter-rated agreement score between two annotators.
compute_krippendorff_alpha(docs_annotators)Compute the Krippendorff alpha score, an inter-rated agreement score between multiple annotators.
- compute_classification_report(true_docs, predicted_docs, metrics_by_attr_value=True, average='macro')[source]#
Compute classification metrics of document attributes giving annotated documents. This method uses sklearn.metrics.classification_report to compute precision, recall and F1-score for value of the attribute.
Warning
The set of true and predicted documents must be sorted to calculate the metric
- Parameters:
true_docs (list of TextDocument) – Text documents containing attributes of reference
predicted_docs (list of TextDocument) – Text documents containing predicted attributes
metrics_by_attr_value (bool, default=True) – Whether return metrics by attribute value. If False, only global metrics are returned
average (str, default="macro") – Type of average to be performed in metrics. - macro, unweighted mean (default) - weighted, weighted average by support (number of true instances by attr value)
- Return type:
dict[str, float | int]- Returns:
dict of str to float or int – A dictionary with the computed metrics
- compute_cohen_kappa(docs_annotator_1, docs_annotator_2)[source]#
Compute the cohen’s kappa score, an inter-rated agreement score between two annotators. This method uses ‘sklearn’ as backend to compute the level of agreement.
Warning
The set of documents must be sorted to calculate the metric
- Parameters:
docs_annotator_1 (list of TextDocument) – Text documents containing attributes annotated by the first annotator
docs_annotator_2 (list of TextDocument) – Text documents to compare, these documents contain attributes annotated by the other annotator
- Return type:
dict[str, float | int]- Returns:
dict of str to float or int – A dictionary with cohen’s kappa score and support (number of annotated docs). The value is a number between -1 and 1, where 1 indicates perfect agreement; zero or lower indicates chance agreement.
- compute_krippendorff_alpha(docs_annotators)[source]#
Compute the Krippendorff alpha score, an inter-rated agreement score between multiple annotators.
Warning
Documents must be sorted to calculate the metric.
Note
See
medkit.text.metrics.irr_utils.krippendorff_alphafor more information about the score- Parameters:
docs_annotators (list of list of TextDocument) – A list of list of Text documents containing attributes. The size of the list is the number of annotators to compare.
- Return type:
dict[str, float | int]- Returns:
dict of str to float or int – A dictionary with the krippendorff alpha score, number of annotators and support (number of documents). A value of 1 indicates perfect reliability between annotators; zero or lower indicates absence of reliability.