medkit.tools.mtsamples#
This module aims to provide facilities for accessing some examples of mtsamples files available on this repository: neurazlab/mtsamplesFR
Refer to the repository for more information.
This repository contains:
- a version of mtsamples.csv
Source: https://www.kaggle.com/datasets/tboyle10/medicaltranscriptions license: CC0: Public Domain
a mtsamples_translation.json file which is a translation to french
Date: 08/04/2022
Functions:
|
Convert mtsamples data into a medkit file |
|
Function loading mtsamples data into medkit text documents |
- load_mtsamples(cache_dir='.cache', translated=True, nb_max=None)[source]#
Function loading mtsamples data into medkit text documents
- Parameters:
cache_dir (str or Path, default=".cache") – Directory where to store mtsamples file. Default: .cache
translated (bool, default=True) – If True (default), mtsamples_translated.json file is used (FR). If False, mtsamples.csv is used (EN)
nb_max (int, optional) – Maximum number of documents to load
- Return type:
list[TextDocument]- Returns:
list of TextDocument – The medkit text documents corresponding to mtsamples data
- convert_mtsamples_to_medkit(output_file, encoding='utf-8', cache_dir='.cache', translated=True)[source]#
Convert mtsamples data into a medkit file
- Parameters:
output_file (str or Path) – Path to the medkit jsonl file to generate
encoding (str, default="utf-8") – Encoding of the medkit file to generate
cache_dir (str or Path, default=".cache") – Directory where mtsamples file is cached. Default: .cache
translated (bool, default=True) – If True (default), mtsamples_translated.json file is used (FR). If False, mtsamples.csv is used (EN)