medkit.tools.mtsamples
======================

.. py:module:: medkit.tools.mtsamples

.. autoapi-nested-parse::

   Tools for accessing examples of mtsamples files.

   Refer to the `mtsamplesFR`_ repository for more information.

   This repository contains:

   * the original dataset from Kaggle (data/mtsamples.csv);

   * a French translation for the dataset (data/mtsamples_translation.json).

   Both of which are made available under the CC0-1.0 license.

   .. _mtsamplesFR: https://github.com/medkit-lib/mtsamplesFR

   ..
       !! processed by numpydoc !!


Functions
---------

.. autoapisummary::

   medkit.tools.mtsamples.load_mtsamples
   medkit.tools.mtsamples.convert_mtsamples_to_medkit


Module Contents
---------------

.. py:function:: load_mtsamples(cache_dir: pathlib.Path | str = '.cache', translated: bool = True, nb_max: int | None = None) -> list[medkit.core.text.TextDocument]

   
   Load mtsamples data as medkit text documents.


   :Parameters:

       **cache_dir** : str or Path, default=".cache"
           Directory where to store mtsamples file. Default: .cache

       **translated** : bool, default=True
           If True (default), `mtsamples_translated.json` file is used (FR).
           If False, `mtsamples.csv` is used (EN)

       **nb_max** : int, optional
           Maximum number of documents to load

   :Returns:

       list of TextDocument
           The medkit text documents corresponding to mtsamples data













   ..
       !! processed by numpydoc !!

.. py:function:: convert_mtsamples_to_medkit(output_file: pathlib.Path | str, encoding: str | None = 'utf-8', cache_dir: pathlib.Path | str = '.cache', translated: bool = True)

   
   Save mtsamples data as a medkit text file.


   :Parameters:

       **output_file** : str or Path
           Path to the medkit jsonl file to generate

       **encoding** : str, default="utf-8"
           Encoding of the medkit file to generate

       **cache_dir** : str or Path, default=".cache"
           Directory where mtsamples file is cached. Default: .cache

       **translated** : bool, default=True
           If True (default), `mtsamples_translated.json` file is used (FR).
           If False, `mtsamples.csv` is used (EN)














   ..
       !! processed by numpydoc !!

