# I/O components

This page lists all components for converting and loading/saving data.

:::{note}
For more details about public APIs, refer to
{mod}`medkit.io`.
:::

(api:io:brat)=
## Brat

Brat is a web-based tool for text annotation. Medkit supports the **input** and **output** conversion of text documents. 

:::{seealso}
For more details, refer to {mod}`medkit.io.brat`.
You may refer to this [example](../examples/brat_io.md) for more information.
:::


## Doccano

[Doccano](https://github.com/doccano/doccano) is a text annotation tool from multiple tasks. Medkit supports the **input** and **output** conversion of doccano files (.JSONL format). 

You can load annotations from a .jsonl file or a zip directory.

### Supported tasks
| Doccano Project                  	| Task for io converter                                                                                                     	|
|----------------------------------	|---------------------------------------------------------------------------------------------------------------------------	|
| Sequence labeling                	| {class}`medkit.io.doccano.DoccanoTask.SEQUENCE_LABELING` <br> i.e : `{'text':...,'label':[(int,int,label)]}`              	|
| Sequence labeling with relations 	| {class}`medkit.io.doccano.DoccanoTask.RELATION_EXTRACTION` <br>i.e : `{'text':...,'entities':[{...}],'relations':[{...}]}` 	|
| Text Classification              	| {class}`medkit.io.doccano.DoccanoTask.TEXT_CLASSIFICATION`<br>i.e : `{'text':...,'label':[str]}`                          	|

### Client configuration

The doccano user interface allows custom configuration over certain annotation parameters. The {class}`medkit.io.doccano.DoccanoClientConfig` class contains the configuration to be used by the input converter. 

You can modify the settings depending on the configuration of your project. If you don't provide a config, the converter will use the default doccano configuration.


:::{note}
**Metadata**

- Doccano to medkit: All the extra fields are imported as a dictionary in `TextDocument.metadata`
- Medkit to Doccano: The `TextDocument.metadata` is exported as extra fields in the output data. You can set `include_metadata` to False to remove the extra fields.
:::

For more details, refer to {mod}`medkit.io.doccano`.

(api:io:spacy)=
## Spacy

Medkit supports the **input** and **output** conversion of spacy documents.

:::{important}
For using spacy converters, you need to install [spacy](https://spacy.io/).
These dependencies may be installed with `pip install medkit-lib[spacy]`
:::

:::{seealso}
You may refer to this [example](../examples/spacy_io.md) for more information.
:::

For more details, refer to {mod}`medkit.io.spacy`.

## medkit-json

medkit has some utilities to export and import medkit documents to json format.

You can use {mod}`medkit.io.medkit_json.save_text_documents` to save a list of documents, and then {mod}`medkit.io.medkit_json.load_text_documents` to load them in medkit.

For more details, refer to {mod}`medkit.io.medkit_json`.

## RTTM

Rich Transcription Time Marked (.rttm) files contains diarization information. 
Medkit supports input and output conversion of audio documents.

For more details, refer to {mod}`medkit.io.rttm`.
