Version 1.0
A Python module for converting the GreynirCorpus treebank to the Universal Dependencies framework. The module has been adapted from UDConverter.
The resulting UD treebank will be included in UD version 2.11.
Install all requirements by running:
pip install -r requirements.txt
Scripts to run are in the scripts
folder.
In all examples below, the --output
flag is used to write to files in the /CoNLLU/
output folder. Otherwise prints to standard output.
Convert single file or directory of files:
convert.py -N -i path/to/corpus/file.psd --output --post_process
convert.py -N -i path/to/corpus/* --output --post_process
For further usage, input files must be placed in a folder within the corpora
folde:r
Convert single tree in treebank using sentence ID (only prints to standard output):
convert.py -C FOLDER_NAME -id SENTENCE_ID
Convert single file in treebank
convert.py -C FOLDER_NAME -f FILE_NAME --output --post_process
This converter was adapted as part of the Language Technology Programme for Icelandic 2019-2023. The programme, which is managed and coordinated by Almannarómur (https://almannaromur.is/), is funded by the Icelandic Ministry of Education, Science and Culture.