Skip to content

Latest commit

 

History

History
75 lines (54 loc) · 2.48 KB

README.md

File metadata and controls

75 lines (54 loc) · 2.48 KB

Tb2UD

Convert Ancient Greek treebanks in the main formats supported by Arethusa and Perseids to UD. At the moment, it is designed to work with the treebanks compatible with:

Requirements

  • Python 3.6+
  • the package udapy-python, with some additional scripts to work with the formats of Perseus AGLDT; you can get them from my Udapy_AGLDT, which is automatically installed if you use the requirements.txt

Important: don't forget to add 2 folders to your $PYTHONPATH: tb2UD and tb2UD/tb2ud; for instance, you can do that by:

cd /path/to/tb2UD/
export PYTHONPATH="$(pwd):$(pwd)/tb2ud/"

Or, even better, create and configure a virtual environment (see next paragraph). At this point, you simply have to add a .pth file (e.g. env.pth) in the <ENV>/lib/<PYTHON-VERSION>/site-package folder.

How to set up a virtualenv

If you don't know what a virtual environment is, you'll find a lot of good tutorials online, starting with this one. You may also want to consider virtualenvwrapper, which makes a lot of things easier to manage.

Follow these three steps:

  1. create and activate a virtual environment (Python 3.6+); see the link above, if you don't know how do it.

  2. install the required packages:

pip install -r requirements.txt
  1. create a pth file and enter the full path to the tb2UD and tb2UD/tb2ud folders; see here.

If you have virtualenvwrapper, you also have a add2virtualenv script, which takes care of step 3 for you:

add2virtualenv directory1 directory2 ...

How to use it

In the scripts folders, you'll find a few bash scripts to perform some of the most frequently used commands.

You can test that everything is working fine by running the following script:

# test.sh <input-file.xml>
cd test # go to the tb2ud/test folder
./test.sh data/hdt-1-20-39-bu2.xml

(note that the script attempts to read an AGLDT XML file; it fails if the appropriate udapi blocks are not found)

If all goes well, you'll see a series of log entries, followed by the good old Hello, World! string.