Convert Ancient Greek treebanks in the main formats supported by Arethusa and Perseids to UD. At the moment, it is designed to work with the treebanks compatible with:
- Python 3.6+
- the package
udapy-python
, with some additional scripts to work with the formats of Perseus AGLDT; you can get them from my Udapy_AGLDT, which is automatically installed if you use the requirements.txt
Important: don't forget to add 2 folders to your $PYTHONPATH
: tb2UD
and
tb2UD/tb2ud
; for instance, you can do that by:
cd /path/to/tb2UD/
export PYTHONPATH="$(pwd):$(pwd)/tb2ud/"
Or, even better, create and configure a
virtual environment
(see next paragraph). At this point, you simply have to add a .pth file (e.g.
env.pth
) in the <ENV>/lib/<PYTHON-VERSION>/site-package
folder.
If you don't know what a virtual environment is, you'll find a lot of good tutorials online, starting with this one. You may also want to consider virtualenvwrapper, which makes a lot of things easier to manage.
Follow these three steps:
-
create and activate a virtual environment (Python 3.6+); see the link above, if you don't know how do it.
-
install the required packages:
pip install -r requirements.txt
- create a
pth
file and enter the full path to thetb2UD
andtb2UD/tb2ud
folders; see here.
If you have virtualenvwrapper, you also
have a add2virtualenv
script, which takes care of step 3 for you:
add2virtualenv directory1 directory2 ...
In the scripts
folders, you'll find a few bash scripts to perform
some of the most frequently used commands.
You can test that everything is working fine by running the following script:
# test.sh <input-file.xml>
cd test # go to the tb2ud/test folder
./test.sh data/hdt-1-20-39-bu2.xml
(note that the script attempts to read an AGLDT XML file; it fails if the
appropriate udapi
blocks are not found)
If all goes well, you'll see a series of log entries, followed by the good old
Hello, World!
string.