forked from manicki/psi-toolkit
-
Notifications
You must be signed in to change notification settings - Fork 3
/
INSTALL.txt
110 lines (77 loc) · 3.43 KB
/
INSTALL.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
Requirements
============
* cmake
* Boost library
* PCRE library
Optional
--------
* RE2 - for faster regular expressions
* SWIG - for integration with scripting languages (e.g. Perl, Python)
* Perl library: development files - (Ubuntu package: libperl-dev) - for Perl bindings
* GraphViz library: development files - (Ubuntu package: libgraphviz-dev) - for gv-writer
* Following libraries (development files) for pdf-reader:
* Poppler (Ubuntu package: libpoppler-dev)
* Poppler-glib (Ubuntu package: libpoppler-glib-dev)
* GTK2 (Ubuntu package: libgtk2.0-dev)
* LibMagic library: development files - (Ubuntu package: libmagic-dev)
* Antiword (Ubuntu package: antiword) - for doc-reader
* DjVuLibre library: development files - (Ubuntu package: libdjvulibre-dev) - for djvu-reader
* Link Grammar library: development files - (Ubuntu package: liblink-grammar4-dev) - for link-parser
* Bison and Flex tools - (Ubuntu packages: bison flex) - for gobio
* CMPH library: development files - (Ubuntu package: `libcmph-dev`)
Compilation procedure on Ubuntu 16.04 LTS
=========================================
sudo apt-get install bison flex cmake libboost-all-dev libpcre3-dev libcmph-dev libz-dev libmagic-dev
sudo apt-get install swig libperl-dev python python-all-dev openjdk-6-jdk libgraphviz-dev libpoppler-dev libpoppler-glib-dev libgtk2.0-dev liblog4cpp5-dev libaspell-dev antiword libdjvulibre-dev liblink-grammar4-dev #optionally
mkdir build
cd build
cmake -DUSE_JAVA=OFF -DDOWNLOAD_DATA=OFF ..
make
Running – example
=================
E.g. for simple tokenization use:
framework/psi-pipe tp-tokenizer --lang pl
Running tests
=============
There are two types of tests that can be run after compiling
psi-toolkit: unit tests and "mass" tests.
Before pushing any changes to the repository, always check whether
PSI-Toolkit passes ALL the tests:
./tests/test_runner
./tests/mass-tests ..
DO NOT PUSH A VERSION FAILING THE TESTS!!!
Unit tests
----------
To run unit tests, run:
./tests/test_runner
Boost Test library was used, `test_runner` has some options provided by this library.
To see the options, run:
./tests/test_runner --help
In particular, you can limit the set of tests to be run by typing:
./tests/test_runner --run_test=TEST_SUITE_NAME
If you want to add new unit tests, please create `t/` subdirectory and
put test source files there, see `tools/parsers/gobio/t/` for an example.
"Mass" tests
------------
Mass tests consists in running a given PSI-pipe on some input and
checking whether the output is the same is expected. To run mass tests, type:
./tests/mass-tests ..
You can limit mass tests to some particular directory, e.g. in order to run tests of
tokenizers, use the command:
./tests/mass-tests ../tools/tokenizers/
Note that the output generated by mass-tests will overwrite the
original expected output! (That's why if you just run `mass-tests` for
the second time there will be no failures). The failed tests can be
easily, however, identified by running `git status`. Please commit the
new test output if it is actually OK.
Installation procedure
======================
cd psi-toolkit
mkdir -p build
cd build
cmake -DCMAKE_INSTALL_PREFIX=/usr/local -DUSE_JAVA=ON -DIS_INSTALLABLE=ON CMAKE_BUILD_TYPE=Release ..
make
And as root:
make install
In general, however, it is better to use pre-packaged PSI-Toolkit, see:
http://psi-toolkit.wmi.amu.edu.pl/download.html