spheroscope is a web app designated to argumentation mining. The backend is based on cwb-ccc, which runs multiply anchored CQP queries.
You will need a working installation of the IMS Open Corpus Workbench (CWB), a CWB-indexed corpus, as well as word embeddings for most of what this app offers. The Python3 dependencies will be installed automatically if you follow the setup guide below.
The recommended way is to use pipenv:
python -m pip install pipenv
pipenv install --dev
which creates a virtual environment and installs all required packages specified in the Pipfile.
Alternatively, you can use setup.py or the complete requirements file.
Configure the app via cfg.py
in the app folder. You can find an example config file in the repository.
- set
REGISTRY_PATH
to your CWB registry - set
CACHE_PATH
to some directory where you have write access - set
DB_NAME
(sqlite3
file relative to instance),DB_USERNAME
andDB_PASSWORD
for your local database - set
REMOTE_USERNAME
andREMOTE_PASSWORD
if you have access to the remote database - set
EMBEDDINGS
to a dictionary ofCWB_HANDLE:EMBEDDINGS_PATH
You can then
make init
to create the database and
make library
to populate your local instance with all queries, macros, and wordlists from the library.
If you have access to the remote database, you can also
make patterns
make gold
to get the latest versions of patterns and adjudicated annotation (the "gold standard").
You can now start the development flask server via
make run
and navigate to http://127.0.0.1:5000/ to access the interface.
Requirements:
sudo apt install apache2-dev
Clone the repository and cd into the folder. Virtual environment:
python3 -m venv venv
. venv/bin/activate
pip install .
pip install mod_wsgi
Create & modify cfg.py
. Initialise the database:
FLASK_APP=spheroscope
flask init-db
flask import-lib
Run the WSGI server:
mod_wsgi-express start-server wsgi.py --processes 4
After starting the app, you will find default corpus settings in your instance folder. You can change the defaults to your liking, taking into consideration the most common p- and s-attributes of your system corpora.
When selecting one of your corpora for the first time through the interface, a new folder and config file will be created for this corpus in your instance folder; the config file will be populated with the corpus defaults. Most settings, such as the query
and display
parameters, can be changed through the interface (http://127.0.0.1:5000/corpora/).
If you want to use similarity-based recommendations for wordlists, you should point the embeddings
parameter to appropriate embeddings stored as pymagnitude
files.
The gold standard is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License (CC-BY-SA 4.0).
The code and all other assets are licensed under the GNU General Public License version 3 (GPL-3).