Skip to content

Latest commit

 

History

History
90 lines (55 loc) · 3.86 KB

README.md

File metadata and controls

90 lines (55 loc) · 3.86 KB

spheroscope

DOI Imports: cwb-ccc

spheroscope is a web app designated to argumentation mining. The backend is based on cwb-ccc, which runs multiply anchored CQP queries.

Setup

Prerequisites

You will need a working installation of the IMS Open Corpus Workbench (CWB), a CWB-indexed corpus, as well as word embeddings for most of what this app offers. The Python3 dependencies will be installed automatically if you follow the setup guide below.

Installation

The recommended way is to use pipenv:

python -m pip install pipenv
pipenv install --dev

which creates a virtual environment and installs all required packages specified in the Pipfile.

Alternatively, you can use setup.py or the complete requirements file.

Configuration

Configure the app via cfg.py in the app folder. You can find an example config file in the repository.

  • set REGISTRY_PATH to your CWB registry
  • set CACHE_PATH to some directory where you have write access
  • set DB_NAME (sqlite3 file relative to instance), DB_USERNAME and DB_PASSWORD for your local database
  • set REMOTE_USERNAME and REMOTE_PASSWORD if you have access to the remote database
  • set EMBEDDINGS to a dictionary of CWB_HANDLE:EMBEDDINGS_PATH

Loading Resources

You can then

make init

to create the database and

make library

to populate your local instance with all queries, macros, and wordlists from the library.

If you have access to the remote database, you can also

make patterns
make gold

to get the latest versions of patterns and adjudicated annotation (the "gold standard").

Running in Development Mode

You can now start the development flask server via

make run

and navigate to http://127.0.0.1:5000/ to access the interface.

Running in Production

Requirements:

sudo apt install apache2-dev

Clone the repository and cd into the folder. Virtual environment:

python3 -m venv venv
. venv/bin/activate
pip install .
pip install mod_wsgi

Create & modify cfg.py. Initialise the database:

FLASK_APP=spheroscope
flask init-db
flask import-lib

Run the WSGI server:

mod_wsgi-express start-server wsgi.py --processes 4

Corpus Settings

After starting the app, you will find default corpus settings in your instance folder. You can change the defaults to your liking, taking into consideration the most common p- and s-attributes of your system corpora.

When selecting one of your corpora for the first time through the interface, a new folder and config file will be created for this corpus in your instance folder; the config file will be populated with the corpus defaults. Most settings, such as the query and display parameters, can be changed through the interface (http://127.0.0.1:5000/corpora/).

If you want to use similarity-based recommendations for wordlists, you should point the embeddings parameter to appropriate embeddings stored as pymagnitude files.

License

The gold standard is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License (CC-BY-SA 4.0). http://creativecommons.org/licenses/by-sa/4.0/

The code and all other assets are licensed under the GNU General Public License version 3 (GPL-3).