Docker Setup of the OpenWordBible Web Application
-
Download the repository
-
Build the docker image*
docker build --no-cache -t openwordbible_1.0.0 . -
Run the docker image
docker run -p 8000:8000 openwordbible_1.0.0
-
Note: Rebuilding the docker container will write over the image but the container will have to be removed. You can delete the container within dockerhub or run:
docker ps docker stop <container_id> docker rm <container_id>
Running Application Locally
-
Requirements Python 3.11.2 is needed to allow tensorflow to install
-
Create a Python Environment
python -m venv env
-
Start Python Environment
env\Scripts\activate
-
Install dependencies
pip install -r requirements.txt python -m spacy download en_core_web_sm
-
Update Django Settings Warning ❗ Open openwordbible/settings
- Add the IP of the host "[]": ALLOWED_HOSTS = []
- Change the IP address to the IP of the host: SESSION_COOKIE_DOMAIN = '.localhost'
-
Run Django Application Running Locally
python manage.py runserver
Running over the network
python manage.py runserver 0.0.0.0:8000
-
Test Application Go to http://localhost:8000/ location in the browser when running the application locally. Go to http://serveraddress:8000/ location in the browser when running the application locally. ❗[server address] is the IP address of the host server
Running Chatbot server Server edits Edit the app.py file ip address to the server host ip address
origins = [ "http://localhost:8000", "http://127.0.0.1:8000", "http://192.168.56.101:8000", ]
Front end edits Edit the aiChatbot.js file ip address to the host ip address const response = await fetch("http://192.168.56.101:9000/query",
Start the RAG server uvicorn app:app --host 0.0.0.0 --port 9000
Guidelines for the NER dataset
Labels BIO Format Summary
B-XXX: Beginning of entity type XXX
I-XXX: Inside of the same entity
O: Outside any named entity
Entity Labels and Examples
PER: Person names (e.g., 'Abraham Lincoln' -> B-PER I-PER)
ORG: Organizations (e.g., 'United States' -> B-ORG I-ORG)
GPE: Countries, cities, states (e.g., 'New York' -> B-GPE I-GPE)
LOC: Other locations (e.g., 'Grand Canyon' -> B-LOC I-LOC)
GEO: Natural features (e.g., 'Nile River' -> B-GEO I-GEO)
TIM: Time expressions (e.g., 'August 15, 1947' -> B-TIM I-TIM O I-TIM, A.D. -> B-TIM O I-TIM O)
NAT: Natural entities (e.g., 'COVID-19', 'Hurricane Ian' -> B-NAT)
EVE: Named events (e.g., 'World War II' -> B-EVE I-EVE I-EVE)
ART: Artworks (e.g., 'Starry Night' -> B-ART I-ART)
MISC: Miscellaneous named items (e.g., 'Western culture' -> B-MISC I-MISC)Common Mistakes
- Do not mix entity types in one phrase (e.g., B-PER I-ORG is incorrect)
- Always start an entity with B-XXX
- Label punctuation as O unless it is part of the named entity
- Use B-XXX for new entity mentions even if same typeLOC vs GEO
Entity Tag Why
Grand Canyon LOC Often treated as a tourist destination or landmark
Mississippi River GEO A natural river — not man-made
5th Avenue LOC A street — man-made, not a natural feature
Sahara Desert GEO A natural desert
Central Park LOC Designed by humans — a city park
Himalayas GEO Mountain range — natural formationB-TIM, I-TIM
3 → B-TIM
: → O
15 → I-TIM
A → B-TIM
. → O
D → I-TIM
. → O Explicit dates
January 5th B-TIM I-TIM
03/17/2022 B-TIM
the 14th of July B-TIM I-TIM I-TIMSpecific Times of Day
3:45 PM B-TIM O I-TIM
at noon B-TIM I-TIM
12 o'clock B-TIM I-TIMDate + Time Combinations
January 1st, 2023 at 4 PM B-TIM I-TIM O I-TIM I-TIM I-TIM I-TIMRelative Time Expressions
yesterday B-TIM
last week B-TIM I-TIM
two days ago B-TIM I-TIM I-TIM
next year B-TIM I-TIM
in 5 minutes B-TIM I-TIM I-TIMDurations
for three hours B-TIM I-TIM I-TIM
about 10 minutes B-TIM I-TIM I-TIMTime Ranges
from 10am to 2pm B-TIM I-TIM I-TIM I-TIM
between Monday and Friday B-TIM I-TIM I-TIM I-TIM NER Label Set for Prodigy / Doccano
Prodigy:
{
"labels": ["PER", "ORG", "GPE", "LOC", "GEO", "TIM", "NAT", "EVE", "ART", "MISC"]
}
Doccano YAML:
- [PER, Person]
- [ORG, Organization]
- [GPE, Geo-political Entity]
- [LOC, Other Location]
- [GEO, Geographical Feature]
- [TIM, Time Expression]
- [NAT, Natural Entity]
- [EVE, Event]
- [ART, Artwork]
- [MISC, Miscellaneous]