Skip to content

Latest commit

 

History

History
76 lines (51 loc) · 1.75 KB

README.md

File metadata and controls

76 lines (51 loc) · 1.75 KB

ONET taxonomy classification

This repository serves as a hub for tools and resources related to classifying occupations using the Occupational Information Network (ONET) taxonomy. The ONET taxonomy provides a standardized framework for categorizing and organizing occupational information, making it a valuable resource for various applications such as workforce development, career guidance, and labor market analysis.

Install dependencies

Install all the project dependencies by running the following command in the project's root directory:

poetry install

You can use the following command to activate the virtual environment

poetry shell

Note: Make sure you have Poetry installed on your system. If not, you can install it using:

pip install poetry

Other prerequisites

Ensure you have the following API keys before stored in the .env file:

  • PINECONE_ENVIRONMENT
  • PINECONE_API_KEY
  • OPEN_AI_API_KEY
  • OPEN_AI_API_TYPE
  • OPEN_AI_API_VERSION
  • OPEN_AI_ENDPOINT

Ensure you have the following folders created: checkpoints, data.

Ensure you have test_data.csv and train_data.csv files in the data folder.

Run the code

You would need to run to create the checkpoint of the model, the embeddings files and the label encoder checkpoint:

python src/main.py

If you already have the mentioned above parts, you can just run this to obtain predictions:

python src/predict.py

MEMO / Documentation

For more details of the implementation you may check "MEMO.md" file

Issues

OPEN AI issue

If you encounter any issues with the open ai version, run this:

pip install openai==0.28

Path issue

If you encounter issues with the relative path, run this:

export PYTHONPATH=$PWD