Skip to content

Latest commit

 

History

History
39 lines (33 loc) · 1.4 KB

README.md

File metadata and controls

39 lines (33 loc) · 1.4 KB





UpLabel

UpLabel is a lightweight, Python-based and modular tool which serves to support your machine learning tasks by making the data labeling process more efficient, automated and structured. In the current version, the tool is mainly focused on text classification tasks.

Software Component Flow


User Flow


Setup

  1. Create a conda environment using environment.yml
  2. Start 'Test - Pipeline' notebook

Authors

Timm Walz (@nonstoptimm)
Martin Kayser (@maknotavailable)

Glossary

Word Description
label target category used by model and to be labeled
pred(_id) predicted label, respective numeric identifier
split a subset of the data, to be distributed for manual labelling

Open Tasks

In Progress

  • Host as a service in Azure (via FA)
  • Improve complexity calculation

TODO

  • Integrate with neanno frontend (https://github.com/timoklimmer/neanno).
  • Support for Named Entity Recognition tasks.
  • Support for Muli-Class Classification tasks.
  • Active learning: targeted false positives
  • Smart join: label quality score
  • Smart load: data integrity validation
  • Auto-create labeling documentation