Skip to content
@dell-research-harvard

dell-research-harvard

Popular repositories Loading

  1. linktransformer linktransformer Public

    A convenient way to link, deduplicate, aggregate and cluster data(frames) in Python using deep learning

    Python 127 11

  2. AmericanStories AmericanStories Public

    The official Github for the American Stories dataset as in {link}

    Python 125 9

  3. effocr effocr Public

    A model(ing framework) for sample efficient OCR

    Python 62 7

  4. HJDataset HJDataset Public

    A Large Dataset of Historical Japanese Documents with Complex Layouts

    Jupyter Notebook 34 4

  5. NEWS-COPY NEWS-COPY Public

    Noise-robust de-duplication at scale

    Python 20 2

  6. efficient_ocr efficient_ocr Public

    Efficient OCR for Building a Diverse Digital History

    Python 10 1

Repositories

Showing 10 of 29 repositories

Top languages

Loading…

Most used topics

Loading…