Top2Vec learns jointly embedded topic, document and word vectors.
- 
            Updated
            
Nov 14, 2024  - Python
 
Top2Vec learns jointly embedded topic, document and word vectors.
Find parts of long text or data, allowing for some changes/typos.
hotpdf is a fast PDF parsing library to extract text and find text within PDF documents built on top of pdfminer.six
⚡ A telegram bot for searching all the stickers (just like @gif).
Expose a Top2Vec model with a REST API.
Simple full text search demo for Google App Engine
A static site generator for Zettelkasten notes
Turn PostgreSQL into your search engine in a Pythonic way.
Text preprocessing, representation, similarity calculation, text search and classification. Let's go and play with text!
State-of-the-art embedding models fine-tuned for the ecommerce domain. +67% increase in evaluation metrics vs ViT-B-16-SigLIP.
semantic-sh is a SimHash implementation to detect and group similar texts by taking power of word vectors and transformer-based language models (BERT).
Fast fuzzy text search
(MIRROR) Text search engine that runs on a local service. Includes a pipeline for preprocessing a user-defined image dataset.
kawadi is my collection of tools, that I need more frequently
A desktop file search tool that uses ripgrep. Implemented using pyside6
Read a mastodon archive and create a sqlite3 database of archived post content
Basic and Full-text Search in Django
An advanced, cross-platform file content search tool with a Tkinter GUI. Features advanced filtering, multi-format support (PDF, DOCX, ZIP), content preview, and search analytics.
High-quality congressional bill search with hybrid BM25+vector similarity using DuckDB, TEI embeddings, and GovInfo API. Local deployment with Docker.
Este é um script em Python que permite buscar por um texto específico dentro de todos os arquivos PDF em uma pasta.
Add a description, image, and links to the text-search topic page so that developers can more easily learn about it.
To associate your repository with the text-search topic, visit your repo's landing page and select "manage topics."