Skip to content

Commit

Permalink
Added article recoomendation app
Browse files Browse the repository at this point in the history
  • Loading branch information
singhayush7 committed Dec 22, 2024
1 parent 64bd1bb commit 6f47601
Show file tree
Hide file tree
Showing 20 changed files with 687 additions and 0 deletions.
1 change: 1 addition & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -191,6 +191,7 @@ These are ready to use applications built using LanceDB serverless vector databa
|-----------------------------------------------------|----------------------------------------------------------------------------------------------------------------------|-------------------------------------------|
| [Writing assistant](https://github.com/lancedb/vectordb-recipes/tree/main/applications/node/lanchain_writing_assistant) | Writing assistant app using lanchain.js with LanceDB, allows you to get real time relevant suggestions and facts based on you written text to help you with your writing. | ![Writing assistant](https://github.com/user-attachments/assets/87354e93-df4d-40ad-922b-abcbb62d667c) |
| [Sentence auto complete](https://github.com/lancedb/vectordb-recipes/tree/main/applications/node/sentance_auto_complete) | Sentance auto complete app using lanchain.js with LanceDB, allows you to get real time relevant auto complete suggestions and facts based on you written text to help you with your writing.You can also upload your data source in the form of a pdf file.You can switch between gpt models to get faster results. | ![Sentance auto complete](https://github.com/lancedb/assets/blob/main/recipes/sentance_Auto_complete.gif) |
| [Article Recommendation](https://github.com/lancedb/vectordb-recipes/tree/main/applications/node/article_recommender) | Article Recommender: Explore vast data set of articles with Instant, Context-Aware Suggestions. Leveraging Advanced NLP, Vector Search, and Customizable Datasets, Our App Delivers Real-Time, Precise Article Recommendations. Perfect for Research, Content Curation, and Staying Informed. Unlock Smarter Insights with State-of-the-Art Technology in Content Retrieval and Discovery!". | ![Article Recommendation](https://github.com/lancedb/assets/blob/main/recipes/article_recommendation_engine.gif) |
||||

| Project Name | Description | Screenshot |
Expand Down
24 changes: 24 additions & 0 deletions applications/node/article_recommender/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
# Logs
logs
*.log
npm-debug.log*
yarn-debug.log*
yarn-error.log*
pnpm-debug.log*
lerna-debug.log*

node_modules
dist
dist-ssr
*.local

# Editor directories and files
.vscode/*
!.vscode/extensions.json
.idea
.DS_Store
*.suo
*.ntvs*
*.njsproj
*.sln
*.sw?
161 changes: 161 additions & 0 deletions applications/node/article_recommender/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,161 @@
**AI-Powered Article Recommendation System**
============================================

An advanced **AI-driven article recommendation engine** designed to process and retrieve **relevant articles** from a vast dataset of over **2 million articles**. This tool provides real-time, **context-aware article suggestions** by leveraging advanced **vector search** and **natural language processing (NLP)** technologies.

**Demo**
--------

![Real-Time Autocomplete Demo](https://github.com/lancedb/assets/blob/main/recipes/article_recommendation_engine.gif)


* * * * *

**Features**
------------

- 🔍 **Keyword-Based Search**: Input any keyword or phrase, and get **top 10 relevant articles** instantly.
- 🌐 **Massive Dataset Support**: Efficiently processes and retrieves results from a **dataset of over 2 million articles**.
- 📈 **High Precision Recommendations**: Articles are ranked based on semantic similarity and relevance using state-of-the-art embeddings.
- 🧠 **AI-Powered Relevance**: Built with **LangChain.js** and **LanceDB** for robust NLP and vector search capabilities.

* * * * *

**How It Works**
----------------

1. **Data Preprocessing**: Articles are divided into smaller, context-preserving chunks using **RecursiveCharacterTextSplitter**.\
Example configuration:

`const splitter = new RecursiveCharacterTextSplitter({
chunkSize: 25000, // Adjust chunk size for optimal performance
chunkOverlap: 1, // Ensure overlap for context continuity
});`

2. **Vector Embedding**: The preprocessed data is embedded using **OpenAIEmbeddings**.
3. **Efficient Storage**: Embedded vectors are stored in **LanceDB**, optimized for high-speed similarity search.
4. **Query and Retrieval**: User input is matched against the dataset to retrieve **top 10 semantically similar articles**.

* * * * *

**Technical Highlights**
------------------------

- **Advanced Vector Search**: Uses LanceDB to enable fast and scalable similarity searches across millions of articles.
- **Real-Time Results**: The system retrieves and ranks articles within milliseconds.
- **Customizable Dataset**: Easily replace the default dataset or upload custom datasets in `.csv` or `.txt` formats.

* * * * *

**Use Cases**
-------------

- **Research and Academic Work**: Find articles that are most relevant to your research topic.
- **Content Curation**: Discover the best content for blogs, newsletters, or social media.
- **Media Monitoring**: Track trends and news articles efficiently.
- **Educational Insights**: Access curated learning material on any subject.

* * * * *

**Getting Started**
-------------------

### **1\. Prerequisites**

- **Node.js** version **20+**
- A valid [OpenAI API Key](https://platform.openai.com/signup)

### **2\. Installation**

Clone the repository and install dependencies:


`git clone <repository-url>
cd <repository-folder>
npm install`

### **3\. Configure API Key**

Add your OpenAI API key in `.env`:

`OPENAI_API_KEY=your_openai_key`

* * * * *


### **4\. Add your data source**

Add your data source under the src>Backend>dataSourceFiles as news.csv
If you name it otherwise, you might have to change the data source link in langChainProcessor.mjs file

* * * * *

### **5\. Running the System**

use node >V20

`npm install`

#### Run Backend Server:

`npm run server`

#### Run Full Application:


`npm run dev`

Access the app at:

`http://localhost:5173`

* * * * *

**Customizing the Dataset**
---------------------------

You can upload or replace the dataset for customized recommendations:

1. Navigate to `src/Backend/dataSourceFiles`.
2. Replace the existing `.csv` or `.txt` file with your dataset.
3. Restart the backend server to process the new dataset.

For example, to use the **All the News 2 Dataset**:\
[A dataset of 180mb size..used for creating this app](https://components.one/datasets/above-the-fold)\
[All the News 2 Dataset](https://components.one/datasets/all-the-news-2-news-articles-dataset)

* * * * *

**API Overview**
----------------

**Endpoint**: `/api/articles`\
**Method**: `POST`\
**Request Body**:

`{
"text": "Your keyword here"
}`

**Response**:

`{
"result": [
{
"metadata": {
"title": "Sample Title",
"author": "Author Name",
"content": "Snippet of the article..."
}
}
]
}`

* * * * *

**Future Enhancements**
-----------------------

- **Support for Multi-Modal Datasets**: Images, PDFs, and multimedia support.
- **Interactive Filters**: Filter results by date, author, or publication.
- **Deployable Cloud Versions**: Ready-to-deploy solutions for AWS, Vercel, and Netlify.
38 changes: 38 additions & 0 deletions applications/node/article_recommender/eslint.config.js
Original file line number Diff line number Diff line change
@@ -0,0 +1,38 @@
import js from '@eslint/js'
import globals from 'globals'
import react from 'eslint-plugin-react'
import reactHooks from 'eslint-plugin-react-hooks'
import reactRefresh from 'eslint-plugin-react-refresh'

export default [
{ ignores: ['dist'] },
{
files: ['**/*.{js,jsx}'],
languageOptions: {
ecmaVersion: 2020,
globals: globals.browser,
parserOptions: {
ecmaVersion: 'latest',
ecmaFeatures: { jsx: true },
sourceType: 'module',
},
},
settings: { react: { version: '18.3' } },
plugins: {
react,
'react-hooks': reactHooks,
'react-refresh': reactRefresh,
},
rules: {
...js.configs.recommended.rules,
...react.configs.recommended.rules,
...react.configs['jsx-runtime'].rules,
...reactHooks.configs.recommended.rules,
'react/jsx-no-target-blank': 'off',
'react-refresh/only-export-components': [
'warn',
{ allowConstantExport: true },
],
},
},
]
12 changes: 12 additions & 0 deletions applications/node/article_recommender/index.html
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
<!doctype html>
<html lang="en">
<head>
<meta charset="UTF-8" />
<meta name="viewport" content="width=device-width, initial-scale=1.0" />
<title>Article</title>
</head>
<body>
<div id="root"></div>
<script type="module" src="/src/main.jsx"></script>
</body>
</html>
54 changes: 54 additions & 0 deletions applications/node/article_recommender/package.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,54 @@
{
"name": "article-recommender",
"private": true,
"version": "0.0.0",
"type": "module",
"scripts": {
"dev": "vite",
"build": "vite build",
"lint": "eslint .",
"preview": "vite preview",
"server": "node src/backend/server.mjs",
"start": "npm-run-all --parallel server dev"
},
"dependencies": {
"@heroicons/react": "^2.2.0",
"@lancedb/lancedb": "^0.12.0",
"@langchain/community": "^0.3.1",
"@langchain/openai": "^0.3.14",
"@phosphor-icons/react": "^2.1.7",
"@testing-library/jest-dom": "^5.17.0",
"@testing-library/react": "^13.4.0",
"@testing-library/user-event": "^13.5.0",
"body-parser": "^1.20.3",
"cors": "^2.8.5",
"csv-parser": "^3.0.0",
"express": "^4.21.2",
"fs": "^0.0.1-security",
"langchain": "^0.3.7",
"multer": "^1.4.5-lts.1",
"phosphor-react": "^1.4.1",
"react": "^18.3.1",
"react-dom": "^18.3.1",
"react-quill": "^2.0.0",
"react-scripts": "5.0.1",
"vectordb": "^0.1.19",
"web-vitals": "^2.1.4"
},
"devDependencies": {
"@eslint/js": "^9.15.0",
"@types/react": "^18.3.12",
"@types/react-dom": "^18.3.1",
"@vitejs/plugin-react": "^4.3.4",
"autoprefixer": "^10.4.20",
"eslint": "^9.15.0",
"eslint-plugin-react": "^7.37.2",
"eslint-plugin-react-hooks": "^5.0.0",
"eslint-plugin-react-refresh": "^0.4.14",
"globals": "^15.12.0",
"npm-run-all": "^4.1.5",
"postcss": "^8.4.49",
"tailwindcss": "^3.4.16",
"vite": "^6.0.1"
}
}
6 changes: 6 additions & 0 deletions applications/node/article_recommender/postcss.config.js
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
export default {
plugins: {
tailwindcss: {},
autoprefixer: {},
},
}
3 changes: 3 additions & 0 deletions applications/node/article_recommender/public/assets/logo.svg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Empty file.
Loading

0 comments on commit 6f47601

Please sign in to comment.