Sylint

RAG-Based Static Vulnerability Scanner with Semantic Code Analysis

Live Demo • Features • Installation • Usage • Architecture

Overview

Sylint is a browser-based static vulnerability scanner that combines deep code embeddings with retrieval-augmented generation (RAG) to detect, explain, and remediate security vulnerabilities. Unlike traditional pattern-matching tools, Sylint understands code semantically, enabling it to identify vulnerabilities even in obfuscated or stylistically varied code.

Key Differentiators

Semantic Analysis: Uses Microsoft's UniXcoder to understand code logic, not just syntax
RAG-Powered Explanations: Grounds vulnerability analysis in real-world CVE/CWE patterns
Multi-Language Support: Analyzes 9 programming languages with a single model
Developer-Friendly: Provides plain-English explanations and automated fix suggestions

Supported Languages

Python • JavaScript • TypeScript • Java • C • C++ • PHP • Ruby • Go

Features

Core Functionality

Semantic Vulnerability Detection - Identifies security issues through code understanding rather than pattern matching
Deep Code Embeddings - 768-dimensional vector representations using UniXcoder
CVE/CWE Mapping - Automatic classification based on NVD vulnerability database
Automated Fix Suggestions - LLM-generated patches for detected vulnerabilities
Compliance Reporting - Maps findings to PCI DSS, HIPAA, NIST SP 800-53, and OWASP ASVS
Scan History - Persistent storage and retrieval of previous analyses
Export Reports - Generate PDF and Markdown vulnerability reports

Technical Features

Monaco Editor Integration - Professional code editing with syntax highlighting
Real-time Analysis - Serverless backend for fast vulnerability scanning
Vector Similarity Search - Pinecone-powered retrieval of similar vulnerable code
Authentication & Authorization - Clerk-based user management with subscription tiers
Secure Communication - Full HTTPS encryption for all client-server interactions

Demo

Live Application: https://sylint.app/

Try Sylint with your own code or use the sample vulnerabilities provided in the interface.

Example Analysis

# Input: User authentication function with SQL injection vulnerability
# Output: 
# - Detected: CWE-89 (SQL Injection)
# - Explanation: Unsanitized user input concatenated into SQL query
# - Suggested Fix: Use parameterized queries or prepared statements

Architecture

System Overview

┌─────────────┐
│   Browser   │
│  (Next.js)  │
└──────┬──────┘
       │
       ├─────────────┐
       │             │
┌──────▼──────┐ ┌───▼────────┐
│   Convex    │ │  FastAPI   │
│  (Backend)  │ │ (AI Layer) │
└─────────────┘ └─────┬──────┘
                      │
              ┌───────┼────────┐
              │       │        │
       ┌──────▼──┐ ┌──▼────┐ ┌▼────────┐
       │ UniXcoder│ │ Groq  │ │Pinecone │
       │(Embeddings)│(LLM)  │ │(Vector) │
       └──────────┘ └───────┘ └─────────┘

RAG Pipeline

Code Submission - User submits source code via Monaco editor
Embedding Generation - UniXcoder creates 768-dimensional vector representation
Similarity Search - Query Pinecone for top-k similar vulnerable code samples from CVEfixes dataset
Context Augmentation - Retrieved examples augment LLM prompt
Vulnerability Analysis - Groq's Mixtral model generates explanation, CWE tags, and fixes
Result Presentation - Findings displayed with compliance mappings and export options

Installation

Prerequisites

Node.js 18+ and npm
Python 3.9+
Convex account
Clerk account
Groq API key
Pinecone account

Setup

Clone the repository

git clone https://github.com/yourusername/sylint.git
cd sylint

Install dependencies

npm install

Configure environment variables

Create a .env.local file in the root directory:

# Clerk Authentication
NEXT_PUBLIC_CLERK_PUBLISHABLE_KEY=your_clerk_publishable_key
CLERK_SECRET_KEY=your_clerk_secret_key

# Convex
CONVEX_DEPLOYMENT=your_convex_deployment
NEXT_PUBLIC_CONVEX_URL=your_convex_url

# Groq API
GROQ_API_KEY=your_groq_api_key

# Pinecone
PINECONE_API_KEY=your_pinecone_api_key
PINECONE_ENVIRONMENT=your_pinecone_environment

Set up the AI service

cd ai-service
python3 -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate
pip install -r requirements.txt

Start the development servers

# Terminal 1: Next.js frontend
npm run dev

# Terminal 2: Convex backend
npx convex dev

# Terminal 3: FastAPI AI service
cd ai-service
uvicorn main:app --reload --port 8000

Access the application

Open http://localhost:3000 in your browser.

Usage

Basic Vulnerability Scan

Select your programming language from the dropdown
Paste or type your code in the Monaco editor
Click "Scan for Vulnerabilities"
Review detected issues with:
- Vulnerability explanation
- CWE classification
- Affected code location
- Suggested fix
- Compliance implications

API Endpoints

Generate Code Embedding

POST /embed
Content-Type: application/json

{
  "code": "string",
  "language": "python"
}

Explain Vulnerability

POST /explain
Content-Type: application/json

{
  "code": "string",
  "similar_vulnerabilities": ["array of similar code samples"],
  "language": "python"
}

Tech Stack

Frontend

Next.js 14 - React framework with App Router
TypeScript - Type-safe development
Tailwind CSS - Utility-first styling
Monaco Editor - VS Code-based code editor

Backend

Convex - Real-time serverless database
FastAPI - High-performance Python API framework
Clerk - Authentication and user management

AI/ML

UniXcoder - Microsoft's code understanding model (768-dim embeddings)
Groq API - LLM inference (Mixtral Llama 3.3 70B)
Pinecone - Vector database for similarity search

Dataset

CVEfixes - Curated vulnerable code samples from NVD with CVE/CWE mappings

Roadmap

Compliance Mode Selection - Filter analysis by specific frameworks (PCI DSS, HIPAA, NIST)
Multi-file Project Scanning - Analyze entire codebases with dependency tracking
CI/CD Integration - GitHub Actions and GitLab CI plugins
Custom Rule Creation - User-defined vulnerability patterns
IDE Extensions - VS Code and JetBrains plugin support
Real-time Collaboration - Multi-user code review sessions
Enhanced Compliance Database - Expanded regulatory framework coverage

Contributing

Contributions are welcome! Please feel free to submit a Pull Request. For major changes, please open an issue first to discuss what you would like to change.

Fork the repository
Create your feature branch (git checkout -b feature/AmazingFeature)
Commit your changes (git commit -m 'Add some AmazingFeature')
Push to the branch (git push origin feature/AmazingFeature)
Open a Pull Request

License

This project is licensed under the MIT License - see the LICENSE file for details.

Acknowledgments

Microsoft for UniXcoder
CVEfixes dataset contributors
Groq for LLM API access
Pinecone for vector database infrastructure

Name		Name	Last commit message	Last commit date
Latest commit History 82 Commits
ai-service		ai-service
convex		convex
public		public
src		src
.eslintrc.json		.eslintrc.json
.gitignore		.gitignore
README.md		README.md
next.config.ts		next.config.ts
notes.md		notes.md
package-lock.json		package-lock.json
package.json		package.json
postcss.config.mjs		postcss.config.mjs
tailwind.config.ts		tailwind.config.ts
test.js		test.js
tsconfig.json		tsconfig.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Sylint

Overview

Key Differentiators

Supported Languages

Features

Core Functionality

Technical Features

Demo

Example Analysis

Architecture

System Overview

RAG Pipeline

Installation

Prerequisites

Setup

Usage

Basic Vulnerability Scan

API Endpoints

Generate Code Embedding

Explain Vulnerability

Tech Stack

Frontend

Backend

AI/ML

Dataset

Roadmap

Contributing

License

Acknowledgments

About

Uh oh!

Releases

Packages

Uh oh!

Languages

butlerem/vulnerability-scanner-UniXcoder-RAG

Folders and files

Latest commit

History

Repository files navigation

Sylint

Overview

Key Differentiators

Supported Languages

Features

Core Functionality

Technical Features

Demo

Example Analysis

Architecture

System Overview

RAG Pipeline

Installation

Prerequisites

Setup

Usage

Basic Vulnerability Scan

API Endpoints

Generate Code Embedding

Explain Vulnerability

Tech Stack

Frontend

Backend

AI/ML

Dataset

Roadmap

Contributing

License

Acknowledgments

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages