A sophisticated AI-powered legal contract analysis platform built with Next.js and advanced machine learning models. This platform leverages the Contract Understanding Atticus Dataset (CUAD) to provide intelligent contract review and clause extraction capabilities.
- Overview
- Features
- Architecture
- Installation
- Usage
- AI Model Training
- Project Structure
- Technology Stack
- Development
- API Documentation
- Contributing
- License
LawBotics v2 is an advanced legal technology platform that combines artificial intelligence with legal expertise to streamline contract review processes. The platform utilizes fine-tuned language models trained on the CUAD dataset to identify and extract key contract clauses, helping legal professionals save time and reduce errors in contract analysis.
- AI-Powered Contract Analysis: Automated clause identification and extraction from legal contracts
- Multi-Format Support: Process contracts in PDF and text formats
- Real-time Analysis: Instant contract review with immediate results
- Clause Categorization: Identify 41+ different types of legal clauses
- Interactive UI: Modern, responsive web interface built with Next.js 15
- Fine-tuned LLaMA Models: Custom-trained models on legal contract data
- CUAD Dataset Integration: Leverages 13,000+ labeled contract examples
- LangChain Integration: Advanced AI orchestration and processing
- Google GenAI Support: Integration with Google's generative AI models
- Authentication: Secure user management with Clerk
- Dark/Light Mode: Customizable theme support
- Responsive Design: Optimized for desktop and mobile devices
- Real-time Notifications: Toast notifications and progress tracking
- PDF Viewer: Built-in PDF document viewer and processor
โโโโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโ
โ Web Frontend โ โ AI Processing โ โ Data Storage โ
โ (Next.js 15) โโโโโบโ (Python/ML) โโโโโบโ (Convex DB) โ
โโโโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโ
โ โ โ
โผ โผ โผ
โโโโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโ
โ Authentication โ โ Model Training โ โ Contract Storageโ
โ (Clerk) โ โ (Jupyter) โ โ (Files) โ
โโโโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโ
- Node.js (v18 or higher)
- Python (v3.8 or higher)
- Git
- npm or yarn
-
Clone the repository
git clone https://github.com/hasnaintypes/lawbotics-v2.git cd lawbotics-v2 -
Install dependencies
# Install web UI dependencies cd apps/web-ui npm install # Return to root cd ../..
-
Set up environment variables
# Copy environment template cp apps/web-ui/.env.example apps/web-ui/.env.local -
Configure environment variables Edit
apps/web-ui/.env.localwith your keys:# Clerk Authentication NEXT_PUBLIC_CLERK_PUBLISHABLE_KEY=your_clerk_publishable_key CLERK_SECRET_KEY=your_clerk_secret_key # Convex Database NEXT_PUBLIC_CONVEX_URL=your_convex_url # Google AI GOOGLE_GENERATIVE_AI_API_KEY=your_google_ai_key # Other services SVIX_SECRET=your_svix_secret
-
Start the development server
cd apps/web-ui npm run dev -
Access the application Open http://localhost:3000 in your browser
- Sign Up/Login: Create an account or log in using the Clerk authentication system
- Upload Contract: Upload a PDF or text file containing the legal contract
- AI Analysis: The system will automatically process the contract and identify key clauses
- Review Results: Examine the extracted clauses and their categorizations
- Export/Save: Save results or export analysis reports
- Clause Detection: Automatically identifies 41+ types of legal clauses
- Risk Assessment: Highlights potentially problematic clauses
- Comparison: Compare multiple contracts side by side
- Search: Full-text search within contracts and extracted clauses
The project includes comprehensive AI model training capabilities using the CUAD dataset.
- CUAD v1: 13,000+ labeled examples across 510 commercial contracts
- 41 Clause Categories: Comprehensive coverage of legal contract elements
- Multiple Formats: CSV, JSON, Excel, PDF, and TXT formats available
-
Navigate to AI model directory
cd ai-model -
Open Jupyter Notebook
jupyter notebook Fine_tuning_code.ipynb
-
Follow the training steps:
- Data preparation and preprocessing
- Model fine-tuning with LLaMA architecture
- Evaluation and validation
- Model export and deployment
- LLaMA 3.2: Primary model for instruction tuning
- Google GenAI: Integration for additional AI capabilities
- Custom Fine-tuned Models: Specialized legal contract models
lawbotics-v2/
โโโ ๐ ai-model/ # AI/ML model training and data
โ โโโ ๐ Fine_tuning_code.ipynb # Jupyter notebook for model training
โ โโโ ๐ data-set/ # CUAD dataset and training data
โ โโโ ๐ CUAD_v1.json # Main dataset file
โ โโโ ๐ master_clauses.csv # Clause categorization data
โ โโโ ๐ full_contract_pdf/ # Original contract PDFs
โ โโโ ๐ full_contract_txt/ # Text versions of contracts
โ โโโ ๐ label_group_xlsx/ # Excel files with labeled data
โโโ ๐ apps/ # Application modules
โ โโโ ๐ dashboard/ # Admin dashboard (planned)
โ โโโ ๐ web-ui/ # Main web application
โ โโโ ๐ package.json # Dependencies and scripts
โ โโโ ๐ next.config.ts # Next.js configuration
โ โโโ ๐ tailwind.config.js # Tailwind CSS configuration
โ โโโ ๐ middleware.ts # Authentication middleware
โ โโโ ๐ src/ # Source code
โ โ โโโ ๐ app/ # Next.js app router pages
โ โ โโโ ๐ components/ # Reusable UI components
โ โ โโโ ๐ lib/ # Utility functions and configs
โ โ โโโ ๐ hooks/ # Custom React hooks
โ โ โโโ ๐ services/ # API and external service integrations
โ โ โโโ ๐ store/ # State management (Zustand)
โ โ โโโ ๐ convex/ # Convex database functions
โ โโโ ๐ public/ # Static assets
โโโ ๐ docs/ # Documentation (planned)
- Next.js 15: React framework with App Router
- React 19: Latest React with concurrent features
- TypeScript: Type-safe development
- Tailwind CSS 4: Modern utility-first CSS framework
- Radix UI: Accessible component primitives
- Lucide React: Modern icon library
- Convex: Real-time database and backend
- Clerk: Authentication and user management
- SVIX: Webhook management
- Axios: HTTP client for API requests
- LangChain: AI application framework
- Google Generative AI: AI model integration
- Python: Model training and processing
- Jupyter: Interactive development environment
- Zustand: Lightweight state management
- React PDF: PDF processing and viewing
- Recharts: Data visualization
- Sonner: Toast notifications
# Development server
npm run dev
# Production build
npm run build
# Start production server
npm run start
# Lint code
npm run lint- ESLint: Code linting with Next.js configuration
- TypeScript: Strict type checking enabled
- Prettier: Code formatting (recommended)
- Component Structure: Use functional components with TypeScript
- State Management: Utilize Zustand for global state
- Styling: Implement Tailwind CSS classes with component variants
- API Integration: Use services directory for external API calls
- Error Handling: Implement comprehensive error boundaries
POST /api/auth/signin- User sign inPOST /api/auth/signup- User registrationPOST /api/auth/signout- User sign out
POST /api/contracts/upload- Upload contract for analysisGET /api/contracts/:id- Retrieve contract analysisPOST /api/contracts/analyze- Perform AI analysisGET /api/contracts/history- User's contract history
POST /api/ai/extract-clauses- Extract clauses from contractPOST /api/ai/classify- Classify contract clausesGET /api/ai/models- Available AI models
We welcome contributions to LawBotics v2! Please follow these guidelines:
- Fork the repository
- Create a feature branch:
git checkout -b feature/your-feature-name - Make your changes and add tests
- Commit your changes:
git commit -m 'Add some feature' - Push to the branch:
git push origin feature/your-feature-name - Submit a pull request
- Follow the existing code style and conventions
- Add tests for new features
- Update documentation as needed
- Ensure all tests pass before submitting PR
- Provide clear commit messages and PR descriptions
Please use the GitHub issue tracker to report bugs or request features. Include:
- Detailed description of the issue
- Steps to reproduce
- Expected vs actual behavior
- Screenshots (if applicable)
- Environment details
This project is licensed under the MIT License - see the LICENSE file for details.
- The Atticus Project: For providing the CUAD dataset
- Next.js Team: For the excellent React framework
- Clerk: For seamless authentication solutions
- Radix UI: For accessible component primitives
- Tailwind CSS: For the utility-first CSS framework
For support and questions:
- ๐ง Email: support@lawbotics.com
- ๐ฌ Discord: LawBotics Community
- ๐ Documentation: docs.lawbotics.com
- ๐ Issues: GitHub Issues
Built with โค๏ธ by the LawBotics Team
Empowering legal professionals with AI-driven contract analysis