AI-Powered Predictive Vulnerability Discovery Engine
Predict vulnerabilities before they're exploited. Oracle uses advanced machine learning to identify 0-day vulnerabilities through pattern analysis and anomaly detection.
- Deep Code Embeddings - Transform code into ML-ready vectors
- Vulnerability Prediction - Predict security issues with trained models
- Anomaly Detection - Identify unknown 0-day patterns using isolation forests
- Pattern Classification - Multi-class vulnerability classification
- Static Analysis - AST-based pattern matching and dangerous function detection
- Semantic Analysis - Symbol tables, call graphs, and type inference
- Data Flow Analysis - Reaching definitions, live variables, def-use chains
- Control Flow Analysis - CFG construction, dominators, loop detection
- Taint Tracking - Full source-to-sink taint propagation
- SQL/Command Injection (CWE-89, CWE-78)
- Cross-Site Scripting (CWE-79)
- Buffer Overflow (CWE-120)
- Use After Free (CWE-416)
- Path Traversal (CWE-22)
- Insecure Deserialization (CWE-502)
- Authentication Bypass (CWE-287)
- Cryptographic Weaknesses (CWE-327)
- SSRF (CWE-918)
- And 6 more vulnerability classes...
- CVSS Calculation - Automatic severity scoring
- Risk Prioritization - Smart finding prioritization
- CVE Correlation - Link findings to known vulnerabilities
- NVD Integration - Real-time vulnerability database
- HTML Reports - Beautiful, interactive dashboards
- SARIF Export - CI/CD integration ready
- JSON/Markdown - Developer-friendly formats
- Trend Analysis - Track security posture over time
using Pkg
Pkg.add(url="https://github.com/yourusername/oracle")using Oracle
# Scan a single file
result = analyze("vulnerable.c")
# Scan entire codebase
result = scan_codebase("./src")
# Generate report
generate_report(result, format="html")using Oracle
# Configure scanner
config = ScanConfig(
enable_ml=true,
enable_anomaly=true,
min_confidence=0.5,
parallel=true
)
# Initialize scanner with custom config
scanner = Scanner(config=config)
# Scan with full analysis
result = scan(scanner, "./project")
# Prioritize findings
prioritizer = RiskPrioritizer()
prioritized = prioritize(prioritizer, result.findings)
# Correlate with CVEs
client = NVDClient(api_key=ENV["NVD_API_KEY"])
correlated = correlate_findings(client, result.findings)
# Generate comprehensive report
generator = ReportGenerator(output_dir="./reports")
generate_report(generator, result, format="html", target="MyProject")โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ Oracle Pipeline โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโค
โ โ
โ โโโโโโโโโโโ โโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโ โ
โ โ Code โโโโโถโ Tokenizer โโโโโถโ AST Parser โ โ
โ โโโโโโโโโโโ โโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโ โ
โ โ โ
โ โโโโโโโโโโโโโโโโโโโโโผโโโโโโโโโโโโโโโโโโโโ โ
โ โผ โผ โผ โ
โ โโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโ โโโโโโโโโโโโโ
โ โ Static โ โ Semantic โ โ Data โโ
โ โ Analysis โ โ Analysis โ โ Flow โโ
โ โโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโ โโโโโโโโโโโโโ
โ โ โ โ โ
โ โโโโโโโโโโโโโโโโโโโโโผโโโโโโโโโโโโโโโโโโโโ โ
โ โผ โ
โ โโโโโโโโโโโโโโโโโโโโ โ
โ โ Feature Vector โ โ
โ โโโโโโโโโโโโโโโโโโโโ โ
โ โ โ
โ โโโโโโโโโโโโโโโโโโโโผโโโโโโโโโโโโโโโโโโโ โ
โ โผ โผ โผ โ
โ โโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโ โ
โ โ Predictor โ โ Classifier โ โ Anomaly โ โ
โ โ (ML) โ โ (Ensemble) โ โ Detection โ โ
โ โโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโ โ
โ โ โ โ โ
โ โโโโโโโโโโโโโโโโโโโโผโโโโโโโโโโโโโโโโโโโ โ
โ โผ โ
โ โโโโโโโโโโโโโโโโโโโโ โ
โ โ Findings โ โ
โ โ & Risk Scores โ โ
โ โโโโโโโโโโโโโโโโโโโโ โ
โ โ โ
โ โผ โ
โ โโโโโโโโโโโโโโโโโโโโ โ
โ โ Report โ โ
โ โ Generation โ โ
โ โโโโโโโโโโโโโโโโโโโโ โ
โ โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
-
Code Embeddings (
CodeEmbedder)- Tokenizes code into semantic units
- Generates 128-dimensional embeddings
- Supports similarity search for pattern matching
-
Vulnerability Predictor (
VulnerabilityPredictor)- Multi-label classification across 15 vulnerability types
- Trained on historical vulnerability data
- Heuristic initialization for zero-shot prediction
-
Pattern Classifier (
PatternClassifier)- Random forest ensemble with 10 estimators
- Feature importance tracking
- Probability distribution output
-
Anomaly Detector (
AnomalyDetector)- Isolation Forest algorithm
- Detects code that deviates from normal patterns
- Zero-day candidate identification
oracle/
โโโ src/
โ โโโ Oracle.jl # Main module & exports
โ โโโ analyzers/
โ โ โโโ static.jl # Static analysis engine
โ โ โโโ semantic.jl # Semantic analysis
โ โ โโโ dataflow.jl # Data flow analysis
โ โ โโโ controlflow.jl # Control flow analysis
โ โ โโโ taint.jl # Taint tracking
โ โโโ ml/
โ โ โโโ embeddings.jl # Code embeddings
โ โ โโโ predictor.jl # Vulnerability prediction
โ โ โโโ classifier.jl # Pattern classification
โ โ โโโ anomaly.jl # Anomaly detection
โ โโโ patterns/
โ โ โโโ database.jl # Pattern database
โ โ โโโ matcher.jl # Pattern matching
โ โโโ engine/
โ โ โโโ scanner.jl # Main scanner
โ โ โโโ risk.jl # Risk calculation
โ โโโ reporting/
โ โ โโโ generator.jl # Report generation
โ โโโ integrations/
โ โ โโโ nvd.jl # NVD API integration
โ โ โโโ cve.jl # CVE tracking
โ โโโ utils/
โ โโโ helpers.jl # Utility functions
โ โโโ languages.jl # Language support
โโโ test/
โโโ docs/
โโโ Project.toml
โโโ README.md
| Language | Static | Semantic | Data Flow | Taint |
|---|---|---|---|---|
| C/C++ | โ | โ | โ | โ |
| Java | โ | โ | โ | โ |
| Python | โ | โ | โ | โ |
| JavaScript/TypeScript | โ | โ | โ | โ |
| PHP | โ | โ | โ | โ |
| Go | โ | โ | โ | โ |
| Rust | โ | โ | โ | โ |
| Ruby | โ | โ | โ | โ |
config = ScanConfig(
# Scope
include_patterns = ["*.c", "*.py", "*.js"],
exclude_patterns = ["*test*", "*vendor*"],
max_file_size = 1_000_000,
# Analysis modules
enable_static = true,
enable_semantic = true,
enable_dataflow = true,
enable_taint = true,
enable_ml = true,
enable_anomaly = true,
# Thresholds
min_confidence = 0.5,
max_findings_per_file = 50,
# Performance
parallel = true,
max_workers = 8,
timeout_seconds = 60,
# Output
verbose = false,
generate_report = true,
report_format = "html"
)export NVD_API_KEY="your-api-key" # For NVD integration
export ORACLE_CACHE_DIR="~/.oracle" # Cache directory
export ORACLE_LOG_LEVEL="info" # Logging level| Metric | Value |
|---|---|
| Files/second | ~100 (parallel) |
| Memory usage | ~500MB baseline |
| Prediction latency | <50ms |
| Accuracy (F1) | 0.87 (on benchmark) |
using Oracle
# Load training data
df = CSV.read("vulnerability_dataset.csv", DataFrame)
# Extract features and labels
features = extract_training_features(df)
labels = df.vuln_class
# Train predictor
predictor = VulnerabilityPredictor()
train!(predictor, features, labels, epochs=100)
# Save model
save_predictor(predictor, "custom_model.jls")
# Train classifier
classifier = PatternClassifier(n_estimators=50)
train!(classifier, features, labels)
save_classifier(classifier, "custom_classifier.jls")
# Train anomaly detector
detector = AnomalyDetector(contamination=0.05)
train!(detector, features)
save_detector(detector, "custom_detector.jls")name: Security Scan
on: [push, pull_request]
jobs:
oracle-scan:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: julia-actions/setup-julia@v1
- run: julia -e 'using Pkg; Pkg.add(url="https://github.com/yourusername/oracle")'
- run: |
julia -e '
using Oracle
result = scan_codebase(".")
generate_report(result, format="sarif", output_file="results.sarif")
exit(result.stats.findings_by_severity[CRITICAL] > 0 ? 1 : 0)
'
- uses: github/codeql-action/upload-sarif@v2
with:
sarif_file: results.sarif{
"oracle.enable": true,
"oracle.onSave": true,
"oracle.minSeverity": "medium",
"oracle.enableML": true
}# Analyze a single file
analyze(filepath::String; language=nothing) -> AnalysisResult
# Scan entire codebase
scan_codebase(path::String; config=DEFAULT_SCAN_CONFIG) -> ScanResult
# Predict vulnerabilities
predict_vulnerabilities(code::String, language::String) -> Vector{PredictionResult}
# Generate report
generate_report(result::ScanResult; format="html") -> String# Create custom scanner
Scanner(; config::ScanConfig) -> Scanner
# Risk calculation
calculate_risk(calc::RiskCalculator, finding::Finding) -> Float64
calculate_cvss(finding::Finding) -> CVSSScore
# CVE correlation
correlate_findings(client::NVDClient, findings::Vector{Finding}) -> Vector{CorrelatedFinding}
# Anomaly analysis
analyze_anomaly(detector::AnomalyDetector, x::Vector, ref::Matrix) -> AnomalyAnalysisOracle is designed with security in mind:
- No code execution during analysis
- Sandboxed pattern matching
- Rate-limited external API calls
- Secure credential handling
MIT License - See LICENSE for details.
- NVD/NIST for vulnerability data
- CWE/MITRE for weakness enumeration
- The Julia community for excellent packages
Documentation โข Issues โข Discussions
Made with ๐ by the NullSec Team