|
| 1 | +# Architecture |
| 2 | + |
| 3 | +## System Overview |
| 4 | + |
| 5 | +Oxc (The Oxidation Compiler) is a collection of high-performance JavaScript and TypeScript tools written in Rust. The system is designed as a modular, composable set of compiler components that can be used independently or together to build complete toolchains for JavaScript/TypeScript development. |
| 6 | + |
| 7 | +### Core Mission |
| 8 | + |
| 9 | +- **Performance**: Deliver faster performance than existing JavaScript tools |
| 10 | +- **Correctness**: Maintain compatibility with JavaScript/TypeScript standards |
| 11 | +- **Modularity**: Enable users to compose tools according to their specific needs |
| 12 | +- **Developer Experience**: Provide excellent error messages and tooling integration |
| 13 | + |
| 14 | +## High-Level Architecture |
| 15 | + |
| 16 | +``` |
| 17 | +┌─────────────────────────────────────────────────────────────────┐ |
| 18 | +│ Applications │ |
| 19 | +├─────────────────────────────────────────────────────────────────┤ |
| 20 | +│ oxlint │ Language Server │ NAPI Bindings │ Future Tools │ |
| 21 | +├─────────────────────────────────────────────────────────────────┤ |
| 22 | +│ Core Libraries │ |
| 23 | +├─────────────────────────────────────────────────────────────────┤ |
| 24 | +│ Parser │ Semantic │ Linter │ Transformer │ Minifier │ Codegen │ |
| 25 | +├─────────────────────────────────────────────────────────────────┤ |
| 26 | +│ Foundation Libraries │ |
| 27 | +├─────────────────────────────────────────────────────────────────┤ |
| 28 | +│ AST │ Allocator │ Diagnostics │ Span │ Syntax │ |
| 29 | +└─────────────────────────────────────────────────────────────────┘ |
| 30 | +``` |
| 31 | + |
| 32 | +## Architecture Principles |
| 33 | + |
| 34 | +### 1. Zero-Copy Architecture |
| 35 | + |
| 36 | +The system is built around an arena allocator (`oxc_allocator`) that enables zero-copy operations throughout the compilation pipeline. All AST nodes are allocated in a single arena, eliminating the need for reference counting or garbage collection. |
| 37 | + |
| 38 | +### 2. Visitor Pattern |
| 39 | + |
| 40 | +AST traversal is implemented using the visitor pattern (`oxc_ast_visit`) with automatic visitor generation through procedural macros. This ensures type safety and performance while maintaining code clarity. |
| 41 | + |
| 42 | +### 3. Shared Infrastructure |
| 43 | + |
| 44 | +Common functionality like error reporting (`oxc_diagnostics`), source positions (`oxc_span`), and syntax definitions (`oxc_syntax`) are shared across all components to ensure consistency. |
| 45 | + |
| 46 | +## Core Components |
| 47 | + |
| 48 | +### Foundation Layer |
| 49 | + |
| 50 | +#### oxc_allocator |
| 51 | + |
| 52 | +- **Purpose**: Arena-based memory allocator for zero-copy operations |
| 53 | +- **Key Features**: |
| 54 | + - Single allocation arena for entire compilation unit |
| 55 | + - Eliminates need for Rc/Arc in hot paths |
| 56 | + - Enables structural sharing of AST nodes |
| 57 | +- **Dependencies**: None (foundational) |
| 58 | + |
| 59 | +#### oxc_span |
| 60 | + |
| 61 | +- **Purpose**: Source position tracking and text manipulation |
| 62 | +- **Key Features**: |
| 63 | + - Byte-based indexing for UTF-8 correctness |
| 64 | + - Efficient span operations for source maps |
| 65 | + - Integration with diagnostic reporting |
| 66 | +- **Dependencies**: None (foundational) |
| 67 | + |
| 68 | +#### oxc_syntax |
| 69 | + |
| 70 | +- **Purpose**: JavaScript/TypeScript language definitions |
| 71 | +- **Key Features**: |
| 72 | + - Token definitions and keyword mappings |
| 73 | + - Language feature flags and compatibility |
| 74 | + - Shared syntax validation logic |
| 75 | +- **Dependencies**: oxc_span |
| 76 | + |
| 77 | +#### oxc_diagnostics |
| 78 | + |
| 79 | +- **Purpose**: Error reporting and diagnostic infrastructure |
| 80 | +- **Key Features**: |
| 81 | + - Rich error messages with source context |
| 82 | + - Multiple output formats (JSON, pretty-printed) |
| 83 | + - Integration with language server protocol |
| 84 | +- **Dependencies**: oxc_span |
| 85 | + |
| 86 | +#### oxc_ast |
| 87 | + |
| 88 | +- **Purpose**: Abstract Syntax Tree definitions and utilities |
| 89 | +- **Key Features**: |
| 90 | + - Complete JavaScript/TypeScript AST coverage |
| 91 | + - Generated visitor traits for type safety |
| 92 | + - Serialization support for caching |
| 93 | +- **Dependencies**: oxc_allocator, oxc_span, oxc_syntax |
| 94 | + |
| 95 | +##### AST Design Principles |
| 96 | + |
| 97 | +The Oxc AST differs significantly from the [estree](https://github.com/estree/estree) AST specification by removing ambiguous nodes and introducing distinct types. While many existing JavaScript tools rely on estree as their AST specification, a notable drawback is its abundance of ambiguous nodes that often leads to confusion during development. |
| 98 | + |
| 99 | +For example, instead of using a generic estree `Identifier`, the Oxc AST provides specific types such as: |
| 100 | + |
| 101 | +- `BindingIdentifier` - for variable declarations and bindings |
| 102 | +- `IdentifierReference` - for variable references |
| 103 | +- `IdentifierName` - for property names and labels |
| 104 | + |
| 105 | +This clear distinction greatly enhances the development experience by aligning more closely with the ECMAScript specification and providing better type safety. |
| 106 | + |
| 107 | +### Core Processing Layer |
| 108 | + |
| 109 | +#### oxc_parser |
| 110 | + |
| 111 | +- **Purpose**: JavaScript/TypeScript parsing |
| 112 | +- **Key Features**: |
| 113 | + - Hand-written recursive descent parser |
| 114 | + - Full ES2024+ and TypeScript support |
| 115 | + - Preservation of comments and trivia |
| 116 | +- **Dependencies**: oxc_allocator, oxc_ast, oxc_diagnostics, oxc_span, oxc_syntax |
| 117 | + |
| 118 | +#### oxc_semantic |
| 119 | + |
| 120 | +- **Purpose**: Semantic analysis and symbol resolution |
| 121 | +- **Key Features**: |
| 122 | + - Scope chain construction |
| 123 | + - Symbol table generation |
| 124 | + - Dead code detection |
| 125 | +- **Dependencies**: oxc_ast, oxc_cfg, oxc_diagnostics, oxc_span, oxc_syntax |
| 126 | + |
| 127 | +#### oxc_linter |
| 128 | + |
| 129 | +- **Purpose**: ESLint-compatible linting engine |
| 130 | +- **Key Features**: |
| 131 | + - 200+ built-in rules |
| 132 | + - Plugin architecture for custom rules |
| 133 | + - Automatic fixing for many rules |
| 134 | + - Configuration compatibility with ESLint |
| 135 | +- **Dependencies**: oxc_ast, oxc_semantic, oxc_diagnostics, oxc_cfg |
| 136 | + |
| 137 | +#### oxc_transformer |
| 138 | + |
| 139 | +- **Purpose**: Code transformation and transpilation |
| 140 | +- **Key Features**: |
| 141 | + - TypeScript to JavaScript transformation |
| 142 | + - Modern JavaScript feature transpilation |
| 143 | + - React JSX transformation |
| 144 | + - Babel plugin compatibility layer |
| 145 | +- **Dependencies**: oxc_ast, oxc_semantic, oxc_allocator |
| 146 | + |
| 147 | +#### oxc_minifier |
| 148 | + |
| 149 | +- **Purpose**: Code size optimization |
| 150 | +- **Key Features**: |
| 151 | + - Dead code elimination |
| 152 | + - Constant folding and propagation |
| 153 | + - Identifier mangling integration |
| 154 | + - Statement and expression optimization |
| 155 | +- **Dependencies**: oxc_ast, oxc_semantic, oxc_mangler |
| 156 | + |
| 157 | +#### oxc_codegen |
| 158 | + |
| 159 | +- **Purpose**: AST to source code generation |
| 160 | +- **Key Features**: |
| 161 | + - Configurable output formatting |
| 162 | + - Source map generation |
| 163 | + - Comment preservation options |
| 164 | + - Minified and pretty-printed output modes |
| 165 | +- **Dependencies**: oxc_ast, oxc_span |
| 166 | + |
| 167 | +### Application Layer |
| 168 | + |
| 169 | +#### oxlint (apps/oxlint) |
| 170 | + |
| 171 | +- **Purpose**: Command-line linter application |
| 172 | +- **Key Features**: |
| 173 | + - File discovery and parallel processing |
| 174 | + - Configuration file support |
| 175 | + - Multiple output formats |
| 176 | + - Integration with CI/CD systems |
| 177 | +- **Dependencies**: oxc_linter, oxc_parser, oxc_semantic |
| 178 | + |
| 179 | +#### Language Server (oxc_language_server) |
| 180 | + |
| 181 | +- **Purpose**: LSP implementation for editor integration |
| 182 | +- **Key Features**: |
| 183 | + - Real-time diagnostics |
| 184 | + - Go-to-definition and references |
| 185 | + - Symbol search and completion |
| 186 | +- **Dependencies**: All core components |
| 187 | + |
| 188 | +#### NAPI Bindings (napi/*) |
| 189 | + |
| 190 | +- **Purpose**: Node.js integration layer |
| 191 | +- **Key Features**: |
| 192 | + - Parser bindings for JavaScript tooling |
| 193 | + - Linter integration for build tools |
| 194 | + - Transform pipeline for bundlers |
| 195 | + - Async processing support |
| 196 | +- **Dependencies**: Core components + Node.js FFI |
| 197 | + |
| 198 | +## Data Flow |
| 199 | + |
| 200 | +### Compilation Pipeline |
| 201 | + |
| 202 | +1. **Input**: Source text + configuration |
| 203 | +2. **Lexing/Parsing**: `oxc_parser` → AST + comments |
| 204 | +3. **Semantic Analysis**: `oxc_semantic` → Symbol table + scope info |
| 205 | +4. **Processing**: Tool-specific analysis (linting, transformation, etc.) |
| 206 | +5. **Output**: Results (diagnostics, transformed code, etc.) |
| 207 | + |
| 208 | +### Memory Management Flow |
| 209 | + |
| 210 | +``` |
| 211 | +Source Text → Arena Allocator → AST Nodes → Visitors → Results |
| 212 | + ↓ ↓ ↓ ↓ ↓ |
| 213 | + UTF-8 Arena Borrowed Zero-copy Owned |
| 214 | + String Memory References Processing Output |
| 215 | +``` |
| 216 | + |
| 217 | +## Quality Attributes |
| 218 | + |
| 219 | +### Performance |
| 220 | + |
| 221 | +- **Target**: 10-100x faster than comparable tools |
| 222 | +- **Strategies**: |
| 223 | + - Arena allocation for memory efficiency |
| 224 | + - Zero-copy data structures |
| 225 | + - Parallel processing where possible |
| 226 | + - Minimal allocations in hot paths |
| 227 | + |
| 228 | +#### Parser Performance Implementation |
| 229 | + |
| 230 | +- AST is allocated in a memory arena ([bumpalo](https://crates.io/crates/bumpalo)) for fast AST memory allocation and deallocation |
| 231 | +- Short strings are inlined by [CompactString](https://crates.io/crates/compact_str) |
| 232 | +- No other heap allocations are done except the above two |
| 233 | +- Scope binding, symbol resolution and some syntax errors are not done in the parser, they are delegated to the semantic analyzer |
| 234 | + |
| 235 | +#### Linter Performance Implementation |
| 236 | + |
| 237 | +- Oxc parser is used for optimal performance |
| 238 | +- AST visit is a fast operation due to linear memory scan from the memory arena |
| 239 | +- Files are linted in a multi-threaded environment, so scales with the total number of CPU cores |
| 240 | +- Every single lint rule is tuned for performance |
| 241 | + |
| 242 | +### Correctness |
| 243 | + |
| 244 | +- **Target**: 100% compatibility with language standards |
| 245 | +- **Strategies**: |
| 246 | + - Comprehensive test suites |
| 247 | + - Real-world codebase testing |
| 248 | + - Conformance testing against official specs |
| 249 | + - Conservative error handling |
| 250 | + |
| 251 | +### Maintainability |
| 252 | + |
| 253 | +- **Target**: Clear, reviewable, extensible codebase |
| 254 | +- **Strategies**: |
| 255 | + - Strong type system usage |
| 256 | + - Procedural macro code generation |
| 257 | + - Clear separation of concerns |
| 258 | + - Comprehensive documentation |
| 259 | + |
| 260 | +### Usability |
| 261 | + |
| 262 | +- **Target**: Drop-in replacement for existing tools |
| 263 | +- **Strategies**: |
| 264 | + - Configuration compatibility |
| 265 | + - Familiar CLI interfaces |
| 266 | + - Rich error messages |
| 267 | + - Editor integration |
| 268 | + |
| 269 | +## Technical Constraints |
| 270 | + |
| 271 | +### Language Choice |
| 272 | + |
| 273 | +- **Rust**: Chosen for memory safety, performance, and zero-cost abstractions |
| 274 | +- **MSRV**: N-2 policy for stability |
| 275 | + |
| 276 | +### Memory Model |
| 277 | + |
| 278 | +- **Arena Allocation**: Single arena per compilation unit |
| 279 | +- **Lifetime Management**: Explicit lifetimes tied to arena |
| 280 | +- **No Garbage Collection**: Manual memory management for predictable performance |
| 281 | + |
| 282 | +### Threading Model |
| 283 | + |
| 284 | +- **File-level Parallelism**: Multiple files processed in parallel |
| 285 | +- **Single-threaded Pipeline**: Each file processed by single thread |
| 286 | +- **Shared State**: Minimal shared state to avoid synchronization overhead |
| 287 | + |
| 288 | +### Compatibility Requirements |
| 289 | + |
| 290 | +- **JavaScript**: ES2024+ compatibility |
| 291 | +- **TypeScript**: Latest TypeScript syntax support |
| 292 | +- **Node.js**: LTS versions through NAPI bindings |
| 293 | +- **Editors**: LSP compatibility for all major editors |
| 294 | + |
| 295 | +## Design Decisions |
| 296 | + |
| 297 | +### Arena Allocator Choice |
| 298 | + |
| 299 | +**Decision**: Use custom arena allocator instead of Rc/Arc |
| 300 | +**Rationale**: |
| 301 | + |
| 302 | +- Eliminates reference counting overhead |
| 303 | +- Enables zero-copy string operations |
| 304 | +- Simplifies memory management |
| 305 | +- Improves cache locality |
| 306 | + |
| 307 | +**Trade-offs**: |
| 308 | + |
| 309 | +- ✅ 10-50% performance improvement |
| 310 | +- ✅ Simplified ownership model |
| 311 | +- ❌ Requires lifetime management |
| 312 | +- ❌ Less flexible memory patterns |
| 313 | + |
| 314 | +### Hand-written Parser |
| 315 | + |
| 316 | +**Decision**: Implement recursive descent parser instead of parser generator |
| 317 | +**Rationale**: |
| 318 | + |
| 319 | +- Easier debugging and maintenance |
| 320 | +- More efficient generated code |
| 321 | +- Faster compilation times |
| 322 | + |
| 323 | +**Trade-offs**: |
| 324 | + |
| 325 | +- ✅ Better performance and error messages |
| 326 | +- ✅ More maintainable code |
| 327 | +- ❌ More manual implementation work |
| 328 | +- ❌ Higher risk of parser bugs |
| 329 | + |
| 330 | +### Visitor Pattern |
| 331 | + |
| 332 | +**Decision**: Use visitor pattern with procedural macros |
| 333 | +**Rationale**: |
| 334 | + |
| 335 | +- Type-safe AST traversal |
| 336 | +- Automatic visitor generation |
| 337 | +- Consistent patterns across tools |
| 338 | +- Efficient dispatch |
| 339 | + |
| 340 | +**Trade-offs**: |
| 341 | + |
| 342 | +- ✅ Type safety and performance |
| 343 | +- ✅ Reduced boilerplate code |
| 344 | +- ❌ Compile-time complexity |
| 345 | +- ❌ Learning curve for contributors |
| 346 | + |
| 347 | +## Future Considerations |
| 348 | + |
| 349 | +### Planned Extensions |
| 350 | + |
| 351 | +- **Formatter**: Complete code formatting tool |
| 352 | +- **Bundler**: Integration with bundling workflows |
| 353 | +- **Type Checker**: Full TypeScript type checking |
| 354 | +- **Plugin System**: User-defined transformations |
| 355 | + |
| 356 | +### Scalability Concerns |
| 357 | + |
| 358 | +- **Large Codebases**: Processing optimization improvements |
| 359 | +- **Memory Usage**: Streaming processing for huge files |
| 360 | +- **Parallel Processing**: Fine-grained parallelization |
| 361 | + |
| 362 | +### Technology Evolution |
| 363 | + |
| 364 | +- **Rust Evolution**: Leveraging new language features |
| 365 | +- **JavaScript Standards**: Keeping pace with TC39 proposals |
| 366 | +- **Editor Integration**: Advanced IDE features |
| 367 | + |
| 368 | +## Development Infrastructure |
| 369 | + |
| 370 | +### Test Infrastructure |
| 371 | + |
| 372 | +Correctness and reliability are taken extremely seriously in Oxc. We spend significant effort on strengthening the test infrastructure to prevent problems from propagating to downstream tools: |
| 373 | + |
| 374 | +- **Conformance Testing**: Test262, Babel, and TypeScript conformance suites |
| 375 | +- **Fuzzing**: Extensive fuzzing to discover edge cases |
| 376 | +- **Snapshot Testing**: Linter diagnostic snapshots for regression prevention |
| 377 | +- **Ecosystem CI**: Testing against real-world codebases |
| 378 | +- **Idempotency Testing**: Ensuring transformations are stable |
| 379 | +- **Code Coverage**: Comprehensive coverage tracking |
| 380 | +- **End-to-End Testing**: Testing against top 3000 npm packages |
| 381 | + |
| 382 | +### Build and Development Tools |
| 383 | + |
| 384 | +- **Rust**: MSRV 1.86.0+ with clippy and rustfmt integration |
| 385 | +- **Just**: Command runner for development tasks (`just --list` for available commands) |
| 386 | +- **Performance Monitoring**: Continuous benchmarking and performance regression detection |
| 387 | +- **Cross-platform**: Support for Linux, macOS, and Windows |
| 388 | +- **CI/CD**: Automated testing, building, and publishing pipelines |
| 389 | + |
| 390 | +For detailed development guidelines, see [CONTRIBUTING.md](./CONTRIBUTING.md) and [AGENTS.md](./AGENTS.md). |
| 391 | + |
| 392 | +--- |
| 393 | + |
| 394 | +This architecture document follows the [architecture.md](https://architecture.md/) format for documenting software architecture decisions and system design. |
0 commit comments