|
| 1 | +# AGENTS.md |
| 2 | + |
| 3 | +This file provides guidance to AI agents when working with code in this repository. |
| 4 | + |
| 5 | +## Project Overview |
| 6 | + |
| 7 | +UnityDataTools is a .NET 9.0 command-line tool for analyzing Unity build output (AssetBundles, Player builds, Addressables). It extracts data from Unity's proprietary binary formats into SQLite databases and human-readable text files. The tool showcases the UnityFileSystemApi native library and serves as both a production tool and reference implementation. |
| 8 | + |
| 9 | +## Common Commands |
| 10 | + |
| 11 | +### Building |
| 12 | +```bash |
| 13 | +# Build entire solution in Release mode |
| 14 | +dotnet build -c Release |
| 15 | + |
| 16 | +# Build from solution file |
| 17 | +dotnet build UnityDataTools.sln -c Release |
| 18 | + |
| 19 | +# Build specific project |
| 20 | +dotnet build UnityDataTool/UnityDataTool.csproj -c Release |
| 21 | +``` |
| 22 | + |
| 23 | +Output location (Windows): `UnityDataTool\bin\Release\net9.0\UnityDataTool.exe` |
| 24 | + |
| 25 | +### Publishing (Mac-specific) |
| 26 | +```bash |
| 27 | +# Intel Mac |
| 28 | +dotnet publish UnityDataTool -c Release -r osx-x64 -p:PublishSingleFile=true -p:UseAppHost=true |
| 29 | + |
| 30 | +# Apple Silicon Mac |
| 31 | +dotnet publish UnityDataTool -c Release -r osx-arm64 -p:PublishSingleFile=true -p:UseAppHost=true |
| 32 | +``` |
| 33 | + |
| 34 | +### Testing |
| 35 | +```bash |
| 36 | +# Run all tests |
| 37 | +dotnet test |
| 38 | + |
| 39 | +# Run tests for specific project |
| 40 | +dotnet test UnityFileSystem.Tests/UnityFileSystem.Tests.csproj |
| 41 | +dotnet test Analyzer.Tests/Analyzer.Tests.csproj |
| 42 | +dotnet test UnityDataTool.Tests/UnityDataTool.Tests.csproj |
| 43 | + |
| 44 | +# Run tests with filter |
| 45 | +dotnet test --filter "FullyQualifiedName~SerializedFile" |
| 46 | +``` |
| 47 | + |
| 48 | +Test projects: UnityFileSystem.Tests, Analyzer.Tests, UnityDataTool.Tests, TestCommon (helper library) |
| 49 | + |
| 50 | +### Running the Tool |
| 51 | +```bash |
| 52 | +# Show all commands |
| 53 | +UnityDataTool --help |
| 54 | + |
| 55 | +# Analyze AssetBundles into SQLite database |
| 56 | +UnityDataTool analyze /path/to/bundles -o database.db |
| 57 | + |
| 58 | +# Dump binary file to text format |
| 59 | +UnityDataTool dump /path/to/file.bundle -o /output/path |
| 60 | + |
| 61 | +# Extract archive contents |
| 62 | +UnityDataTool archive extract file.bundle -o contents/ |
| 63 | + |
| 64 | +# Find reference chains to an object |
| 65 | +UnityDataTool find-refs database.db -n "ObjectName" -t "Texture2D" |
| 66 | +``` |
| 67 | + |
| 68 | +## Architecture |
| 69 | + |
| 70 | +### Component Hierarchy |
| 71 | +``` |
| 72 | +UnityDataTool (CLI executable) |
| 73 | +├── Analyzer → SQLite database generation |
| 74 | +├── TextDumper → Human-readable text output |
| 75 | +├── ReferenceFinder → Object reference chain tracing |
| 76 | +└── UnityFileSystem → C# wrapper for native library |
| 77 | + └── UnityFileSystemApi (native .dll/.dylib/.so) |
| 78 | +``` |
| 79 | + |
| 80 | +### Key Architectural Patterns |
| 81 | + |
| 82 | +**Native Interop**: UnityFileSystem wraps UnityFileSystemApi (native library from Unity Editor) via P/Invoke in `DllWrapper.cs`. The native library reads Unity Archive and SerializedFile formats. |
| 83 | + |
| 84 | +**TypeTree Navigation**: Unity binary files contain TypeTrees that describe object serialization. The `RandomAccessReader` class navigates these trees like property accessors: `reader["m_Name"].GetValue<string>()`. This enables the tool to interpret objects without hardcoded type knowledge. |
| 85 | + |
| 86 | +**Parser Pattern**: `ISQLiteFileParser` interface allows multiple parsers to handle different file formats: |
| 87 | +- `SerializedFileParser` - Unity binary files (AssetBundles, Player data) |
| 88 | +- `AddressablesBuildLayoutParser` - JSON build reports |
| 89 | + |
| 90 | +**Handler Registry**: Type-specific handlers extract specialized properties for Unity object types. Handlers implement `ISQLiteHandler` and are registered in `SerializedFileSQLiteWriter.m_Handlers`: |
| 91 | +- `MeshHandler` - vertices, indices, bones, blend shapes |
| 92 | +- `Texture2DHandler` - width, height, format, mipmaps |
| 93 | +- `ShaderHandler` - variants, keywords, subprograms |
| 94 | +- `AudioClipHandler` - compression, channels, frequency |
| 95 | +- `AnimationClipHandler` - legacy flag, events |
| 96 | +- `AssetBundleHandler` - dependencies, preload data |
| 97 | +- `PreloadDataHandler` - preloaded assets |
| 98 | + |
| 99 | +**SQL Schema Resources**: Each handler has an embedded `.sql` resource file defining its tables and views (e.g., `Analyzer/SQLite/Resources/Mesh.sql`). Views join type-specific tables with the base `objects` table. |
| 100 | + |
| 101 | +**Command Pattern**: SQL operations are encapsulated in classes derived from `AbstractCommand` with `CreateCommand()`, `SetValue()`, `ExecuteNonQuery()` methods. |
| 102 | + |
| 103 | +### Data Flow (Analyze Command) |
| 104 | + |
| 105 | +1. `Program.cs` → `HandleAnalyze()` → `AnalyzerTool.Analyze()` |
| 106 | +2. AnalyzerTool finds files matching search pattern |
| 107 | +3. For each file, parsers are tried in order (JSON first, then SerializedFile) |
| 108 | +4. `SerializedFileParser.ProcessFile()`: |
| 109 | + - Checks for Unity Archive signature → calls `MountArchive()` |
| 110 | + - Otherwise treats as SerializedFile → calls `OpenSerializedFile()` |
| 111 | +5. `SerializedFileSQLiteWriter.WriteSerializedFile()`: |
| 112 | + - Iterates through `sf.Objects` |
| 113 | + - Gets TypeTree via `sf.GetTypeTreeRoot(objectId)` |
| 114 | + - Creates `RandomAccessReader` to navigate properties |
| 115 | + - Looks up type-specific handler in `m_Handlers` dictionary |
| 116 | + - Handler extracts specialized properties (e.g., MeshHandler reads vertex count) |
| 117 | + - Writes to `objects` table + type-specific table (e.g., `meshes`) |
| 118 | + - Optionally processes PPtrs (references) and calculates CRC32 |
| 119 | +6. SQLiteWriter finalizes database with indexes and views |
| 120 | + |
| 121 | +### Important Files |
| 122 | + |
| 123 | +**Entry Points**: |
| 124 | +- `UnityDataTool/Program.cs` - CLI using System.CommandLine |
| 125 | +- `UnityDataTool/Commands/` - Command handlers (Analyze.cs, Dump.cs, Archive.cs, FindReferences.cs) |
| 126 | + |
| 127 | +**Core Libraries**: |
| 128 | +- `UnityFileSystem/UnityFileSystem.cs` - Init(), MountArchive(), OpenSerializedFile() |
| 129 | +- `UnityFileSystem/DllWrapper.cs` - P/Invoke bindings to native library |
| 130 | +- `UnityFileSystem/SerializedFile.cs` - Represents binary data files |
| 131 | +- `UnityFileSystem/RandomAccessReader.cs` - TypeTree property navigation |
| 132 | + |
| 133 | +**Analyzer**: |
| 134 | +- `Analyzer/AnalyzerTool.cs` - Main API entry point |
| 135 | +- `Analyzer/SQLite/SQLiteWriter.cs` - Base class for database writers |
| 136 | +- `Analyzer/SQLite/Writers/SerializedFileSQLiteWriter.cs` - Handler registration |
| 137 | +- `Analyzer/SQLite/Writers/AddressablesBuildLayoutSQLWriter.cs` - JSON report processing |
| 138 | +- `Analyzer/SQLite/Handlers/` - Type-specific extractors |
| 139 | +- `Analyzer/SerializedObjects/` - RandomAccessReader-based property readers |
| 140 | +- `Analyzer/SQLite/Resources/` - SQL DDL schema files |
| 141 | + |
| 142 | +**TextDumper**: |
| 143 | +- `TextDumper/TextDumperTool.cs` - Converts binary to YAML-like text |
| 144 | + |
| 145 | +**ReferenceFinder**: |
| 146 | +- `ReferenceFinder/ReferenceFinderTool.cs` - Traces object dependency chains |
| 147 | + |
| 148 | +## Extending the Tool |
| 149 | + |
| 150 | +### Adding New Unity Type Support |
| 151 | + |
| 152 | +1. Create handler class implementing `ISQLiteHandler`: |
| 153 | + ``` |
| 154 | + Analyzer/SQLite/Handlers/FooHandler.cs |
| 155 | + ``` |
| 156 | + |
| 157 | +2. Create reader class using RandomAccessReader: |
| 158 | + ``` |
| 159 | + Analyzer/SerializedObjects/Foo.cs |
| 160 | + ``` |
| 161 | + |
| 162 | +3. Register handler in `SerializedFileSQLiteWriter.cs`: |
| 163 | + ```csharp |
| 164 | + m_Handlers["Foo"] = new FooHandler(); |
| 165 | + ``` |
| 166 | + |
| 167 | +4. Create SQL schema resource: |
| 168 | + ``` |
| 169 | + Analyzer/SQLite/Resources/Foo.sql |
| 170 | + ``` |
| 171 | + Define tables (e.g., `foos`) and views (e.g., `foo_view` joining `objects` and `foos`) |
| 172 | + |
| 173 | +5. Reference the schema in handler's GetResourceName() method |
| 174 | + |
| 175 | +### Adding New File Format Support |
| 176 | + |
| 177 | +1. Create parser implementing `ISQLiteFileParser` |
| 178 | +2. Create writer derived from `SQLiteWriter` |
| 179 | +3. Add parser to `AnalyzerTool.parsers` list |
| 180 | +4. Create SQL schema and Command classes as needed |
| 181 | + |
| 182 | +Example: Addressables support uses `AddressablesBuildLayoutParser` + `AddressablesBuildLayoutSQLWriter` to parse JSON build reports. |
| 183 | + |
| 184 | +## Important Concepts |
| 185 | + |
| 186 | +### TypeTrees |
| 187 | +TypeTrees describe how Unity objects are serialized (property names, types, offsets). They enable: |
| 188 | +- Backward compatibility - reading files from different Unity versions |
| 189 | +- Generic parsing without hardcoded type definitions |
| 190 | +- Support for custom MonoBehaviours/ScriptableObjects |
| 191 | + |
| 192 | +**Critical**: Player builds exclude TypeTrees by default for performance. To analyze Player data, enable the "ForceAlwaysWriteTypeTrees" diagnostic switch during build. |
| 193 | + |
| 194 | +### File Formats |
| 195 | +- **Unity Archive** - Container format (AssetBundles, .data files). Can be mounted as virtual filesystem. |
| 196 | +- **SerializedFile** - Binary format storing Unity objects with TypeTree metadata. |
| 197 | +- **Addressables BuildLayout** - JSON build report (buildlogreport.json, AddressablesReport.json) |
| 198 | + |
| 199 | +### Database Views |
| 200 | +The SQLite output uses views extensively to join base `objects` table with type-specific tables: |
| 201 | +- `object_view` - All objects with basic properties |
| 202 | +- `mesh_view` - Objects + mesh-specific columns |
| 203 | +- `texture_view` - Objects + texture-specific columns |
| 204 | +- `shader_view` - Objects + shader-specific columns |
| 205 | +- `view_breakdown_by_type` - Aggregated size by type |
| 206 | +- `view_potential_duplicates` - Assets included multiple times |
| 207 | +- `asset_view` - Explicitly assigned assets only |
| 208 | +- `shader_keyword_ratios` - Keyword variant analysis |
| 209 | + |
| 210 | +See `Analyzer/README.md` and `Documentation/addressables-build-reports.md` for complete database schema documentation. |
| 211 | + |
| 212 | +### Common Issues |
| 213 | + |
| 214 | +**TypeTree Errors**: "Invalid object id" during analyze means SerializedFile lacks TypeTrees. Enable ForceAlwaysWriteTypeTrees or use files built with TypeTrees. |
| 215 | + |
| 216 | +**File Loading Warnings**: "Failed to load... File may be corrupted" is normal for non-Unity files in analyzed directories. Use `-p` search pattern to filter (e.g., `-p "*.bundle"`). |
| 217 | + |
| 218 | +**SQL UNIQUE Constraint Errors**: Occurs when same SerializedFile name appears in multiple archives. This happens when analyzing multiple builds in same directory or using AssetBundle variants. See `Documentation/comparing-builds.md` for solutions. |
| 219 | + |
| 220 | +**Mac Security**: "UnityFileSystemApi.dylib cannot be opened" - Open System Preferences → Security & Privacy and allow the library. |
| 221 | + |
| 222 | +## Native Library (UnityFileSystemApi) |
| 223 | + |
| 224 | +The native library is included for Windows, Mac, and Linux in `UnityFileSystem/` directory. It's backward compatible and reads data files from most Unity versions. |
| 225 | + |
| 226 | +To use a specific Unity version's library: |
| 227 | +1. Find library in Unity Editor installation: `{UnityEditor}/Data/Tools/` |
| 228 | +2. Copy to `UnityDataTool/UnityFileSystem/`: |
| 229 | + - Windows: `UnityFileSystemApi.dll` |
| 230 | + - Mac: `UnityFileSystemApi.dylib` |
| 231 | + - Linux: `UnityFileSystemApi.so` |
| 232 | +3. Rebuild the tool |
| 233 | + |
| 234 | +## Testing Data |
| 235 | + |
| 236 | +UnityFileSystemTestData is a Unity project that generates test data for the test suites. TestCommon provides shared test utilities. |
0 commit comments