You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: rfcs/0017-incremental-build.md
+35-81Lines changed: 35 additions & 81 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -46,11 +46,11 @@ This shall enable the following workflow:
46
46
47
47
1.**Action:** Build is started
48
48
1. Task A, Task B and Task C are executed in sequence, writing their results into individual writer stages.
49
-
1.*The writer stages and `cache-info.json`are serialized onto disk.*
49
+
1.*Task outputs are written to a content-addressable store and the `cache-info.json`metadata is serialized to disk.*
50
50
1. Build finishes and the resources of all writer stages and the source reader are combined and written into the target output directory.
51
51
* Resources present in later writer stages (and higher versions) are preferred over competing resources with the same path.
52
52
1.**Action:** A source file is modified and a new build is triggered
53
-
1.*The `cache-info.json` is read from disk and the writer stages are imported into the project instance.*
53
+
1.*The `cache-info.json` is read from disk, allowing the build to access cached content from the content-addressable store.*
54
54
1. The build determines which tasks need to be executed using the imported cache and information about the modified source file.
55
55
* In this example, it is determined that Task A and Task C need to be executed since they requested the modified resource in their previous execution.
56
56
1. Task A is executed. The output is written into a new **version** (v2) of the associated writer stage.
@@ -59,18 +59,20 @@ This shall enable the following workflow:
59
59
* Task A can't access v1 of its writer stage. It can only access the combined resources of all previous writer stages.
60
60
1. The `Project Build Cache` determines whether the resources produced in this latest execution of Task A are relevant for Task B. If yes, the content of those resources is compared to the cached content of the resources Task B has received during its last execution. In this example, the output of Task A is not relevant for Task B and it is skipped.
61
61
1. Task C is called and has access to both versions (v1 and v2) of the writer stage of Task A. Allowing it to access all resources produced in all previous executions of Task A.
62
-
1.*Writer stages and `cache-info.json`are serialized onto disk.*
62
+
1.*Task outputs are written to the content-addressable store and the `cache-info.json`is updated.*
63
63
1. The build finishes. The combined resources of all writer stages and the source reader are written to the target output directory.
64
64
65
65

66
66
67
67
### Cache Creation
68
68
69
-
The build cache shall be serialized onto disk in order to use it in successive UI5 Tooling executions. A standardized directory should be used for this, so that UI5 Tooling can automatically find and use the cache.
69
+
The build cache shall be serialized onto disk in order to use it in successive UI5 Tooling executions. This will be done using a **Content-Addressable Store (CAS)** model, which separates file content from metadata. This ensures that each unique piece of content is stored only once on disk, greatly reducing disk space usage and improving I/O performance.
70
70
71
-
Every project has its own cache. This allows for reuse of a project's cache across multiple consuming projects. For example, the `sap.ui.core` library could be built once and the build cache can then be reused in the build of multiple applications.
71
+
Every project has its own cache metadata. This allows for reuse of a project's cache across multiple consuming projects. For example, the `sap.ui.core` library could be built once and the build cache can then be reused in the build of multiple applications.
72
72
73
-
The cache consists of a `cache-info.json` file with the below data structure and multiple directories with the serialized writer stages.
73
+
The cache consists of two main parts:
74
+
1. A global **object store (the CAS)** where all file contents are stored, named by a hash of their content.
75
+
2. A per-project `cache-info.json` file which acts as a lightweight **metadata index**, mapping logical file paths to their content hashes in the object store.
74
76
75
77
#### cache-info.json
76
78
@@ -91,25 +93,19 @@ The cache consists of a `cache-info.json` file with the below data structure and
91
93
"pathsRead": [],
92
94
"patterns": []
93
95
},
94
-
"resourcesRead": {
95
-
"/resources/project/namespace/Component.js": {
96
-
"sha256": "d41d8cd98f00b204e9899998ecf8427e",
97
-
"lastModified": 1734005532120
98
-
}
96
+
"inputs": {
97
+
// Map of logical paths read to their content hashes
@@ -121,83 +117,42 @@ The cache key can be used to identify the cache. It shall be based on the projec
121
117
122
118
**taskCache**
123
119
124
-
An array of objects, each representing a task that was executed during the build. The object contains the name of the task, the project resources that were read and written by the task, and the resources that were read from the project's dependencies. If the task used glob patterns to read resources, those patterns are stored instead of the resolved paths so that the pattern can later be matched against newly created resources that might invalidate the task.
125
-
126
-
For each resource that has been read or written, the SHA256 hash of the content and the timestamp of last modification are stored. This allows the UI5 Tooling to determine whether the resource has changed since the last build and whether the task cache is still valid.
120
+
An array of objects, each representing a task that was executed during the build. The object contains the name of the task and its resource requests. `inputs` maps the logical path of resources read by the task to their content hash, and `outputs` does the same for resources written by the task. This hash acts as a pointer to the actual file content in the shared CAS object store. If the task used glob patterns to read resources, those patterns are stored so that they can be matched against newly created resources.
127
121
128
122
**sourceMetadata**
129
123
130
-
For each *source* file of the project, the SHA256 hash of the content and the timestamp of last modification are stored. This allows the UI5 Tooling to determine whether the source files have changed since the last build.
124
+
For each *source* file of the project, this object maps the logical path to the SHA256 hash of its content. This allows the UI5 Tooling to quickly determine whether source files have changed since the last build.
131
125
132
126
#### Cache directory structure
133
127
128
+
The directory structure is flat and efficient. A global `cas/` directory stores all unique file contents from all builds, while project-specific directories contain only their lightweight metadata.
129
+
134
130
```
135
131
.ui5-cache
132
+
├── cas/ <-- Global Content-Addressable Store (shared across all projects)
133
+
│ ├── c1c77edc5c689a471b12fe8ba79c51d1 (Content of one file)
134
+
│ ├── d41d8cd98f00b204e9899998ecf8427e (Content of another file)
The directories inside `taskCache/` shall each represent a writer stage, prefixed by an integer number reflecting the order of creation in the build. The directories contain all resources that have been *written* by the task associated with that stage.
145
+
All unique file contents from all projects and their builds are stored **once**in the global `cas` directory, named by their content hash. This automatic deduplication leads to significant disk space savings.
191
146
192
147

193
148
194
149
### Cache Import
195
150
196
-
Before building a project, UI5 Tooling shall scan for a cache directory with the respective cache key and import the cache if one is found.
151
+
Before building a project, UI5 Tooling shall scan for a cache directory with the respective cache key and import the cache if one is found.
197
152
198
-
The import process mainly populates the `Build Task Cache` instances with the information from the `cache-info.json` file and creates readers for the individual `taskCache` directories (representing the writers of each task's previous execution). Those readers are then set as the initial version (v1) writer stages in the corresponding `Project` instance.
153
+
The import process is very fast, as it only involves reading the lightweight `cache-info.json` file to populate the `Build Task Cache` instances with their metadata. When the build process needs to access a cached resource, it uses the metadata map to find the content hash and reads the corresponding file directly from the global `cas` store.
199
154
200
-
This allows executing individual tasks and provide them with the results of all tasks that would normally have been executed before them. Also, the task can decide to only process a few changed resources while the build result will still contain all resources that were written by any of the the task's previous executions.
155
+
This allows executing individual tasks and providing them with the results of all preceding tasks without the overhead of creating numerous file system readers or managing physical copies of files for each build stage.
201
156
202
157

203
158
@@ -229,7 +184,7 @@ The UI5 Tooling server shall integrate the incremental build as a mean to pre-pr
229
184
230
185
Middleware like `serveThemes` (used for compiling LESS resources to CSS) would become obsolete with this, since the `buildThemes` task will be executed instead.
231
186
232
-
If any project (root or dependency) configures custom tasks, those tasks are executed in the server as well. This makes it possible to easily integrate projects with custom tasks as dependencies.
187
+
If any project (root or dependency) defines custom tasks, those tasks are executed in the server as well. This makes it possible to easily integrate projects with custom tasks as dependencies.
233
188
234
189
Since executing a full build requires more time than the on-the-fly processing of resources currently implemented in the UI5 Tooling server, users shall be able to disable individual tasks that are not necessarily needed during development. This can be done using CLI parameters as well as ui5.yaml configuration.
235
190
@@ -256,7 +211,7 @@ All of this should be communicated in the UI5 Tooling documentation and in blog
256
211
* Projects might have to adapt their configurations
257
212
* Custom tasks might need to be adapted. Before they could only access the sources of a project. With this change, they will access the build result instead. Access to the sources is still possible but requires the use of a dedicated API
258
213
* UI5 Tooling standard tasks need to be adapted to use the new cache API. Especially the bundling tasks currently have no concept for partially re-creating bundles. However, this is an essential requirement to achieve fast incremental builds.
259
-
*The project build cache might become very large and consume a lot of disk space. On systems with restricted disk space or slow I/O operations, this could lead to a worse performance.
214
+
*While the content-addressable cache is highly efficient at deduplication, the central cache can still grow very large over time. A robust purging mechanism is critical for managing disk space.
260
215
261
216
## Alternatives
262
217
@@ -267,10 +222,9 @@ An alternative to using the incremental build in the UI5 Tooling server would be
267
222
* Measure performance in BAS. Find out whether this approach results in acceptable performance.
268
223
* How to distinguish projects with build cache from pre-built projects (with project manifest)
269
224
* Cache related topics
270
-
* Clarify cache key
225
+
* Clarify cache key
271
226
* Current POC: project version + dependency versions + build config + UI5 Tooling module versions
272
227
* Include resource tags in cache
273
-
* Compress cache to reduce memory pressure
274
228
* Allow tasks to store additional information in the cache
275
229
* Cache Purging
276
230
* Some tasks might be relevant for the server only (e.g. code coverage), come up with a way to configure that
0 commit comments