Virtual Datasets #8708

frcroth · 2025-06-23T12:58:18Z

URL of deployed dev instance (used for testing):

https://___.webknossos.xyz

Steps to test:

Upload dataset
Explore remote dataset
View datasets
Create, update and save annotations
Update datasets in WK
Delete datasets
Dataset composition
Zarr streaming
Zarr streaming via Legacy Routes
Make changes to JSON and check if they are registered in DB
Interact with the datasets using the libs/voxelytics/worker

Implementation notes

DataSource Id is still used internally in the datastore for various caches, also in the binary data handling.
Handling of real disk data sources is still done via orgid and datasetDirectoryName (e.g., uploading, storage size)
Everything else should use dataset ids.

TODOs:

Issues:

fixes Virtual datasets #8384

Followups:

Readable URIs for zarr routes Readable URIs for Zarr streaming #8812
Test remote datasets that are recreated when refreshing schema. Include remote datasets in initial testing data #8813
Protocol Changes Remove datastore from communication about datasets where not neccessary #8814
- Deletion in WK -> WK asks datastore to delete on disk if not virtual
- Do we need to call register datasets at the datastore? The datastore does not need to be involved in creation of virtual datasets
Can we make datasetDirectoryName optional?
make the datastore property optional?
remove DataSourceRepository entirely, and have the checkInbox function report to wk directly

(Please delete unneeded items, merge only when none are left open)

Added changelog entry (create a $PR_NUMBER.md file in unreleased_changes or use ./tools/create-changelog-entry.py)
Added migration guide entry if applicable (edit the same file as for the changelog)
Updated documentation if applicable
Adapted wk-libs python client if relevant API parts change
Removed dev-only changes like prints and application.conf edits
Considered common edge cases
Needs datastore update after deployment

coderabbitai · 2025-06-23T12:58:35Z

📝 Walkthrough

Walkthrough

This change introduces "virtual datasets" by shifting dataset identification and management from organization/directory-based keys to a unified dataset ID (ObjectId). It updates backend and frontend APIs, controllers, services, and database schemas to use dataset IDs throughout, removes legacy fields and methods, and adds support for creating, updating, and deleting virtual datasets. The codebase now consistently references datasets by ID, with new and updated endpoints, routes, and schema migrations.

Changes

Cohort / File(s)	Change Summary
Controllers: Dataset, Annotation, UserToken, RemoteDataStore, TracingStore `app/controllers/AnnotationIOController.scala`, `app/controllers/DatasetController.scala`, `app/controllers/UserTokenController.scala`, `app/controllers/WKRemoteDataStoreController.scala`, `app/controllers/WKRemoteTracingStoreController.scala`	Controller methods updated to use dataset IDs instead of organization/directory names; new endpoints for virtual dataset management; access validation extended for delete; compose endpoint added.
Dataset Model, Service, Compose Logic `app/models/dataset/ComposeService.scala`, `app/models/dataset/Dataset.scala`, `app/models/dataset/DatasetService.scala`, `app/models/dataset/explore/WKExploreRemoteLayerService.scala`, `app/models/annotation/AnnotationService.scala`	Virtual dataset support: new ComposeService, new field `isVirtual`, logic for creating/updating virtual datasets, support for WKW layers, method signature updates, and removal of legacy compose logic.
Remote Data Store Clients `app/models/dataset/WKRemoteDataStoreClient.scala`, `webknossos-datastore/app/com/scalableminds/webknossos/datastore/services/DSRemoteWebknossosClient.scala`, `webknossos-tracingstore/app/com/scalableminds/webknossos/tracingstore/TSRemoteDatastoreClient.scala`, `webknossos-tracingstore/app/com/scalableminds/webknossos/tracingstore/TSRemoteWebknossosClient.scala`	API clients and caches updated to use dataset IDs for all dataset operations, new endpoints for registration, update, deletion, and dataset ID lookup.
Data Layer Model and Helpers `webknossos-datastore/app/com/scalableminds/webknossos/datastore/models/datasource/DataLayer.scala`, `webknossos-datastore/app/com/scalableminds/webknossos/datastore/dataformats/layers/WKWDataLayers.scala`, `webknossos-datastore/app/com/scalableminds/webknossos/datastore/dataformats/layers/N5DataLayers.scala`, `webknossos-datastore/app/com/scalableminds/webknossos/datastore/dataformats/layers/PrecomputedDataLayers.scala`, `webknossos-datastore/app/com/scalableminds/webknossos/datastore/dataformats/layers/Zarr3DataLayers.scala`, `webknossos-datastore/app/com/scalableminds/webknossos/datastore/dataformats/layers/ZarrDataLayers.scala`, `webknossos-tracingstore/app/com/scalableminds/webknossos/tracingstore/tracings/RemoteFallbackLayer.scala`, `webknossos-tracingstore/app/com/scalableminds/webknossos/tracingstore/tracings/volume/VolumeTracingService.scala`, `webknossos-tracingstore/app/com/scalableminds/webknossos/tracingstore/tracings/volume/VolumeTracingBucketHelper.scala`, `webknossos-tracingstore/app/com/scalableminds/webknossos/tracingstore/tracings/volume/VolumeTracingLayer.scala`, `webknossos-tracingstore/app/com/scalableminds/webknossos/tracingstore/tracings/editablemapping/EditableMappingLayer.scala`	Remove legacy fields (e.g., `wkwResolutions`), unify on `mags`; update all related logic and serialization; remove cube size indirection; update fallback layer construction to use dataset IDs.
Datastore Controllers and Services `webknossos-datastore/app/com/scalableminds/webknossos/datastore/controllers/BinaryDataController.scala`, `webknossos-datastore/app/com/scalableminds/webknossos/datastore/controllers/DataSourceController.scala`, `webknossos-datastore/app/com/scalableminds/webknossos/datastore/controllers/DSMeshController.scala`, `webknossos-datastore/app/com/scalableminds/webknossos/datastore/controllers/ZarrStreamingController.scala`, `webknossos-datastore/app/com/scalableminds/webknossos/datastore/services/AccessTokenService.scala`, `webknossos-datastore/app/com/scalableminds/webknossos/datastore/services/DatasetCache.scala`, `webknossos-datastore/app/com/scalableminds/webknossos/datastore/services/DataSourceService.scala`, `webknossos-datastore/app/com/scalableminds/webknossos/datastore/services/mesh/DSFullMeshService.scala`, `webknossos-datastore/app/com/scalableminds/webknossos/datastore/helpers/DatasetDeleter.scala`, `webknossos-datastore/app/com/scalableminds/webknossos/datastore/services/uploading/UploadService.scala`	All dataset-related endpoints and internal logic refactored to use dataset IDs; cache and access token logic updated; new/updated methods for registration, update, and deletion; helper methods updated for symlink and disk operations.
Frontend API, Types, and Components `frontend/javascripts/admin/api/mesh.ts`, `frontend/javascripts/admin/rest_api.ts`, `frontend/javascripts/admin/dataset/composition_wizard/04_configure_new_dataset.tsx`, `frontend/javascripts/dashboard/advanced_dataset/dataset_action_view.tsx`, `frontend/javascripts/dashboard/dataset/dataset_settings_data_tab.tsx`, `frontend/javascripts/dashboard/dataset/dataset_settings_delete_tab.tsx`, `frontend/javascripts/dashboard/dataset/dataset_settings_provider.tsx`, `frontend/javascripts/dashboard/dataset/dataset_settings_viewconfig_tab.tsx`, `frontend/javascripts/types/api_types.ts`, `frontend/javascripts/types/schemas/datasource.schema.ts`, `frontend/javascripts/types/schemas/datasource.types.ts`, `frontend/javascripts/viewer/model/bucket_data_handling/wkstore_adapter.ts`, `frontend/javascripts/viewer/model/sagas/load_histogram_data_saga.ts`, `frontend/javascripts/viewer/model/sagas/meshes/precomputed_mesh_saga.ts`, `frontend/javascripts/viewer/view/right-border-tabs/connectome_tab/connectome_view.tsx`, `frontend/javascripts/viewer/view/right-border-tabs/segments_tab/segments_view_helper.tsx`	All API calls, types, and UI components refactored to use dataset IDs; type definitions updated (`LayerLink`, `DataLayerWKWPartial`); all usages of `wkwResolutions` replaced with `mags`; composition wizard and settings updated.
Tests `frontend/javascripts/test/backend-snapshot-tests/datasets.e2e.ts`, `frontend/javascripts/test/model/binary/layers/wkstore_adapter.spec.ts`	Tests updated to use dataset IDs instead of organization/directory names; helper functions added for resolving dataset IDs.
Routes and Schema `conf/webknossos.latest.routes`, `conf/webknossos.versioned.routes`, `conf/evolutions/137-virtual-datasets.sql`, `conf/evolutions/reversions/137-virtual-datasets.sql`, `webknossos-datastore/conf/datastore.latest.routes`, `webknossos-datastore/conf/datastore.versioned.routes`, `webknossos-tracingstore/conf/tracingstore.versioned.routes`, `tools/postgres/schema.sql`	All routes updated to use dataset IDs; new API version (v10) introduced; schema migration for `isVirtual` and removal of `cubeLength`; migration scripts for upgrade/downgrade.
Documentation and Unreleased Changes `docs/data/concepts.md`, `unreleased_changes/8708.md`	Documentation and unreleased changes updated to reflect `mags` replacing `wkwResolutions`, dataset ID usage, and breaking changes for client libraries.
Legacy and Removed Code `webknossos-datastore/app/com/scalableminds/webknossos/datastore/controllers/LegacyController.scala`, `webknossos-datastore/app/com/scalableminds/webknossos/datastore/controllers/WKDatasetController.scala`, `webknossos-datastore/app/com/scalableminds/webknossos/datastore/services/uploading/ComposeService.scala`	Added legacy controller for v9 and earlier; removed old dataset controller and old compose service.

Estimated code review effort

🎯 5 (Critical) | ⏱️ ~90+ minutes

Complexity: Critical — The changes are extensive, cross-cutting, and high-impact, affecting core backend, frontend, API, schema, and documentation. They introduce new concepts, remove legacy code, and require careful validation of backward compatibility and migration logic.

Assessment against linked issues

Objective	Addressed	Explanation
Support for virtual datasets in the database, not requiring a folder on disk (#8384)	✅
Use dataset IDs (ObjectId) for all dataset identification and API communication (#8384)	✅
Allow creation, update, and deletion of virtual datasets (#8384)	✅
Remove reliance on symlinks and directory-based identification (#8384)	✅
Remove `wkwResolutions`, use `mags` for all new datasets (#8384)	✅

Assessment against linked issues: Out-of-scope changes

Code Change	Explanation
Addition of legacy controller `LegacyController` for v9 and earlier API support (`webknossos-datastore/app/com/scalableminds/webknossos/datastore/controllers/LegacyController.scala`)	This addition is out of scope for the virtual datasets feature but is likely included to maintain backward compatibility for older API versions.
Removal of old compose service and controller (`webknossos-datastore/app/com/scalableminds/webknossos/datastore/services/uploading/ComposeService.scala`, `webknossos-datastore/app/com/scalableminds/webknossos/datastore/controllers/WKDatasetController.scala`)	These removals are related to deprecated compose functionality and dataset controller, not directly part of virtual dataset creation or ID migration but necessary for codebase cleanup.

Possibly related PRs

Virtual Remote Datasets #8657: Also refactors dataset access to use dataset IDs instead of organization/datasetName, touching similar controllers and clients.
Use ObjectId directly as parameter in routes #8285: Refactors method signatures and route parameters to use ObjectId directly, aligning with the dataset ID migration in this PR.
Fix creating volume tasks with base annotation #8468: Related to volume tracing and task creation workflows involving dataset and data source parameters.

Suggested labels

frontend, enhancement

Suggested reviewers

fm3
MichaelBuessemeyer
daniel-wer

Poem

🐰✨
A hop, a leap, a dataset ID,
No more folders, no more symlinks—free!
Virtual dreams now take their place,
With "mags" and IDs in every space.
Old ways gone, new routes begun,
Webknossos' journey has just begun!
🥕

Note

⚡️ Unit Test Generation is now available in beta!

Learn more here, or try it out under "Finishing Touches" below.

✨ Finishing Touches

🧪 Generate unit tests

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch explore-virtual-datasets

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

Review comments: Directly reply to a review comment made by CodeRabbit. Example:
- I pushed a fix in commit <commit_id>, please review it.
- Explain this complex logic.
- Open a follow-up GitHub issue for this discussion.
Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
- @coderabbitai explain this code block.
PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
- @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
- @coderabbitai read src/utils.ts and explain its main purpose.
- @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

CodeRabbit Commands (Invoked using PR comments)

@coderabbitai pause to pause the reviews on a PR.
@coderabbitai resume to resume the paused reviews.
@coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
@coderabbitai full review to do a full review from scratch and review all the files again.
@coderabbitai summary to regenerate the summary of the PR.
@coderabbitai generate docstrings to generate docstrings for this PR.
@coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
@coderabbitai generate unit tests to generate unit tests for this PR.
@coderabbitai resolve resolve all the CodeRabbit review comments.
@coderabbitai configuration to show the current CodeRabbit configuration for the repository.
@coderabbitai help to get help.

Other keywords and placeholders

Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (`.coderabbit.yaml`)

You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
Please see the configuration documentation for more information.
If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

Visit our Documentation for detailed information on how to use CodeRabbit.
Join our Discord Community to get help, request features, and share feedback.
Follow us on X/Twitter for updates and announcements.

…etid routes

coderabbitai

Actionable comments posted: 0

♻️ Duplicate comments (2)

webknossos-datastore/app/com/scalableminds/webknossos/datastore/services/DSRemoteWebknossosClient.scala (1)

235-247: Caching implementation follows best practices

The caching implementation with 5-minute TTL is appropriate for dataset ID lookups.

webknossos-datastore/app/com/scalableminds/webknossos/datastore/controllers/DataSourceController.scala (1)

694-716: Cache invalidation properly implemented

The method correctly invalidates the cache after updating the remote data source (line 711), addressing the previous review comment about stale cache issues.

🧹 Nitpick comments (3)

webknossos-datastore/app/com/scalableminds/webknossos/datastore/services/DSRemoteWebknossosClient.scala (1)
144-156: Consider adding response validation

The method extracts response.body directly without validating if it's a valid dataset ID format. Consider parsing it as an ObjectId to ensure validity.
-      datasetId = response.body
+      datasetId <- ObjectId.fromString(response.body).toFox ?~> "Invalid dataset ID in response"
webknossos-datastore/app/com/scalableminds/webknossos/datastore/controllers/LegacyController.scala (1)
457-499: Consider adding cache invalidation after disk load

In the loadFromDisk function, after successfully updating the data source repository, the dataset cache should be invalidated to ensure subsequent operations get the fresh data.
         case GenericDataSource(_, _, _, _) =>
           for {
             _ <- dataSourceRepository.updateDataSource(dataSource)
+            _ = datasetCache.invalidateCache(datasetId)
           } yield Ok(Json.toJson(dataSource))
webknossos-datastore/app/com/scalableminds/webknossos/datastore/controllers/DataSourceController.scala (1)
371-381: Consider more specific error messages

The generic error message "dataset.add.failed" could be more specific to help users understand what went wrong during registration.
-          datasetId <- dsRemoteWebknossosClient.registerDataSource(dataSource, dataSourceId, folderId) ?~> "dataset.add.failed"
+          datasetId <- dsRemoteWebknossosClient.registerDataSource(dataSource, dataSourceId, folderId) ?~> "dataset.add.failed.registration"

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between fd3b021 and 1957283.

📒 Files selected for processing (4)

webknossos-datastore/app/com/scalableminds/webknossos/datastore/controllers/DataSourceController.scala (18 hunks)
webknossos-datastore/app/com/scalableminds/webknossos/datastore/controllers/LegacyController.scala (1 hunks)
webknossos-datastore/app/com/scalableminds/webknossos/datastore/services/DSRemoteWebknossosClient.scala (5 hunks)
webknossos-datastore/app/com/scalableminds/webknossos/datastore/services/DatasetCache.scala (1 hunks)

🚧 Files skipped from review as they are similar to previous changes (1)

webknossos-datastore/app/com/scalableminds/webknossos/datastore/services/DatasetCache.scala

🧰 Additional context used

🧠 Learnings (3)

📚 Learning: in the `updatemags` method of datasetmagsdao (scala), the code handles different dataset types disti...

Learnt from: frcroth
PR: scalableminds/webknossos#8609
File: app/models/dataset/Dataset.scala:753-775
Timestamp: 2025-05-12T13:07:29.637Z
Learning: In the `updateMags` method of DatasetMagsDAO (Scala), the code handles different dataset types distinctly:
1. Non-WKW datasets have `magsOpt` populated and use the first branch which includes axisOrder, channelIndex, and credentialId.
2. WKW datasets will have `wkwResolutionsOpt` populated and use the second branch which includes cubeLength.
3. The final branch is a fallback for legacy data.
This ensures appropriate fields are populated for each dataset type.

Applied to files:

webknossos-datastore/app/com/scalableminds/webknossos/datastore/services/DSRemoteWebknossosClient.scala
webknossos-datastore/app/com/scalableminds/webknossos/datastore/controllers/DataSourceController.scala
webknossos-datastore/app/com/scalableminds/webknossos/datastore/controllers/LegacyController.scala

📚 Learning: in scala's for-comprehension with fox (future-like type), the `<-` operator ensures sequential execu...

Learnt from: MichaelBuessemeyer
PR: scalableminds/webknossos#8352
File: app/models/organization/CreditTransactionService.scala:0-0
Timestamp: 2025-01-27T12:06:42.865Z
Learning: In Scala's for-comprehension with Fox (Future-like type), the `<-` operator ensures sequential execution. If any step fails, the entire chain short-circuits and returns early, preventing subsequent operations from executing. This makes it safe to perform validation checks before database operations.

Applied to files:

webknossos-datastore/app/com/scalableminds/webknossos/datastore/services/DSRemoteWebknossosClient.scala
webknossos-datastore/app/com/scalableminds/webknossos/datastore/controllers/DataSourceController.scala

📚 Learning: in the webknossos codebase, classes extending `foximplicits` have access to an implicit conversion f...

Learnt from: frcroth
PR: scalableminds/webknossos#8236
File: webknossos-datastore/app/com/scalableminds/webknossos/datastore/services/mesh/MeshFileService.scala:170-173
Timestamp: 2025-04-23T08:51:57.756Z
Learning: In the webknossos codebase, classes extending `FoxImplicits` have access to an implicit conversion from `Option[A]` to `Fox[A]`, where `None` is converted to an empty Fox that fails gracefully in for-comprehensions.

Applied to files:

webknossos-datastore/app/com/scalableminds/webknossos/datastore/services/DSRemoteWebknossosClient.scala
webknossos-datastore/app/com/scalableminds/webknossos/datastore/controllers/DataSourceController.scala

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)

GitHub Check: build-smoketest-push
GitHub Check: backend-tests
GitHub Check: frontend-tests

🔇 Additional comments (6)

webknossos-datastore/app/com/scalableminds/webknossos/datastore/services/DSRemoteWebknossosClient.scala (2)

53-57: LGTM!

The DataSourceRegistrationInfo case class is well-structured and properly implements JSON serialization.

130-133: LGTM!

The migration to use ObjectId for dataset identification is correctly implemented.

webknossos-datastore/app/com/scalableminds/webknossos/datastore/controllers/LegacyController.scala (1)

66-78: Legacy route implementation pattern is consistent and secure

The implementation correctly validates access using the legacy DataSourceId before converting to the new ObjectId format. The error handling with Fox ensures proper propagation of failures.

webknossos-datastore/app/com/scalableminds/webknossos/datastore/controllers/DataSourceController.scala (3)

353-367: Update method correctly handles both disk and virtual datasets

The implementation appropriately checks if the dataset exists on disk before deciding whether to update locally or remotely. Preserving the original datasource ID ensures consistency.

412-430: Cache clearing refactored efficiently

The method now accepts an InboxDataSource parameter, avoiding redundant disk reads. All relevant caches are properly cleared.

443-463: Delete operation correctly handles virtual datasets

The method appropriately distinguishes between disk-based and virtual datasets, using the new deleteVirtualDataset API for datasets that don't exist on disk.

fm3

Let’s go!

coderabbitai · 2025-08-04T09:20:07Z

Note

Unit test generation is an Early Access feature. Expect some limitations and changes as we gather feedback and continue to improve it.

Generating unit tests... This may take up to 20 minutes.

coderabbitai · 2025-08-04T09:31:32Z

✅ UTG Post-Process Complete

No new issues were detected in the generated code and all check runs have completed. The unit test generation process has completed successfully.

coderabbitai · 2025-08-04T09:31:35Z

Creating a PR to put the unit tests in...

The changes have been created in this pull request: View PR

#8708 missed to update the frontend part of the `positionForSegment` route. I checked this in the changes of #8708. This should fix this. Furthermore, I did investige some other potential routes and did not find any other being broken. But I did not a full search, just some other related routes in `datastore.latest.routes`. ### URL of deployed dev instance (used for testing): - https://___.webknossos.xyz ### Steps to test: - Not sure how to test this reliably. I triggered the bug locally by splitting two segments that had no connecting surface in the agglomerate mesh. This lead to a crash of wk and when I investigated this, it showed that the route I fixed here was causing a 404 on the current master. ### Issues: - self noticed bug ------ (Please delete unneeded items, merge only when none are left open) - [x] Added changelog entry (create a `$PR_NUMBER.md` file in `unreleased_changes` or use `./tools/create-changelog-entry.py`)

- The “scan disk for new datasets” button now only scans for the current orga. This is mostly a performance optimization for multi-orga setups - Removed the now obsolete stateful dataSourceRepository (the wk-side database is the source of truth, the datastore side should use only the data from there, with its cache) - Remaining usages now either talk to wk directly or, in the case of STL download in LegacyController, use the original controller’s implementation. - The dashboard search now supports searching for dataset ids (only active when a full ObjectId is entered in the search field) - Fixed a bug where datasets with AdditionalAxes would be assumed changed on every re-report due to the hashCode being non-deterministic. This is fixed by using Seq instead of Array for the bounds of AdditionalAxis. ### Steps to test: - Create setup with multiple orgas (e.g. with isWkOrgInstance=true) - Put datasets in multiple orgas, hit refresh button. It should scan those of the user’s current orga only. - The regular once-a-minute scan should still scan everything. - Try searching for a dataset id in the dashboard ### TODOs: - [x] Backend - [x] Take optional orga id parameter, scan only that directory - [x] Don’t unreport datasets of other orgas - [x] adapt to the changes of #8708 - [x] Frontend - [x] Adapt function in rest_api.ts - [x] Pass current organizationId when hitting the button ### Issues: - fixes #8784 ------ - [x] Added changelog entry (create a `$PR_NUMBER.md` file in `unreleased_changes` or use `./tools/create-changelog-entry.py`) - [x] Removed dev-only changes like prints and application.conf edits - [x] Considered [common edge cases](../blob/master/.github/common_edge_cases.md) - [x] Needs datastore update after deployment --------- Co-authored-by: Charlie Meister <charlie.meister@student.hpi.de> Co-authored-by: frcroth <frcroth@users.noreply.github.com>

With #8708 the datastore uses the datasource properties as stored in postgres. Since wk already scans for attachments not registered in the on-disk datasource-properties.json and writes those into postgres, we don’t need to re-scan when answering the attachment list routes. There we can now rely on the attachment being already registered. ### Steps to test: - Open a local dataset with hdf5 attachments that are not registered in the local datasource-properties.json - They should still be usable. ### Issues: - contributes to #8567 ------ - [x] Considered [common edge cases](../blob/master/.github/common_edge_cases.md) - [x] Needs datastore update after deployment

this change was accidentally omitted in #8708

Follow-up fix for #8708 (request must be by id now) ### URL of deployed dev instance (used for testing): - https://fixadhocviewmode.webknossos.xyz ### Steps to test: - Open a dataset with a static segmentation layer, request ad-hoc mesh for a segment, should load. ### Issues: - fixes https://scm.slack.com/archives/C02H5T8Q08P/p1757351922621309?thread_ts=1757338710.530969&cid=C02H5T8Q08P ------ - [x] Added changelog entry (create a `$PR_NUMBER.md` file in `unreleased_changes` or use `./tools/create-changelog-entry.py`) - [x] Considered [common edge cases](../blob/master/.github/common_edge_cases.md)

frcroth force-pushed the explore-virtual-datasets branch from f7de002 to 7dcad67 Compare June 23, 2025 14:22

fm3 assigned frcroth Jun 24, 2025

fm3 added backend new feature labels Jun 24, 2025

Base automatically changed from virtual-remote-datasets to master June 25, 2025 07:26

frcroth added 2 commits June 25, 2025 09:54

Explore remote datasets as virtual datasets

9f89d38

Do not have virtual remote datasets deleted

42101a9

frcroth force-pushed the explore-virtual-datasets branch from 7dcad67 to 42101a9 Compare June 25, 2025 07:54

frcroth added 11 commits June 25, 2025 10:27

Put mag in db

18dfe98

Add temporary front end for testing virtual datasets

9c3cf74

Use mags for WKW datasets

391227a

Merge branch 'master' into explore-virtual-datasets

916542d

Move zarr streaming stuff to service, todo: add controller with datas…

3b3b13c

…etid routes

Move old zarr routes to LegacyController, update zarr routes to use id

3f81a85

Use datasetId in BinaryDataController

ac0f66d

Agglomerate files by dataset id

d51dea9

Merge branch 'master' into explore-virtual-datasets

371f3fb

Update more routes to use dataset id

611e552

Disable deletion route on virtual datasets for now

a4aaff4

frcroth changed the title ~~Exploring remote datasets creates virtual dataset~~ Virtual Datasets Jul 9, 2025

frcroth added 9 commits July 9, 2025 09:28

Merge branch 'master' into explore-virtual-datasets

677c8fe

Use datasetId for connectome routes

5b220ac

Move compose to webknossos

0fc1834

Merge branch 'master' into explore-virtual-datasets

6e27ba5

Fix WKW dataset mags being lost in parsing

b1797fc

Merge branch 'master' into explore-virtual-datasets

72de557

Adapt RemoteFallbackLayer to use datasetIds

ffdb99f

Add 'isVirtual' column to datasets

f4c2c0c

Remove usages of datasource id in rest api

f4ec53f

frcroth added 2 commits August 4, 2025 10:59

Fix handling of unusable datasource in reload

1957283

Fix backend formatting

70d50b3

coderabbitai bot reviewed Aug 4, 2025

View reviewed changes

fm3 approved these changes Aug 4, 2025

View reviewed changes

Validate objectId in response

f4f1001

coderabbitai bot mentioned this pull request Aug 4, 2025

CodeRabbit Generated Unit Tests: Add comprehensive unit tests for controllers and services to improve coverage #8818

Closed

frcroth merged commit 664ce02 into master Aug 4, 2025
5 checks passed

frcroth deleted the explore-virtual-datasets branch August 4, 2025 11:08

frcroth mentioned this pull request Aug 4, 2025

Fix SQL query issues introduced by #8708 #8821

Merged

frcroth added a commit that referenced this pull request Aug 4, 2025

Fix SQL query issues introduced by #8708 (#8821)

27948cb

frcroth mentioned this pull request Aug 4, 2025

Do not rely on DataSourceId for UserAccessRequest, it does not match annotations, exports #8239

Open

This was referenced Aug 5, 2025

Support reading wkw datasets with new mag key instead of wkwResolutions scalableminds/webknossos-libs#1349

Merged

Don’t re-scan attachments in their list routes #8829

Merged

MichaelBuessemeyer mentioned this pull request Aug 5, 2025

Fix positionForSegment route #8830

Merged

1 task

coderabbitai bot mentioned this pull request Aug 11, 2025

Remove datastore from communication about virtual datasets #8848

Merged

1 task

fm3 mentioned this pull request Aug 13, 2025

Adapt to wk API version 10, datastore routes by dataset id scalableminds/webknossos-libs#1352

Open

coderabbitai bot mentioned this pull request Aug 26, 2025

Prepare Release 25.09.0 #8874

Merged

1 task

fm3 added a commit that referenced this pull request Aug 29, 2025

Update ApiVersioning to report v10

1e289b4

this change was accidentally omitted in #8708

coderabbitai bot mentioned this pull request Sep 2, 2025

New reserveUploadToPaths Protocol; Refactor DataLayer Classes; UPath #8844

Merged

35 tasks

fm3 mentioned this pull request Sep 9, 2025

Fix loading adHocMeshes from dataset view mode #8903

Merged

2 tasks

coderabbitai bot mentioned this pull request Sep 12, 2025

Upload datasets to S3 #8912

Open

18 tasks

This was referenced Sep 25, 2025

Remove advanced dataset settings toggle #8954

Merged

Use mags instead of resolutions in datasource #8958

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Virtual Datasets #8708

Virtual Datasets #8708

Uh oh!

frcroth commented Jun 23, 2025 •

edited

Loading

Uh oh!

coderabbitai bot commented Jun 23, 2025 •

edited

Loading

Walkthrough

Changes

Estimated code review effort

Assessment against linked issues

Assessment against linked issues: Out-of-scope changes

Possibly related PRs

Suggested labels

Suggested reviewers

Poem

Chat

Support

CodeRabbit Commands (Invoked using PR comments)

Other keywords and placeholders

CodeRabbit Configuration File (`.coderabbit.yaml`)

Documentation and Community

Uh oh!

coderabbitai bot left a comment

Uh oh!

fm3 left a comment

Uh oh!

coderabbitai bot commented Aug 4, 2025

Uh oh!

coderabbitai bot commented Aug 4, 2025

Uh oh!

coderabbitai bot commented Aug 4, 2025

Uh oh!

Uh oh!

Uh oh!

Virtual Datasets #8708

Virtual Datasets #8708

Uh oh!

Conversation

frcroth commented Jun 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

URL of deployed dev instance (used for testing):

Steps to test:

Implementation notes

TODOs:

Issues:

Followups:

Uh oh!

coderabbitai bot commented Jun 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Assessment against linked issues

Assessment against linked issues: Out-of-scope changes

Possibly related PRs

Suggested labels

Suggested reviewers

Poem

Chat

Support

CodeRabbit Commands (Invoked using PR comments)

Other keywords and placeholders

CodeRabbit Configuration File (.coderabbit.yaml)

Documentation and Community

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

fm3 left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot commented Aug 4, 2025

Uh oh!

coderabbitai bot commented Aug 4, 2025

Uh oh!

coderabbitai bot commented Aug 4, 2025

Uh oh!

Uh oh!

Uh oh!

frcroth commented Jun 23, 2025 •

edited

Loading

coderabbitai bot commented Jun 23, 2025 •

edited

Loading

CodeRabbit Configuration File (`.coderabbit.yaml`)