Skip to content

Support for binary blob data source and openapi document updates.#44

Open
jsaarimaa wants to merge 29 commits intodevfrom
feature/blob-binary-db
Open

Support for binary blob data source and openapi document updates.#44
jsaarimaa wants to merge 29 commits intodevfrom
feature/blob-binary-db

Conversation

@jsaarimaa
Copy link
Collaborator

Added support for the binary converted px-files as a data source.
Refactored the data source interfaces to enable the binary data source.
Changed caching behaviour; faulted tasks are now ejected from the cache.
Updated the OpenApi documentation.
Added unit tests.

jsaarimaa and others added 19 commits January 30, 2026 12:53
…ector and cancellation token support to all data connectors.

Refactored all data source and connector interfaces and implementations to accept and propagate CancellationToken for async methods, including file listing, metadata, data, and auxiliary file access. Improved cache eviction for failed/canceled tasks and added related unit tests. Added BinaryBlobSynchronizationException. Updated file/stream reading and StreamReader usage to support cancellation. General cleanup and documentation improvements.
- Refactored BinaryBlobDataBaseConnector to read content dimension values in parallel using SemaphoreSlim, improving performance for large datasets.
- Improved streaming and chunked read logic for binary blobs, with better error handling and resource management.
- Use connector-specific ILogger<T> injection for accurate logging context.
- Introduce PxBlobPrefix constant and use it to prefix auxiliary file paths in BlobDataBaseConnector.
- Replace manual path manipulation with Path.Combine for blob names.
- Inject ILogger into CachedDataSource; add debug logging for cache hits/misses
- Update CachedDataSource unit tests to verify logging and cache behavior
- Refactor BinaryBlobDataBaseConnector: extract blob path helpers, clarify read logic, add unit tests for helpers
- Make BlobReadModeSelector static and document read strategy selection
- Standardize CancellationToken usage across all data source connectors
- Unify and document PX file stream opening logic; improve error handling
Explicitly pass GlobalJsonConverterOptions.Default to JsonSerializer.DeserializeAsync when reading GroupingFileModel, ensuring custom converters and settings are applied instead of relying on default options.
- Standardize error response types (string) and codes (400, 404, 500, 503, etc.) across controllers
- DataController now returns 503 if data is unavailable (BinaryBlobSynchronizationException); update OpenAPI docs accordingly
- Add X-Max-Cells header to data responses; add tests for header presence
- Enhance OpenAPI filters: remove text/csv from error responses, ensure 406 has no content, add 400/406/500 responses
- Add ProblemDetailsDocumentFilter to remove orphaned schema from OpenAPI; add unit tests
- Update error responses in Cache, Tables, Metadata, and Databases controllers to return plain strings; improve log messages
- Update TablesController to return NotFoundObjectResult with message for 404s
- Update NUnit/test packages
- Expand and add unit tests for OpenAPI filters, 503 handling, and X-Max-Cells header
@jsaarimaa jsaarimaa requested a review from Copilot February 11, 2026 14:12
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a new binary-blob-backed database connector and refactors datasource/caching interfaces to support binary PX data while updating OpenAPI output and expanding unit tests.

Changes:

  • Introduces BinaryBlobDataBaseConnector (+ read-strategy heuristics) and refactors connectors behind a shared DataBaseConnector/BlobDataBaseConnector base.
  • Updates DI registrations to use Azure client factories + DefaultAzureCredential and extends DB types/config keys.
  • Updates OpenAPI document filters and controller response metadata; expands NUnit test coverage for new behaviors.

Reviewed changes

Copilot reviewed 45 out of 45 changed files in this pull request and generated 9 comments.

Show a summary per file
File Description
PxApi/Utilities/WebApplicationExtensions.cs Passes cancellation token into DB validation file enumeration.
PxApi/Utilities/ServiceCollectionExtensions.cs Adds BinaryBlobStorage registration + Azure client factory setup for blob/fileshare connectors.
PxApi/Utilities/LoggerConsts.cs Adds logging scope keys for container/blob-related fields.
PxApi/Utilities/BlobReadModeSelector.cs New heuristic to pick streaming vs windowed reads for multidimensional blobs.
PxApi/PxApi.csproj Updates Azure.Identity/Px.Utils and adds Microsoft.Extensions.Azure dependency.
PxApi/Program.cs Adds ProblemDetails cleanup filter to OpenAPI generation.
PxApi/OpenApi/DocumentFilters/ProblemDetailsDocumentFilter.cs New filter removing orphaned ProblemDetails schema.
PxApi/OpenApi/DocumentFilters/DataControllerPostEndpointDocumentFilter.cs Cleans error response media types in OpenAPI (no CSV for errors, 406 has no body).
PxApi/OpenApi/DocumentFilters/DataControllerGetEndpointDocumentFilter.cs Same OpenAPI error-response cleanup for GET endpoints.
PxApi/Exceptions/BinaryBlobSynchronizationException.cs New exception for temporarily unavailable binary blob data.
PxApi/DataSources/PxBlobDataBaseConnector.cs New blob connector implementation based on common blob base class.
PxApi/DataSources/MountedDataBaseConnector.cs Migrates to DataBaseConnector base + adds cancellation token usage.
PxApi/DataSources/IDataBaseConnector.cs Refactors connector API: adds metadata/data read APIs + cancellation tokens.
PxApi/DataSources/FileShareDataBaseConnector.cs Moves to Azure client factory-based ShareServiceClient + stream-based reads.
PxApi/DataSources/DataBaseConnector.cs New base class providing shared PX metadata/data reading from streams.
PxApi/DataSources/BlobStorageDataBaseConnector.cs Removes old blob connector (replaced by new base + PxBlobDataBaseConnector).
PxApi/DataSources/BlobDataBaseConnector.cs New base for blob-backed connectors (shared stream + auxiliary file logic).
PxApi/DataSources/BinaryBlobDataBaseConnector.cs New connector for .pxb data + .meta.json metadata in Azure Blob Storage.
PxApi/Controllers/TablesController.cs Updates OpenAPI response types and adjusts error handling/NotFound message bodies.
PxApi/Controllers/MetadataController.cs Updates OpenAPI response types and adds 500 response metadata on HEAD/OPTIONS.
PxApi/Controllers/DatabasesController.cs Updates OpenAPI response types and adds 500 response metadata on HEAD/OPTIONS.
PxApi/Controllers/DataController.cs Adds 503 response for binary sync gaps + emits X-Max-Cells header.
PxApi/Controllers/CacheController.cs Simplifies response bodies to strings and updates OpenAPI response types.
PxApi/Configuration/DataBaseType.cs Adds BinaryBlobStorage enum value.
PxApi/Caching/PxfileReader.cs Removes old reader (logic moved behind DataBaseConnector APIs).
PxApi/Caching/MetaCacheContainer.cs Removes cached data-section offset logic (no longer needed).
PxApi/Caching/ICachedDataSource.cs Adds cancellation tokens; removes single-string metadata API.
PxApi/Caching/DatabaseCache.cs Evicts faulted/canceled tasks from cache to avoid poison entries.
PxApi/Caching/CachedDataSource.cs Uses new connector APIs + adds caching debug logs + cancellation tokens.
PxApi.UnitTests/Utils/TestConfigFactory.cs Updates test configuration keys for new connector settings.
PxApi.UnitTests/UtilitiesTests/ServiceCollectionExtensionsTests.cs Updates tests for new DI validation behavior/config keys.
PxApi.UnitTests/UtilitiesTests/BlobReadModeSelectorTests.cs Adds tests for new read-mode heuristic and gap calculations.
PxApi.UnitTests/PxApi.UnitTests.csproj Updates NUnit packages and adds Px.Utils dependency.
PxApi.UnitTests/DocumentFilters/ProblemDetailsDocumentFilterTests.cs Adds coverage for new OpenAPI document filter behavior.
PxApi.UnitTests/DocumentFilters/DataControllerPostEndpointDocumentFilterTests.cs Expands tests for new error response cleanup behavior.
PxApi.UnitTests/DocumentFilters/DataControllerGetEndpointDocumentFilterTests.cs Expands tests for new error response cleanup behavior.
PxApi.UnitTests/DataSources/BinaryBlobDataBaseConnector_InternalHelpersTests.cs Adds coverage for internal helper methods in binary blob connector.
PxApi.UnitTests/ControllerTests/TablesControllerTests.cs Updates tests for cancellation-token APIs and NotFound message body.
PxApi.UnitTests/ControllerTests/MetadataControllerTest.cs Updates tests for cancellation-token APIs.
PxApi.UnitTests/ControllerTests/DatabasesControllerTests.cs Updates tests for cancellation-token APIs.
PxApi.UnitTests/ControllerTests/DataControllerTests.cs Adds tests for 503 handling and X-Max-Cells header.
PxApi.UnitTests/ControllerTests/DataControllerStreamTests.cs Updates stream tests to use new connector APIs and CachedDataSource ctor changes.
PxApi.UnitTests/ControllerTests/CacheControllerTests.cs Updates tests for cancellation-token APIs and new string responses.
PxApi.UnitTests/Caching/PxFileReaderTests.cs Updates tests to validate DataBaseConnector-based metadata/data reads.
PxApi.UnitTests/Caching/CachedDataBaseConnectorTests.cs Updates + expands tests for cache logging and task-eviction-on-failure behavior.

Added InternalsVisibleTo attribute for "PxApi.UnitTests" to allow unit testing of internal members.
Replaced GetTimestamp with logic that derives the timestamp from the maximum LastUpdated value among content dimension values, formatted as "yyyyMMddHHmm". Removed redundant DateTime.Now assignment in logging context.
Added NormalizeDirectoryPath and IsWithinDirectory methods to MountedDataBaseConnector for robust path handling and security checks. Updated file access logic to prevent directory traversal and unauthorized access. Introduced unit tests covering edge cases for path validation.
Introduced BlobPathHelper utility for normalizing and combining Azure Blob Storage paths, ensuring consistent use of forward slashes. Updated BlobDataBaseConnector and PxBlobDataBaseConnector to use this helper for all blob path operations. Added comprehensive unit tests for BlobPathHelper.
- Eagerly materialize blob item lists for metadata files, simplifying existence and selection logic in BinaryBlobDataBaseConnector.
- Replace stackalloc/ReadOnlySpan with char[] in BlobPathHelper for improved compatibility and clarity.
- Fix typo in BlobReadModeSelector comment.
Refactored BlobReadModeSelector and related tests to use long[] for reverse cumulative products (rcsp) instead of int[], preventing integer overflow with large arrays. Removed redundant int[] rcsp helpers and updated all relevant calculations and method signatures. Also improved exception handling in DatabaseCache to prevent unobserved task exceptions.
Introduce BlobReadModeConfig for configurable thresholds (SmallThreshold, MaxWindowedReadSize, ReadWindowGap) used in blob read strategy selection. Refactor BlobReadModeSelector to use these settings from AppSettings. In BinaryBlobDataBaseConnector, abstract all Azure Blob SDK calls into virtual methods for improved testability. Add comprehensive unit tests for binary blob reading, metadata, and file listing, including large-blob and edge-case scenarios. Update existing tests to use the new configurable thresholds.
Added DataBaseConnectorTests covering metadata and data reading for both seekable and non-seekable streams, including key handling and data mapping. Introduced a custom TestableConnector for controlled testing. Also marked ValidateDatabaseConnectionsAsync with [ExcludeFromCodeCoverage].
Replaces all usages of Assert.Multiple with using (Assert.EnterMultipleScope()) {...} across the test suite for improved clarity and compatibility with newer NUnit practices. Also includes minor assertion cleanups (e.g., Is.EqualTo(0) → Is.Zero) and consolidates some import statements.
@sonarqubecloud
Copy link

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants