Skip to content

Conversation

@fhnaumann
Copy link
Collaborator

@fhnaumann fhnaumann commented Sep 10, 2025

Renamed the branch to reflect the planned changes more accurately.

Relates to #2467, #1987 and #1986.
In the future, this PR may also include #2476 and #1988

@github-project-automation github-project-automation bot moved this to Backlog in Pathling Sep 15, 2025
@johngrimes johngrimes moved this from Backlog to In progress in Pathling Sep 15, 2025
@johngrimes johngrimes added fhirpath Related to fhirpath reference implementation new feature New feature or request labels Sep 15, 2025
@fhnaumann
Copy link
Collaborator Author

fhnaumann commented Sep 16, 2025

Demonstration:
An AidBox server is deployed to Kubernetes. A small script pulls data from it (using bulk export?) and converts the ndjson files to parquet files (delta tables). The developed pathling-server uses that data as input and it can be requested to perform bulk export on it. Requests are made through a web client. Every component is dockerized, packaged using helm charts and then deployed to Kubernetes.

In the future, additional technologies such as databricks may be used. Also authentication and authorization may be performed through the web client to the pathling-server.

use the auth in the createTag method, but unsure what the effects are. Is there something "in" the auth object that stays the same across requests so caching still works? Anyhow, some auth information should maybe be part of the tag (if parts stay the same across requests)

@johngrimes johngrimes self-requested a review September 18, 2025 01:39
@johngrimes
Copy link
Member

@fhnaumann Could you please merge main into this branch?

@johngrimes
Copy link
Member

Why are there some deleted files from the test data directory in the library API? There are other tests that rely upon this data.

Copy link
Member

@johngrimes johngrimes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't actually compile for me yet, and not passing the tests on CI.

I've added some preliminary comments anyway, I can take another look once we have a green build.

As a general comment please also take a look at the CONTRIBUTING.md file and make sure that everything is ticked off there.

@Component
@Profile("server")
@Slf4j
public class ConformanceProvider implements IServerConformanceProvider<CapabilityStatement>,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will need to be updated to accurately reflect the capabilities of the server now.

Add comprehensive Javadoc comments to all fields in the Job class,
explaining the purpose of each field. Also add missing @param tag for
the id parameter in the constructor and add final modifiers to method
parameters.
Add missing ExportConfiguration parameter to ExportProvider constructor
call in SecurityTestForOperations test.
Change ExportProvider to inject ServerConfiguration instead of
ExportConfiguration directly, since nested configurations are not
automatically available as Spring beans. Access the export configuration
via serverConfiguration.getExport().
Log all response headers in the assertCompleteResult method to aid in
debugging and verification of the Expires header configuration.
The FHIR Bulk Data Export manifest now correctly sets requiresAccessToken
to true when server authorisation is enabled. This ensures that bulk data
clients include the access token when downloading exported files.

Also fixes Dependencies.java to use PathlingContext.Builder pattern
instead of the non-existent create(SparkSession, EncodingConfiguration,
TerminologyConfiguration) method.
…ent scan

Prevents the deltaLake() bean from being created during test data import,
which was failing because it tried to read Delta tables that hadn't been
generated yet.
Implement patient-level and group-level bulk export per FHIR Bulk Data
Access specification. Key changes:

- Add PatientExportProvider for /Patient/$export and /Patient/[id]/$export
- Add GroupExportProvider for /Group/[id]/$export
- Add PatientCompartmentService to filter resources by patient compartment
- Add ExportOperationHelper to deduplicate export execution logic
- Extend ExportRequest with exportLevel and patientIds fields
- Update ExportOperationValidator with patient-level validation
- Register new providers in FhirServer and ConformanceProvider
- Fix code style issues and add missing Javadoc across modified files
The bulk export manifest was generating incorrect result URLs for
patient-level exports (e.g. /Patient/$result instead of /$result).
This was caused by parsing the request URL to derive the server base,
which included the resource type path segment.

Changes:
- Add serverBaseUrl field to ExportRequest record
- Pass requestDetails.getFhirServerBase() from validator to request
- Update ExportResponse to use serverBaseUrl directly
- Remove backwards-compatible constructors from ExportRequest
- Update test utilities to use canonical constructor
Add Javadoc documentation, nullability annotations, and rename ND_JSON
constant to NDJSON to follow Java naming conventions.
Add Javadoc documentation, nullability annotations, and final modifiers.
Fix redundant registry lookup and improve code formatting.
Clarify that this provider handles system-level bulk exports, consistent
with PatientExportProvider and GroupExportProvider naming.
Add Javadoc documentation, nullability annotations, and final modifiers.
Remove commented-out code and use Lombok @Getter for requiresAccessToken.
Improve readability by extracting URL conversion logic into a dedicated
private method.
Rename to OperationValidation, add Javadoc documentation, nullability
annotations, and final modifiers. Update all references.
Implements the Argonaut $bulk-submit specification for receiving bulk data
from external systems. The operation supports a multi-phase submission
lifecycle (in-progress, complete, aborted) and delegates actual data
processing to the existing ImportExecutor.

Key components:
- BulkSubmitProvider: Main operation endpoint with async support
- BulkSubmitStatusProvider: Status checking endpoint
- BulkSubmitValidator: Request validation with submitter authorisation
- BulkSubmitExecutor: Manifest fetching and file download orchestration
- SubmissionRegistry: In-memory state with Hadoop FileSystem persistence
- BulkSubmitResultBuilder: Export-style manifest generation

Configuration via pathling.bulk-submit.* properties including enabled flag,
allowed submitters list, staging location, and allowable source prefixes.
Conditionally include $bulk-submit and $bulk-submit-status operations
in the CapabilityStatement when bulk-submit is enabled in configuration.
Adds OperationDefinition resources for both operations.
Per Argonaut spec, URL parameters (manifestUrl, fhirBaseUrl,
replacesManifestUrl) are string (url) not FHIR url type. This allows
clients to send valueString instead of valueUrl.
The Argonaut spec uses headerName/headerValue for the fileRequestHeader
parts, not name/value. Also makes header extraction optional to
gracefully skip empty or incomplete headers.
Clients like bulk-submit-provider send Accept: application/json rather
than application/fhir+json. Using lenient validation allows these
requests to proceed by automatically adding the required header.
Add @AsyncSupported annotation so the operation returns 202 Accepted
with Content-Location pointing to $job endpoint for polling. The method
blocks until the submission completes, enabling clients to poll via GET.
Per Argonaut spec, manifestUrl may be omitted when setting
submissionStatus to complete. The manifest details can come from a
previous in-progress request. This change:

- Removes validator requirement for manifestUrl on complete status
- Stores manifest details in handleInProgressSubmission when provided
- Uses stored manifest details in handleCompleteSubmission if not
  provided in the request
Add submissionId and submissionStatus to request URL to ensure different
bulk-submit requests get unique async job cache tags. Also fix state
management so withManifestDetails() preserves current state rather than
automatically transitioning to PROCESSING.
Add WireMock-based integration test that exercises the bulk-submit
workflow with stubbed manifest and NDJSON data endpoints. Fix test
helper in BulkSubmitResultBuilderTest to correctly set PROCESSING
state. Update CONTRIBUTING.md with correct command for running
specific integration tests without unit tests.
@sonarqubecloud
Copy link

Quality Gate Failed Quality Gate failed

Failed conditions
D Reliability Rating on New Code (required ≥ A)

See analysis details on SonarQube Cloud

Catch issues before they fail your Quality Gate with our IDE extension SonarQube for IDE

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

fhirpath Related to fhirpath reference implementation new feature New feature or request

Projects

Status: In progress

Development

Successfully merging this pull request may close these issues.

3 participants