Conversation
* Added spacing between clarin & dspace logo * Spacing is prettier and when clarin logo is removed, dspace logo is centered
* Fix of unwanted Seznam Dataset License request while using other licenses
…ace#851) * Added an URL serializer to fix encoding of the special characters from the URL e.g., `[`, `(` because the filename wasn't properly parsed * Added some unit tests for encoding the bitstream filename url
* Do not use the clarin item view box for the bulk access * Removed unused import * Clarify the `showClarinViewBox` is boolean * Use the constant for the hardcoded bulk access list id
* Create Acknowledgment-ReadMe.md Acnkowledgment of NRP project (cherry picked from commit ad889b2) * Video files previews This uses the thumbnail as poster (if available) and correctly sets the source of the video currently only works for anonymously accessible files. (cherry picked from commit 4832c2f) * Handle video previews for restricted items append a shortlived token at the right time (error, seeking, stalled) (cherry picked from commit 2c12d7d) * Display only ORIGINAL bitstreams Thumbnails, when available, should be shown istead of the generic MIME_TYPE_IMAGE. Content of the TEXT bundle should not be shown at all this is usually automatically extracted "text layer" of a PDF, useful for indexing, but don't want people downloading it. (cherry picked from commit 3862442) * fix linter and test errors (cherry picked from commit f852096) * Code review follow up the listOfFiles should really not contain files from "TEXT" or "THUMBNAIL" bundles. * code review unsubscribe error$, seeking$ and $stalled * code review - thumbnail might be undefined * code review - consistent formatting
this contains the changes from lindat-2025.05.15113644568
There was a problem hiding this comment.
Pull Request Overview
Sync various updates from lindat-2025.05 including weekly import, filename‐encoding improvements, bulk access tweaks, Solr config copy changes, and UI/logo layout adjustments.
- Add a custom
UrlSerializerandencodeRFC3986URIComponentutil to properly handle special characters in bitstream URLs - Introduce bulk‐access list ID and clarify view logic for CLARIN vs default list display
- Update Docker Compose to copy Solr configs on each start and schedule a weekly import
Reviewed Changes
Copilot reviewed 17 out of 17 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| src/app/shared/object-list/object-list.component.ts | Define BULK_ACCESS_LIST_ID and showClarinViewBox() for bulk‐access logic |
| src/app/shared/object-list/object-list.component.html | Swap hardcoded context === 'search' checks for showClarinViewBox() |
| src/app/shared/clarin-shared-util.ts | Add encodeRFC3986URIComponent to safely encode filenames |
| src/app/login-page/login-page.component.html | Align logos, add target="_blank" and rel attrs |
| src/app/item-page/simple/field-components/preview-section/file-description/file-description.component.ts | Implement OnDestroy to manage subscriptions and handle missing thumbnails |
| src/app/item-page/simple/field-components/preview-section/file-description/file-description.component.spec.ts | Minor spec formatting update |
| src/app/item-page/clarin-files-section/clarin-files-section.component.ts | Restrict metadata bitstream query to ORIGINAL only |
| src/app/core/url-serializer/bitstream-url-serializer.ts | New BitstreamUrlSerializer to encode filenames in router URLs |
| src/app/core/url-serializer/bitstream-url-serializer.spec.ts | Tests for filename encoding in /bitstream/ URLs |
| src/app/bitstream-page/legacy-bitstream-url.resolver.ts | Apply encodeRFC3986URIComponent to legacy download filenames |
| src/app/bitstream-page/clarin-zip-download-page/clarin-zip-download-page.component.ts | Conditionally append dtoken query param only if present |
| src/app/bitstream-page/clarin-license-agreement-page/clarin-license-agreement-page.component.ts | Subscribe to item$ and load license content only when appropriate |
| src/app/app.module.ts | Provide custom BitstreamUrlSerializer for the router |
| docker/docker-compose-rest.yml | Remove Solr volume mount; copy configsets at startup |
| .github/workflows/import-weekly.yml | Schedule weekly import job every Sunday |
| .github/disabled-workflows/deploy.yml | Remove legacy import schedule from disabled workflow |
Comments suppressed due to low confidence (4)
src/app/item-page/simple/field-components/preview-section/file-description/file-description.component.ts:38
- [nitpick] Property
handlers_addeduses snake_case; align with TypeScript style by renaming tohandlersAdded.
handlers_added = false;
src/app/shared/clarin-shared-util.ts:96
- This new utility lacks unit tests. Add tests covering spaces, brackets, parentheses, and edge cases to ensure correct encoding.
export function encodeRFC3986URIComponent(uriPart: string) {
src/app/core/url-serializer/bitstream-url-serializer.spec.ts:26
- Consider adding a test for filenames containing multiple
/segments to verify path reconstruction (and catch unintended comma insertion).
it('should encode special characters in the filename in /bitstream/ URLs', () => {
src/app/shared/clarin-shared-util.ts:101
- The implementation only replaces parentheses. To match the docstring and cover brackets (
[]) and spaces, extend the replacement or use a more comprehensive regex/pipeline.
.replace(/[()]/g, c => '%' + c.charCodeAt(0).toString(16).toUpperCase());
| loadLicenseContentSeznam() { | ||
| this.htmlContentService.getHmtlContentByPathAndLocale(this.LICENSE_PATH_SEZNAM_CZ).then(content => { | ||
| this.licenseContentSeznam.next(content); | ||
| this.item$.subscribe((item) => { |
There was a problem hiding this comment.
Subscribing directly without cleanup can cause memory leaks. Consider using take(1) or unsubscribing in ngOnDestroy, or refactor to async pipe.
Suggested change
| this.item$.subscribe((item) => { | |
| this.item$.pipe(take(1)).subscribe((item) => { |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This contains the changes from lindat-2025.05.15113644568
Run Python import weekly
ZCU-PUB/Fixed encoding of the filename from the URL (dataquest-dev#838) (dataquest-dev#851)
UFAL/Do not mount the Solr configs; copy them each time instead. (dataquest-dev#850)
UFAL/Fix the bulk access (dataquest-dev#852)
UFAL/Do not add
dtokeninto the URL if it is null (dataquest-dev#860) (dataquest-dev#861)UFAL/Added spacing between clarin & dspace logo (dataquest-dev#848)
Ufal/seznam license request (dataquest-dev#844)
Merge remote-tracking branch 'dataquest-dev/dtq-dev' into HEAD
Ufal dtq sync 2025 05 14 (dataquest-dev#855)