Skip to content

Conversation

@pinglin
Copy link
Member

@pinglin pinglin commented Sep 29, 2025

Because

  • The Content-Disposition header parsing logic had an incorrect condition that prevented filename extraction from HTTP responses
  • The original code checked disposition == "" but then tried to access the disposition string, causing filenames to never be extracted from HTTP headers
  • This affected file uploads where the filename should be preserved from the original HTTP response
  • The base64 encoding issue for uploaded filenames was related to this parsing problem

This commit

  • Fixes the Content-Disposition header parsing logic by changing the condition from disposition == "" to disposition != ""
  • Removes the unnecessary check for "attachment" prefix, allowing filename extraction from both "attachment" and "inline" dispositions according to RFC standards
  • Adds comprehensive test coverage for filename extraction from various Content-Disposition header formats including:
    • Basic attachment and inline dispositions
    • RFC 5987 encoded filenames with special characters
    • Complex filenames with quotes and special characters
    • Data URI parsing with filename parameters
    • URL-encoded filename handling
  • Adds tests for both binaryFetcher and artifactBinaryFetcher implementations

Note

Fixes filename extraction from HTTP responses and adds extensive tests for URL/data URI handling and MinIO presigned patterns.

  • Bug Fix
    • Correct Content-Disposition parsing in pkg/external/external.go to extract filename when header is present (handles both attachment and inline).
  • Tests (pkg/external/external_test.go)
    • Add coverage for BinaryFetcher and ArtifactBinaryFetcher:
      • Filename extraction from Content-Disposition (attachment, inline, RFC 5987, complex names, absent header).
      • Data URI parsing with optional filename parameter and error cases.
      • Presigned URL decoding via v1alpha/blob-urls/{base64} and deprecated pattern matching.
      • Regular URL fallback and network/error handling.

Written by Cursor Bugbot for commit 35fbcfc. This will update automatically on new commits. Configure here.

@linear
Copy link

linear bot commented Sep 29, 2025

@pinglin pinglin merged commit 04ae176 into main Sep 29, 2025
8 checks passed
@pinglin pinglin deleted the pinglin/ins-8668-bug-fix-base64-encoding-issue-for-uploaded-filename-in branch September 29, 2025 19:40
@pinglin pinglin changed the title fix(external): fix Content-Disposition header parsing for filename ex… fix(external): fix Content-Disposition header parsing for filename extraction Sep 29, 2025
pinglin added a commit that referenced this pull request Sep 29, 2025
…traction (#1132)

Because

- The Content-Disposition header parsing logic had an incorrect
condition that prevented filename extraction from HTTP responses
- The original code checked `disposition == ""` but then tried to access
the disposition string, causing filenames to never be extracted from
HTTP headers
- This affected file uploads where the filename should be preserved from
the original HTTP response
- The base64 encoding issue for uploaded filenames was related to this
parsing problem

This commit

- Fixes the Content-Disposition header parsing logic by changing the
condition from `disposition == ""` to `disposition != ""`
- Removes the unnecessary check for "attachment" prefix, allowing
filename extraction from both "attachment" and "inline" dispositions
according to RFC standards
- Adds comprehensive test coverage for filename extraction from various
Content-Disposition header formats including:
  - Basic attachment and inline dispositions
  - RFC 5987 encoded filenames with special characters
  - Complex filenames with quotes and special characters
  - Data URI parsing with filename parameters
  - URL-encoded filename handling
- Adds tests for both `binaryFetcher` and `artifactBinaryFetcher`
implementations

<!-- CURSOR_SUMMARY -->
---

> [!NOTE]
> Fixes filename extraction from HTTP responses and adds extensive tests
for URL/data URI handling and MinIO presigned patterns.
>
> - **Bug Fix**
> - Correct `Content-Disposition` parsing in `pkg/external/external.go`
to extract `filename` when header is present (handles both `attachment`
and `inline`).
> - **Tests** (`pkg/external/external_test.go`)
>   - Add coverage for `BinaryFetcher` and `ArtifactBinaryFetcher`:
> - Filename extraction from `Content-Disposition` (attachment, inline,
RFC 5987, complex names, absent header).
> - Data URI parsing with optional `filename` parameter and error cases.
> - Presigned URL decoding via `v1alpha/blob-urls/{base64}` and
deprecated pattern matching.
>     - Regular URL fallback and network/error handling.
>
> <sup>Written by [Cursor
Bugbot](https://cursor.com/dashboard?tab=bugbot) for commit
35fbcfc. This will update automatically
on new commits. Configure
[here](https://cursor.com/dashboard?tab=bugbot).</sup>
<!-- /CURSOR_SUMMARY -->
jvallesm pushed a commit that referenced this pull request Oct 7, 2025
🤖 I have created a release *beep* *boop*
---


##
[0.61.0](v0.60.0...v0.61.0)
(2025-10-06)


### Features

* **component,ai,gemini:** add image generation support
([#1122](#1122))
([d986614](d986614))
* **component,ai,gemini:** add multimedia support with unified format…
([#1114](#1114))
([291b379](291b379))
* **component,ai,gemini:** add text embeddings task support
([#1129](#1129))
([d7ca6cf](d7ca6cf))
* **component,ai,gemini:** enhance streaming to output all fields
([#1106](#1106))
([dfb6b24](dfb6b24))
* **component,ai,gemini:** implement automatic format conversion for
unsupported media types
([#1128](#1128))
([f767b8a](f767b8a))
* **component,ai,gemini:** implement File API support for large files…
([#1118](#1118))
([b51c8f4](b51c8f4))
* **data:** add comprehensive AVIF image format support
([#1135](#1135))
([76d6941](76d6941))
* **data:** add HEIC/HEIF image support and normalize MIME types
([#1127](#1127))
([2dfa254](2dfa254))
* **data:** enhance unmarshaler with JSON string to struct conversion
([#1116](#1116))
([9e06b7c](9e06b7c))
* **data:** implement time types support with pattern validation
([#1115](#1115))
([79630c0](79630c0))


### Bug Fixes

* **compogen:** escape curly braces for readme.com compatibility
([#1124](#1124))
([904992d](904992d))
* **component,ai,gemini:** add operation validation for cache task
([#1130](#1130))
([9e19255](9e19255))
* **component,ai,gemini:** correct text-based documents logic
([#1103](#1103))
([ed5a111](ed5a111))
* **component,ai,gemini:** unify InlineData processing and enable images
in streaming responses
([#1125](#1125))
([3117046](3117046))
* **data:** remove duplicate dot in generated filenames
([#1136](#1136))
([0a74a00](0a74a00))
* **external:** fix Content-Disposition header parsing for filename
extraction
([#1132](#1132))
([869b081](869b081))
* **service:** handle null JSON metadata in pipeline conversion
([#1134](#1134))
([b244784](b244784))
* **text:** correct positions on duplicate markdown chunks
([#1120](#1120))
([1b4cd1f](1b4cd1f))
* **usage:** add missing error filtering for users/admin
([#1119](#1119))
([cd1bd55](cd1bd55))


### Refactor

* **component,ai,gemini:** merge usage and usage-metadata fields into
single usage field
([#1126](#1126))
([a6046cd](a6046cd))
* **component,ai.gemini:** standardize file api timeout and use native
embedding type
([#1133](#1133))
([174f7d6](174f7d6))
* **component,generic,http:** move test functions to test files and
improve code legibility
([#1131](#1131))
([1153a09](1153a09))
* **component,generic,http:** replace env-based URL validation with
constructor injection
([#1121](#1121))
([f1f7d2f](f1f7d2f))


### Tests

* **component,generic,http:** replace external httpbin.org dependency
with local test server
([#1101](#1101))
([a82d155](a82d155))

---
This PR was generated with [Release
Please](https://github.com/googleapis/release-please). See
[documentation](https://github.com/googleapis/release-please#release-please).
donch1989 pushed a commit that referenced this pull request Oct 7, 2025
🤖 I have created a release *beep* *boop*
---


##
[0.61.0](v0.60.0...v0.61.0)
(2025-10-07)


### Features

* **component,ai,gemini:** add image generation support
([#1122](#1122))
([d986614](d986614))
* **component,ai,gemini:** add multimedia support with unified format…
([#1114](#1114))
([291b379](291b379))
* **component,ai,gemini:** add text embeddings task support
([#1129](#1129))
([d7ca6cf](d7ca6cf))
* **component,ai,gemini:** enhance streaming to output all fields
([#1106](#1106))
([dfb6b24](dfb6b24))
* **component,ai,gemini:** implement automatic format conversion for
unsupported media types
([#1128](#1128))
([f767b8a](f767b8a))
* **component,ai,gemini:** implement File API support for large files…
([#1118](#1118))
([b51c8f4](b51c8f4))
* **data:** add comprehensive AVIF image format support
([#1135](#1135))
([76d6941](76d6941))
* **data:** add HEIC/HEIF image support and normalize MIME types
([#1127](#1127))
([2dfa254](2dfa254))
* **data:** enhance unmarshaler with JSON string to struct conversion
([#1116](#1116))
([9e06b7c](9e06b7c))
* **data:** implement time types support with pattern validation
([#1115](#1115))
([79630c0](79630c0))


### Bug Fixes

* **compogen:** escape curly braces for readme.com compatibility
([#1124](#1124))
([904992d](904992d))
* **component,ai,gemini:** add operation validation for cache task
([#1130](#1130))
([9e19255](9e19255))
* **component,ai,gemini:** correct text-based documents logic
([#1103](#1103))
([ed5a111](ed5a111))
* **component,ai,gemini:** unify InlineData processing and enable images
in streaming responses
([#1125](#1125))
([3117046](3117046))
* **component,document:** fix incorrect expected value in the unit test
([#1138](#1138))
([189dbd6](189dbd6))
* **data:** remove duplicate dot in generated filenames
([#1136](#1136))
([0a74a00](0a74a00))
* **external:** fix Content-Disposition header parsing for filename
extraction
([#1132](#1132))
([869b081](869b081))
* **service:** handle null JSON metadata in pipeline conversion
([#1134](#1134))
([b244784](b244784))
* **text:** correct positions on duplicate markdown chunks
([#1120](#1120))
([1b4cd1f](1b4cd1f))
* **usage:** add missing error filtering for users/admin
([#1119](#1119))
([cd1bd55](cd1bd55))


### Miscellaneous

* release v0.61.0
([e1db93c](e1db93c))


### Refactor

* **component,ai,gemini:** merge usage and usage-metadata fields into
single usage field
([#1126](#1126))
([a6046cd](a6046cd))
* **component,ai.gemini:** standardize file api timeout and use native
embedding type
([#1133](#1133))
([174f7d6](174f7d6))
* **component,generic,http:** move test functions to test files and
improve code legibility
([#1131](#1131))
([1153a09](1153a09))
* **component,generic,http:** replace env-based URL validation with
constructor injection
([#1121](#1121))
([f1f7d2f](f1f7d2f))


### Tests

* **component,generic,http:** replace external httpbin.org dependency
with local test server
([#1101](#1101))
([a82d155](a82d155))

---
This PR was generated with [Release
Please](https://github.com/googleapis/release-please). See
[documentation](https://github.com/googleapis/release-please#release-please).
donch1989 added a commit to instill-ai/instill-core that referenced this pull request Oct 8, 2025
Because
- The version of the pipeline-backend service is not updated in the
instill-core repository.

This commit
- updates the `PIPELINE_BACKEND_VERSION` in the `.env` file to `0.61.0`.
- updates the `pipelineBackend.image.tag` in the helm chart values.yaml
file to `0.61.0`.

## Changes in pipeline-backend
- chore(main): release 0.61.0 (instill-ai/pipeline-backend#1137)
- chore: release v0.61.0
- fix(component,document): fix incorrect expected value in the unit test
(instill-ai/pipeline-backend#1138)
- fix(data): remove duplicate dot in generated filenames
(instill-ai/pipeline-backend#1136)
- feat(data): add comprehensive AVIF image format support
(instill-ai/pipeline-backend#1135)
- fix(service): handle null JSON metadata in pipeline conversion
(instill-ai/pipeline-backend#1134)
- refactor(component,ai.gemini): standardize file api timeout and use
native embedding type (instill-ai/pipeline-backend#1133)
- fix(external): fix Content-Disposition header parsing for filename
extraction (instill-ai/pipeline-backend#1132)
- refactor(component,generic,http): move test functions to test files
and improve code legibility (instill-ai/pipeline-backend#1131)
- fix(component,ai,gemini): add operation validation for cache task
(instill-ai/pipeline-backend#1130)
- feat(component,ai,gemini): add text embeddings task support
(instill-ai/pipeline-backend#1129)
- feat(component,ai,gemini): implement automatic format conversion for
unsupported media types (instill-ai/pipeline-backend#1128)
- feat(data): add HEIC/HEIF image support and normalize MIME types
(instill-ai/pipeline-backend#1127)
- refactor(component,ai,gemini): merge usage and usage-metadata fields
into single usage field (instill-ai/pipeline-backend#1126)
- fix(component,ai,gemini): unify InlineData processing and enable
images in streaming responses (instill-ai/pipeline-backend#1125)
- fix(compogen): escape curly braces for readme.com compatibility
(instill-ai/pipeline-backend#1124)
- ci(workflows): merge sync-component-docs workflows into single push-t…
(instill-ai/pipeline-backend#1123)
- feat(component,ai,gemini): add image generation support
(instill-ai/pipeline-backend#1122)

Co-authored-by: donch1989 <441005+donch1989@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants