Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix invalid URI Encoding in API request handling #465

Merged
merged 14 commits into from
Jul 13, 2024

Conversation

delano
Copy link
Collaborator

@delano delano commented Jul 13, 2024

User description

The changes introduce a new middleware Rack::HandleInvalidUTF8 to handle invalid UTF-8 characters in HTTP requests. This middleware checks various parts of the request environment for invalid UTF-8 sequences and raises an exception if any are found, returning a 400 Bad Request response. Additionally, the existing Rack::HandleInvalidPercentEncoding middleware is enhanced to handle exceptions more gracefully and log errors.

Key changes include:

  • Addition of lib/middleware/handle_invalid_utf8.rb to define the new middleware.
  • Extensive tests in try/07_handle_invalid_utf8_try.rb to validate the new middleware's functionality.
  • Enhancements to lib/middleware/handle_invalid_percent_encoding.rb to improve error handling and logging.
  • Introduction of lib/onetime/app/base.rb to provide configuration options for UTF-8 and URI encoding checks.
  • Updates to config.ru to include the new middleware in the middleware stack.
  • Addition of inline descriptions in tryout files.
  • Modifications to lib/onetime/app/api.rb and lib/onetime/app/api/base.rb to include the new settings and improve relative path requires.
  • Enabling Performance/Size cop in .rubocop.yml for better code quality.

PR Type

Enhancement, Tests, Documentation


Description

  • Added Rack::HandleInvalidUTF8 middleware to handle invalid UTF-8 characters in HTTP requests.
  • Enhanced Rack::HandleInvalidPercentEncoding middleware for better error handling and logging.
  • Introduced AppSettings module for configuring UTF-8 and URI encoding checks.
  • Updated tests to cover new middleware functionalities and configurations.
  • Added detailed descriptions to various tryout files for better documentation.
  • Enabled Performance/Size cop in .rubocop.yml for improved code quality.

Changes walkthrough 📝

Relevant files
Enhancement
7 files
config.ru
Add UTF-8 handling middleware and improve logging               

config.ru

  • Added Rack::HandleInvalidUTF8 middleware to handle invalid UTF-8
    characters.
  • Ensured immediate flushing of stdout for real-time logging.
  • +7/-0     
    handle_invalid_percent_encoding.rb
    Improve error handling and logging in percent encoding middleware

    lib/middleware/handle_invalid_percent_encoding.rb

  • Enhanced to handle exceptions more gracefully.
  • Added logging improvements.
  • Introduced check_enabled? method for route-specific checks.
  • +48/-18 
    handle_invalid_utf8.rb
    Add middleware to handle invalid UTF-8 characters               

    lib/middleware/handle_invalid_utf8.rb

  • Introduced new middleware to handle invalid UTF-8 characters.
  • Added logging and error handling for invalid UTF-8 sequences.
  • Implemented route-specific checks for UTF-8 validation.
  • +143/-0 
    api.rb
    Integrate AppSettings module for encoding checks                 

    lib/onetime/app/api.rb

  • Included AppSettings module for UTF-8 and URI encoding checks.
  • Set default values for check_utf8 and check_uri_encoding.
  • +5/-0     
    base.rb
    Update relative path requires in API base                               

    lib/onetime/app/api/base.rb

    • Updated relative path requires for better maintainability.
    +3/-1     
    base.rb
    Add AppSettings module for encoding middleware configuration

    lib/onetime/app/base.rb

  • Added AppSettings module for configuration of encoding checks.
  • Provided class-level accessors for check_utf8 and check_uri_encoding.
  • +33/-0   
    base.rb
    Update relative path requires in web base                               

    lib/onetime/app/web/base.rb

    • Updated relative path requires for better maintainability.
    +1/-1     
    Tests
    3 files
    06_handle_invalid_percent_encoding_try.rb
    Update tests for percent encoding middleware                         

    try/06_handle_invalid_percent_encoding_try.rb

  • Updated tests to reflect changes in percent encoding middleware.
  • Ensured middleware sets correct content type in error responses.
  • +9/-9     
    07_handle_invalid_utf8_try.rb
    Add tests for UTF-8 handling middleware                                   

    try/07_handle_invalid_utf8_try.rb

  • Added tests for the new UTF-8 handling middleware.
  • Covered various scenarios including invalid UTF-8 in headers and body.

  • +117/-0 
    23_app_settings_try.rb
    Add tests for AppSettings module                                                 

    try/23_app_settings_try.rb

  • Added tests for AppSettings module.
  • Covered default values and setting/getting configurations.
  • +73/-0   
    Documentation
    16 files
    10_utils_try.rb
    Document utility function tests                                                   

    try/10_utils_try.rb

    • Added detailed descriptions for utility function tests.
    +16/-0   
    11_entropy_try.rb
    Document entropy module tests                                                       

    try/11_entropy_try.rb

    • Added detailed descriptions for entropy module tests.
    +13/-0   
    15_config_try.rb
    Document configuration tests                                                         

    try/15_config_try.rb

    • Added detailed descriptions for configuration tests.
    +12/-0   
    17_mail_validation.rb
    Document email validation tests                                                   

    try/17_mail_validation.rb

    • Added detailed descriptions for email validation tests.
    +14/-1   
    20_metadata_try.rb
    Document metadata class tests                                                       

    try/20_metadata_try.rb

    • Added detailed descriptions for metadata class tests.
    +14/-0   
    21_secret_try.rb
    Document secret class tests                                                           

    try/21_secret_try.rb

    • Added detailed descriptions for secret class tests.
    +13/-0   
    22_value_encryption_try.rb
    Document value encryption tests                                                   

    try/22_value_encryption_try.rb

    • Added detailed descriptions for value encryption tests.
    +18/-0   
    23_passphrase_try.rb
    Document passphrase handling tests                                             

    try/23_passphrase_try.rb

    • Added detailed descriptions for passphrase handling tests.
    +12/-0   
    25_customer_try.rb
    Document customer model tests                                                       

    try/25_customer_try.rb

    • Added detailed descriptions for customer model tests.
    +17/-0   
    30_session_try.rb
    Document session management tests                                               

    try/30_session_try.rb

    • Added detailed descriptions for session management tests.
    +17/-0   
    35_ratelimit_try.rb
    Document rate limiting tests                                                         

    try/35_ratelimit_try.rb

    • Added detailed descriptions for rate limiting tests.
    +16/-0   
    40_email_template_try.rb
    Document email template tests                                                       

    try/40_email_template_try.rb

    • Added detailed descriptions for email template tests.
    +13/-0   
    50_subdomain_try.rb
    Document subdomain functionality tests                                     

    try/50_subdomain_try.rb

    • Added detailed descriptions for subdomain functionality tests.
    +16/-0   
    60_logic_base_try.rb
    Document base logic tests                                                               

    try/60_logic_base_try.rb

    • Added detailed descriptions for base logic tests.
    +16/-0   
    61_logic_destroy_account_try.rb
    Document account destruction tests                                             

    try/61_logic_destroy_account_try.rb

    • Added detailed descriptions for account destruction tests.
    +15/-0   
    68_receive_feedback_try.rb
    Document feedback receiving tests                                               

    try/68_receive_feedback_try.rb

    • Added detailed descriptions for feedback receiving tests.
    +15/-0   
    Configuration changes
    1 files
    .rubocop.yml
    Enable Performance/Size cop in RuboCop configuration         

    .rubocop.yml

    • Enabled Performance/Size cop for better code quality.
    +1/-0     

    💡 PR-Agent usage:
    Comment /help on the PR to get a list of all available PR-Agent tools and their descriptions

    delano added 12 commits July 13, 2024 06:26
    Separated the exception handling into a separate method for better
    readability.
    Adds Rack middleware to sanitize requests, logging and returning 400
    errors for invalid UTF-8 in headers or bodies. Tests cover a variety of
    encoding issues and ensure valid requests pass through unchanged. Now
    requests with malformed encoding will be rejected consistently according
    to standards while properly encoded content continues to be served.
    The middleware has been updated check request parameters, headers, and
    body content for invalid UTF-8 sequences. Each potential source is now
    validated and an error is raised early if invalid encoding is detected.
    This avoids silently handling requests with bad data and ensures only
    valid encoded content is passed to the application.
    
    - The middleware no longer extends `Rack::UTF8Sanitizer` and instead
    implements custom validation logic
    - An array of all relevant request sources is defined to iterate over
    for validation
    - A new method checks each value and raises an error if invalid UTF-8 is
    found
    - On failure, a descriptive error message identifies the failed
    parameter
    - This change hopefully avoids potential encoding issues down the
    application stack. And unnecessary processing of invalid data user
    headaches.
    We only want to check uri and utf encodings for API routes, not for web
    routes. This change adds a new AppSettings module to define the
    middleware flags for each route. The flags are then used to determine
    whether to check for valid encodings.
    
    The middleware for handling invalid percent encoding and UTF-8
    validation has been refactored to:
    
    - Add specific check for Onetime app routes to determine configuration
    - Improve naming (URI encoding rather than percent encoding)
    - Check request parameters for valid encodings only when configured at
    the app/route level
    - Move shared AppSettings module to a common location
    - Improve logging with more contextual messages
    
    These changes:
    
    - Use more precise terminology for encoding validation
    - Avoid unnecessary checks when not configured
    - Centralize reusable settings definition
    - Enhance debuggability with improved logging
    
    Tests have been updated to demonstrate the more nuanced behavior.
    Need to explicitly enable the check in unit tests. There's no app or
    route to hang logic off of.
    • Remove rack-utf8_sanitizer from Gemfile
    • Update Gemfile.lock
    @delano delano self-assigned this Jul 13, 2024
    @delano delano added improvement Issues or pull requests that involve improvements to the project. documentation Issues or pull requests related to documentation. tech debt Addressing future rework costs due to quick, suboptimal solutions taken previously. maintenance Issues or tasks related to maintenance of the project. tests Issues or pull requests that involve testing code. dx Developer experience working with the project. labels Jul 13, 2024
    @delano delano changed the title Fix invalid %-encoding in POST request handling Fix invalid URI Encoding in API request handling Jul 13, 2024
    @codiumai-pr-agent-pro codiumai-pr-agent-pro bot added the Review effort [1-5]: 4 Pull requests that require a very high level of review effort. label Jul 13, 2024
    Copy link
    Contributor

    PR Reviewer Guide 🔍

    ⏱️ Estimated effort to review: 4 🔵🔵🔵🔵⚪
    🧪 PR contains tests
    🔒 No security concerns identified
    ⚡ Key issues to review

    Code Duplication
    The method handle_exception in handle_invalid_percent_encoding.rb and handle_invalid_utf8.rb seems to have duplicated logic for handling exceptions. Consider refactoring this into a shared module or superclass to reduce duplication and improve maintainability.

    Debugging Code
    The file contains a require 'pry-byebug' statement, which is typically used for debugging. This should not be included in production code as it can lead to performance issues and potentially expose sensitive information.

    Exception Handling
    The method check_and_raise_for_invalid_utf8 raises a generic StandardError for invalid UTF-8 input. It would be better to define a custom exception class for clarity and to allow more precise rescue statements elsewhere in the application.

    Middleware Configuration
    The middleware Rack::HandleInvalidUTF8 is added globally in the config.ru file without any environment-specific checks. This might not be necessary for all environments (e.g., testing) and could lead to unexpected behavior or performance issues in those contexts.

    Copy link
    Contributor

    codiumai-pr-agent-pro bot commented Jul 13, 2024

    PR Code Suggestions ✨

    CategorySuggestion                                                                                                                                    Score
    Thread safety
    Use thread-local storage for variables that might be modified in a multi-threaded context

    It's recommended to use instance variables with caution in a multi-threaded
    environment like a web server. Consider using thread-local variables or other
    thread-safe mechanisms if these variables (@check_utf8, @check_uri_encoding) are
    meant to be modified at runtime.

    lib/onetime/app/api.rb [10-11]

    -@check_utf8 = true
    -@check_uri_encoding = true
    +Thread.current[:check_utf8] = true
    +Thread.current[:check_uri_encoding] = true
     
    • Apply this suggestion
    Suggestion importance[1-10]: 9

    Why: Using thread-local variables enhances thread safety, which is crucial in a multi-threaded environment like a web server. This suggestion addresses a significant potential issue.

    9
    Performance
    Optimize the environment handling to avoid unnecessary duplication

    Instead of duplicating the entire environment, consider using a more efficient
    method to check for invalid UTF-8 without altering the original env. This can
    improve performance, especially for large environments.

    lib/middleware/handle_invalid_utf8.rb [61-65]

    -shallow_copy = env.dup
     self.class.input_sources.each do |key|
    -  next unless shallow_copy.key?(key)
    -  check_and_raise_for_invalid_utf8(shallow_copy[key], key)
    +  next unless env.key?(key)
    +  check_and_raise_for_invalid_utf8(env[key], key)
     
    • Apply this suggestion
    Suggestion importance[1-10]: 8

    Why: The suggestion correctly identifies an opportunity to improve performance by avoiding unnecessary duplication of the environment. This change is efficient and maintains the functionality.

    8
    Best practice
    ✅ Change from require_relative to require for external gems to avoid path issues
    Suggestion Impact:The suggestion to use `require` instead of `require_relative` for the 'pry-byebug' gem was implemented, as the line with `require 'pry-byebug'` was removed.

    code diff:

     require_relative '../lib/onetime'
    -require 'pry-byebug';

    Consider using require instead of require_relative for loading the 'pry-byebug' gem.
    This is because require is typically used for external libraries and gems, and it
    can help avoid issues related to relative paths in different environments.

    try/30_session_try.rb [21]

    -require 'pry-byebug';
    +require 'pry-byebug'
     
    • Apply this suggestion
    Suggestion importance[1-10]: 8

    Why: Using require for external gems is a best practice to avoid path issues, especially in different environments. This suggestion improves code reliability.

    8
    Enhancement
    ✅ Enhance logging details for better issue tracking

    Improve the logging detail by including more context about the error and the
    affected part of the request. This can help in quicker identification and resolution
    of issues.

    lib/middleware/handle_invalid_utf8.rb [130]

    -logger.info "[handle-invalid-utf8] #{message}"
    +logger.info "[handle-invalid-utf8] Error occurred: #{message}. Affected URI: #{env['REQUEST_URI']}"
     

    [Suggestion has been applied]

    Suggestion importance[1-10]: 7

    Why: The suggestion improves the logging detail by including more context about the error and the affected part of the request, which can aid in quicker identification and resolution of issues. This is a useful enhancement for debugging.

    7
    Maintainability
    Add comments for clarity and maintainability in RuboCop configuration

    It's good practice to keep the RuboCop configuration organized and maintain a
    consistent structure. Consider adding a comment above the new Performance/Size
    setting to describe its purpose or the reason for enabling it, similar to other
    settings in the file.

    .rubocop.yml [66]

    +# Enable Performance/Size to ensure efficient code
     Enabled: true
     
    • Apply this suggestion
    Suggestion importance[1-10]: 7

    Why: Adding comments improves the maintainability and clarity of the configuration file, making it easier for future developers to understand the purpose of settings.

    7
    ✅ Check for potential method conflicts after including helpers in API base

    Since the require_relative '../helpers' is added, ensure that any methods or modules
    included from Onetime::App::Helpers are not causing namespace conflicts or
    overriding methods unintentionally in Onetime::App::API::Base.

    lib/onetime/app/api/base.rb [1]

    -require_relative '../helpers'
    +require_relative '../helpers'  # Ensure no conflicts with Onetime::App::API::Base methods
     

    [Suggestion has been applied]

    Suggestion importance[1-10]: 5

    Why: While the suggestion to check for method conflicts is valid, it doesn't provide a concrete improvement to the code. It is more of a reminder for the developer.

    5

    Co-authored-by: codiumai-pr-agent-pro[bot] <151058649+codiumai-pr-agent-pro[bot]@users.noreply.github.com>
    Signed-off-by: Delano <1206+delano@users.noreply.github.com>
    Copy link
    Collaborator Author

    @delano delano left a comment

    Choose a reason for hiding this comment

    The reason will be displayed to describe this comment to others. Learn more.

    👌

    @delano delano merged commit 8592e94 into develop Jul 13, 2024
    9 checks passed
    @delano delano deleted the 463-utf-8-encoding-error branch July 13, 2024 17:47
    delano added a commit that referenced this pull request Jul 13, 2024
    commit b00226a
    Merge: 8592e94 1603016
    Author: Delano <1206+delano@users.noreply.github.com>
    Date:   Sat Jul 13 14:13:55 2024 -0700
    
        Merge pull request #466 from onetimesecret/delano/20240705-logging
    
    commit 1603016
    Author: delano <delano@onetimesecret.com>
    Date:   Sat Jul 13 11:28:46 2024 -0700
    
        [#20240705] Add common logger to production apps
    
    commit dced9d1
    Author: delano <delano@onetimesecret.com>
    Date:   Sat Jul 13 11:26:33 2024 -0700
    
        [#20240705] Fix for tryout checking updated log content
    
    commit 7b67bc7
    Author: delano <delano@onetimesecret.com>
    Date:   Sat Jul 13 11:26:04 2024 -0700
    
        [#20240705] Rewind rack input stream in handle_invalid_percent_encoding middleware
    
    commit 692e6da
    Author: delano <delano@onetimesecret.com>
    Date:   Sat Jul 13 11:15:06 2024 -0700
    
        [#20240705] Update logging to include REQUEST_URI
    
    commit 940470b
    Author: delano <delano@onetimesecret.com>
    Date:   Sat Jul 13 11:12:13 2024 -0700
    
        [#20240705] Simplify middleware setup in config.ru
    
        • Move HeaderLoggerMiddleware to Rack namespace
        • Update logging to use OT.info
        • Remove redundant middleware in dev setup
    
    commit 65305f4
    Author: delano <delano@onetimesecret.com>
    Date:   Sat Jul 13 11:07:57 2024 -0700
    
        [#20240705] Improve header logging middleware for debugging
    
        The HeaderLoggerMiddleware was renamed to Rack::HeaderLoggerMiddleware
        and modified to log at debug level instead of info.
    
    commit 18a7837
    Merge: dcbc98c 8592e94
    Author: delano <delano@onetimesecret.com>
    Date:   Sat Jul 13 11:02:53 2024 -0700
    
        [#20240705] Merge branch 'develop' into delano/20240705-logging
    
    commit 8592e94
    Merge: 1028fa8 4c9485b
    Author: Delano <1206+delano@users.noreply.github.com>
    Date:   Sat Jul 13 10:47:26 2024 -0700
    
        Merge pull request #465 from onetimesecret/463-utf-8-encoding-error
    
        Fix invalid URI Encoding in API request handling
    
    commit 4c9485b
    Author: Delano <1206+delano@users.noreply.github.com>
    Date:   Sat Jul 13 10:38:50 2024 -0700
    
        Update lib/onetime/app/api/base.rb
    
        Co-authored-by: codiumai-pr-agent-pro[bot] <151058649+codiumai-pr-agent-pro[bot]@users.noreply.github.com>
        Signed-off-by: Delano <1206+delano@users.noreply.github.com>
    
    commit 118046d
    Author: delano <delano@onetimesecret.com>
    Date:   Sat Jul 13 10:34:58 2024 -0700
    
        [#463] Remove errant pry-byebug requires
    
    commit 610c039
    Author: delano <delano@onetimesecret.com>
    Date:   Sat Jul 13 10:27:32 2024 -0700
    
        [#463] Remove rack-utf8_sanitizer gem
    
        • Remove rack-utf8_sanitizer from Gemfile
        • Update Gemfile.lock
    
    commit c74e25c
    Author: delano <delano@onetimesecret.com>
    Date:   Sat Jul 13 10:21:49 2024 -0700
    
        [#463] Grab some snacks and enjoy the docs!
    
    commit c8fe5df
    Author: delano <delano@onetimesecret.com>
    Date:   Sat Jul 13 10:20:46 2024 -0700
    
        [#463] Add tryouts for AppSettings
    
    commit 8c094d8
    Author: delano <delano@onetimesecret.com>
    Date:   Sat Jul 13 10:09:35 2024 -0700
    
        [#463] Fix for uri encoding tryout (after name change)
    
    commit 77d8813
    Author: delano <delano@onetimesecret.com>
    Date:   Sat Jul 13 10:09:08 2024 -0700
    
        [#463] Fix for utf8 encoding tryout
    
        Need to explicitly enable the check in unit tests. There's no app or
        route to hang logic off of.
    
    commit 2a7e446
    Author: delano <delano@onetimesecret.com>
    Date:   Sat Jul 13 10:05:56 2024 -0700
    
        [#463] Run utf8 check before uri encoding
    
    commit 302985f
    Author: delano <delano@onetimesecret.com>
    Date:   Sat Jul 13 10:04:52 2024 -0700
    
        [#463] Add AppSettings module for middleware flags
    
        We only want to check uri and utf encodings for API routes, not for web
        routes. This change adds a new AppSettings module to define the
        middleware flags for each route. The flags are then used to determine
        whether to check for valid encodings.
    
        The middleware for handling invalid percent encoding and UTF-8
        validation has been refactored to:
    
        - Add specific check for Onetime app routes to determine configuration
        - Improve naming (URI encoding rather than percent encoding)
        - Check request parameters for valid encodings only when configured at
        the app/route level
        - Move shared AppSettings module to a common location
        - Improve logging with more contextual messages
    
        These changes:
    
        - Use more precise terminology for encoding validation
        - Avoid unnecessary checks when not configured
        - Centralize reusable settings definition
        - Enhance debuggability with improved logging
    
        Tests have been updated to demonstrate the more nuanced behavior.
    
    commit 125c4ba
    Author: delano <delano@onetimesecret.com>
    Date:   Sat Jul 13 08:26:25 2024 -0700
    
        [#463] Tidy
    
    commit 958be41
    Author: delano <delano@onetimesecret.com>
    Date:   Sat Jul 13 08:04:17 2024 -0700
    
        [#463] Validate UTF-8 encoding in request parameters too
    
        The middleware has been updated check request parameters, headers, and
        body content for invalid UTF-8 sequences. Each potential source is now
        validated and an error is raised early if invalid encoding is detected.
        This avoids silently handling requests with bad data and ensures only
        valid encoded content is passed to the application.
    
        - The middleware no longer extends `Rack::UTF8Sanitizer` and instead
        implements custom validation logic
        - An array of all relevant request sources is defined to iterate over
        for validation
        - A new method checks each value and raises an error if invalid UTF-8 is
        found
        - On failure, a descriptive error message identifies the failed
        parameter
        - This change hopefully avoids potential encoding issues down the
        application stack. And unnecessary processing of invalid data user
        headaches.
    
    commit 4676c3c
    Author: delano <delano@onetimesecret.com>
    Date:   Sat Jul 13 07:08:08 2024 -0700
    
        [#463] Add UTF-8 sanitization middleware to handle invalid encodings
    
    commit d7d4b86
    Author: delano <delano@onetimesecret.com>
    Date:   Sat Jul 13 07:07:27 2024 -0700
    
        [#463] Handle invalid UTF-8 in requests
    
        Adds Rack middleware to sanitize requests, logging and returning 400
        errors for invalid UTF-8 in headers or bodies. Tests cover a variety of
        encoding issues and ensure valid requests pass through unchanged. Now
        requests with malformed encoding will be rejected consistently according
        to standards while properly encoded content continues to be served.
    
    commit 9c2efb4
    Author: delano <delano@onetimesecret.com>
    Date:   Sat Jul 13 06:26:03 2024 -0700
    
        [#463] Cleanup middleware implementation
    
        Separated the exception handling into a separate method for better
        readability.
    
    commit 1028fa8
    Merge: d2aa9c6 2ecb142
    Author: Delano <1206+delano@users.noreply.github.com>
    Date:   Fri Jul 12 15:49:06 2024 -0700
    
        Merge pull request #464 from onetimesecret/463-fix-invalid-encoding-in-post-request-handling
    
        Fix invalid %-encoding in POST request handling
    
    commit 2ecb142
    Author: delano <delano@onetimesecret.com>
    Date:   Fri Jul 12 15:09:01 2024 -0700
    
        [#463] Fix invalid percent encoding
    
        i.e. "invalid %-encoding" error.
    
        Integrated a middleware to handle invalid percent encodings by
        normalizing characters. This avoids potential errors from corrupted
        query parameters and ensures consistency across different parsers. Added
        to development environment for debugging and added a logging middleware
        for additional visibility.
    
    commit b4459cb
    Author: delano <delano@onetimesecret.com>
    Date:   Fri Jul 12 15:05:03 2024 -0700
    
        [#463] Add middleware to handle percent-encoding error
    
        Includes tryouts.
    
        ---
    
        The `Rack::HandleInvalidPercentEncoding` middleware addresses a complex issue in web applications: handling malformed percent-encoded data in HTTP requests. Percent-encoding is used to represent special characters in URLs and form data, but when it's implemented incorrectly by clients, it can lead to parsing errors and potential security vulnerabilities.
    
        This middleware tackles several challenges:
    
        1. Detecting invalid percent-encoding early in the request processing pipeline.
        2. Preventing these errors from causing application crashes or unpredictable behavior.
        3. Providing meaningful feedback to API consumers or end-users.
        4. Logging the issue for monitoring and debugging purposes.
    
        Rather than attempting to guess the correct encoding or silently fixing the input, which could lead to data integrity issues or mask potential attacks, this middleware takes a straightforward and secure approach:
    
        1. It intercepts the request before it reaches the main application.
        2. It attempts to parse the request parameters, which will trigger an error if invalid percent-encoding is present.
        3. If an error is detected, it immediately returns a 400 Bad Request response with a clear error message.
        4. It logs the error for server-side visibility.
    
        ---
    
        Prior art:
    
        Invalid % Encoding #2159
        ruby-grape/grape#2159
    
        invalid %-encoding error in application for malformed uri #337
        rack/rack#337
    
        invalid %-encoding Rack backtrace
        https://gist.github.com/matiaskorhonen/4f67425a065fd12b43b8
    
        Invalid or incomplete POST parameters
        https://thomasleecopeland.com/2018/08/12/invalid-or-incomplete-post-parameters.html
    
        Rails/Rack: "ArgumentError: invalid %-encoding" for POST data
        https://stackoverflow.com/questions/15769681/rails-rack-argumenterror-invalid-encoding-for-post-data/24615414#24615414
    
        invalid %-encoding error in application for malformed uri
        https://gist.github.com/bf4/d26259acfa29f3b9882b#file-exception_app-rb
    
    commit 52146f4
    Author: delano <delano@onetimesecret.com>
    Date:   Fri Jul 12 14:31:14 2024 -0700
    
        [#463] Undo Rubocop autostyling for new tryouts
    
    commit 1047cfd
    Author: delano <delano@onetimesecret.com>
    Date:   Fri Jul 12 14:30:26 2024 -0700
    
        [#463] Exclude migrations, tryouts from RuboCop analysis
    
    commit 129d964
    Author: delano <delano@onetimesecret.com>
    Date:   Fri Jul 12 14:29:09 2024 -0700
    
        [#463] Add tryouts to replicate behaviour
    
        e.g. Raising `invalid %-encoding` errors in request.params.
    
        Signed-off-by: delano <delano@onetimesecret.com>
    
    commit c85720c
    Author: delano <delano@onetimesecret.com>
    Date:   Fri Jul 12 06:52:20 2024 -0700
    
        [#463] Add powershell example
    
    commit dcbc98c
    Author: delano <delano@onetimesecret.com>
    Date:   Thu Jul 11 15:27:10 2024 -0700
    
        [#20240705] Improve logging middleware for debugging
    
        Moved the HeaderLoggerMiddleware code into its own file to clean up
        config.ru. Also added documentation comments to the middleware class to
        explain its purpose and recommended usage.
    
        The HeaderLoggerMiddleware is intended for development environments to
        log HTTP request headers, which can help debug proxy and load balancer
        configs. It was previously defined inline in config.ru, making that file
        more cluttered. Extracting it to its own file in the lib/middleware
        folder results in cleaner code organization without impacting
        functionality.
    
    commit 5fa285e
    Author: delano <delano@onetimesecret.com>
    Date:   Thu Jul 11 15:01:28 2024 -0700
    
        [#20240705] Add prompts folder to ignored files
    
        The .gitignore file was updated to include ignoring new folders containing dialogue prompts and configuration for the Caddy web server. This prevents unrelated files from being tracked in source control and keeps the repository focused on code.
    
    commit c3b1e71
    Author: delano <delano@onetimesecret.com>
    Date:   Thu Jul 11 14:53:19 2024 -0700
    
        [#20240705] Improve debugging with logging headers
    
        Add middleware to log request headers to syslog for all requests. This
        provides valuable context when debugging issues, helping to understand
        origin of requests and behaviour of upstream services. The middleware is
        added to both development and production configurations so header data
        is consistently captured.
    
        Header logging provides:
    
        - Origin information like client IP, forwarded headers
        - Request metadata like HTTP method and URL
        - Insight into headers from upstream services
    
        This focuses logging on important requests attributes without cluttering
        logs with request body content. With just the header info, issues can
        often be resolved more quickly.
    delano added a commit that referenced this pull request Jul 16, 2024
    commit 073c16d
    Merge: 297379e 9348097
    Author: Delano <1206+delano@users.noreply.github.com>
    Date:   Sun Jul 14 12:05:58 2024 -0700
    
        Merge pull request #475 from onetimesecret/473-make-bundle-install-optional-in-entrypointsh
    
        Make bundle install optional in entrypoint.sh
    
    commit 9348097
    Author: delano <delano@onetimesecret.com>
    Date:   Sun Jul 14 11:58:32 2024 -0700
    
        [#473] Update bundle config for docker
    
        - Set Bundler's `without` config to skip development and test groups,
        avoiding unnecessary packages in the runtime image.
        - Updated the Gemfile.lock and bundled all dependencies to ensure
        consistency across versions.
        - Bumped the application version from 0.13.0 to 0.15.0 to make up for
        lost time.
    
    commit 8d995e1
    Author: delano <delano@onetimesecret.com>
    Date:   Sun Jul 14 11:56:35 2024 -0700
    
        [#473] Make bundle install optional (via BUNDLE_INSTALL)
    
        Also update shebang in entrypoint.sh to use bash instead of sh.
    
    commit 297379e
    Merge: 7957a44 03e88a9
    Author: Delano <1206+delano@users.noreply.github.com>
    Date:   Sun Jul 14 08:36:26 2024 -0700
    
        Merge pull request #472 from onetimesecret/471-improve-middleware-logging-and-configuration-handling
    
        Refactor middleware, logging, and configuration handling
    
    commit 03e88a9
    Author: delano <delano@onetimesecret.com>
    Date:   Sun Jul 14 08:31:42 2024 -0700
    
        [#471] Remove legacy cli command (register_build)
    
    commit 2aaaf13
    Author: delano <delano@onetimesecret.com>
    Date:   Sun Jul 14 08:31:13 2024 -0700
    
        [#471] Delay even requiring sentry until enabled (in dev)
    
    commit 1a201a0
    Author: delano <delano@onetimesecret.com>
    Date:   Sun Jul 14 08:27:46 2024 -0700
    
        [#471] Formatting
    
        And remove auto-loading debugger in dev + debug mode.
    
    commit 0f64cd1
    Author: delano <delano@onetimesecret.com>
    Date:   Sun Jul 14 07:26:44 2024 -0700
    
        [#471] Rename BUILD.yml to VERSION.yml for clarity
    
    commit bc74241
    Author: delano <delano@onetimesecret.com>
    Date:   Sun Jul 14 07:21:05 2024 -0700
    
        [#471] Improve initialization performance and clarity
    
        The initialization routine was restructured for improved performance and
        readability. Key changes:
    
        - Load configuration earlier for dependencies
        - Extract initializer steps into private methods named for their purpose
        - Prepare emailer, secrets, rate limiting and fortunes loading before
        other steps
        - Reduce duplication around locale loading
        - Print locales loaded in banner for easier debugging
    
    commit 42f5cbf
    Author: delano <delano@onetimesecret.com>
    Date:   Sun Jul 14 07:07:50 2024 -0700
    
        [#471] Simplify bootup sequence
    
        Refactor the Onetime.boot! to improve readability by extracting logic
        into private methods with descriptive names. Separate configuration,
        initialization, and connection tasks.
    
    commit 1eea4b1
    Author: delano <delano@onetimesecret.com>
    Date:   Sun Jul 14 07:06:35 2024 -0700
    
        [#471] Simplify middleware logistics at start-time
    
    commit 9fa4d0c
    Author: delano <delano@onetimesecret.com>
    Date:   Sun Jul 14 07:05:53 2024 -0700
    
        [#471] Set default log level for middlewares
    
    commit 7957a44
    Merge: 4178d4e 7ba1033
    Author: Delano <1206+delano@users.noreply.github.com>
    Date:   Sat Jul 13 16:48:19 2024 -0700
    
        Merge pull request #470 from onetimesecret/469-unexpected-404-errors-for-api-and-colonel-endpoints
    
        Standardize app attachment and rack 'em up (with route tests)
    
    commit 7ba1033
    Author: delano <delano@onetimesecret.com>
    Date:   Sat Jul 13 16:44:45 2024 -0700
    
        [#469] Add basic tryouts for web, API and colonel routes
    
        Added a new test file at try/90_routes_smoketest.rb to verify the basic
        functionality of key routes for the web application, API and colonel
        interfaces. The test uses Rack::MockRequest to simulate HTTP requests
        and check for expected status codes and response formats without
        exercising full functionality. This helps catch any major breakage in
        the core routing of these interfaces before more extensive tests are
        added.
    
    commit 828197f
    Merge: 83141a9 4178d4e
    Author: delano <delano@onetimesecret.com>
    Date:   Sat Jul 13 16:20:43 2024 -0700
    
        [#469] Merge branch 'develop' into 469-unexpected-404-errors-for-api-and-colonel-endpoints
    
    commit 4178d4e
    Merge: b00226a 50d570c
    Author: Delano <1206+delano@users.noreply.github.com>
    Date:   Sat Jul 13 16:18:40 2024 -0700
    
        Merge pull request #468 from onetimesecret/467-undefined-method-check_uri_encoding-error-in-handleinvalidpercentencoding-middleware
    
        Fix middleware app setting checks
    
    commit 83141a9
    Author: delano <delano@onetimesecret.com>
    Date:   Sat Jul 13 16:08:23 2024 -0700
    
        [#469] Disable logging tryouts (temporary
    
    commit 831570c
    Author: delano <delano@onetimesecret.com>
    Date:   Sat Jul 13 15:49:51 2024 -0700
    
        [#469] Standardize app attachment in rack 'em up
    
        Move common middleware attachment logic to inside map block for
        consistency across environments. This consolidates duplicate code and
        ensures apps are always attached with the same base set of middleware in
        both dev and prod.
    
    commit 50d570c
    Author: delano <delano@onetimesecret.com>
    Date:   Sat Jul 13 15:35:26 2024 -0700
    
        [#467] Include new AppSettings in web, colonel apps
    
    commit 98a8e44
    Author: delano <delano@onetimesecret.com>
    Date:   Sat Jul 13 15:14:44 2024 -0700
    
        [#467] Fix middleware app setting checks
    
        A change to how middleware checks for Otto app settings to determine if
        percent encoding and UTF-8 validation should be applied. Previously the
        check returned true if the app had settings defined, but this allowed
        validation to be applied even if the specific setting was disabled. The
        changes now only return true if both the app has settings AND the
        relevant setting is explicitly enabled, avoiding unnecessary validation.
    
    commit b00226a
    Merge: 8592e94 1603016
    Author: Delano <1206+delano@users.noreply.github.com>
    Date:   Sat Jul 13 14:13:55 2024 -0700
    
        Merge pull request #466 from onetimesecret/delano/20240705-logging
    
    commit 1603016
    Author: delano <delano@onetimesecret.com>
    Date:   Sat Jul 13 11:28:46 2024 -0700
    
        [#20240705] Add common logger to production apps
    
    commit dced9d1
    Author: delano <delano@onetimesecret.com>
    Date:   Sat Jul 13 11:26:33 2024 -0700
    
        [#20240705] Fix for tryout checking updated log content
    
    commit 7b67bc7
    Author: delano <delano@onetimesecret.com>
    Date:   Sat Jul 13 11:26:04 2024 -0700
    
        [#20240705] Rewind rack input stream in handle_invalid_percent_encoding middleware
    
    commit 692e6da
    Author: delano <delano@onetimesecret.com>
    Date:   Sat Jul 13 11:15:06 2024 -0700
    
        [#20240705] Update logging to include REQUEST_URI
    
    commit 940470b
    Author: delano <delano@onetimesecret.com>
    Date:   Sat Jul 13 11:12:13 2024 -0700
    
        [#20240705] Simplify middleware setup in config.ru
    
        • Move HeaderLoggerMiddleware to Rack namespace
        • Update logging to use OT.info
        • Remove redundant middleware in dev setup
    
    commit 65305f4
    Author: delano <delano@onetimesecret.com>
    Date:   Sat Jul 13 11:07:57 2024 -0700
    
        [#20240705] Improve header logging middleware for debugging
    
        The HeaderLoggerMiddleware was renamed to Rack::HeaderLoggerMiddleware
        and modified to log at debug level instead of info.
    
    commit 18a7837
    Merge: dcbc98c 8592e94
    Author: delano <delano@onetimesecret.com>
    Date:   Sat Jul 13 11:02:53 2024 -0700
    
        [#20240705] Merge branch 'develop' into delano/20240705-logging
    
    commit 8592e94
    Merge: 1028fa8 4c9485b
    Author: Delano <1206+delano@users.noreply.github.com>
    Date:   Sat Jul 13 10:47:26 2024 -0700
    
        Merge pull request #465 from onetimesecret/463-utf-8-encoding-error
    
        Fix invalid URI Encoding in API request handling
    
    commit 4c9485b
    Author: Delano <1206+delano@users.noreply.github.com>
    Date:   Sat Jul 13 10:38:50 2024 -0700
    
        Update lib/onetime/app/api/base.rb
    
        Co-authored-by: codiumai-pr-agent-pro[bot] <151058649+codiumai-pr-agent-pro[bot]@users.noreply.github.com>
        Signed-off-by: Delano <1206+delano@users.noreply.github.com>
    
    commit 118046d
    Author: delano <delano@onetimesecret.com>
    Date:   Sat Jul 13 10:34:58 2024 -0700
    
        [#463] Remove errant pry-byebug requires
    
    commit 610c039
    Author: delano <delano@onetimesecret.com>
    Date:   Sat Jul 13 10:27:32 2024 -0700
    
        [#463] Remove rack-utf8_sanitizer gem
    
        • Remove rack-utf8_sanitizer from Gemfile
        • Update Gemfile.lock
    
    commit c74e25c
    Author: delano <delano@onetimesecret.com>
    Date:   Sat Jul 13 10:21:49 2024 -0700
    
        [#463] Grab some snacks and enjoy the docs!
    
    commit c8fe5df
    Author: delano <delano@onetimesecret.com>
    Date:   Sat Jul 13 10:20:46 2024 -0700
    
        [#463] Add tryouts for AppSettings
    
    commit 8c094d8
    Author: delano <delano@onetimesecret.com>
    Date:   Sat Jul 13 10:09:35 2024 -0700
    
        [#463] Fix for uri encoding tryout (after name change)
    
    commit 77d8813
    Author: delano <delano@onetimesecret.com>
    Date:   Sat Jul 13 10:09:08 2024 -0700
    
        [#463] Fix for utf8 encoding tryout
    
        Need to explicitly enable the check in unit tests. There's no app or
        route to hang logic off of.
    
    commit 2a7e446
    Author: delano <delano@onetimesecret.com>
    Date:   Sat Jul 13 10:05:56 2024 -0700
    
        [#463] Run utf8 check before uri encoding
    
    commit 302985f
    Author: delano <delano@onetimesecret.com>
    Date:   Sat Jul 13 10:04:52 2024 -0700
    
        [#463] Add AppSettings module for middleware flags
    
        We only want to check uri and utf encodings for API routes, not for web
        routes. This change adds a new AppSettings module to define the
        middleware flags for each route. The flags are then used to determine
        whether to check for valid encodings.
    
        The middleware for handling invalid percent encoding and UTF-8
        validation has been refactored to:
    
        - Add specific check for Onetime app routes to determine configuration
        - Improve naming (URI encoding rather than percent encoding)
        - Check request parameters for valid encodings only when configured at
        the app/route level
        - Move shared AppSettings module to a common location
        - Improve logging with more contextual messages
    
        These changes:
    
        - Use more precise terminology for encoding validation
        - Avoid unnecessary checks when not configured
        - Centralize reusable settings definition
        - Enhance debuggability with improved logging
    
        Tests have been updated to demonstrate the more nuanced behavior.
    
    commit 125c4ba
    Author: delano <delano@onetimesecret.com>
    Date:   Sat Jul 13 08:26:25 2024 -0700
    
        [#463] Tidy
    
    commit 958be41
    Author: delano <delano@onetimesecret.com>
    Date:   Sat Jul 13 08:04:17 2024 -0700
    
        [#463] Validate UTF-8 encoding in request parameters too
    
        The middleware has been updated check request parameters, headers, and
        body content for invalid UTF-8 sequences. Each potential source is now
        validated and an error is raised early if invalid encoding is detected.
        This avoids silently handling requests with bad data and ensures only
        valid encoded content is passed to the application.
    
        - The middleware no longer extends `Rack::UTF8Sanitizer` and instead
        implements custom validation logic
        - An array of all relevant request sources is defined to iterate over
        for validation
        - A new method checks each value and raises an error if invalid UTF-8 is
        found
        - On failure, a descriptive error message identifies the failed
        parameter
        - This change hopefully avoids potential encoding issues down the
        application stack. And unnecessary processing of invalid data user
        headaches.
    
    commit 4676c3c
    Author: delano <delano@onetimesecret.com>
    Date:   Sat Jul 13 07:08:08 2024 -0700
    
        [#463] Add UTF-8 sanitization middleware to handle invalid encodings
    
    commit d7d4b86
    Author: delano <delano@onetimesecret.com>
    Date:   Sat Jul 13 07:07:27 2024 -0700
    
        [#463] Handle invalid UTF-8 in requests
    
        Adds Rack middleware to sanitize requests, logging and returning 400
        errors for invalid UTF-8 in headers or bodies. Tests cover a variety of
        encoding issues and ensure valid requests pass through unchanged. Now
        requests with malformed encoding will be rejected consistently according
        to standards while properly encoded content continues to be served.
    
    commit 9c2efb4
    Author: delano <delano@onetimesecret.com>
    Date:   Sat Jul 13 06:26:03 2024 -0700
    
        [#463] Cleanup middleware implementation
    
        Separated the exception handling into a separate method for better
        readability.
    
    commit 1028fa8
    Merge: d2aa9c6 2ecb142
    Author: Delano <1206+delano@users.noreply.github.com>
    Date:   Fri Jul 12 15:49:06 2024 -0700
    
        Merge pull request #464 from onetimesecret/463-fix-invalid-encoding-in-post-request-handling
    
        Fix invalid %-encoding in POST request handling
    
    commit 2ecb142
    Author: delano <delano@onetimesecret.com>
    Date:   Fri Jul 12 15:09:01 2024 -0700
    
        [#463] Fix invalid percent encoding
    
        i.e. "invalid %-encoding" error.
    
        Integrated a middleware to handle invalid percent encodings by
        normalizing characters. This avoids potential errors from corrupted
        query parameters and ensures consistency across different parsers. Added
        to development environment for debugging and added a logging middleware
        for additional visibility.
    
    commit b4459cb
    Author: delano <delano@onetimesecret.com>
    Date:   Fri Jul 12 15:05:03 2024 -0700
    
        [#463] Add middleware to handle percent-encoding error
    
        Includes tryouts.
    
        ---
    
        The `Rack::HandleInvalidPercentEncoding` middleware addresses a complex issue in web applications: handling malformed percent-encoded data in HTTP requests. Percent-encoding is used to represent special characters in URLs and form data, but when it's implemented incorrectly by clients, it can lead to parsing errors and potential security vulnerabilities.
    
        This middleware tackles several challenges:
    
        1. Detecting invalid percent-encoding early in the request processing pipeline.
        2. Preventing these errors from causing application crashes or unpredictable behavior.
        3. Providing meaningful feedback to API consumers or end-users.
        4. Logging the issue for monitoring and debugging purposes.
    
        Rather than attempting to guess the correct encoding or silently fixing the input, which could lead to data integrity issues or mask potential attacks, this middleware takes a straightforward and secure approach:
    
        1. It intercepts the request before it reaches the main application.
        2. It attempts to parse the request parameters, which will trigger an error if invalid percent-encoding is present.
        3. If an error is detected, it immediately returns a 400 Bad Request response with a clear error message.
        4. It logs the error for server-side visibility.
    
        ---
    
        Prior art:
    
        Invalid % Encoding #2159
        ruby-grape/grape#2159
    
        invalid %-encoding error in application for malformed uri #337
        rack/rack#337
    
        invalid %-encoding Rack backtrace
        https://gist.github.com/matiaskorhonen/4f67425a065fd12b43b8
    
        Invalid or incomplete POST parameters
        https://thomasleecopeland.com/2018/08/12/invalid-or-incomplete-post-parameters.html
    
        Rails/Rack: "ArgumentError: invalid %-encoding" for POST data
        https://stackoverflow.com/questions/15769681/rails-rack-argumenterror-invalid-encoding-for-post-data/24615414#24615414
    
        invalid %-encoding error in application for malformed uri
        https://gist.github.com/bf4/d26259acfa29f3b9882b#file-exception_app-rb
    
    commit 52146f4
    Author: delano <delano@onetimesecret.com>
    Date:   Fri Jul 12 14:31:14 2024 -0700
    
        [#463] Undo Rubocop autostyling for new tryouts
    
    commit 1047cfd
    Author: delano <delano@onetimesecret.com>
    Date:   Fri Jul 12 14:30:26 2024 -0700
    
        [#463] Exclude migrations, tryouts from RuboCop analysis
    
    commit 129d964
    Author: delano <delano@onetimesecret.com>
    Date:   Fri Jul 12 14:29:09 2024 -0700
    
        [#463] Add tryouts to replicate behaviour
    
        e.g. Raising `invalid %-encoding` errors in request.params.
    
        Signed-off-by: delano <delano@onetimesecret.com>
    
    commit c85720c
    Author: delano <delano@onetimesecret.com>
    Date:   Fri Jul 12 06:52:20 2024 -0700
    
        [#463] Add powershell example
    
    commit dcbc98c
    Author: delano <delano@onetimesecret.com>
    Date:   Thu Jul 11 15:27:10 2024 -0700
    
        [#20240705] Improve logging middleware for debugging
    
        Moved the HeaderLoggerMiddleware code into its own file to clean up
        config.ru. Also added documentation comments to the middleware class to
        explain its purpose and recommended usage.
    
        The HeaderLoggerMiddleware is intended for development environments to
        log HTTP request headers, which can help debug proxy and load balancer
        configs. It was previously defined inline in config.ru, making that file
        more cluttered. Extracting it to its own file in the lib/middleware
        folder results in cleaner code organization without impacting
        functionality.
    
    commit 5fa285e
    Author: delano <delano@onetimesecret.com>
    Date:   Thu Jul 11 15:01:28 2024 -0700
    
        [#20240705] Add prompts folder to ignored files
    
        The .gitignore file was updated to include ignoring new folders containing dialogue prompts and configuration for the Caddy web server. This prevents unrelated files from being tracked in source control and keeps the repository focused on code.
    
    commit c3b1e71
    Author: delano <delano@onetimesecret.com>
    Date:   Thu Jul 11 14:53:19 2024 -0700
    
        [#20240705] Improve debugging with logging headers
    
        Add middleware to log request headers to syslog for all requests. This
        provides valuable context when debugging issues, helping to understand
        origin of requests and behaviour of upstream services. The middleware is
        added to both development and production configurations so header data
        is consistently captured.
    
        Header logging provides:
    
        - Origin information like client IP, forwarded headers
        - Request metadata like HTTP method and URL
        - Insight into headers from upstream services
    
        This focuses logging on important requests attributes without cluttering
        logs with request body content. With just the header info, issues can
        often be resolved more quickly.
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    documentation Issues or pull requests related to documentation. dx Developer experience working with the project. Enhancement improvement Issues or pull requests that involve improvements to the project. maintenance Issues or tasks related to maintenance of the project. Review effort [1-5]: 4 Pull requests that require a very high level of review effort. tech debt Addressing future rework costs due to quick, suboptimal solutions taken previously. tests Issues or pull requests that involve testing code.
    Projects
    Status: Done
    Development

    Successfully merging this pull request may close these issues.

    1 participant