Skip to content
Permalink

Comparing changes

Choose two branches to see what’s changed or to start a new pull request. If you need to, you can also or learn more about diff comparisons.

Open a pull request

Create a new pull request by comparing changes across two branches. If you need to, you can also . Learn more about diff comparisons here.
base repository: snakemake/snakemake-executor-plugin-kubernetes
Failed to load repositories. Confirm that selected base ref is valid, then try again.
Loading
base: v0.3.1
Choose a base ref
...
head repository: snakemake/snakemake-executor-plugin-kubernetes
Failed to load repositories. Confirm that selected head ref is valid, then try again.
Loading
compare: v0.3.2
Choose a head ref
  • 2 commits
  • 3 files changed
  • 3 contributors

Commits on Mar 6, 2025

  1. fix: Adding additional logic to handle resource limit requirements (#38)

    Hey @johanneskoester I found a bug with my GPU code. It turns out that
    in many default configurations for Kubernetes clusters there is a limit
    range or some other admission controller requiring both resource
    requests and resource limits when scaling to very large jobs. This
    ultimately prevents jobs from unbounded resource use. In some scenarios
    the admission controller will reject the pod at admission time and in
    others the pod dies when it tries to auto-assign some default limit that
    is insufficient. From what I can tell it’s actually fairly difficult
    catching these errors - sometimes the pods die silently or it appears
    that the job never started. The other danger to this is that if the
    cluster doesn’t have required limit ranges then the configuration may
    interpret this a permission to use infinite resources - leaving you with
    an uncomfortable compute bill.
    
    So what I added is a new resource type called `scale`. This variable
    allows us to conditionally include resource limits - those limits being
    equal to the resource requests.
    
    - If `scale=True`(the default), we omit the limits entirely. This is how
    the plugin currently operates and will allow the pods to scale up as
    needed.
    - If `scale=False` we explicitly set the resource limits for each
    requested resource type.
    
    Hopefully this logic gives enough control to handle larger/specialized
    workloads to prevent unintended behavior.
    
    <!-- This is an auto-generated comment: release notes by coderabbit.ai
    -->
    ## Summary by CodeRabbit
    
    - **New Features**
    - Improved resource allocation for Kubernetes container deployments.
    Resource limits for CPU, memory, ephemeral storage, and GPU are now
    applied only when scaling is not enabled, offering more flexibility in
    managing container workloads.
    <!-- end of auto-generated comment: release notes by coderabbit.ai -->
    
    ---------
    
    Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
    mrburke00 and coderabbitai[bot] authored Mar 6, 2025
    Configuration menu
    Copy the full SHA
    25819c5 View commit details
    Browse the repository at this point in the history
  2. chore(main): release 0.3.2 (#39)

    🤖 I have created a release *beep* *boop*
    ---
    
    
    ##
    [0.3.2](v0.3.1...v0.3.2)
    (2025-03-06)
    
    
    ### Bug Fixes
    
    * Adding additional logic to handle resource limit requirements
    ([#38](#38))
    ([25819c5](25819c5))
    
    ---
    This PR was generated with [Release
    Please](https://github.com/googleapis/release-please). See
    [documentation](https://github.com/googleapis/release-please#release-please).
    
    Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
    github-actions[bot] authored Mar 6, 2025
    Configuration menu
    Copy the full SHA
    f9d2c7a View commit details
    Browse the repository at this point in the history
Loading