Skip to content

don't let user set target release with unreplicated artifacts #8321

@davepacheco

Description

@davepacheco

I'm trying an upgrade on a4x2 and tuf_artifact_replication seems not to be fully working, or is going slowly. I was still able to set the target release and proceed with the upgrade. The system behaves fairly well -- it just gets stuck executing a blueprint with something like this:

note: using Nexus URL http://[fd00:1122:3344:101::6]:12221
task: "blueprint_executor"
  configured period: every 1m
  currently executing: iter 83, triggered by a periodic timer firing
    started at 2025-06-11T20:19:47.485Z, running for 26067ms
  last completed activation: iter 82, triggered by a periodic timer firing
    started at 2025-06-11T20:18:47.489Z (86s ago) and ran for 51024ms
    target blueprint: 7ff85197-6be4-482c-8ded-c8b114ca07eb                                                                                                                                                                                                                                                                                                                                                                                                                                                                        
    execution:        enabled                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     
    status:           completed (14 steps)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        
    warning:          at: Deploy sled configs: Failed to put OmicronSledConfig {                                                                                                                                                                                                                                                                                                                                                                                                                                                  
                          generation: Generation(                                                                                                                                                                                                                                                                                                                                                                                                                                                                                 
                              9,                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  
                          ),                                                                   
...
                      } to sled 2d190199-1a3a-419c-8f07-13e00352306e: Error Response: status: 400 Bad Request; headers: {"content-type": "application/json", "x-request-id": "c62efedb-0d2f-47f0-90ef-0edb51c90944", "content-length": "210", "date": "Wed, 11 Jun 2025 20:18:51 GMT"}; value: Error { error_code: None, message: "sled config failed artifact store existence checks: Artifact be6aab2e39fcf5882e94e749ddd394eae45a322deac0528edae8143f9d53fed5 not found", request_id: "c62efedb-0d2f-47f0-90ef-0edb51c90944" } 
    error:            (none)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      


Eventually in at least some cases so far the artifact does show up and then execution succeeds and the upgrade continues. So the system is handling it about as well as it can, but I imagine we want to prevent you from starting an upgrade when the artifacts aren't replicated everywhere.

This is admittedly tricky -- you could add a sled in the middle of an upgrade and that shouldn't stop it. And we probably would need to be able to override this check if we've got some busted sled. But if we just check this at the point where you set the target release, that might be useful.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions