Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support new predicate type for pipelineRun resource #824

Open
joejstuart opened this issue Jun 8, 2023 · 12 comments
Open

Support new predicate type for pipelineRun resource #824

joejstuart opened this issue Jun 8, 2023 · 12 comments
Labels
kind/feature Categorizes issue or PR as related to a new feature. lifecycle/rotten

Comments

@joejstuart
Copy link
Contributor

joejstuart commented Jun 8, 2023

Feature request

Since the buildConfig field is removed in the SLSA v1 spec, I propose that chains supports a new predicate type to store the pipelineRun and expanded taskRun resources.
This issue was originally brought up in #750.

Use case

  • Verifying required tasks in a pipeline.
  • Verification of pipeline parameters.
  • Task steps need to run with certain images.
  • Tasks should be called with certain repos or bundles.
  • Verification of task parameters.
  • Task results need to be verified.
  • TaskRun order needs to be verified.

The SLSA V1 spec can meet some of these, but not all. For example, the taskRun results need the taskRun resources to be expanded and to verify the taskRun order, the pipelineRun status is needed. Therefore, a new predicate type that contains the pipelineRun status and expanded taskRuns is proposed. I also think including the pipelineRun spec would be a good idea, because that would allow for the whole pipelineRun to be evaluated at once.

@joejstuart joejstuart added the kind/feature Categorizes issue or PR as related to a new feature. label Jun 8, 2023
@wlynch
Copy link
Member

wlynch commented Jun 14, 2023

I'm generally fine with this as long as it's gated by config!

Is the idea roughly to dump the Run object as it's own intoto predicate? Might be worth throwing together an example. 🤔

/cc @chitrangpatel who might have feedback.

@chitrangpatel
Copy link
Contributor

chitrangpatel commented Jun 14, 2023

The SLSA1.0 predicate for chains is designed here.

TL;DR:

  1. We will store the entire pipelinerun/taskrun spec in externalParameters. This means that for fully-embedded tasks and pipelines you have most of the information right there. Although, this is not the resolved spec so you will not have access to the substitutions that were made by the pipelines controller.
  2. For remotely referenced pipelines and tasks, the remote references will be added to resolvedDependencies so you "could" access the remote resources for verification if required. (Although, we may not have covered the use case of on-cluster tasks and pipelines that are referenced in the task/pipeline runs 🤔 ).
  3. We will store the top level (task/pipeline) run results in byProducts. Is that enough or do you need to dig into the underlying taskrun results as well?

Would this design cover most of your use cases?

Note that we have already started to implement this design as a part of Issue #797

I agree with @wlynch that if there are major deviations from this design then we should definitely gate it by config into a new type. A couple of examples covering different use cases would definitely help here!!

@joejstuart
Copy link
Contributor Author

The SLSA1.0 predicate does cover most of the use cases. The two cases it doesn't cover are the need for the underlying taskRun results and the list of tasks the pipeline ran. The latter is in the pipelineRun spec only if the pipelineSpec is in-lined.

The proposal included the whole pipelineRun since having the whole resource in one document may make it easier to evaluate with a policy, but including the underlying taskRuns and pipelineRun status will meet our needs.

I will create some examples like in the document @chitrangpatel shared.

Thanks!

@joejstuart
Copy link
Contributor Author

joejstuart commented Jun 14, 2023

Here is an example proposal for the new predicate. The main purpose is so the task results can be evaluated and to capture task information that would only be included in the SLSAv1.0 predicate if the pipelineSpec is in-line. Some of the questions this can answer are inline. This is mostly the buildConfig from SLSAv0.2.

"predicate": {
  "tasks": [
    {
      "name": "task1",
      "metadata": {      # used as an alternative to identify the task
        "annotations": {
          "annotation1": "annotation1-value",
        },
        "labels": {     
          "label1": "label1-value"
        }
      },
      "after": [        # what order did the tasks run in?
        "task-before"
      ],
      "startedOn":  "12-01-2001", # how long did the task take?
      "finishedOn": "12-01-2001",
      "status":     "Success", # did the task succeed?
      "steps": [ # what did the task do?
        {
           "name": "step1",
           "cmd": "echo 1"
           "container": "task1-build",
           "image": "docker.io/library/image1",
        }
      ]
      "params": [ # what parameters were passed to the task?
        "name": "param1",
        "value": "param1-value"
      ],
      "results": [ # what are the results?
        {
          "name": "result1",
          "value": "result1-value"
        }
      ],
      "workspaces": [ # did the workspace use a pvc that may be shared?
        {
          "name": "workspace1",
          "persistentVolumeClaim": {
             "claimName": "claim1"
           }
        }
      ]
    }
  ]
}

@chitrangpatel
Copy link
Contributor

chitrangpatel commented Jun 15, 2023

Thanks @joejstuart for that example. Some more thoughts:
Do you want this predicate type to look exactly like above or have a more SLSA v1.0 like layout?
i.e. Move some of the above information into designated SLSAv1.0 fields? For eg. fields like metadata, startedOn, finishedOn and status seem like runDetails, results are more like byProducts and the referenced task/pipeline spec is probably a resolved dependency.

This is what I was imagining:

  1. For fully-inlined pipeline spec, that information will be a part of the run spec in externalParameters. That should have all the information you posted above except the times (started, finished), status and metadata.
  2. For embedded case, I was thinking that may be we should be including the referenced pipeline/task spec in the resolved dependencies section. resolvedDependencies allows us to store json content so we could make use of that. Again, that will get you everything except for the times (started, finished), status and metadata.
  3. Currently, we only store the results surfaced to the pipelineRun in byProducts but we could totally also store the results of the underlying pipeline tasks.
  4. We could store the times (started, finished), status and metadata of the individual taskruns in byProducts since its a little more flexible than runDetails.Metadata.

Alternatively, we could also store the complete resolved status as an item in byProducts since we could imagine the resolved status to be a byProduct of the pipeline.

PipelineRun (but as a json):

status:
  childReferences:
  - apiVersion: tekton.dev/v1beta1
    kind: TaskRun
    name: recipe-time-m2z8r-fetch-the-recipe
    pipelineTaskName: fetch-the-recipe
  - apiVersion: tekton.dev/v1beta1
    kind: TaskRun
    name: recipe-time-m2z8r-print-the-recipe
    pipelineTaskName: print-the-recipe
  completionTime: "2023-06-15T18:40:04Z"
  conditions:
  - lastTransitionTime: "2023-06-15T18:40:04Z"
    message: 'Tasks Completed: 2 (Failed: 0, Cancelled 0), Skipped: 0'
    reason: Succeeded
    status: "True"
    type: Succeeded
  pipelineSpec:
    tasks:
    - name: fetch-the-recipe
      taskRef:
        kind: Task
        name: fetch-secure-data
      workspaces:
      - name: shared-data
    - name: print-the-recipe
      runAfter:
      - fetch-the-recipe
      taskSpec:
        metadata: {}
        spec: null
        steps:
        - computeResources: {}
          image: ubuntu
          name: print-secrets
          script: cat $(workspaces.shared-data.path)/recipe.txt
  startTime: "2023-06-15T18:39:38Z"

Child references (but as a json):

status:
  completionTime: "2023-06-15T18:39:58Z"
  conditions:
  - lastTransitionTime: "2023-06-15T18:39:58Z"
    message: All Steps have completed executing
    reason: Succeeded
    status: "True"
    type: Succeeded
  podName: recipe-time-m2z8r-fetch-the-recipe-pod
  startTime: "2023-06-15T18:39:38Z"
  steps:
  - container: step-fetch-and-write
    imageID: docker.io/library/ubuntu@sha256:ac58ff7fe25edc58bdf0067ca99df00014dbd032e2246d30a722fa348fd799a5
    name: fetch-and-write
    terminated:
      containerID: containerd://90c3dffad86bb633c65f57bc9011a728e4823dd90e55574b0b4b03df46b1e977
      exitCode: 0
      finishedAt: "2023-06-15T18:39:57Z"
      reason: Completed
      startedAt: "2023-06-15T18:39:57Z"
  taskSpec:
    steps:
    - computeResources: {}
      image: ubuntu
      name: fetch-and-write
      script: |
        echo hi >> /workspace/shared-data/recipe.txt
    workspaces:
    - name: shared-data

...

WDYT?

Thoughts @wlynch, @lcarva @chuangw6 ?

@joejstuart
Copy link
Contributor Author

I'm mostly concerned with just having that data available. I also think having to use two attestations to evaluate a build is more of a pain than having it in one place and the externalParameters are already Tekton specific. So, I guess I'd lean towards storing the information in the SLSA attestation. I'm happy to create an example of this if that would be of any value? It might help me.

@chitrangpatel
Copy link
Contributor

My only friction with putting everything in one place is that it goes against the guidelines suggested by SLSA. I totally see the value in having it all in one place. I can see how a Tekton build verifier might benefit a lot from this. However generic SLSA verifiers will have trouble because it gets too Tekton-specific (especially if the information is not where its supposed to be according to SLSA).

Like @wlynch said, if we want to just dump the entire run object as its own predicate, we could do that but gated by config. And may be that could be of better use to Tekton verifiers.

But for SLSA predicate, we should try to be in-line with what their recommendations are.

@joejstuart
Copy link
Contributor Author

@chitrangpatel Sorry for the delay in my response. After looking at the SLSAV1.0 spec and reviewing what chains is capturing, I think your first comment makes a lot of sense to store the pipelineSpec/taskSpec (in the non-embedded case) in resolvedDependencies and the taskRun results in byProducts. With this information the SLSA provenance would be more complete imho.

As for the new predicate type, I think that would be beneficial at some point, but having the above information would be a higher priority for my use cases.

@chitrangpatel
Copy link
Contributor

Created an issue #834 to deal with pipelineSpec and taskSpec in resolved dependencies.

@joejstuart
Copy link
Contributor Author

Created an issue #834 to deal with pipelineSpec and taskSpec in resolved dependencies.

Thank you!

@tekton-robot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale with a justification.
Stale issues rot after an additional 30d of inactivity and eventually close.
If this issue is safe to close now please do so with /close with a justification.
If this issue should be exempted, mark the issue as frozen with /lifecycle frozen with a justification.

/lifecycle stale

Send feedback to tektoncd/plumbing.

@tekton-robot
Copy link

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten with a justification.
Rotten issues close after an additional 30d of inactivity.
If this issue is safe to close now please do so with /close with a justification.
If this issue should be exempted, mark the issue as frozen with /lifecycle frozen with a justification.

/lifecycle rotten

Send feedback to tektoncd/plumbing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/feature Categorizes issue or PR as related to a new feature. lifecycle/rotten
Projects
None yet
Development

No branches or pull requests

4 participants