Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for referencing Tasks in git #2298

Closed
bobcatfish opened this issue Mar 26, 2020 · 27 comments
Closed

Add support for referencing Tasks in git #2298

bobcatfish opened this issue Mar 26, 2020 · 27 comments
Assignees
Labels
area/roadmap Issues that are part of the project (or organization) roadmap (usually an epic) kind/feature Categorizes issue or PR as related to a new feature. lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness.

Comments

@bobcatfish
Copy link
Collaborator

bobcatfish commented Mar 26, 2020

Expected Behavior

In #1839 we are adding support for referencing Tasks that are stored in OCI image repos. We should also be able to reference Tasks that are stored in git, e.g.

apiVersion: tekton.dev/v1alpha1
kind: TaskRun
metadata:
  name: my-task-run
spec:
  taskRef:
    git:
      url: https://github.com/my/repo
      commit: deadbeef
      path: path/to/my/task.yaml

Actual Behavior

Via #1839 we now support referencing versioned Tasks and Pipelines in OCI registries (https://github.com/tektoncd/pipeline/blob/master/docs/pipelines.md#tekton-bundles).

Use case

  • I would like to set up a triggering system that uses my git repo as the source of truth for my Pipelines and Tasks, so that they can be versioned alongside my code, and so that I can make changes to these Pipelines and Tasks in my PRs that are picked up and used by CI

For example I could make a TriggerTemplate like:

apiVersion: triggers.tekton.dev/v1alpha1
kind: TriggerTemplate
metadata:
  name: run-tests
spec:
  params:
  - name: commitish
    description: The commitish to grab the Pipeline from to run
    default: master
  resourcetemplates:
  - apiVersion: tekton.dev/v1beta1
    kind: PipelineRun
    metadata:
      generateName: run-tests-$(uid)-
    spec:
      pipelineRef:
        git:
          url: https://github.com/my/repo
          # This would let me include any changes to the Pipeline in the PR testing
          commit: $(params.commitish)
          path: path/to/my/pipeline.yaml

Additional Info

@vdemeester
Copy link
Member

/kind feature

@tekton-robot tekton-robot added the kind/feature Categorizes issue or PR as related to a new feature. label Mar 27, 2020
@pierretasci
Copy link

Just seeing this now. I think this is very doable without much additional overhead. Definitely worth a look into. There is also KPT (https://github.com/GoogleContainerTools/kpt) which promises something similar.

@tekton-robot
Copy link
Collaborator

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

/close

Send feedback to tektoncd/plumbing.

@tekton-robot
Copy link
Collaborator

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.
If this issue is safe to close now please do so with /close.

/lifecycle stale

Send feedback to tektoncd/plumbing.

@tekton-robot
Copy link
Collaborator

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.
If this issue is safe to close now please do so with /close.

/lifecycle rotten

Send feedback to tektoncd/plumbing.

@tekton-robot
Copy link
Collaborator

@tekton-robot: Closing this issue.

In response to this:

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

/close

Send feedback to tektoncd/plumbing.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@tekton-robot tekton-robot added lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. labels Aug 14, 2020
@vdemeester
Copy link
Member

/remove-lifecycle rotten
/remove-lifecycle stale
/reopen

@tekton-robot
Copy link
Collaborator

@vdemeester: Reopened this issue.

In response to this:

/remove-lifecycle rotten
/remove-lifecycle stale
/reopen

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@tekton-robot tekton-robot reopened this Aug 17, 2020
@tekton-robot tekton-robot removed lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Aug 17, 2020
@bobcatfish
Copy link
Collaborator Author

Added some more detail to the description of this after I accidentally made a duplicate XD

@vdemeester
Copy link
Member

vdemeester commented Nov 3, 2020

So there is one thing that worries me is that, it would open the door for a lot of "ways" to reference definitions.

  • For example, why do we support git and not mercurial, or darcs, or other vcs ?
  • Then, why not supporting http/https references, ftp, ssh, <insert-your-protocol>, … ? (aka where do we start, where do we end)
  • Do we need to have some extension mechanism here to support more things that we didn't envisioned initially ?
  • How much complexity does it add to the resolving part for this ? Is this complexity worth putting in pipeline or could this be solved by another component ? Especially in terms of auth and secret management — both on the implementation complexity and the usage complexity.
  • Given the provided example, we only refer pipelineRef to the git repository. How do we envision this for the tasks used by this Pipeline then (if the task definition also lives in the git repository) ? How to make the Pipeline usable locally (without triggers) ?

One general thought on this too : the more we add ways to refer definitions, the less portable definition types (Pipeline, Task) becomes.

One question I am asking myself is, how can we support your usecase (which is more or less Pipeline as code) as of today, and/or without having this feature in (but other features in) — for exploration purposes.
As of today there is a bunch of possibility already:

  • tekton-asa-code is one approach. The trigger definition schedule a Task that does the job ; in a gist it takes whatever in .tekton (configurable), it works in two modes :
    • per namespace: create a new namespace per run, apply resources, execute the thing and report.
    • in the same namespace: embedded all in a PipelineRun, apply, execute and report.
  • @afrittoli work on plumbing (with triggers, etc…) — see Mario Meets the Robocat and Cloud Native CD Pipelines with Tekton.

Without this feature, but with some changes:

  • tekton-asa-code could be updated to automatically package resources in a bundle and do the magic of refering to them from the PipelineRun spec, …
  • We could have Triggers do some work with Tekton OCI bundles given a specific configuration. It would be able to package the resources from a configured place (from whatever generated the event and is configured for) and "adapt" the resources to refer to this bundle (using a comitish tag or most preferably the image digest). This could be done by an interceptor, or a Task (like tekton-asa-code does).
  • Standardize the above approach by having an official task that does this in the catalog 🙃

In a gist, I, initially, see more problems of having this in pipeline than not. But as lot's of people said : No is temporary, Yes is forever.

/cc @tektoncd/core-maintainers @chmouel

@bobcatfish
Copy link
Collaborator Author

For example, why do we support git and not mercurial, or darcs, or other vcs ?

That's a good question - I think we'd have to choose which to support.

My current thought is that we start with git support and add others as people ask for them. Usually I've tried to prioritize making things like this pluggable; do we feel like that's important here? I feel like version control is so fundamental to CI/CD use cases that it's reasonable to have some built in version control support?

How much complexity does it add to the resolving part for this ? Is this complexity worth putting in pipeline or could this be solved by another component ? Especially in terms of auth and secret management — both on the implementation complexity and the usage complexity.

Do you see this adding a lot of complexity? We've currently got git-init, and it seems like we do add features to it from time to time, but it seems worth the cost?

Are we talking about engineering effort, interface, or the time it takes to fetch the resources? Fetching would only happen once per reconcile and we could add caching if we wanted. The interface might be interesting to design but we've already had some experience via git-init.

How do we envision this for the tasks used by this Pipeline then (if the task definition also lives in the git repository) ?

Good question - I could see this being up to the author. You could either parameterize the Task refs such that they use the same git commits, or you could refer to them in your cluster, or you could refer to them in an OCI registry.

How to make the Pipeline usable locally (without triggers) ?

You'd provide params for the location of the git repo + the commit (maybe this gets back to the request in TEP-0018 to have a default bundle - maybe we'd want a default/parametrizable git repo as well in this case - but to start with id say folks could use params to specify this)

the more we add ways to refer definitions, the less portable definition types (Pipeline, Task) becomes.

Could you explain more about that? I think referring to Pipelines and Tasks within a cluster might be even worse (e.g. if someone deletes/changes a definition in the cluster)

Without this feature, but with some changes

Both seem like they would work but being able to refer to Tasks and Pipelines where they live in version control seems like a a simple elegant solution that would require a lot less to get up and running and reason about once it's running.

If myself or someone else is able to make a POC at some point that might help.

@vdemeester
Copy link
Member

My current thought is that we start with git support and add others as people ask for them. Usually I've tried to prioritize making things like this pluggable; do we feel like that's important here? I feel like version control is so fundamental to CI/CD use cases that it's reasonable to have some built in version control support?

Some questions : Is the use of tektoncd/pipeline limited to version control ? the current answer is definitively no. Should it ? I would answer no here too, tektoncd/pipeline should be as less opiniated as it can in terms of "where data comes from and where it goes".

Do you see this adding a lot of complexity? We've currently got git-init, and it seems like we do add features to it from time to time, but it seems worth the cost?

Are we talking about engineering effort, interface, or the time it takes to fetch the resources? Fetching would only happen once per reconcile and we could add caching if we wanted. The interface might be interesting to design but we've already had some experience via git-init.

I am talking about engineering effort, and usage/interface (user experience, verbosity, …), not on fetching resources, it's an implementation detail — which, even though a detail, it should be once for a specific run, not per reconcile (just as it should be for normal fetch and oci bundles), then we should use the object itself to look at the definition, it is less racey.

How do we envision this for the tasks used by this Pipeline then (if the task definition also lives in the git repository) ?

Good question - I could see this being up to the author. You could either parameterize the Task refs such that they use the same git commits, or you could refer to them in your cluster, or you could refer to them in an OCI registry.

Well that's the "problem", it has to be simple for the user. If the user has to adapt its pipeline to be able to parametrize this, it makes the pipeline less shareable (kinda) — very similar to what current PipelineResource design does when you use the GitResource for example, you are stuck with it, you have to write another Task (in case of GitResource) to use another input than a git repository.

How to make the Pipeline usable locally (without triggers) ?

You'd provide params for the location of the git repo + the commit (maybe this gets back to the request in TEP-0018 to have a default bundle - maybe we'd want a default/parametrizable git repo as well in this case - but to start with id say folks could use params to specify this)

By locally, I mean "from the source on my laptop" to a running pipeline, without doing any commit. Right now, it is possible through tooling : I apply my definition, I find a way to populate a workspace with my data (volume, …) and I run my Pipeline with it. If the Pipeline user a git reference notation to get tasks definition, how do I test locally changes to my task(s) ? except by changing the pipeline definition itself ?

the more we add ways to refer definitions, the less portable definition types (Pipeline, Task) becomes.

Could you explain more about that? I think referring to Pipelines and Tasks within a cluster might be even worse (e.g. if someone deletes/changes a definition in the cluster)

The more choice you have to refer to something, the more "matrix" of problem you encounter. The example above is one of them. It's not about where we refer things from, but how much possible ways we have to refer things from and how this affect the ability to author and share task/pipeline/….

Note that, in this reflexion, I am only looking from the tektoncd/pipeline point of view, not a full fledge CI/CD system (which tektoncd/pipeline is not, it's just a component). "Should tektoncd/pipeline be a full fledge CI/CD system ?" is a question, that we may want to discuss too. "Should tektoncd provide a full fledge CI/CD system ?" is another question.

Note that, just like with PipelineResource, I am trying to make use think and discuss really hard before implementing new features in tektoncd/pipeline if they are solvable by other components

@bobcatfish
Copy link
Collaborator Author

I am trying to make use think and discuss really hard before implementing new features in tektoncd/pipeline if they are solvable by other components

Excellent! I hope we can be as rigorous with all the new features we add :D

Responding to you inline has helped me come up with a different way to present this feature which I think might help!

I want to assert that:

  1. Version control is a key element of continuous delivery (and CI)
  2. (1) is not limited to just your source code, but also your configuration (i.e. "keep absolutely everything in version control")

Both of these recommendations are backed up by every canonical piece of literature in the space I've encountered, so given that our mission is to create components for CI/CD:

  1. I think it's reasonable to assume that most of the time there is version control involved in the activities of or surrounding execution of Pipelines and Tasks
  2. The best practice we should recommend and support is for people to store their Pipeline and Task definitions in version control

We currently only support getting the definitions in (2) from a cluster or from an OCI registry.

This means that (if you agree with the above!) although we know folks will be storing these definitions in the version control, we're saying they need to _ do something_ with those definitions before they can use them, i.e. apply them to a cluster or upload them to a registry.

The feature I'm proposing here is to recognize that folks will be storing these definitions in version control, and not require that they have to then do some extra thing with them to use them. (And this elegantly solves some other use cases like using changes to the Pipelines and Tasks that are made in the same PR.)

(I also assert that referencing Tasks and Pipelines in cluster actually buys us very little - which we have especially seen with needing to add CRD types like ClusterTask - which even then don't give us the scoping we want)

if the user has to adapt its pipeline to be able to parametrize this, it makes the pipeline less shareable (kinda)

I'm not sure how this is worse than the current state: you can create Pipelines that refer to Tasks that only exist in your own cluster or your own OCI registry.

If the Pipeline user a git reference notation to get tasks definition, how do I test locally changes to my task(s) ? except by changing the pipeline definition itself ?

Today if you're making changes to a Task, you need to either apply it to your cluster or upload it to the registry, right?

If you apply an updated Task to the cluster (presumably your own private cluster), you have to name it the same as the Pipeline is expecting or edit the Pipeline. If you upload it to the registry, you have to either use the same name/label/version as the Pipeline is changing, or update the Pipeline.

If we had support for referring to Tasks in git, supporting this scenario via requiring a change to the Pipeline doesn't seem much different to me?

To me this points more toward having some kinda "local mode", maybe via the CLI, which is able look for Task and Pipeline definitions on the filesystem (currently not supported at all) - which would probably require being able to override at runtime where Tasks and Pipelines are pulled from - something we might want to consider even without version control support.

@afrittoli
Copy link
Member

If we consider the location where you fetch a task / pipeline from a runtime concern, I think supporting multiple sources would not hinder reusability of tasks and pipelines.

When running a pipelinerun/taskrun one has to specify the pipeline/task ref, which could be cluster, OCI, git... and perhaps even path in a workspace?

For pipeline tasks, the task name is part of the pipeline definition, but where to look for that could again be a runtime concern.

@vdemeester
Copy link
Member

This means that (if you agree with the above!) although we know folks will be storing these definitions in the version control, we're saying they need to _ do something_ with those definitions before they can use them, i.e. apply them to a cluster or upload them to a registry.

The feature I'm proposing here is to recognize that folks will be storing these definitions in version control, and not require that they have to then do some extra thing with them to use them. (And this elegantly solves some other use cases like using changes to the Pipelines and Tasks that are made in the same PR.)

(I also assert that referencing Tasks and Pipelines in cluster actually buys us very little - which we have especially seen with needing to add CRD types like ClusterTask - which even then don't give us the scoping we want)

I agree with the definitions above, but it applies to a system (a CI system). tektoncd/pipeline is a component not a full CI system and I see handling this case — definitions are in a version control — as a responsibility of the system, not necessarily the tektoncd/pipeline component.

To try to make my point a bit clearer, I want to make a small parallel with Pod and Deployment here. A Pod doesn't have to support all the concerns, for example, a Pod (spec) doesn't have anything related to livenessProbes, because it is not its concerns, it's the Deployment concern. I feel some feature (like this one) might not be under the tektoncd/pipeline API and be better achieved by tooling or higher level constructs.

Of course, as a user, I expect to use a CI/CD system (or build one) that allows me to store my definitions, etc., in a version control system. But it doesn't mean it has to be supported by tektoncd/pipeline instead of something else (another component part of my CI system).

Which brings me back again on I am only looking from the tektoncd/pipeline point of view, not a full fledge CI/CD system (which tektoncd/pipeline is not, it's just a component). "Should tektoncd/pipeline be a full fledge CI/CD system ?" "Should Tekton (the community, the tektoncd org) provide a full fledge CI/CD system ?" is another one.

Today if you're making changes to a Task, you need to either apply it to your cluster or upload it to the registry, right?

If you apply an updated Task to the cluster (presumably your own private cluster), you have to name it the same as the Pipeline is expecting or edit the Pipeline. If you upload it to the registry, you have to either use the same name/label/version as the Pipeline is changing, or update the Pipeline.

If we had support for referring to Tasks in git, supporting this scenario via requiring a change to the Pipeline doesn't seem much different to me?

It really depends on the tool you used 😉. If I use a tool that bundles everything into taskSpec and pipelineSpec, I never update any task on my cluster, I edit my yaml, and I let the tool do its thing. This is what tekton-asa-code does for example.

For pipeline tasks, the task name is part of the pipeline definition, but where to look for that could again be a runtime concern.

I also tend to agree with @afrittoli, that maybe where to look could be a runtime concern (with failover mechanisms, …).

Does where to look for the definition needs to be something an Pipeline/Task definition author has to take care of ? Should it even be there ? (and on this, I am glad the bundle are hidden under feature-flag still)
This is, imo, a critical question to answer, because depending on it, we may not need the bundle part in Pipeline for example — and we would need to generalize the approach taken in TEP-0018 Allow a Run to Specify a Default Bundle proposal.

To me this points more toward having some kinda "local mode", maybe via the CLI, which is able look for Task and Pipeline definitions on the filesystem (currently not supported at all) - which would probably require being able to override at runtime where Tasks and Pipelines are pulled from - something we might want to consider even without version control support.

This is more like that, this might make "version control support" not necessary at all.

As a summary, I feel answering the following question are very important to be able to discuss this, and other tektoncd/pipeline, feature(s):

  • Does where to look for the definition needs to be something an Pipeline/Task definition author has to take care of ? Should it even be there ?
  • Do we consider tektoncd/pipeline to be a full fledge CI/CD system ? Should it ?
  • Should Tekton (the community, the tektoncd org) provide a full fledge CI/CD system ?

@vdemeester
Copy link
Member

  • Do we consider tektoncd/pipeline to be a full fledge CI/CD system ? Should it ?
  • Should Tekton (the community, the tektoncd org) provide a full fledge CI/CD system ?

Note that, those need to be answered and clearly stated on tekton.dev, our community repository, … to try to make users discovering tekton not having the wrong impressions 🙃

@jstrachan
Copy link

BTW here's the solution we've been using in the Jenkins X community to workaround there being no native support yet for referencing tasks + steps in git and overriding them... https://jenkins-x.io/blog/2021/02/25/gitops-pipelines/

we're using the ko and mink trick of using a custom image URI for now.

it would obviously be better to add this explicitly into the tekton CRDs some day.

The part we've really found useful is being able to just reuse all steps in a task; a named step or all named steps and adding customisations before/after/between the steps and to override steps too.

So its basically a (purposely) simple overlay mechanisms where we can import steps from tasks referenced in git and override them locally.

@vdemeester
Copy link
Member

BTW here's the solution we've been using in the Jenkins X community to workaround there being no native support yet for referencing tasks + steps in git and overriding them... https://jenkins-x.io/blog/2021/02/25/gitops-pipelines/

we're using the ko and mink trick of using a custom image URI for now.

it would obviously be better to add this explicitly into the tekton CRDs some day.

Would it be though ? (in tektoncd/pipeline CRD I mean). We are back into the following questions

  • Do we consider tektoncd/pipeline to be a full fledge CI/CD system ? Should it ?
  • Should Tekton (the community, the tektoncd org) provide a full fledge CI/CD system ?

The fact that this is "not" supported in tektoncd/pipeline today (only reference in-cluster, reference by oci ref or embedded spec are supported) allow tools and product using some tektoncd components to experiment with their solution, use what works the best with them. The more "opinion" we put into the core (tektoncd/pipeline) the less it is a component, the more it is a product.

Overall, I am all for supporting this in Tekton (aka in a project in tektoncd), but I am a bit worried to support this in the tektoncd/pipeline component, at least for now.

The part we've really found useful is being able to just reuse all steps in a task; a named step or all named steps and adding customisations before/after/between the steps and to override steps too.

So its basically a (purposely) simple overlay mechanisms where we can import steps from tasks referenced in git and override them locally.

Reading https://jenkins-x.io/blog/2021/02/25/gitops-pipelines/, right now you are abusing image and stepTemplate to be able to pick up all task or some tasks from another Task definition (be it in-cluster, on the catalog, …), am I right ? I feel this is/was the authoring part of tektoncd/community#316 (cc @bobcatfish) but I do like that approach, it seems like a very lightweight and "customizable" replacement for PipelineResources (cc @jerop)

@tekton-robot
Copy link
Collaborator

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale with a justification.
Stale issues rot after an additional 30d of inactivity and eventually close.
If this issue is safe to close now please do so with /close with a justification.
If this issue should be exempted, mark the issue as frozen with /lifecycle frozen with a justification.

/lifecycle stale

Send feedback to tektoncd/plumbing.

@tekton-robot tekton-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label May 30, 2021
@tekton-robot
Copy link
Collaborator

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten with a justification.
Rotten issues close after an additional 30d of inactivity.
If this issue is safe to close now please do so with /close with a justification.
If this issue should be exempted, mark the issue as frozen with /lifecycle frozen with a justification.

/lifecycle rotten

Send feedback to tektoncd/plumbing.

@tekton-robot tekton-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Jun 29, 2021
@bobcatfish
Copy link
Collaborator Author

Discussion has been continuing around this topic via the experimental workflows project (and related projects such as pipeline as code and WG, and also via TEP-0060 remote resolution (and tektoncd/community#493)

/lifecycle frozen
/remove-lifecycle rotten

@tekton-robot tekton-robot added lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. and removed lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. labels Aug 10, 2021
@jerop jerop added the area/roadmap Issues that are part of the project (or organization) roadmap (usually an epic) label Feb 17, 2022
@jerop jerop moved this to In Progress in Tekton Pipelines Roadmap Feb 17, 2022
@abayer
Copy link
Contributor

abayer commented Aug 11, 2022

/assign

This will be integrated into Pipeline as part of #4710.

@lbernick
Copy link
Member

Closed by #4710

Repository owner moved this from Todo to Done in Tekton Community Roadmap Dec 21, 2022
Repository owner moved this from In Progress to Done in Tekton Pipelines Roadmap Dec 21, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/roadmap Issues that are part of the project (or organization) roadmap (usually an epic) kind/feature Categorizes issue or PR as related to a new feature. lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness.
Projects
Status: Done
Status: Done
Development

No branches or pull requests