Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

update common to fix eso #88

Closed
wants to merge 3 commits into from
Closed

update common to fix eso #88

wants to merge 3 commits into from

Commits on Jul 8, 2023

  1. common automatic update (#84)

    * Updated namespaces template to include labels and annotations functionality
    
    * Added schema validation to support additional formal for labels and annotations
    
    * Updated the values-example.yaml to include new format for namespaces
    
    * Updated Changes.md to include new namespaces functionality.
    
    * Updating CI tests
    
    * Fixed Markdown errors
    
    * Add an experimental letsencypt chart
    
    This change adds an experimental letsencrypt chart that allows a pattern
    user/developer to have all routes and the API endpoint use signed
    certificates by letsencrypt.
    
    At this stage only AWS is supported. The full documentation is contained
    in the chart's README.md file
    
    * Do not run kubeconform on the certificate stuff just yet
    
    * Fix up kustomize example
    
    In the same vein as Industrial Edge 57f41dc135f72011d3796fe42d9cbf05d2b82052
    we call kustomize build.
    
    Newer gitops versions dropped the openshift-clients rpm by default which
    contained kubectl. Let's just invoke "kustomize" directly as the binary
    is present in both old and new gitops versions
    
    Since "kubectl kustomize" builds the set of resources by default, we
    need to switch to "kubectl build" by default
    
    We also use the same naming conventions used in Industrial Edge while
    we're at it.
    
    * Upgrade vault-helm to v0.24.0
    
    Tested on MCG with hub and spoke
    
    * Add a hello-world ansible playbook example
    
    Just a simple example that reads a helm value and puts it in a configmap
    
    * Inject ANSIBLE_CONFIG in make ansible-lint
    
    * Use new ansible-lint action
    
    * Fix some ansible-lint warnings
    
    * Fix up python versions
    
    * Skip cannot find role error
    
    Avoid checking those two playbooks the action seems to be too limited
    to understand where the ansible.cfg is
    
    * Added health check for pvc resource in argocd.yaml
    
    This allows argo to continue rolling out the rest of the applications.
    Without the health check the application is stuck in a progressing state
    and will not continue thus preventing any downstream application from
    deploying.
    
    * adding tests
    
    * Update super-linter image to latest
    
    * Update super-linter image to latest
    
    * Update CI workflows
    
    * updated template with why implemented comment
    
    * Add dependabot settings for github actions
    
    * adding tests
    
    * - Added functionality to support the following format for labels and annotations:
          labels:
            openshift.io/node-selector: ""
          annotations:
            openshift.io/cluster-monitoring: "true"
    
    * Fixed CI Issues
    
    * Applying @claudiol recommendation
    
    * make test
    
    * Avoid exited containers proliferation
    
    When running the `pattern.sh` script multiple times, a lot of
    podman exited containers will be left on the machine, adding
    `--rm` parameter to `podman run` makes podman automatically
    delete the exited containers leaving the machine cleaner.
    
    * Handling of pre-release builds is too complex for a helm chart
    
    Generating the ICSP and allowing insecure registries is best done prior
    to helm upgrade, and requires VPN access to registry-proxy.engineering.redhat.com
    
    * Fixing issues with operator groups
    
    * Adding CI test
    
    * Updated operator group template
    
    * Updating CI issues
    
    * Removed duplicate code for operatorgroup by using multiple conditions
    
    * Allow overriding the pattern's name
    
    This is especially useful when multiple people are working on a pattern
    an have been using different names:
    
        $ make help |grep Pattern:
        Pattern: multicloud-gitops
        $ make NAME=foobar help |grep Pattern:
        Pattern: foobar
    
    * Add precise instruction to upgrade the vault subchart
    
    * Upgrade vault-helm to v0.24.1
    
    * Add an item to README.md
    
    * Fix up common/ tests
    
    * Fix super linter
    
    * Set gitOpsSpec.operatorSource
    
    After merging validatedpatterns/patterns-operator@235b303
    it is now effectively possible to pick a different catalogSource for
    the gitops operator. This is needed in order to allow CI to install
    the gitops operator from an IIB
    
    * Introduce EXTRA_HELM_OPTS
    
    This variable can be set in order to pass additional helm arguments from the
    the command line. I.e. we can set things without having to tweak values files
    So it is now possible to run something like the following:
    
      ./pattern.sh make install \
      EXTRA_HELM_OPTS="--set main.gitops.operatorSource=iib-49232"
    
    * Disable var-naming[no-role-prefix] in ansible lint
    
    * Add new ansible role to deal with IIBs
    
    * Simplify load-iib target
    
    * Add templates folder
    
    * Fix a couple of linting warnings
    
    * Fix some super-linter complaints
    
    * Skip the iib-ci playbook
    
    * Drop var-naming[no-role-prefix] linter
    
    * Allow for multiple images when calling load-iib
    
    * Add help for load-iib
    
    * Output index_image in make
    
    * Output index_image in make (2)
    
    * Set facts later in the playbook not in defaults/
    
    * Fix how we export vars in make load-iib
    
    * Fix how we export vars in make load-iib (2)
    
    * Use machineCount to register the number of nodes that need to be ready
    
    * Add helpful debug messages
    
    * Add | on shell now that we call pipefail
    
    * Test dropping nevercontact source
    
    * Skip insecure tls when logging in
    
    * Also allow gchr.io
    
    * Revert "Test dropping nevercontact source"
    
    This reverts commit d8746a37fce2663018f52203c892f00b825e32a7.
    
    * Fix typo
    
    * Clarify instructions in the README file
    
    * Automate the channel example
    
    * Find out KUBEADMINAPI programmatically
    
    * Use command instead of shell
    
    * Do not grep for operator bundle unless it is the gitops operator
    
    * Also whitelist ghcr.io
    
    * Fetch the operator bundle itself in a more robust way
    
    It seems that the operator bundle image itself is nowhere to be found
    inside any OCP cluster object (it's not in packagemanifests nor
    catalogsource). Resorting to parsing the IIB via opm alpha commands
    to fetch the exact image.
    
    * Add more mirrors
    
    * Some more work to support MCE
    
    * Cleanup spacing
    
    * Fix super-linter
    
    * Move task in right folder
    
    * Drop last mention of operator instead of item
    
    * Improve the grepping for the operator bundle
    
    Without also grepping for the default_channel we can end up getting
    multiple results, which breaks everything.
    
    Tested this and it fixed the issue I was seeing with the
    openshift-gitops-operator this morning
    
    * Drop display_skipped_hosts
    
    display_skipped_hosts=False has a horrible side-effect:
    When a task takes a long time, it is always the *next* task and not the
    one printed on the screen/log. That is because ansible has to wait for
    the task to finish before printing it as it does not know before hand if
    the host will be skipped and hence the task should not be displayed at
    all
    
    * Be more specific about the steps in the README
    
    * Upgrade ESO to v0.8.2
    
    * Update README.md
    
    * Update tests after eso 0.8.2 upgrade
    
    * Move to new spec format for dex/sso
    
    Via https://issues.redhat.com/browse/GITOPS-2761 we are told that the
    dex configuration has a new format.
    Old format:
    
        spec:
          dex:
            openShiftOAuth: true
            resources:
            ...
    
    New format:
    
        spec:
          sso:
            provider: dex
            dex:
              openShiftOAuth: true
              resources:
              ...
    
    This format is only supported starting with gitops-1.8.0, so we should
    merge this only when we are absolutely sure that no pattern in no
    situation needs an older gitops version.
    
    Tested on MCG with gitops-1.8.2
    
    Note: with this change gitops < 1.8 is not supported. Starting with
    gitops-1.9 the old format will be unsupported.
    
    * Disable ArgoCD from kubeconform
    
    The reason is that most of the tools we used to generate the json
    schema, seem to be unmaintained, so it is getting hard to update
    our schemas in our GH org. We'll need to revisit this in the future.
    
    * Add a short line about username/token for the iib role on OCP <= 4.12
    
    * Drop https:// from podman login
    
    Seems we hit https://www.github.com/containers/podman/issues/13691 at
    least with older podman versions.
    
    If this turns out to break podman 4.5.0 I will special case it later
    
    * Set the mce-subscription-spec annotation
    
    We set it by default to "redhat-operators" and if defined to .Values.clusterGroup.subscriptions.acm.source
    The reason we do this is the following:
    1. In a default deployment scenario MCE has to be deployed as normal
       from the redhat-operators catalogSource just as ACM is
    2. When we deploy gitops-operator from an IIB instead, MCE would be
       installed trying to get it from the IIB because
       https://www.github.com/stolostron/multiclusterhub-operator/pull/975
       made it so that it picks the latest version looking at all catalog
       sources. But since we only mirrored the gitops operator in the
       cluster, this breaks as the images for MCE from the IIB are not there
       By setting the default to "redhat-operators" we fix this case
    3. Now in the case where we want to install ACM from an IIB we need to
       be able to override this and we will pick whatever value is set in
       .Values.clusterGroup.subscriptions.acm.source, which will need to be
       defined for this to work when testing ACM+MCE from an IIB
    
    Note: Currently point 3. works only if you set it in a values file.
    Setting .Values.clusterGroup.subscriptions.acm.source via extraParams
    won't be passed down from the clusterGroup app to the applications.
    It's a bug that we need to fix.
    
    Note(2): We surround this with an 'if kindIs "map" .Values.clusterGroup.subscriptions'
    because we do not want to break things if subscription is a list and not
    a map. If we ever manage to drop subscriptions as list, then we can
    remove that if
    
    * Fix typo in README for iib
    
    * Simplify the README a bit
    
    * Add support for extraParams being passed down to all applications
    
    Via validatedpatterns/patterns-operator#74
    we add the extraParams in an extraParametersNested dictionary that holds
    the extraParams key/value pairs. If they exist, let's add them as
    parameters.
    
    This allows them to end up in the applications.
    
    * Add a lookup playbook to figure out IIB numbers
    
    * Allow overriding channel and source when installing the patterns-operator
    
    This will allow us to test the patterns-operator using a different
    catalogsource (potentially installed via an IIB). So we can run:
    
    make EXTRA_HELM_OPTS="\
      --set main.extraParameters[0].name=main.patternsOperator.channel --set main.extraParameters[0].value=slow \
      --set main.extraParameters[1].name=main.patternsOperator.source --set main.extraParameters[1].value=patten-index" install
    
    * Fix small typo in iib instructions
    
    * Drop a redirect and up retries when pushing the IIB to the internal registry
    
    * Update ESO to v0.8.3
    
    * WIP add presync for eso that waits for vault to be up
    
    * Add tests
    
    * Fix image and comment
    
    * Adding rbac to support the vault sa checking on the vault-0 pod status.
    
    * Make Test
    
    * Revert "Make Test"
    
    This reverts commit 64e9dc7.
    
    * Revert "Adding rbac to support the vault sa checking on the vault-0 pod status."
    
    This reverts commit 598bc74.
    
    * Revert "Fix image and comment"
    
    This reverts commit d4d3fe1.
    
    * Revert "Add tests"
    
    This reverts commit ab5532a.
    
    * Revert "WIP add presync for eso that waits for vault to be up"
    
    This reverts commit 2797699.
    
    * Increase the default retry limit when syncing
    
    ArgoCD will retry 5 times by default to sync an application in case of
    errors and then will give up. So if an application contains a reference
    to a CRD that has not been installed yet (say because it will be
    installed by another application), it will error out and retry later.
    This happens by default for a maximum of 5 times [1]. After those 5 times
    the application will give up and will stay in Degraded moded and
    eventually move to Failed. In this case a manual sync will usually fix
    the application just fine (i.e. as long as the missing CRD has been
    installed in the meantime).
    
    Now to solve this issue we can add complex preSync Jobs that wait for
    the needed resources, but this fundamentally breaks the simplicity of
    things and introduces unneeded dependencies. In this change we just
    increase the default retry limit to something larger (20) that should
    cover most cases. The retry limit functionality is rather undocumented
    currently in the docs but is defined at [2] and also shown at [3].
    
    In our patterns' case the concrete issue happened as follows:
    1. ESO ClusterSecrets were often not synced/degraded
    2. We introduced a Job in a preSync hook for the ESO chart that would
       wait on vault to be ready before applying the rest of ESO
    3. MCG started failing because the config-demo app had already tried to
       sync 5 times and failed everytime because the ESO CRDs were not
       installed yet (due to ESO waiting on vault)
    
    So instead of adding yet another job, let's just try a lot more often.
    We picked 20 as a sane default because that should have argo try for
    about 60 minutes (3min is the default maximum backoff limit)
    
    Tested with two MCG installations (with the ESO Job hook included) and
    both worked out of the box. Whereas before I managed to get three
    failures out of three installs.
    
    [1] https://github.com/argoproj/argo-cd/blob/master/controller/appcontroller.go#L1680
    [2] https://github.com/argoproj/argo-cd/blob/master/manifests/crds/application-crd.yaml#L1476
    [3] https://github.com/argoproj/argo-cd/blob/master/docs/operator-manual/application.yaml#L202C18-L202C100
    
    * Add Changes.md entry
    
    * Fix up tests after common rebase
    
    ---------
    
    Co-authored-by: Lester Claudio <claudiol@redhat.com>
    Co-authored-by: day0hero <jonny@redhat.com>
    Co-authored-by: Lorenzo Dalrio <ldalrio@redhat.com>
    Co-authored-by: Andrew Beekhof <andrew@beekhof.net>
    Co-authored-by: Martin Jackson <mhjacks@redhat.com>
    Co-authored-by: jonny <65790298+day0hero@users.noreply.github.com>
    7 people committed Jul 8, 2023
    Configuration menu
    Copy the full SHA
    7ed309b View commit details
    Browse the repository at this point in the history

Commits on Jul 9, 2023

  1. Odf update (#85)

    * Updated namespaces template to include labels and annotations functionality
    
    * Added schema validation to support additional formal for labels and annotations
    
    * Updated the values-example.yaml to include new format for namespaces
    
    * Updated Changes.md to include new namespaces functionality.
    
    * Updating CI tests
    
    * Fixed Markdown errors
    
    * Add an experimental letsencypt chart
    
    This change adds an experimental letsencrypt chart that allows a pattern
    user/developer to have all routes and the API endpoint use signed
    certificates by letsencrypt.
    
    At this stage only AWS is supported. The full documentation is contained
    in the chart's README.md file
    
    * Do not run kubeconform on the certificate stuff just yet
    
    * Fix up kustomize example
    
    In the same vein as Industrial Edge 57f41dc135f72011d3796fe42d9cbf05d2b82052
    we call kustomize build.
    
    Newer gitops versions dropped the openshift-clients rpm by default which
    contained kubectl. Let's just invoke "kustomize" directly as the binary
    is present in both old and new gitops versions
    
    Since "kubectl kustomize" builds the set of resources by default, we
    need to switch to "kubectl build" by default
    
    We also use the same naming conventions used in Industrial Edge while
    we're at it.
    
    * Upgrade vault-helm to v0.24.0
    
    Tested on MCG with hub and spoke
    
    * Add a hello-world ansible playbook example
    
    Just a simple example that reads a helm value and puts it in a configmap
    
    * Inject ANSIBLE_CONFIG in make ansible-lint
    
    * Use new ansible-lint action
    
    * Fix some ansible-lint warnings
    
    * Fix up python versions
    
    * Skip cannot find role error
    
    Avoid checking those two playbooks the action seems to be too limited
    to understand where the ansible.cfg is
    
    * Added health check for pvc resource in argocd.yaml
    
    This allows argo to continue rolling out the rest of the applications.
    Without the health check the application is stuck in a progressing state
    and will not continue thus preventing any downstream application from
    deploying.
    
    * adding tests
    
    * Update super-linter image to latest
    
    * Update super-linter image to latest
    
    * Update CI workflows
    
    * updated template with why implemented comment
    
    * Add dependabot settings for github actions
    
    * adding tests
    
    * - Added functionality to support the following format for labels and annotations:
          labels:
            openshift.io/node-selector: ""
          annotations:
            openshift.io/cluster-monitoring: "true"
    
    * Fixed CI Issues
    
    * Applying @claudiol recommendation
    
    * make test
    
    * Avoid exited containers proliferation
    
    When running the `pattern.sh` script multiple times, a lot of
    podman exited containers will be left on the machine, adding
    `--rm` parameter to `podman run` makes podman automatically
    delete the exited containers leaving the machine cleaner.
    
    * Handling of pre-release builds is too complex for a helm chart
    
    Generating the ICSP and allowing insecure registries is best done prior
    to helm upgrade, and requires VPN access to registry-proxy.engineering.redhat.com
    
    * Fixing issues with operator groups
    
    * Adding CI test
    
    * Updated operator group template
    
    * Updating CI issues
    
    * Removed duplicate code for operatorgroup by using multiple conditions
    
    * Allow overriding the pattern's name
    
    This is especially useful when multiple people are working on a pattern
    an have been using different names:
    
        $ make help |grep Pattern:
        Pattern: multicloud-gitops
        $ make NAME=foobar help |grep Pattern:
        Pattern: foobar
    
    * Add precise instruction to upgrade the vault subchart
    
    * Upgrade vault-helm to v0.24.1
    
    * Add an item to README.md
    
    * Fix up common/ tests
    
    * Fix super linter
    
    * Set gitOpsSpec.operatorSource
    
    After merging validatedpatterns/patterns-operator@235b303
    it is now effectively possible to pick a different catalogSource for
    the gitops operator. This is needed in order to allow CI to install
    the gitops operator from an IIB
    
    * Introduce EXTRA_HELM_OPTS
    
    This variable can be set in order to pass additional helm arguments from the
    the command line. I.e. we can set things without having to tweak values files
    So it is now possible to run something like the following:
    
      ./pattern.sh make install \
      EXTRA_HELM_OPTS="--set main.gitops.operatorSource=iib-49232"
    
    * Disable var-naming[no-role-prefix] in ansible lint
    
    * Add new ansible role to deal with IIBs
    
    * Simplify load-iib target
    
    * Add templates folder
    
    * Fix a couple of linting warnings
    
    * Fix some super-linter complaints
    
    * Skip the iib-ci playbook
    
    * Drop var-naming[no-role-prefix] linter
    
    * Allow for multiple images when calling load-iib
    
    * Add help for load-iib
    
    * Output index_image in make
    
    * Output index_image in make (2)
    
    * Set facts later in the playbook not in defaults/
    
    * Fix how we export vars in make load-iib
    
    * Fix how we export vars in make load-iib (2)
    
    * Use machineCount to register the number of nodes that need to be ready
    
    * Add helpful debug messages
    
    * Add | on shell now that we call pipefail
    
    * Test dropping nevercontact source
    
    * Skip insecure tls when logging in
    
    * Also allow gchr.io
    
    * Revert "Test dropping nevercontact source"
    
    This reverts commit d8746a37fce2663018f52203c892f00b825e32a7.
    
    * Fix typo
    
    * Clarify instructions in the README file
    
    * Automate the channel example
    
    * Find out KUBEADMINAPI programmatically
    
    * Use command instead of shell
    
    * Do not grep for operator bundle unless it is the gitops operator
    
    * Also whitelist ghcr.io
    
    * Fetch the operator bundle itself in a more robust way
    
    It seems that the operator bundle image itself is nowhere to be found
    inside any OCP cluster object (it's not in packagemanifests nor
    catalogsource). Resorting to parsing the IIB via opm alpha commands
    to fetch the exact image.
    
    * Add more mirrors
    
    * Some more work to support MCE
    
    * Cleanup spacing
    
    * Fix super-linter
    
    * Move task in right folder
    
    * Drop last mention of operator instead of item
    
    * Improve the grepping for the operator bundle
    
    Without also grepping for the default_channel we can end up getting
    multiple results, which breaks everything.
    
    Tested this and it fixed the issue I was seeing with the
    openshift-gitops-operator this morning
    
    * Drop display_skipped_hosts
    
    display_skipped_hosts=False has a horrible side-effect:
    When a task takes a long time, it is always the *next* task and not the
    one printed on the screen/log. That is because ansible has to wait for
    the task to finish before printing it as it does not know before hand if
    the host will be skipped and hence the task should not be displayed at
    all
    
    * Be more specific about the steps in the README
    
    * Upgrade ESO to v0.8.2
    
    * Update README.md
    
    * Update tests after eso 0.8.2 upgrade
    
    * Move to new spec format for dex/sso
    
    Via https://issues.redhat.com/browse/GITOPS-2761 we are told that the
    dex configuration has a new format.
    Old format:
    
        spec:
          dex:
            openShiftOAuth: true
            resources:
            ...
    
    New format:
    
        spec:
          sso:
            provider: dex
            dex:
              openShiftOAuth: true
              resources:
              ...
    
    This format is only supported starting with gitops-1.8.0, so we should
    merge this only when we are absolutely sure that no pattern in no
    situation needs an older gitops version.
    
    Tested on MCG with gitops-1.8.2
    
    Note: with this change gitops < 1.8 is not supported. Starting with
    gitops-1.9 the old format will be unsupported.
    
    * Disable ArgoCD from kubeconform
    
    The reason is that most of the tools we used to generate the json
    schema, seem to be unmaintained, so it is getting hard to update
    our schemas in our GH org. We'll need to revisit this in the future.
    
    * Add a short line about username/token for the iib role on OCP <= 4.12
    
    * Drop https:// from podman login
    
    Seems we hit https://www.github.com/containers/podman/issues/13691 at
    least with older podman versions.
    
    If this turns out to break podman 4.5.0 I will special case it later
    
    * Set the mce-subscription-spec annotation
    
    We set it by default to "redhat-operators" and if defined to .Values.clusterGroup.subscriptions.acm.source
    The reason we do this is the following:
    1. In a default deployment scenario MCE has to be deployed as normal
       from the redhat-operators catalogSource just as ACM is
    2. When we deploy gitops-operator from an IIB instead, MCE would be
       installed trying to get it from the IIB because
       https://www.github.com/stolostron/multiclusterhub-operator/pull/975
       made it so that it picks the latest version looking at all catalog
       sources. But since we only mirrored the gitops operator in the
       cluster, this breaks as the images for MCE from the IIB are not there
       By setting the default to "redhat-operators" we fix this case
    3. Now in the case where we want to install ACM from an IIB we need to
       be able to override this and we will pick whatever value is set in
       .Values.clusterGroup.subscriptions.acm.source, which will need to be
       defined for this to work when testing ACM+MCE from an IIB
    
    Note: Currently point 3. works only if you set it in a values file.
    Setting .Values.clusterGroup.subscriptions.acm.source via extraParams
    won't be passed down from the clusterGroup app to the applications.
    It's a bug that we need to fix.
    
    Note(2): We surround this with an 'if kindIs "map" .Values.clusterGroup.subscriptions'
    because we do not want to break things if subscription is a list and not
    a map. If we ever manage to drop subscriptions as list, then we can
    remove that if
    
    * Fix typo in README for iib
    
    * Simplify the README a bit
    
    * Add support for extraParams being passed down to all applications
    
    Via validatedpatterns/patterns-operator#74
    we add the extraParams in an extraParametersNested dictionary that holds
    the extraParams key/value pairs. If they exist, let's add them as
    parameters.
    
    This allows them to end up in the applications.
    
    * Add a lookup playbook to figure out IIB numbers
    
    * Allow overriding channel and source when installing the patterns-operator
    
    This will allow us to test the patterns-operator using a different
    catalogsource (potentially installed via an IIB). So we can run:
    
    make EXTRA_HELM_OPTS="\
      --set main.extraParameters[0].name=main.patternsOperator.channel --set main.extraParameters[0].value=slow \
      --set main.extraParameters[1].name=main.patternsOperator.source --set main.extraParameters[1].value=patten-index" install
    
    * Fix small typo in iib instructions
    
    * Drop a redirect and up retries when pushing the IIB to the internal registry
    
    * Update ESO to v0.8.3
    
    * WIP add presync for eso that waits for vault to be up
    
    * Add tests
    
    * Fix image and comment
    
    * Adding rbac to support the vault sa checking on the vault-0 pod status.
    
    * Make Test
    
    * Removed previous version of common to convert to subtree from https://github.com/hybrid-cloud-patterns/common.git main
    
    * updated script to check for new status
    
    * make test
    
    * make test and remove presync checks for eso
    
    * make test
    
    * make test
    
    ---------
    
    Co-authored-by: Lester Claudio <claudiol@redhat.com>
    Co-authored-by: Michele Baldessari <michele@acksyn.org>
    Co-authored-by: Lorenzo Dalrio <ldalrio@redhat.com>
    Co-authored-by: Andrew Beekhof <andrew@beekhof.net>
    Co-authored-by: Martin Jackson <mhjacks@redhat.com>
    6 people committed Jul 9, 2023
    Configuration menu
    Copy the full SHA
    f247f63 View commit details
    Browse the repository at this point in the history

Commits on Jul 31, 2023

  1. Update Common (#87)

    * Updated namespaces template to include labels and annotations functionality
    
    * Added schema validation to support additional formal for labels and annotations
    
    * Updated the values-example.yaml to include new format for namespaces
    
    * Updated Changes.md to include new namespaces functionality.
    
    * Updating CI tests
    
    * Fixed Markdown errors
    
    * Add an experimental letsencypt chart
    
    This change adds an experimental letsencrypt chart that allows a pattern
    user/developer to have all routes and the API endpoint use signed
    certificates by letsencrypt.
    
    At this stage only AWS is supported. The full documentation is contained
    in the chart's README.md file
    
    * Do not run kubeconform on the certificate stuff just yet
    
    * Fix up kustomize example
    
    In the same vein as Industrial Edge 57f41dc135f72011d3796fe42d9cbf05d2b82052
    we call kustomize build.
    
    Newer gitops versions dropped the openshift-clients rpm by default which
    contained kubectl. Let's just invoke "kustomize" directly as the binary
    is present in both old and new gitops versions
    
    Since "kubectl kustomize" builds the set of resources by default, we
    need to switch to "kubectl build" by default
    
    We also use the same naming conventions used in Industrial Edge while
    we're at it.
    
    * Upgrade vault-helm to v0.24.0
    
    Tested on MCG with hub and spoke
    
    * Add a hello-world ansible playbook example
    
    Just a simple example that reads a helm value and puts it in a configmap
    
    * Inject ANSIBLE_CONFIG in make ansible-lint
    
    * Use new ansible-lint action
    
    * Fix some ansible-lint warnings
    
    * Fix up python versions
    
    * Skip cannot find role error
    
    Avoid checking those two playbooks the action seems to be too limited
    to understand where the ansible.cfg is
    
    * Added health check for pvc resource in argocd.yaml
    
    This allows argo to continue rolling out the rest of the applications.
    Without the health check the application is stuck in a progressing state
    and will not continue thus preventing any downstream application from
    deploying.
    
    * adding tests
    
    * Update super-linter image to latest
    
    * Update super-linter image to latest
    
    * Update CI workflows
    
    * updated template with why implemented comment
    
    * Add dependabot settings for github actions
    
    * adding tests
    
    * - Added functionality to support the following format for labels and annotations:
          labels:
            openshift.io/node-selector: ""
          annotations:
            openshift.io/cluster-monitoring: "true"
    
    * Fixed CI Issues
    
    * Applying @claudiol recommendation
    
    * make test
    
    * Avoid exited containers proliferation
    
    When running the `pattern.sh` script multiple times, a lot of
    podman exited containers will be left on the machine, adding
    `--rm` parameter to `podman run` makes podman automatically
    delete the exited containers leaving the machine cleaner.
    
    * Handling of pre-release builds is too complex for a helm chart
    
    Generating the ICSP and allowing insecure registries is best done prior
    to helm upgrade, and requires VPN access to registry-proxy.engineering.redhat.com
    
    * Fixing issues with operator groups
    
    * Adding CI test
    
    * Updated operator group template
    
    * Updating CI issues
    
    * Removed duplicate code for operatorgroup by using multiple conditions
    
    * Allow overriding the pattern's name
    
    This is especially useful when multiple people are working on a pattern
    an have been using different names:
    
        $ make help |grep Pattern:
        Pattern: multicloud-gitops
        $ make NAME=foobar help |grep Pattern:
        Pattern: foobar
    
    * Add precise instruction to upgrade the vault subchart
    
    * Upgrade vault-helm to v0.24.1
    
    * Add an item to README.md
    
    * Fix up common/ tests
    
    * Fix super linter
    
    * Set gitOpsSpec.operatorSource
    
    After merging validatedpatterns/patterns-operator@235b303
    it is now effectively possible to pick a different catalogSource for
    the gitops operator. This is needed in order to allow CI to install
    the gitops operator from an IIB
    
    * Introduce EXTRA_HELM_OPTS
    
    This variable can be set in order to pass additional helm arguments from the
    the command line. I.e. we can set things without having to tweak values files
    So it is now possible to run something like the following:
    
      ./pattern.sh make install \
      EXTRA_HELM_OPTS="--set main.gitops.operatorSource=iib-49232"
    
    * Disable var-naming[no-role-prefix] in ansible lint
    
    * Add new ansible role to deal with IIBs
    
    * Simplify load-iib target
    
    * Add templates folder
    
    * Fix a couple of linting warnings
    
    * Fix some super-linter complaints
    
    * Skip the iib-ci playbook
    
    * Drop var-naming[no-role-prefix] linter
    
    * Allow for multiple images when calling load-iib
    
    * Add help for load-iib
    
    * Output index_image in make
    
    * Output index_image in make (2)
    
    * Set facts later in the playbook not in defaults/
    
    * Fix how we export vars in make load-iib
    
    * Fix how we export vars in make load-iib (2)
    
    * Use machineCount to register the number of nodes that need to be ready
    
    * Add helpful debug messages
    
    * Add | on shell now that we call pipefail
    
    * Test dropping nevercontact source
    
    * Skip insecure tls when logging in
    
    * Also allow gchr.io
    
    * Revert "Test dropping nevercontact source"
    
    This reverts commit d8746a37fce2663018f52203c892f00b825e32a7.
    
    * Fix typo
    
    * Clarify instructions in the README file
    
    * Automate the channel example
    
    * Find out KUBEADMINAPI programmatically
    
    * Use command instead of shell
    
    * Do not grep for operator bundle unless it is the gitops operator
    
    * Also whitelist ghcr.io
    
    * Fetch the operator bundle itself in a more robust way
    
    It seems that the operator bundle image itself is nowhere to be found
    inside any OCP cluster object (it's not in packagemanifests nor
    catalogsource). Resorting to parsing the IIB via opm alpha commands
    to fetch the exact image.
    
    * Add more mirrors
    
    * Some more work to support MCE
    
    * Cleanup spacing
    
    * Fix super-linter
    
    * Move task in right folder
    
    * Drop last mention of operator instead of item
    
    * Improve the grepping for the operator bundle
    
    Without also grepping for the default_channel we can end up getting
    multiple results, which breaks everything.
    
    Tested this and it fixed the issue I was seeing with the
    openshift-gitops-operator this morning
    
    * Drop display_skipped_hosts
    
    display_skipped_hosts=False has a horrible side-effect:
    When a task takes a long time, it is always the *next* task and not the
    one printed on the screen/log. That is because ansible has to wait for
    the task to finish before printing it as it does not know before hand if
    the host will be skipped and hence the task should not be displayed at
    all
    
    * Be more specific about the steps in the README
    
    * Upgrade ESO to v0.8.2
    
    * Update README.md
    
    * Update tests after eso 0.8.2 upgrade
    
    * Move to new spec format for dex/sso
    
    Via https://issues.redhat.com/browse/GITOPS-2761 we are told that the
    dex configuration has a new format.
    Old format:
    
        spec:
          dex:
            openShiftOAuth: true
            resources:
            ...
    
    New format:
    
        spec:
          sso:
            provider: dex
            dex:
              openShiftOAuth: true
              resources:
              ...
    
    This format is only supported starting with gitops-1.8.0, so we should
    merge this only when we are absolutely sure that no pattern in no
    situation needs an older gitops version.
    
    Tested on MCG with gitops-1.8.2
    
    Note: with this change gitops < 1.8 is not supported. Starting with
    gitops-1.9 the old format will be unsupported.
    
    * Disable ArgoCD from kubeconform
    
    The reason is that most of the tools we used to generate the json
    schema, seem to be unmaintained, so it is getting hard to update
    our schemas in our GH org. We'll need to revisit this in the future.
    
    * Add a short line about username/token for the iib role on OCP <= 4.12
    
    * Drop https:// from podman login
    
    Seems we hit https://www.github.com/containers/podman/issues/13691 at
    least with older podman versions.
    
    If this turns out to break podman 4.5.0 I will special case it later
    
    * Set the mce-subscription-spec annotation
    
    We set it by default to "redhat-operators" and if defined to .Values.clusterGroup.subscriptions.acm.source
    The reason we do this is the following:
    1. In a default deployment scenario MCE has to be deployed as normal
       from the redhat-operators catalogSource just as ACM is
    2. When we deploy gitops-operator from an IIB instead, MCE would be
       installed trying to get it from the IIB because
       https://www.github.com/stolostron/multiclusterhub-operator/pull/975
       made it so that it picks the latest version looking at all catalog
       sources. But since we only mirrored the gitops operator in the
       cluster, this breaks as the images for MCE from the IIB are not there
       By setting the default to "redhat-operators" we fix this case
    3. Now in the case where we want to install ACM from an IIB we need to
       be able to override this and we will pick whatever value is set in
       .Values.clusterGroup.subscriptions.acm.source, which will need to be
       defined for this to work when testing ACM+MCE from an IIB
    
    Note: Currently point 3. works only if you set it in a values file.
    Setting .Values.clusterGroup.subscriptions.acm.source via extraParams
    won't be passed down from the clusterGroup app to the applications.
    It's a bug that we need to fix.
    
    Note(2): We surround this with an 'if kindIs "map" .Values.clusterGroup.subscriptions'
    because we do not want to break things if subscription is a list and not
    a map. If we ever manage to drop subscriptions as list, then we can
    remove that if
    
    * Fix typo in README for iib
    
    * Simplify the README a bit
    
    * Add support for extraParams being passed down to all applications
    
    Via validatedpatterns/patterns-operator#74
    we add the extraParams in an extraParametersNested dictionary that holds
    the extraParams key/value pairs. If they exist, let's add them as
    parameters.
    
    This allows them to end up in the applications.
    
    * Add a lookup playbook to figure out IIB numbers
    
    * Allow overriding channel and source when installing the patterns-operator
    
    This will allow us to test the patterns-operator using a different
    catalogsource (potentially installed via an IIB). So we can run:
    
    make EXTRA_HELM_OPTS="\
      --set main.extraParameters[0].name=main.patternsOperator.channel --set main.extraParameters[0].value=slow \
      --set main.extraParameters[1].name=main.patternsOperator.source --set main.extraParameters[1].value=patten-index" install
    
    * Fix small typo in iib instructions
    
    * Drop a redirect and up retries when pushing the IIB to the internal registry
    
    * Update ESO to v0.8.3
    
    * WIP add presync for eso that waits for vault to be up
    
    * Add tests
    
    * Fix image and comment
    
    * Adding rbac to support the vault sa checking on the vault-0 pod status.
    
    * Make Test
    
    * Revert "Make Test"
    
    This reverts commit 64e9dc7.
    
    * Revert "Adding rbac to support the vault sa checking on the vault-0 pod status."
    
    This reverts commit 598bc74.
    
    * Revert "Fix image and comment"
    
    This reverts commit d4d3fe1.
    
    * Revert "Add tests"
    
    This reverts commit ab5532a.
    
    * Revert "WIP add presync for eso that waits for vault to be up"
    
    This reverts commit 2797699.
    
    * Increase the default retry limit when syncing
    
    ArgoCD will retry 5 times by default to sync an application in case of
    errors and then will give up. So if an application contains a reference
    to a CRD that has not been installed yet (say because it will be
    installed by another application), it will error out and retry later.
    This happens by default for a maximum of 5 times [1]. After those 5 times
    the application will give up and will stay in Degraded moded and
    eventually move to Failed. In this case a manual sync will usually fix
    the application just fine (i.e. as long as the missing CRD has been
    installed in the meantime).
    
    Now to solve this issue we can add complex preSync Jobs that wait for
    the needed resources, but this fundamentally breaks the simplicity of
    things and introduces unneeded dependencies. In this change we just
    increase the default retry limit to something larger (20) that should
    cover most cases. The retry limit functionality is rather undocumented
    currently in the docs but is defined at [2] and also shown at [3].
    
    In our patterns' case the concrete issue happened as follows:
    1. ESO ClusterSecrets were often not synced/degraded
    2. We introduced a Job in a preSync hook for the ESO chart that would
       wait on vault to be ready before applying the rest of ESO
    3. MCG started failing because the config-demo app had already tried to
       sync 5 times and failed everytime because the ESO CRDs were not
       installed yet (due to ESO waiting on vault)
    
    So instead of adding yet another job, let's just try a lot more often.
    We picked 20 as a sane default because that should have argo try for
    about 60 minutes (3min is the default maximum backoff limit)
    
    Tested with two MCG installations (with the ESO Job hook included) and
    both worked out of the box. Whereas before I managed to get three
    failures out of three installs.
    
    [1] https://github.com/argoproj/argo-cd/blob/master/controller/appcontroller.go#L1680
    [2] https://github.com/argoproj/argo-cd/blob/master/manifests/crds/application-crd.yaml#L1476
    [3] https://github.com/argoproj/argo-cd/blob/master/docs/operator-manual/application.yaml#L202C18-L202C100
    
    * Add Changes.md entry
    
    * Split off global helm variables to a helper definition
    
    We can only split out bits of yaml that reference $.* variables. This is
    because these sinppets in _helpers.tbl are passed a single context
    either $ or . and cannot use both like the top-level domain.
    
    * Switch ApplicationSets to use the newly-introduced helpers
    
    I only remove the variables that are defined identically in
    ApplicationSet and in the helper. Leaving the other ones as is
    as their presence is not entirely clear to me and I do not want to
    risk breaking things.
    
    * Split off valueFiles to _helpers.tbl
    
    * Switch applicationsets to use the new helper
    
    * Drop some older comments
    
    * Tweak the load secret debug message to be clearer
    
    When HOME is set we replace it with '~' in this debug message
    because when run from inside the container the HOME is /pattern-home
    which is confusing for users. Printing out '~' when at the start of
    the string is less confusing.
    
    Before:
    ok: [localhost] => {
        "msg": "/home/michele/.config/hybrid-cloud-patterns/values-secret-multicloud-gitops.yaml"
    }
    
    After:
    ok: [localhost] => {
        "msg": "~/.config/hybrid-cloud-patterns/values-secret-multicloud-gitops.yaml"
    }
    
    * Check if the KUBECONFIG file is pointing outside of the HOME folder
    
    If it is somewhere under /tmp or out of the HOME folder, bail out
    explaining why. This has caused a few silly situations where the user
    would save the KUBECONFIG file under /tmp. Since bind-mounting /tmp
    seems like a wrong thing to do in general, we at least bail out with a
    clear error message. To do this we rely on a bash functionality so let's
    just switch the script to use that.
    
    Tested as follows:
    export KUBECONFIG=/tmp/kubeconfig
    ./scripts/pattern-util.sh make help
    /tmp/kubeconfig is pointing outside of the HOME folder, this will make it unavailable from the container.
    Please move it somewhere inside your /home/michele folder, as that is what gets bind-mounted inside the container
    
    export KUBECONFIG=~/kubeconfig
    ./scripts/pattern-util.sh make help
    Pattern: common
    
    Usage:
      make <target>
    ...
    
    * Include an example SNO cluster pool in the tests
    
    * Enforce lowercase names for cluster claims
    
    * Avoid mixing yaml and json in the OCP install-config
    
    * Update provisioning tests
    
    * Sanely handle cluster pools with no clusters (yet)
    
    * Clustergroup Chart.yaml name change
    
    We currently have a small inconsistency where we use common/clustergroup
    in order to point Argo CD to this chart, but the name inside the chart
    is 'pattern-clustergroup'.
    
    This inconsistency is currently irrelevant, but in the future when
    migrating to helm charts inside proper helm repos, this becomes
    problematic. So let's fix the name to be the same as the folder.
    
    Tested on MCG successfully.
    
    * Fix the clusterPoolName in clusterClaims
    
    Currently with the following values snippet:
    
      managedClusterGroups:
        exampleRegion:
          name: group-one
          acmlabels:
          - name: clusterGroup
            value: group-one
          helmOverrides:
          - name: clusterGroup.isHubCluster
            value: false
          clusterPools:
            exampleAWSPool:
              size: 1
              name: aws-ap-bandini
              openshiftVersion: 4.12.24
              baseDomain: blueprints.rhecoeng.com
              controlPlane:
                count: 1
                platform:
                  aws:
                    type: m5.2xlarge
              workers:
                count: 0
              platform:
                aws:
                  region: ap-southeast-2
              clusters:
              - One
    
    You will get a clusterClaim that is pointing to the wrong Pool:
    NAMESPACE                 NAME                       POOL
    open-cluster-management   one-group-one              aws-ap-bandini
    
    This is wrong because the clusterPool name will be generated using the
    pool name + "-" group name:
    
      {{- $pool := . }}
      {{- $poolName := print .name "-" $group.name }}
    
    But the clusterPoolName inside the clusterName is only using the
    "$pool.name" which will make the clusterClaim ineffective as the pool
    does not exist.
    
    Switch to using the same poolName that is being used when creating the
    clusterPool.
    
    * Add some comments to make if/else and loops clearer
    
    Let's improve readability by adding some comments to point out which
    flow constructs are being ended.
    
    * Add some more comments in applications.yaml
    
    * Add a default for options applicationRetryLimit
    
    * Split out values files to a helper for the acm chart
    
    Just like we did for the clustergroup chart, let's split the values
    file list into a dedicated helper. This time since there are no global
    variables we include it with the current context and not with the '$'
    context.
    
    Tested with MCG: hub and spoke. Correctly observed all the applications
    running on the spoke.
    
    * Fix up tests
    
    They changed because we made the list indentation more correct (two
    extra spaces to the left)
    
    * Fix sa/namespace mixup in vault_spokes_init
    
    * Update local patch
    
    Also set seccompProfile to null to make things work on OCP 4.10
    
    * Update ESO to 0.8.5
    
    * Tweak ESO UBI images
    
    Tested the ESO upgrade on MCG on both 4.10 and 4.13
    
    * Removed previous version of common to convert to subtree from https://github.com/hybrid-cloud-patterns/common.git main
    
    * make test
    
    ---------
    
    Co-authored-by: Lester Claudio <claudiol@redhat.com>
    Co-authored-by: Michele Baldessari <michele@acksyn.org>
    Co-authored-by: Lorenzo Dalrio <ldalrio@redhat.com>
    Co-authored-by: Andrew Beekhof <andrew@beekhof.net>
    Co-authored-by: Martin Jackson <mhjacks@redhat.com>
    Co-authored-by: Tom Stockwell <2060486+stocky37@users.noreply.github.com>
    7 people committed Jul 31, 2023
    Configuration menu
    Copy the full SHA
    fc060b8 View commit details
    Browse the repository at this point in the history