Incident: Kaniko and ACR
Categories:
We’ve recently had an issue with one of our packages come to light. We wanted to talk through the resolution steps we’re going to put into place.
So what happened?
Azure users started reporting seeing the following error within the build step:
error checking push permissions -- make sure you entered the correct tag name, and that you are authenticated correctly, and try again: checking push permission for "xyz.azurecr.io/myorg/myrepo:0.0.1": resolving authorization for xyz.azurecr.io failed: error getting credentials - err: exec: "docker-credential-acr-env": executable file not found in $PATH
This seemed to be indicating to an authorization issues with the terraform module. Other users were seemingly unaffected by this issue.
Upon further analysis, kaniko seems to have issues with the latest version and grabbing credentials for acr. The latest working version that we are aware of is 1.3. There wasn’t a massive influx of people seeing this issue due to it only occuring once versionstream had been updated with jx gitops upgrade
.
So what are you going to do about it?
For starters, most users are probably using the latest kaniko features, so we’re unable to just roll this back for everyone.
We’re getting started by creating a PR to kaniko to resolve the issue. However, due to release schedules etc, this will mean that getting started with azure will be broken for quite a while.
How can I fix it in the meantime?
If you’re on azure, the resolution is quite simple, here’s a step by step guide:
1. Find which buildpack you are using and navigate to the build-container-build step:
directory: csharp/ date: 12/28/21 git: main
› cat .lighthouse/jenkins-x/release.yaml | grep " image: "
image: uses:jenkins-x/jx3-pipeline-catalog/tasks/csharp/release.yaml@versionStream
2. Replace the build-container-build step to use the suggested kaniko image: gcr.io/kaniko-project/executor:v1.3.0-debug
Before
apiVersion: tekton.dev/v1beta1
kind: PipelineRun
metadata:
creationTimestamp: null
name: release
spec:
pipelineSpec:
tasks:
- name: from-build-pack
resources: {}
taskSpec:
metadata: {}
stepTemplate:
image: uses:jenkins-x/jx3-pipeline-catalog/tasks/csharp/release.yaml@versionStream
name: ""
resources:
requests:
cpu: 200m
memory: 256Mi
workingDir: /workspace/source
steps:
- image: uses:jenkins-x/jx3-pipeline-catalog/tasks/git-clone/git-clone.yaml@versionStream
name: ""
resources: {}
- name: next-version
resources: {}
- name: jx-variables
resources: {}
- name: check-registry
resources: {}
- name: build-container-build
resources: {}
- name: promote-changelog
resources: {}
- name: promote-helm-release
resources: {}
- name: promote-jx-promote
resources: {}
podTemplate: {}
serviceAccountName: tekton-bot
timeout: 12h0m0s
status: {}
After
apiVersion: tekton.dev/v1beta1
kind: PipelineRun
metadata:
creationTimestamp: null
name: release
spec:
pipelineSpec:
tasks:
- name: from-build-pack
resources: {}
taskSpec:
metadata: {}
stepTemplate:
image: uses:jenkins-x/jx3-pipeline-catalog/tasks/csharp/release.yaml@versionStream
name: ""
resources:
requests:
cpu: 200m
memory: 256Mi
workingDir: /workspace/source
steps:
- image: uses:jenkins-x/jx3-pipeline-catalog/tasks/git-clone/git-clone.yaml@versionStream
name: ""
resources: {}
- name: next-version
resources: {}
- name: jx-variables
resources: {}
- name: check-registry
resources: {}
- image: gcr.io/kaniko-project/executor:v1.3.0-debug
name: build-container-build
resources: {}
script: |
#!/busybox/sh
source .jx/variables.sh
cp /tekton/creds-secrets/tekton-container-registry-auth/.dockerconfigjson /kaniko/.docker/config.json
/kaniko/executor $KANIKO_FLAGS --context=/workspace/source --dockerfile=${DOCKERFILE_PATH:-Dockerfile} --destination=$PUSH_CONTAINER_REGISTRY/$DOCKER_REGISTRY_ORG/$APP_NAME:$VERSION
- name: promote-changelog
resources: {}
- name: promote-helm-release
resources: {}
- name: promote-jx-promote
resources: {}
podTemplate: {}
serviceAccountName: tekton-bot
timeout: 12h0m0s
status: {}
3. Do this again for your .lighthouse/jenkins-x/pullrequest.yaml
file
4. Commit and push it to your repository.
We’re hoping to make this simpler
We’re working on:
- Fixing it upstream (the proper fix, but could take a while to be merged)
- Making a temporary azure pipeline step for azure users
- Making a script so that azure users don’t have to go through this manual process each time
- Allowing overriding images without having to specify a script within lighthouse
Help us find and fix things like this in future
Talk to us on our slack channels, which are part of the Kubernetes slack. Join Kubernetes slack here and find us on our channels:
- #jenkins-x-dev for developers of Jenkins X
- #jenkins-x-user for users of Jenkins X
Find out more about becoming involved in the Jenkins X community here.