I'm testing an installation of velero for k8s backups on AKS (Azure's kubernetes paas) and getting this error after a timeout period:
time="2019-10-14T11:22:56Z" level=info msg="Checking existence of namespace" logSource="pkg/cmd/server/server.go:336" namespace=velero
time="2019-10-14T11:22:56Z" level=info msg="Namespace exists" logSource="pkg/cmd/server/server.go:342" namespace=velero
time="2019-10-14T11:22:59Z" level=info msg="Checking existence of Velero custom resource definitions" logSource="pkg/cmd/server/server.go:371"
time="2019-10-14T11:22:59Z" level=info msg="All Velero custom resource definitions exist" logSource="pkg/cmd/server/server.go:405"
time="2019-10-14T11:22:59Z" level=info msg="Checking that all backup storage locations are valid" logSource="pkg/cmd/server/server.go:412"
An error occurred: some backup storage locations are invalid: error getting backup store for location "default": rpc error: code = Unknown desc = azure.BearerAuthorizer#WithAuthorization: Failed to refr
esh the Token for request to https://management.azure.com/subscriptions/<subscription_id>/resourceGroups/rsg-lolgroup/providers/Microsoft.Storage/storageAccounts/lolsubs/listKeys?api-version=2018-02-01: StatusCode=0 -- Original Error: adal: Failed to execute the refresh request. Error = 'Post https://login.microsoftonline.com/tenant_id/oauth2/token?api-version=1.0: x509: certificate signed by unknown authority'
{
"kind": "Deployment",
"apiVersion": "apps/v1",
"metadata": {
"name": "velero",
"namespace": "velero",
"selfLink": "/apis/apps/v1/namespaces/velero/deployments/velero",
"uid": "redacted",
"resourceVersion": "701159",
"generation": 1,
"creationTimestamp": "2019-10-14T06:18:45Z",
"labels": {
"component": "velero"
},
"annotations": {
"deployment.kubernetes.io/revision": "1"
}
},
"spec": {
"replicas": 1,
"selector": {
"matchLabels": {
"deploy": "velero"
}
},
"template": {
"metadata": {
"creationTimestamp": null,
"labels": {
"component": "velero",
"deploy": "velero"
},
"annotations": {
"prometheus.io/path": "/metrics",
"prometheus.io/port": "8085",
"prometheus.io/scrape": "true"
}
},
"spec": {
"volumes": [
{
"name": "plugins",
"emptyDir": {}
},
{
"name": "scratch",
"emptyDir": {}
},
{
"name": "cloud-credentials",
"secret": {
"secretName": "cloud-credentials",
"defaultMode": 420
}
}
],
"containers": [
{
"name": "velero",
"image": "docker-rtsre-local.lol.com:8443/velero:0.0.1",
"command": [
"/velero"
],
"args": [
"server"
],
"ports": [
{
"name": "metrics",
"containerPort": 8085,
"protocol": "TCP"
}
],
"env": [
{
"name": "VELERO_SCRATCH_DIR",
"value": "/scratch"
},
{
"name": "VELERO_NAMESPACE",
"valueFrom": {
"fieldRef": {
"apiVersion": "v1",
"fieldPath": "metadata.namespace"
}
}
},
{
"name": "GOOGLE_APPLICATION_CREDENTIALS",
"value": "/credentials/cloud"
},
{
"name": "AWS_SHARED_CREDENTIALS_FILE",
"value": "/credentials/cloud"
},
{
"name": "AZURE_CREDENTIALS_FILE",
"value": "/credentials/cloud"
}
],
"resources": {
"limits": {
"cpu": "1",
"memory": "256Mi"
},
"requests": {
"cpu": "500m",
"memory": "128Mi"
}
},
"volumeMounts": [
{
"name": "plugins",
"mountPath": "/plugins"
},
{
"name": "scratch",
"mountPath": "/scratch"
},
{
"name": "cloud-credentials",
"mountPath": "/credentials"
}
],
"terminationMessagePath": "/dev/termination-log",
"terminationMessagePolicy": "File",
"imagePullPolicy": "Always"
}
],
"restartPolicy": "Always",
"terminationGracePeriodSeconds": 30,
"dnsPolicy": "ClusterFirst",
"serviceAccountName": "velero",
"serviceAccount": "velero",
"securityContext": {},
"schedulerName": "default-scheduler"
}
},
"strategy": {
"type": "RollingUpdate",
"rollingUpdate": {
"maxUnavailable": "25%",
"maxSurge": "25%"
}
},
"revisionHistoryLimit": 10,
"progressDeadlineSeconds": 600
},
"status": {
"observedGeneration": 1,
"replicas": 1,
"updatedReplicas": 1,
"unavailableReplicas": 1,
"conditions": [
{
"type": "Progressing",
"status": "True",
"lastUpdateTime": "2019-10-14T06:21:35Z",
"lastTransitionTime": "2019-10-14T06:18:45Z",
"reason": "NewReplicaSetAvailable",
"message": "ReplicaSet \"velero-866985f74d\" has successfully progressed."
},
{
"type": "Available",
"status": "False",
"lastUpdateTime": "2019-10-14T07:03:46Z",
"lastTransitionTime": "2019-10-14T07:03:46Z",
"reason": "MinimumReplicasUnavailable",
"message": "Deployment does not have minimum availability."
}
]
}
}
}
I suspect this could be firewall be related, but is there a way to debug further?