Platform: minikube version: v1.24.0
tl;dr airflow won't start, logs of everything are listed below
I'm trying to recreate everything and I'm stuck with this part. I've been waiting for some time for everything written in ml-platform.yaml to configure and app-aflow-airflow-web is in CrashLoopBackoff state for 1 hour now.
I've tried killing it, recreating it and nothing has worked.
Here is list of pods created during execution of this command:
kubectl apply -f manifests/kfdef/ml-platform.yaml -n ml-workshop
NAME READY STATUS RESTARTS AGE
app-aflow-airflow-scheduler-f7fc5d4cb-dndwb 2/2 Running 2 (6m30s ago) 14m
app-aflow-airflow-web-54659fb97d-n6lms 1/2 CrashLoopBackOff 4 (29s ago) 2m34s
app-aflow-airflow-web-7c566d79d-4v2wv 1/2 CrashLoopBackOff 4 (19s ago) 2m33s
app-aflow-airflow-worker-0 1/2 Running 0 2m17s
app-aflow-postgresql-0 1/1 Running 0 14m
app-aflow-redis-master-0 1/1 Running 0 14m
grafana-5dc6cf89d-vs8xd 1/1 Running 0 14m
jupyterhub-7848ccd4b7-jkvpr 1/1 Running 0 14m
jupyterhub-db-0 1/1 Running 0 14m
minio-ml-workshop--1-m2bh4 0/1 Completed 2 14m
minio-ml-workshop-6b84fdc7c4-7nsql 1/1 Running 0 14m
mlflow-d65ccb65d-8wpm6 2/2 Running 0 14m
mlflow-db-0 1/1 Running 0 14m
seldon-controller-manager-7f67f4985b-bs5sq 1/1 Running 0 14m
spark-operator-69cfd96bf4-7h94n 1/1 Running 0 14m
I've changed to minikube ip as mentioned.
Logs from failing container app-aflow-airflow-web-7c566d79d-4v2wv:airflow-web:
airflow 14:25:27.02
airflow 14:25:27.02 Welcome to the Bitnami airflow container
airflow 14:25:27.02 Subscribe to project updates by watching https://github.com/bitnami/bitnami-docker-airflow
airflow 14:25:27.02 Submit issues and feature requests at https://github.com/bitnami/bitnami-docker-airflow/issues
airflow 14:25:27.02
airflow 14:25:27.02 INFO ==> Enabling non-root system user with nss_wrapper
airflow 14:25:27.03 INFO ==> ** Starting Airflow setup **
airflow 14:25:27.05 INFO ==> Initializing Airflow ...
airflow 14:25:27.06 INFO ==> No injected configuration file found. Creating default config file
airflow 14:25:27.77 INFO ==> Configuring Airflow webserver authentication
airflow 14:25:27.78 INFO ==> Configuring Airflow database
airflow 14:25:27.81 INFO ==> Configuring Celery Executor
airflow 14:25:27.83 INFO ==> Waiting for PostgreSQL to be available at app-aflow-postgresql:5432...
Stream closed EOF for ml-workshop/app-aflow-airflow-web-7c566d79d-4v2wv (airflow-web)
describing pods also does not reveal much for me:
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 14m default-scheduler Successfully assigned ml-workshop/app-aflow-airflow-web-7c566d79d-4v2wv to minikube
Normal Pulling 14m kubelet Pulling image "registry.access.redhat.com/rhscl/postgresql-96-rhel7:latest"
Normal Pulled 14m kubelet Successfully pulled image "registry.access.redhat.com/rhscl/postgresql-96-rhel7:latest" in 945.837211ms
Normal Created 14m kubelet Created container waifordatabase
Normal Started 14m kubelet Started container waifordatabase
Normal Pulling 14m kubelet Pulling image "k8s.gcr.io/git-sync/git-sync:v3.2.2"
Normal Pulled 14m kubelet Successfully pulled image "k8s.gcr.io/git-sync/git-sync:v3.2.2" in 2.879638022s
Normal Created 14m kubelet Created container git-sync
Normal Started 14m kubelet Started container git-sync
Normal Pulled 14m kubelet Successfully pulled image "quay.io/ml-on-k8s/airflow:2.1.7.web.keycloak" in 1.810590705s
Normal Pulled 14m kubelet Successfully pulled image "quay.io/ml-on-k8s/airflow:2.1.7.web.keycloak" in 1.999765805s
Normal Pulled 13m kubelet Successfully pulled image "quay.io/ml-on-k8s/airflow:2.1.7.web.keycloak" in 2.210168418s
Normal Created 13m (x3 over 14m) kubelet Created container airflow-web
Normal Started 13m (x3 over 14m) kubelet Started container airflow-web
Normal Pulling 13m (x4 over 14m) kubelet Pulling image "quay.io/ml-on-k8s/airflow:2.1.7.web.keycloak"
Warning BackOff 4m14s (x46 over 13m) kubelet Back-off restarting failed container
replicaset:
Normal SuccessfulCreate 15m replicaset-controller Created pod: app-aflow-airflow-web-7c566d79d-4v2wv
kubectl logs:
$ kubectl logs -n ml-workshop app-aflow-airflow-web-7c566d79d-4v2wv
Defaulted container "git-sync" out of: git-sync, airflow-web, waifordatabase (init)
INFO: detected pid 1, running init handler
I1018 14:19:00.618669 12 main.go:430] "level"=0 "msg"="starting up" "args"=["/git-sync"] "pid"=12
I1018 14:19:00.618718 12 main.go:694] "level"=0 "msg"="cloning repo" "origin"="https://github.com/airflow-dags/dags/" "path"="/tmp/git"
I1018 14:19:14.308794 12 main.go:586] "level"=0 "msg"="syncing git" "hash"="8f22697a507c40bb42d4c674edd6b5c49ea0ecbb" "rev"="HEAD"
I1018 14:19:17.552166 12 main.go:607] "level"=0 "msg"="adding worktree" "branch"="origin/main" "path"="/tmp/git/rev-8f22697a507c40bb42d4c674edd6b5c49ea0ecbb"
I1018 14:19:17.556761 12 main.go:630] "level"=0 "msg"="reset worktree to hash" "hash"="8f22697a507c40bb42d4c674edd6b5c49ea0ecbb" "path"="/tmp/git/rev-8f22697a507c40bb42d4c674edd6b5c49ea0ecbb"
I1018 14:19:17.556781 12 main.go:635] "level"=0 "msg"="updating submodules"
previous logs:
$ kubectl logs -n ml-workshop app-aflow-airflow-web-7c566d79d-4v2wv --previous
Defaulted container "git-sync" out of: git-sync, airflow-web, waifordatabase (init)
Error from server (BadRequest): previous terminated container "git-sync" in pod "app-aflow-airflow-web-7c566d79d-4v2wv" not found
Service for postgresql exists and waitfordatabase executed successfully.
When I deleted this with:
kubectl delete -f manifests/kfdef/ml-platform.yaml -n ml-workshop
and reapplied it with same command as mentioned above, airflow2-proxy secret was missing. Added that from manifests/airflow2/base/service-accounts.yaml and same error appeared.
Platform:
minikube version: v1.24.0tl;dr airflow won't start, logs of everything are listed below
I'm trying to recreate everything and I'm stuck with this part. I've been waiting for some time for everything written in
ml-platform.yamlto configure andapp-aflow-airflow-webis inCrashLoopBackoffstate for 1 hour now.I've tried killing it, recreating it and nothing has worked.
Here is list of pods created during execution of this command:
I've changed to
minikube ipas mentioned.Logs from failing container
app-aflow-airflow-web-7c566d79d-4v2wv:airflow-web:describing pods also does not reveal much for me:
replicaset:
kubectl logs:
previous logs:
Service for postgresql exists and
waitfordatabaseexecuted successfully.When I deleted this with:
and reapplied it with same command as mentioned above,
airflow2-proxysecret was missing. Added that frommanifests/airflow2/base/service-accounts.yamland same error appeared.