GitLab CI/Kubernetes - run postgres migration for test env (not production)

6/21/2017

I'm pushing out my Phoenix app to a Kubernetes cluster for testing via GitLab. I'd like to be able to run mix ecto.migrate in my gitlab-ci.yml script once my app and the postgres service are ready. Here's a snippet from the gitlab-ci.yml file:

review:
  stage: review
  image: dtzar/helm-kubectl

  environment:
    name: review/$CI_COMMIT_REF_NAME
    url: https://$CI_PROJECT_NAME-${CI_ENVIRONMENT_SLUG}.$KUBE_DOMAIN
    on_stop: stop_review

  before_script:
    - command deploy/kinit.sh

  script:
    - helm upgrade --install db --wait --set postgresDatabase=app_db stable/postgresql
    - helm upgrade --install app ./deploy/app_chart --wait --set env.DATABASE_URL="${DATABASE_URL}"

    - export POD_NAME=`kubectl get pod -l "app=${CI_ENVIRONMENT_SLUG}" -o jsonpath='{.items[0].metadata.name}'`
    - kubectl exec $POD_NAME -- mix ecto.migrate

From what I understand, the --wait parameter means that each deployment will complete (in its entirety) before moving on. What I'm finding is that although the postgres deployment is complete, that does not mean that the postgres server is ready.

More often than not, when the kubectl exec command runs, I get the following error:

** (exit) exited in: :gen_server.call(#PID<0.183.0>, {:checkout, #Reference<0.0.1.2678>, true, :infinity}, 5000)
    ** (EXIT) time out
    (db_connection) lib/db_connection/poolboy.ex:112: DBConnection.Poolboy.checkout/3
    (db_connection) lib/db_connection.ex:919: DBConnection.checkout/2
    (db_connection) lib/db_connection.ex:741: DBConnection.run/3
    (db_connection) lib/db_connection.ex:1132: DBConnection.run_meter/3
    (db_connection) lib/db_connection.ex:584: DBConnection.prepare_execute/4
    (ecto) lib/ecto/adapters/postgres/connection.ex:93: Ecto.Adapters.Postgres.Connection.execute/4
    (ecto) lib/ecto/adapters/sql.ex:243: Ecto.Adapters.SQL.sql_call/6
    (ecto) lib/ecto/adapters/sql.ex:193: Ecto.Adapters.SQL.query!/5

When I look at the Kubernetes ui, I can see the following error against my postgres pod:

SchedulerPredicates failed due to PersistentVolumeClaim is not bound: "db-postgresql", which is unexpected.

After seeing this message, I monitor the pods and everything comes good. But not before my deployment script fails.

My initial thought is that I could create an initContainer for my app that uses psql to successfully connect to the server and check for the existence of the "app_db" database. That way I don't have to worry about writing my own code for timeouts and retries - I can just take advantage of the built-in mechanism provided by Kubernetes.

However, I don't want to do this in my production environment (I want to run mix ecto.migrate on the production system manually). In this case, the initContainer simply serves as a waste of system resources.

Is there a robust way I can achieve this through the gitlab-ci.yml script?

-- Mitkins
gitlab-ci
kubernetes
kubernetes-helm
phoenix-framework

1 Answer

7/10/2017

From a conceptual point of view I would:

  1. Configure a readiness probe on my Postgres container, so that the Pod is not considered "Running" until the engine is up.

    # in the Pod template:
    # spec.containers['postgres']
    
    readinessProbe:
      exec:
        command:
        - psql
        - -U
        - postgres
        - -c
        - 'SELECT 1'
      initialDelaySeconds: 5
      periodSeconds: 5
  2. Wait for the Pod to transition to a "Running" state before executing my Mix task.

    # in gitlab-ci.yml, before "mix ecto.migrate"
    
    - |
        while [ "$(kubectl get pod $POD_NAME -o jsonpath='{$.status.phase}')" != "Running" ]; do
          sleep 1;
        done
-- Antoine Cotten
Source: StackOverflow