Readiness probe during Spring context startup

8/15/2018

We are deploying our spring boot applications in OpenShift.

Currently we are trying to run a potentially long running task (database migration) before the webcontext is fully set up. It is especially important that the app does not accept REST requests or process messages before the migration is fully run. See the following minimal example:

// DemoApplication.java
@SpringBootApplication
public class DemoApplication {

    public static void main(String[] args) {
        SpringApplication.run(DemoApplication.class, args);
    }
}

// MigrationConfig.java
@Configuration
@Slf4j
public class MigrationConfig {
    @PostConstruct
    public void run() throws InterruptedException {
        log.info("Migration...");
        // long running task
        Thread.sleep(10000);
        log.info("...Migration");
    }
}

// Controller.java
@RestController
public class Controller {

    @GetMapping("/test")
    public String test() {
        return "test";
    }
}

// MessageHandler.java
@EnableBinding(Sink.class)
public class MessageHandler {
    @StreamListener(Sink.INPUT)
    public void handle(String message) {
        System.out.println("Received: " + message);
    }
}

This works fine so far: the auto configuration class is processed before the app responds to requests. What we are worried about, however, is OpenShifts readiness probe: currently we use an actuator health endpoint to check if the application is up and running. If the migration takes a long time, OpenShift might stop the container, potentially leaving us with inconsistent state in the database.

Does anybody have an idea how we could communicate that the application is starting, but prevent REST controller or message handlers from running?

Edit

There are multiple ways of blocking incoming REST requests, @martin-frey suggested a servletfilter.

The larger problem for us is stream listener. We use Spring Cloud Stream to listen to a RabbitMQ queue. I added an exemplary handler in the example above. Do you have any suggestions on how to "pause" that?

-- elactic
kubernetes
openshift
spring
spring-boot
spring-boot-actuator

5 Answers

8/15/2018

I think it can run your app pod without influence if you set up good enough initialDelaySeconds for initialization of your application.[0][1]

readinessProbe:
  httpGet:
    path: /_status/healthz
    port: 8080
  initialDelaySeconds: 10120
  timeoutSeconds: 3
  periodSeconds: 30
  failureThreshold: 100
  successThreshold: 1

Additionally, I recommend to set up the liveness probes with same condition (but more time than the readiness probes' value), then you can implement automated recovery of your pods if the application is failed until initialDelaySeconds.

[0] [ https://kubernetes.io/docs/tasks/configure-pod-container/configure-liveness-readiness-probes/#define-readiness-probes ]

[1] [ https://docs.openshift.com/container-platform/latest/dev_guide/application_health.html ]

-- Daein Park
Source: StackOverflow

8/15/2018

What about a servletfilter that knows about the state of the migration? That way you should be able to handle any inbound request and return a responsecode to your liking. Also there would be no need to prevent any requesthandlers until the system is fully up.

-- Martin Frey
Source: StackOverflow

8/15/2018

How about adding an init container which only role is the db migration stuffs without the application. Then another container to serve the application. But be careful when deploying the application with more than 1 replica. The replicas will also execute the initcontainer at the same time if you are using Deployment. If you need multiple replicas, you might want to consider StatefulSets instead.

-- Bal Chua
Source: StackOverflow

8/16/2018

Such database migrations are best handled by switching to a Recreate deployment strategy and doing the migration as a mid lifecyle hook. At that point there are no instances of your application running so it can be safely done. If you can't have downtime, then you need to have the application be able to be switched to some offline or read/only mode against a copy of your database while doing the migration.

-- Graham Dumpleton
Source: StackOverflow

8/16/2018

Don't keep context busy doing a long task in PostConstruct. Instead start migration as fully asynchronous task and allow Spring to build the rest of the context meanwhile. At the end of the task just set some shared Future with success or failure. Wrap controller in a proxy (can be facilitated with AOP, for example) where every method except the health check tries to get value from the same future within a timeout. If it succeeds, migration is done, all calls are available. If not, reject the call. Your proxy would serve as a gate allowing to use only part of API that is critical to be available while migration is going on. The rest of it may simply respond with 503 indicating the service is not ready yet. Potentially those 503 responses can also be improved by measuring and averaging the time migration typically takes and returning this value with RETRY-AFTER header. And with the MessageHandler you can do essentially same thing. You wait for result of the future in the handle method (provided message handlers are allowed to hang indefinitely). Once the result is set, it will proceed with message handling from that moment on.

-- yegodm
Source: StackOverflow