We ran some stateful applications (e.g. database) on AWS on-demand/reserved ec2 instances in the past, now we are considering moving those app to k8s statefulset with PVC.
My question is that is it recommended to run k8s statefulset on spot instance to reduce the cost? Since we can use kube-spot-termination-notice-handler to taint the node to move the pod to others before the spot instance terminated, it looks like it should be no problem as long as the statefulset has multiple replicas to prevent the service interrupted.
IMO, I would not recommend running a critical StatefulSet on Spot Instances. For example, a critical database. This is some of what would/could happen in these examples:
Mysql master/slave/clustered. Any node going down would lead to unpredictable errors and/or downtime before recovering, or nodes coming back up (with different IP addresses!)
Cassandra. Any node going up/down would cause your cluster to rebalance. If you have these going up and down, then they will constantly be rebalancing! Not to mention the fact that if you had all your nodes in Spot Instances you have the chance of most of them going down.
Spots are great for large one-time batch jobs and that they are not critically time bound. These can be anything data processing or for example, creating or updating an M/L model.
They are also great for stateless services, meaning an application that sits behind a load balancer and uses state store that is not in a spot instance (Mysql, Cassandra, CloudSQL, RDS, etc)
Spots are also great for test/dev environments, again not necessarily time-bound jobs/workloads.
There is probably not one and only answer to this question: it really depends on what it is as a workload you want to run, and how tolerant your application is to failures. When a spot instance is to be interrupted (higher bidder, no more available...), a well-done StatefulSet or any other appropriate controller will indeed do its job as expected and usually pretty quickly (seconds).
But be aware that it is wrong to assert that:
See AWS documentation itself https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/spot-interruptions.html#using-spot-instances-managing-interruptions and here's the excerpt "[...] it is possible that your Spot Instance is terminated before the warning can be made available".
So the real question is: how tolerant is your application to unprepared resources removal?
If you have just 2 EC2s running hundreds of pods each, you'll most likely NOT want to use spot instances as your service will be highly degraded if one of the 2 instances is interrupted, until a new one spins up or k8s redispatches the load (assuming the other instance is big enough). Hundreds of EC2s with few pods each and slightly over-provisioning autoscaling rules? You might as well just go for it and use the spot cost savings!
You'll also want to double-check your clients behaviours: assuming you run an API on k8s and pods are stopped before responding, make sur your clients handle the scenario and fires another request or at the very least fail gracefully.
But you spoke of databases: so how about replication? Is it fast and automated? Are there multiple replicates of data to allow for 1 to n replica loss?..
In other words: it just needs some good planning and thorough testing at scale. Good news is it's easy to do: run a load-test and voluntarily crash an instance, answers will meet you there!