We're running multiple Kubernetes clusters, which run Cassandra. Our usual procedure when doing a rolling restart of the Cassandra pods is to log into each and submit a nodetool drain
and then trigger a recreation of that pod. But often when the pods restart we get errors like
ERROR [HintsDispatcher:2] 2017-08-07 11:09:32,489 HintsDispatchExecutor.java:243 - Failed to dispatch hints file 5fdd139d-4465-4825-85ef-f380bddcb67d-1502100535128-1.hints: file is corrupted ({})
Those corrupt files prevent Cassandra from starting. Is there a way to tell Cassandra to flush all buffers and stop writing, before stopping it, to ensure there are no corrupt files left behind?
You can try to disable hinted handoff, or try to truncate hints after the drain:
nodetool truncatehints
If you care about consistency, run repair after the process.
Warning: If you are working with ANY consistency setting or RF=1, this may lead to some data loss.