we have an app in production which need to be highly available (100%),so we did the following:
My question are:
How the affinity works with pod disruption Budget, could be any collusion ? or this is redundant configs ?
Affinity and Anti-affinity is about where your Pod is scheduled, e.g. so that two replicas of the same app is not scheduled to the same node. Pod Disruption Budgets is about to increase availability when using voluntary disruption e.g. maintenance. They are both related to making better availability for your app - but not related to eachother.
Is there any other configuration which I need to add to make sure that my pods run always (as much as possible)
Things will fail. What you need to do is to embrace distributed systems and make all your workload a distributed system, e.g. with multiple instances to remove single point of failure. This is done differently for stateless (e.g. Deployment) and stateful (e.g. StatefulSet) workload. What's important for you is that your app is available at much as possible, but individual instances (e.g. Pods) can fail, almost without that any user notice it.
We configure 3 instance as HA but then the node died
Things will always fail. E.g. a physical node may crash. You need to design your apps so that it can tolerate some failures.
If you use a cloud provider, you should use regional clusters that uses three independent Availability Zones and you need to spread your workload so that it runs in more than one Availability Zone - in this way, your app can tolerate that a whole Availability Zone is down without affecting your users.