What is the difference between running 2 pods (2 replicas) in Kubernetes vs a one larger pod ?
I have set a pod with 20m memory request limit. Is it better to have 2 replicas with 20m limits or a single pod with 40m memory request limit?
Personally, I think the performance had better to run multiple pods on the same host. I don't know what web server you use, but the requests are processed by limited cpu time, though it has multiple processes or threads for work. Additionally it's more efficient to utilize cpu time during network I/O waiting in using multiple processes. In order to improve the throughput, you should increase the processes or instances to work horizontally, because the response time is getting slower as time past.
Depends mainly on the requirements of the web/mobile application being hosted, which you can ascertain by benchmarking the app performance under 20m & 40m configurations. Overall, you can expect better performance for the application running at 40m and scaling elastically when required by user traffic. Running two pods in different data centers will give better fail-over performance in case of system crash or other issues. You may have higher billing rates running two pods when supporting the same rate of web traffic.
I think there is no golden rule on how to plan your infrastructure capacity to met specific level of your application/service`s objectives. You should start collecting some key performance metrics of your application, and based on these monitoring stats start doing proper dimensioning of your PODs, for which you can use Kubernetes features like Horizontal/Vertical Pod Autoscaling.