I am running a python parallel CPU intensive task on Google Compute Engine. Hence, the greater the number of vCPUs I can run it on, the greater the speed.
I've read that there is no point in creating a multiprocessing pool with greater size than the number of available vCPUs, which makes sense, so I determine the size of my multiprocessing.dummy.Pool
pool using multiprocessing.cpu_count()
.
I am running this script in a Pod, using gcloud Kubernetes Engine, and tested on machines with less than 96 vCPUs during development. The pool size automatically determined seemed always to match to the number of vCPUs. However, running it on a machine with 96 vCPUs, multiprocessing.cpu_count()
returns 64 and not 96. I don't care setting that size manually to 96 but the question is, will I benefit from those extra 32 vCPUs if python is not "aware" of them?
The machine is a n1-highcpu-96 (96 vCPUs, 86.4 GB memory) running the Container-Optimized OS (cos). Python version is 3.6.3.
There is an answer in the message board that someone linked to in a comment to the question, however, it seems better to have the answer on this page, as well as some explanation.
The short answer: inside a pod, run grep -c ^processor /proc/cpuinfo
- this number should agree with multiprocessing.cpu_count()
. If it does, you can trust multiprocessing.cpu_count()
.
However, AFAICT, this identifies all the cores on the node and completely ignores the resource limits set in your Kubernetes deployment YAML. For example, your deployment file might contain:
spec:
containers:
- image: IMAGENAME
name: LABEL
ports:
- containerPort: 5000
resources:
limits:
cpu: 100m
memory: 400M
requests:
cpu: 50m
memory: 200M
In this article, the author gives the following function, which respects the resource limits (not requests):
import math
from pathlib import Path
def get_cpu_quota_within_docker():
cpu_cores = None
cfs_period = Path("/sys/fs/cgroup/cpu/cpu.cfs_period_us")
cfs_quota = Path("/sys/fs/cgroup/cpu/cpu.cfs_quota_us")
if cfs_period.exists() and cfs_quota.exists():
# we are in a linux container with cpu quotas!
with cfs_period.open('rb') as p, cfs_quota.open('rb') as q:
p, q = int(p.read()), int(q.read())
# get the cores allocated by dividing the quota
# in microseconds by the period in microseconds
cpu_cores = math.ceil(q / p) if q > 0 and p > 0 else None
return cpu_cores
So, for the example YAML, the division yields 0.1
, but b/c of the call to ceil
, it returns 1.0
. So what you may be looking for is something like the following (assuming that you have the above-defined function get_cpu_quota_within_docker
defined):
import multiprocessing
from somewhere import get_cpu_quota_within_docker
docker_cpus = get_cpu_quota_within_docker()
cpu_count = docker_cpus if docker_cpus else multiprocessing.cpu_count()