Kubernetes Pod with python script consuming too much memory

7/4/2018

I wrote a python script to dump my database, gzip it and move it to a cloud storage.

Locally everything works smooth and consumses basically no memory (max 20MB RAM, CPU is quite high, I/O is at max).

When I run it as a job in my kubernetes cluster, the memory usage piles up to about 1.6 GB.

That is more or less the size of my gzipped dumpfile.

Here's my dumping logic:

mysqldump_command = ['mysqldump', f'--host={host}', f'--port={port}', f'--user={username}',
                         f'--password={password}', '--databases', '--compact',
                         '--routines', db_name, f'--log-error={self.errorfile}']

print(f'## Creating mysql dump')        
with open(self.filename_gzip, 'wb', 0) as f:
    p1 = subprocess.Popen(mysqldump_command, stdout=subprocess.PIPE)
    p2 = subprocess.Popen('gzip', stdin=p1.stdout, stdout=f)
p1.stdout.close()  # force write error (/SIGPIPE) if p2 dies
p2.wait()
p1.wait()

I tried:

  • setting PYTHONUNBUFFERED=1, no effect.
  • I tried this logic but it was even worse
  • I tried creating the dump first as a file and gzip it afterwards, was the worst of all my experiments

Any further ideas?

Thanks!

-- Ron
kubernetes
memory
mysql
python

2 Answers

7/5/2018

You could use pydump written by GuruTux. As for bufer limit the construct should look like this subprocess.popen(args, bufsize=-1) for more detailed use have a look inside the Python subprocess docs.

Also You can do a remote mysql dump and just pull the .gz file to save the hassle.

-- Crou
Source: StackOverflow

7/6/2018

I reproduced your setup and the dump still works with more restrictive limits.

I just set the ressource caps in the my-job.yml and it does not consume much memory any more:

resources:
  limits:
    memory: "100Mi"
  requests:
    memory: "50Mi"
-- Marius
Source: StackOverflow