I wrote a python script to dump my database, gzip it and move it to a cloud storage.
Locally everything works smooth and consumses basically no memory (max 20MB RAM, CPU is quite high, I/O is at max).
When I run it as a job in my kubernetes cluster, the memory usage piles up to about 1.6 GB.
That is more or less the size of my gzipped dumpfile.
Here's my dumping logic:
mysqldump_command = ['mysqldump', f'--host={host}', f'--port={port}', f'--user={username}',
f'--password={password}', '--databases', '--compact',
'--routines', db_name, f'--log-error={self.errorfile}']
print(f'## Creating mysql dump')
with open(self.filename_gzip, 'wb', 0) as f:
p1 = subprocess.Popen(mysqldump_command, stdout=subprocess.PIPE)
p2 = subprocess.Popen('gzip', stdin=p1.stdout, stdout=f)
p1.stdout.close() # force write error (/SIGPIPE) if p2 dies
p2.wait()
p1.wait()
I tried:
PYTHONUNBUFFERED=1
, no effect.Any further ideas?
Thanks!
You could use pydump written by GuruTux. As for bufer limit the construct should look like this subprocess.popen(args, bufsize=-1)
for more detailed use have a look inside the Python subprocess docs.
Also You can do a remote mysql dump and just pull the .gz file to save the hassle.
I reproduced your setup and the dump still works with more restrictive limits.
I just set the ressource caps in the my-job.yml and it does not consume much memory any more:
resources:
limits:
memory: "100Mi"
requests:
memory: "50Mi"