I run a program in k8s pod container, with SYS_ADMIN capability. The program allocates 2 MB hugepage, which succeeds. Then calls mlock() on that memory, which fails.
I looked at man page for ENOMEM, none of the reason may explain the issue.
I tried running the program on host, works.
I tried running the program on docker container, with SYS_ADMIN, with the same image, works.
I checked the difference between OCI config.json file for direct docker case vs k8s case shown below, I don't see anything interesting..
versions
axe@axe-tester:~$ cat /proc/version
Linux version 4.15.0-29-generic (buildd@lgw01-amd64-057) (gcc version 7.3.0 (Ubuntu 7.3.0-16ubuntu3)) #31-Ubuntu SMP Tue Jul 17 15:39:52 UTC 2018
axe@axe-tester:~$ kubectl version
Client Version: version.Info{Major:"1", Minor:"14", GitVersion:"v1.14.2", GitCommit:"66049e3b21efe110454d67df4fa62b08ea79a19b", GitTreeState:"clean", BuildDate:"2019-05-16T16:23:09Z", GoVersion:"go1.12.5", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"14", GitVersion:"v1.14.2", GitCommit:"66049e3b21efe110454d67df4fa62b08ea79a19b", GitTreeState:"clean", BuildDate:"2019-05-16T16:14:56Z", GoVersion:"go1.12.5", Compiler:"gc", Platform:"linux/amd64"}
axe@axe-tester:~$ docker version
Client:
Version: 18.09.2
API version: 1.39
Go version: go1.10.4
Git commit: 6247962
Built: Tue Feb 26 23:52:23 2019
OS/Arch: linux/amd64
Experimental: false
Server:
Engine:
Version: 18.09.2
API version: 1.39 (minimum version 1.12)
Go version: go1.10.4
Git commit: 6247962
Built: Wed Feb 13 00:24:14 2019
OS/Arch: linux/amd64
Experimental: false
Following yaml file used test program in /tmp/test
apiVersion: v1
kind: Pod
metadata:
name: test
annotations:
seccomp.security.alpha.kubernetes.io/pod: docker/default
spec:
restartPolicy: Never
containers:
- name: t
image: amazonlinux:2
imagePullPolicy: Never
command: ["sleep", "1200"]
securityContext:
capabilities:
add: ["SYS_ADMIN", "IPC_LOCK"]
volumeMounts:
- mountPath: /test
name: test
resources:
limits:
hugepages-2Mi: 100Mi
memory: 100Mi
requests:
memory: 100Mi
volumes:
- name: test
hostPath:
path: /tmp/test
steps to reproduce:
kubectl create -f /tmp/test.yml
kubectl exec -it test -- /bin/bash
# in the container..
bash-4.2# /test
Previous limits: soft=16777216; hard=16777216
mlock failed: Cannot allocate memory
test program
#define MMAP_FLAGS (MAP_PRIVATE | MAP_HUGETLB | MAP_HUGE_2MB| MAP_ANONYMOUS)
#define MMAP_MIN_SIZE (2 * 1024 * 1024)
void *dma_mp_mmap_hugetlb(size_t size)
{
int err = 0;
void *va = NULL;
va = mmap(0, size, PROT_READ | PROT_WRITE, MMAP_FLAGS, -1, 0);
if (va == MAP_FAILED) {
perror("mmap failed");
return MAP_FAILED;
}
/* Pin the memory */
err = mlock(va, size);
if (err) {
perror("mlock failed");
return MAP_FAILED;
}
return va;
}
int main (void) {
struct rlimit old;
getrlimit(RLIMIT_MEMLOCK, &old);
printf("Previous limits: soft=%lld; hard=%lld\n", (long long) old.rlim_cur, (long long) old.rlim_max);
assert(dma_mp_mmap_hugetlb(MMAP_MIN_SIZE) != NULL);
}