I'm running a pod in kubernetes, with hugepages allocated in host and hugepages defined in the pod. The kubernetes worker is in a VM. The VM (host) has huge pages allocated. The pod fails to allocate hugepages though. Application gets SIGBUS when trying to write to the first hugepage allocation.
the pod definition includes hugepages:
securityContext:
allowPrivilegeEscalation: true
privileged: true
runAsUser: 0
capabilities:
add: ["SYS_ADMIN", "IPC_LOCK"]
resources:
requests:
intel.com/intel_sriov_netdevice : 2
memory: 2Gi
hugepages-2Mi: 4Gi
limits:
intel.com/intel_sriov_netdevice : 2
memory: 2Gi
hugepages-2Mi: 4Gi
volumeMounts:
- mountPath: /sys
name: sysfs
- mountPath: /dev/hugepages
name: hugepage
readOnly: false
volumes:
- name: hugepage
emptyDir:
medium: HugePages
- name: sysfs
hostPath:
path: /sys
The VM hosting the pod has hugepages allocated:
cat /proc/meminfo | grep -i hug
AnonHugePages: 0 kB
HugePages_Total: 4096
HugePages_Free: 4096
HugePages_Rsvd: 0
HugePages_Surp: 0
Hugepagesize: 2048 kB
The following piece of code runs fine in the VM hosting the pod, I can see the hugepages files getting created in /dev/hugepages, also the HugePages_Free counter decreases while the process is running.
#include <stdio.h>
#include <sys/mman.h>
#include <errno.h>
#include <stdlib.h>
#include <unistd.h>
#include <fcntl.h>
#define LENGTH (2UL*1024*1024)
#define FILE_NAME "/dev/hugepages/hugepagefile"
static void write_bytes(char *addr)
{
unsigned long i;
for (i = 0; i < LENGTH; i++)
*(addr + i) = (char)i;
}
int main ()
{
void *addr;
int i;
char buf[32];
int fd;
for (i = 0 ; i < 16 ; i++ ) {
sprintf(buf, "%s_%d", FILE_NAME, i);
fd = open(buf, O_CREAT | O_RDWR, 0755);
addr = mmap((void *)(0x0UL), LENGTH, PROT_READ | PROT_WRITE, MAP_PRIVATE | MAP_HUGETLB , fd, 0);
printf("address returned %p \n", addr);
if (addr == MAP_FAILED) {
perror("mmap ");
} else {
write_bytes(addr);
//munmap(addr, LENGTH);
//unlink(FILE_NAME);
}
close(fd);
}
while (1){}
return 0;
}
But if I run the same code in the pod, I get a SIGBUS while trying to write to the first hugepage allocated.
Results on the VM (hosting the pod)
root@k8s-1:~# cat /proc/meminfo | grep -i hug
AnonHugePages: 0 kB
HugePages_Total: 4096
HugePages_Free: 4096
HugePages_Rsvd: 0
HugePages_Surp: 0
Hugepagesize: 2048 kB
root@k8s-1:~# ./mmap &
[1] 19428
root@k8s-1:~# address returned 0x7ffff7800000
address returned 0x7ffff7600000
address returned 0x7ffff7400000
address returned 0x7ffff7200000
address returned 0x7ffff7000000
address returned 0x7ffff6e00000
address returned 0x7ffff6c00000
address returned 0x7ffff6a00000
address returned 0x7ffff6800000
address returned 0x7ffff6600000
address returned 0x7ffff6400000
address returned 0x7ffff6200000
address returned 0x7ffff6000000
address returned 0x7ffff5e00000
address returned 0x7ffff5c00000
address returned 0x7ffff5a00000
root@k8s-1:~# cat /proc/meminfo | grep -i hug
AnonHugePages: 0 kB
HugePages_Total: 4096
HugePages_Free: 4080
HugePages_Rsvd: 0
HugePages_Surp: 0
Hugepagesize: 2048 kB
Results in the pod:
Program received signal SIGBUS, Bus error.
0x00005555555547cb in write_bytes ()
(gdb) where
#0 0x00005555555547cb in write_bytes ()
#1 0x00005555555548a6 in main ()