I am using Kubernetes v1.2.4 (on top of CoreOS stable 1010.5.0) and would like to mount rbd/ceph volumes. Basically I’ve followed https://github.com/kubernetes/kubernetes/tree/master/examples/rbd except that I prefer YAML over JSON.
Noticed there has to be both:
secretRef:
name: ceph-secret
and
keyring: /etc/ceph/keyring
else kubectl complained. Is this expected behavior?
Seems kubelet tries to invoke rbd binary directly on the host (which is a problem for a "bare system" like CoreOS). Since copying over the binary and dependencies would be a bit cumbersome I did this trick:
$ cat /opt/bin/rbd
#!/bin/sh
docker run -v /etc/ceph:/etc/ceph ceph/rbd $@
Took care of /etc/ceph configuration, made the shell script executable and so on - if I do "rbd list" on CoreOS everything works fine. /opt/bin (beside being on PATH on CoreOS by default) is also in the PATH for the kubelet process (which I can confirm through /proc/kubelet pid/environ).
However if I start the (test) pod I get this error (in kubectl pod describe):
Events:
FirstSeen LastSeen Count From SubobjectPath Type Reason Message
--------- -------- ----- ---- ------------- -------- ------ -------
5s 5s 1 {default-scheduler } Normal Scheduled Successfully assigned busybox4 to some-host
4s 4s 1 {kubelet some-host} Warning FailedMount Unable to mount volumes for pod "busybox4_default(5386c7f3-3959-11e6-a768-aa00009a7832)": rbd: map failed fork/exec /opt/bin/rbd: invalid argument
4s 4s 1 {kubelet some-host} Warning FailedSync Error syncing pod, skipping: rbd: map failed fork/exec /opt/bin/rbd: invalid argument
so either fork() or execve() returns EINVAL? By reading through a few man pages I found only exec might actually fail with EINVAL due to
An ELF executable had more than one PT_INTERP segment (i.e., tried to name more than one interpreter)
but that seems quite obscure.
Any idea what the matter is or how I could fix / workaround the problem?
Edit: I tried strace -fp pid and there are a lot of stat() calls which I presume come from golang os/exec LookPath. However I don't see any execve() on "rbd" nor is there any system call failing with EINVAL. To make sure it is not related to fleet (systemd) I also tried running kubelet directly on the console as root. Results are the same.
Just a follow up on the issue.
In the mean time I've upgraded to CoreOS stable (1068.9.0) with Kubernetes v1.3.5.
My /opt/bin/rbd looks like this:
#!/bin/sh
exec docker run -v /dev:/dev -v /sys:/sys --net=host --privileged=true -v /etc/ceph:/etc/ceph ceph/rbd $@
(partially based on your suggestions). And now everything works like a charm. So I guess it was some bug that got fixed (also secretRef and keyring are not both required anymore now from kubectl). Perhaps somebody can comment on what the actual issue was, but else consider the case closed.
Partial answer: that rbd is a shell-script doesn't matter. By looking at strace output from kubelet when invoking other external tools I figured out clone() was used. I've written some short test code to verify what happens when it fails.
#define _GNU_SOURCE
#include <sched.h>
#include <stdio.h>
int test(void *p) {
printf("Hello there!");
return 0;
}
int main() {
if (clone(test, NULL, CLONE_THREAD, NULL) == -1) {
perror("clone");
}
return 0;
}
now if I do
strace ./test 2>&1 | grep clone
the output is
write(2, "clone: Invalid argument\n", 24clone: Invalid argument
Which explains one part of the mystery. When clone() fails with EINVAL strace does not show it at all.
Then I've been looking at Kubernetes source and
https://github.com/kubernetes/kubernetes/blob/master/pkg/volume/rbd/rbd_util.go#L218
seems to work like a charm
[pid 25039] execve("/usr/sbin/modprobe", ["modprobe", "rbd"], [/* 4 vars */] <unfinished ...>
Wonder why the invocation at https://github.com/kubernetes/kubernetes/blob/master/pkg/volume/rbd/rbd_util.go#L231 or https://github.com/kubernetes/kubernetes/blob/master/pkg/volume/rbd/rbd_util.go#L234 wouldn't (or specifically what would make that clone() invocation there fail)
I'm not that familiar with how kubernetes kicks off that rbd script but I think the issue is because it's a script. A script cannot be run directly by a call to exec which kubernetes is doing.
The line #!/bin/sh
at the top of the file isn't automatically going to start a shell for you. That's actually interpreted by another shell. So what you really want instead of calling your script /opt/bin/rbd directly in your kubernetes config. You want to change it to:
/bin/sh -c "/opt/bin/rbd" ...
And then it should work.
In fact, I'd change the script slightly
#!/bin/sh
exec docker run -v /etc/ceph:/etc/ceph ceph/rbd $@
But perhaps what you really want to do is look at this guide:
Bring persistent storage for your containers with krbd on kubernetes
Things have progressed.