Invoking rbd (docker) from kubernetes on coreos returns fork/exec invalid argument

6/23/2016

I am using Kubernetes v1.2.4 (on top of CoreOS stable 1010.5.0) and would like to mount rbd/ceph volumes. Basically I’ve followed https://github.com/kubernetes/kubernetes/tree/master/examples/rbd except that I prefer YAML over JSON.

Noticed there has to be both:

secretRef:
  name: ceph-secret

and

keyring: /etc/ceph/keyring

else kubectl complained. Is this expected behavior?

Seems kubelet tries to invoke rbd binary directly on the host (which is a problem for a "bare system" like CoreOS). Since copying over the binary and dependencies would be a bit cumbersome I did this trick:

$ cat /opt/bin/rbd
#!/bin/sh
docker run -v /etc/ceph:/etc/ceph ceph/rbd $@

Took care of /etc/ceph configuration, made the shell script executable and so on - if I do "rbd list" on CoreOS everything works fine. /opt/bin (beside being on PATH on CoreOS by default) is also in the PATH for the kubelet process (which I can confirm through /proc/kubelet pid/environ).

However if I start the (test) pod I get this error (in kubectl pod describe):

Events:
  FirstSeen LastSeen    Count   From                SubobjectPath   Type        Reason      Message
  --------- --------    -----   ----                -------------   --------    ------      -------
  5s        5s      1   {default-scheduler }                Normal      Scheduled   Successfully assigned busybox4 to some-host
  4s        4s      1   {kubelet some-host}         Warning     FailedMount Unable to mount volumes for pod "busybox4_default(5386c7f3-3959-11e6-a768-aa00009a7832)": rbd: map failed fork/exec /opt/bin/rbd: invalid argument
  4s        4s      1   {kubelet some-host}         Warning     FailedSync  Error syncing pod, skipping: rbd: map failed fork/exec /opt/bin/rbd: invalid argument

so either fork() or execve() returns EINVAL? By reading through a few man pages I found only exec might actually fail with EINVAL due to

An ELF executable had more than one PT_INTERP segment (i.e., tried to name more than one interpreter)

but that seems quite obscure.

Any idea what the matter is or how I could fix / workaround the problem?

Edit: I tried strace -fp pid and there are a lot of stat() calls which I presume come from golang os/exec LookPath. However I don't see any execve() on "rbd" nor is there any system call failing with EINVAL. To make sure it is not related to fleet (systemd) I also tried running kubelet directly on the console as root. Results are the same.

-- fiction
ceph
coreos
kubernetes
linux

3 Answers

8/27/2016

Just a follow up on the issue.

In the mean time I've upgraded to CoreOS stable (1068.9.0) with Kubernetes v1.3.5.

My /opt/bin/rbd looks like this:

#!/bin/sh
exec docker run -v /dev:/dev -v /sys:/sys --net=host --privileged=true -v /etc/ceph:/etc/ceph ceph/rbd $@

(partially based on your suggestions). And now everything works like a charm. So I guess it was some bug that got fixed (also secretRef and keyring are not both required anymore now from kubectl). Perhaps somebody can comment on what the actual issue was, but else consider the case closed.

-- fiction
Source: StackOverflow

7/2/2016

Partial answer: that rbd is a shell-script doesn't matter. By looking at strace output from kubelet when invoking other external tools I figured out clone() was used. I've written some short test code to verify what happens when it fails.

#define _GNU_SOURCE
#include <sched.h>
#include <stdio.h>

int test(void *p) {
  printf("Hello there!");
  return 0;
}

int main() {
  if (clone(test, NULL, CLONE_THREAD, NULL) == -1) {
    perror("clone");
  }
  return 0;
}

now if I do

strace ./test 2>&1  | grep clone

the output is

write(2, "clone: Invalid argument\n", 24clone: Invalid argument

Which explains one part of the mystery. When clone() fails with EINVAL strace does not show it at all.

Then I've been looking at Kubernetes source and

https://github.com/kubernetes/kubernetes/blob/master/pkg/volume/rbd/rbd_util.go#L218

seems to work like a charm

[pid 25039] execve("/usr/sbin/modprobe", ["modprobe", "rbd"], [/* 4 vars */] <unfinished ...>

Wonder why the invocation at https://github.com/kubernetes/kubernetes/blob/master/pkg/volume/rbd/rbd_util.go#L231 or https://github.com/kubernetes/kubernetes/blob/master/pkg/volume/rbd/rbd_util.go#L234 wouldn't (or specifically what would make that clone() invocation there fail)

-- fiction
Source: StackOverflow

6/30/2016

I'm not that familiar with how kubernetes kicks off that rbd script but I think the issue is because it's a script. A script cannot be run directly by a call to exec which kubernetes is doing.

The line #!/bin/sh at the top of the file isn't automatically going to start a shell for you. That's actually interpreted by another shell. So what you really want instead of calling your script /opt/bin/rbd directly in your kubernetes config. You want to change it to:

/bin/sh -c "/opt/bin/rbd" ...

And then it should work.

In fact, I'd change the script slightly

#!/bin/sh
exec docker run -v /etc/ceph:/etc/ceph ceph/rbd $@

But perhaps what you really want to do is look at this guide:

Bring persistent storage for your containers with krbd on kubernetes

Things have progressed.

-- Matt
Source: StackOverflow