I have the same docker image running on two different CoreOS servers. (They're in a Kubernetes cluster, but I think that is irrelevant to the current problem.)
They both are running image hash 01e95e0a93af
. They both should have curl. One does not. This seems... impossible.
Good Server
core@ip-10-0-0-61 ~ $ docker pull gcr.io/surveyadmin-001/wolfgang:commit_e78e07eb6ce5727af6ffeb4ca3e903907e3ab83a
Digest: sha256:5d8bf456ad2d08ce3cd15f05b62fddc07fda3955267ee0d3ef73ee1a96b98e68
[cut]
Status: Image is up to date for gcr.io/surveyadmin-001/wolfgang:commit_e78e07eb6ce5727af6ffeb4ca3e903907e3ab83a
core@ip-10-0-0-61 ~ $ docker run -it --rm gcr.io/surveyadmin-001/wolfgang:commit_e78e07eb6ce5727af6ffeb4ca3e903907e3ab83a /bin/bash
root@d29cb8783830:/app/bundle# curl
curl: try 'curl --help' or 'curl --manual' for more information
root@d29cb8783830:/app/bundle#
Bad Server
core@ip-10-0-0-212 ~ $ docker pull gcr.io/surveyadmin-001/wolfgang:commit_e78e07eb6ce5727af6ffeb4ca3e903907e3ab83a
[cut]
Digest: sha256:5d8bf456ad2d08ce3cd15f05b62fddc07fda3955267ee0d3ef73ee1a96b98e68
Status: Image is up to date for gcr.io/surveyadmin-001/wolfgang:commit_e78e07eb6ce5727af6ffeb4ca3e903907e3ab83a
core@ip-10-0-0-212 ~ $ docker run -it --rm gcr.io/surveyadmin-001/wolfgang:commit_e78e07eb6ce5727af6ffeb4ca3e903907e3ab83a /bin/bash
root@fe6a536393f8:/app/bundle# curl
bash: curl: command not found
root@fe6a536393f8:/app/bundle#
Full logs available on this gist. I took the bad server out of our production cluster but still have it running if anyone wants me to do any other research.
I've run docker tag gcr.io/surveyadmin-001/wolfgang:commit_e78e07eb6ce5727af6ffeb4ca3e903907e3ab83a weird-image
on both servers to make everything more readable.
Can you do a which curl in the first component to check where it finds its curl? And see if that file exists in the second component. – VonC
Seems to not exist at all on the bad server.
Good Server
core@ip-10-0-0-61 ~ $ docker run -it --rm weird-image /bin/bash
root@529b8f20a610:/app/bundle# which curl
/usr/bin/curl
Bad Server
core@ip-10-0-0-212 ~ $ docker run -it --rm weird-image /bin/bash
root@ff98c850dbaa:/app/bundle# ls /usr/bin/curl
ls: cannot access /usr/bin/curl: No such file or directory
root@ff98c850dbaa:/app/bundle#
Any chance you have set up an alias on the bad box? Run alias docker to check – morloch
Nope.
Good Server
core@ip-10-0-0-61 ~ $ alias docker
-bash: alias: docker: not found
Bad Server
core@ip-10-0-0-212 ~ $ alias docker
-bash: alias: docker: not found
More weirdness: it takes a lot longer to run the container on the bad server.
Good Server
core@ip-10-0-0-61 ~ $ time docker run weird-image echo "Done"
Done
real 0m0.422s
user 0m0.015s
sys 0m0.015s
Bad Server
core@ip-10-0-0-212 ~ $ time docker run weird-image echo "Done"
Done
real 0m4.602s
user 0m0.010s
sys 0m0.010s
I've seen lots of cases where Docker images on-disk get random bits of corruption (causing weird inconsistencies like the one you describe here), and deleting and re-pulling the image "fixes" the problem.
To test this, you'll want to make sure you not only docker rmi gcr.io/surveyadmin-001/wolfgang:commit_e78e07eb6ce5727af6ffeb4ca3e903907e3ab83a
(which will minimally output Untagged: gcr.io/surveyadmin-001/wolfgang:commit_e78e07eb6ce5727af6ffeb4ca3e903907e3ab83a
), but also delete the individual layers (and any other tags they may have) so that they're forced to be re-pulled.