So I've taken on a deployment onto the google container engine and have run into a weird behaviour I don't really know how to debug. I am deploying a ruby on rails application using docker and kubernetes.
I am basically following this tutorial: https://cloud.google.com/container-engine/docs/tutorials/hello-node#step_2_create_a_docker_container_image skipping the spinning up replicas section and it works. I can go to the external IP after building/deploying and my app functions the way it is intended. However, after 10 minutes or so it stops. Requests just spin forever.
I have found the log files to be relatively unhelpful seeing only the following that would strike hint:
{
"log": "2015/11/10 05:35:18 Worker running nslookup kubernetes.default.svc.cluster.local localhost >/dev/null\n",
"stream": "stderr"
}
{
"log": "2015/11/10 05:35:19 Client ip xx.xxx.x.x:xxxxx requesting /healthz probe servicing cmd nslookup kubernetes.default.svc.cluster.local localhost >/dev/null\n",
"stream": "stderr"
}
I have been through most of the debugging suggestions on this page: https://cloud.google.com/container-engine/docs/debugging/
kubectl logs ${pod}:
[2015-11-10 05:07:44] INFO WEBrick 1.3.1
[2015-11-10 05:07:44] INFO ruby 2.1.6 (2015-04-13) [x86_64-linux]
[2015-11-10 05:07:44] INFO WEBrick::HTTPServer#start: pid=1 port=80
kubectl logs $pod $instance unnervingly returns:
Container "x" not found in Pod "x"
Dockerfile is pretty much straight from google:
FROM google/ruby
# [START postgres-dep]
RUN apt-get update && \
apt-get install -qy --no-install-recommends libpq-dev && \
apt-get clean
# [END postgres-dep]
ENV RACK_ENV production
WORKDIR /app
ADD Gemfile /app/Gemfile
ADD Gemfile.lock /app/Gemfile.lock
RUN /usr/bin/bundle install --deployment --without development:test
ADD . /app
RUN bundle exec rake assets:precompile
RUN bundle exec rake db:migrate
EXPOSE 8080
ENV RACK_ENV production
CMD ["/usr/bin/bundle", "exec", "rackup", "-p", "80", "/app/config.ru", "-s", "webrick", "-E", "production"]
pinging returns the following:
PING xxxxx (xxxxx): 56 data bytes
64 bytes from xxxx: icmp_seq=0 ttl=49 time=48.462 ms
64 bytes from xxxx: icmp_seq=1 ttl=49 time=48.177 ms
64 bytes from 1xxxx: icmp_seq=2 ttl=49 time=48.181 ms
64 bytes from xxxx: icmp_seq=3 ttl=49 time=48.240 ms
64 bytes from 1xxxxx: icmp_seq=4 ttl=49 time=48.337 ms
64 bytes from xxxxx: icmp_seq=5 ttl=49 time=48.149 ms
64 bytes from xxxxx: icmp_seq=6 ttl=49 time=48.053 ms
64 bytes from xxxx: icmp_seq=7 ttl=49 time=47.958 ms
64 bytes from xxxxx: icmp_seq=8 ttl=49 time=48.137 ms
the latency looks pretty bad. After forever it does point to the red rails screen of death'
Questions: Where are my darn application logs? I don't see anything rails like in the developers console and couldn't find them via ssh either. I sort of assumed it was a balancer/pod config issue but it would be nice to know anyway. Why does it work initially and after a while stop functioning? Where do I begin to troubleshoot behaviour like this when everything says it has the green light with no critical logs?
Are rolling updates(https://cloud.google.com/container-engine/docs/rolling-updates) the process for re-deploying code changes without spinning up and down/re creating everything? Thanks in advance
Where are my darn application logs?
kubectl logs
will grab any logs that are written to stdout / stderr. If your application logs to a file, then you'll need to look at that file directly to see your logs. Try kubectl exec
to get a shell inside your pod and then use your favorite tool (cat, grep, less, etc) to view the log file (check out this blog post if you haven't already to see some neat uses of kubectl, including an example of kubectl exec
).
Why does it work initially and after a while stop functioning?
This probably depends on your application. Once you get the logs, you should be able to tell.
Are rolling updates the process for re-deploying code changes without spinning up and down/re creating everything?
Yes.