Launching Helm Chart stable/minecraft on Kubernetes 1.14 on EKS Fails Liveness Probe

10/10/2019

I'm trying to deploy the Vanilla MineCraft Server from stable/minecraft using Helm on Kubernetes 1.14 running on AWS EKS but I am consitently either getting CrashLoopBackOff or Liveness Probe Failures. This seems strange to me as I'm deploying the chart as specified per the documentation:

helm install --name mine-release --set minecraftServer.eula=TRUE --namespace=mine-release stable/minecraft

Already Attempted Debugging:

  1. Tried decreasing and increasing memory helm install --name mine-release --set resources.requests.memory="1024Mi" --set minecraftServer.memory="1024M" --set minecraftServer.eula=TRUE --namespace=mine-release stable/minecraft
  2. Tried viewing logs through kubectl logs mine-release-minecraft-56f9c8588-xn9pv --namespace mine-release but this error is allways appearing
Error from server: Get https://10.0.143.216:10250/containerLogs/mine-release/mine-release-minecraft-56f9c8588-xn9pv/mine-release-minecraft: dial tcp 10.0.143.216:10250: i/o timeout

To give more context the kubectl describe pods mine-release-minecraft-56f9c8588-xn9pv --namespace mine-release output for pod description and events are below:

Name:               mine-release-minecraft-56f9c8588-xn9pv
Namespace:          mine-release
Priority:           0
PriorityClassName:  <none>
Node:               ip-10-0-143-216.ap-southeast-2.compute.internal/10.0.143.216
Start Time:         Fri, 11 Oct 2019 08:48:34 +1100
Labels:             app=mine-release-minecraft
                    pod-template-hash=56f9c8588
Annotations:        kubernetes.io/psp: eks.privileged
Status:             Running
IP:                 10.0.187.192
Controlled By:      ReplicaSet/mine-release-minecraft-56f9c8588
Containers:
  mine-release-minecraft:
    Container ID:   docker://893f622e1129937fab38dc902e25e95ac86c2058da75337184f105848fef773f
    Image:          itzg/minecraft-server:latest
    Image ID:       docker-pullable://itzg/minecraft-server@sha256:00f592eb6660682f327770d639cf10692b9617fa8b9a764b9f991c401e325105
    Port:           25565/TCP
    Host Port:      0/TCP
    State:          Running
      Started:      Fri, 11 Oct 2019 08:50:56 +1100
    Last State:     Terminated
      Reason:       Completed
      Exit Code:    0
      Started:      Fri, 11 Oct 2019 08:50:03 +1100
      Finished:     Fri, 11 Oct 2019 08:50:53 +1100
    Ready:          False
    Restart Count:  2
    Requests:
      cpu:      500m
      memory:   1Gi
    Liveness:   exec [mcstatus localhost:25565 status] delay=30s timeout=1s period=5s #success=1 #failure=3
    Readiness:  exec [mcstatus localhost:25565 status] delay=30s timeout=1s period=5s #success=1 #failure=3
    Environment:
      EULA:                          true
      TYPE:                          VANILLA
      VERSION:                       1.14.4
      DIFFICULTY:                    easy
      WHITELIST:                     
      OPS:                           
      ICON:                          
      MAX_PLAYERS:                   20
      MAX_WORLD_SIZE:                10000
      ALLOW_NETHER:                  true
      ANNOUNCE_PLAYER_ACHIEVEMENTS:  true
      ENABLE_COMMAND_BLOCK:          true
      FORCE_gameMode:                false
      GENERATE_STRUCTURES:           true
      HARDCORE:                      false
      MAX_BUILD_HEIGHT:              256
      MAX_TICK_TIME:                 60000
      SPAWN_ANIMALS:                 true
      SPAWN_MONSTERS:                true
      SPAWN_NPCS:                    true
      VIEW_DISTANCE:                 10
      SEED:                          
      MODE:                          survival
      MOTD:                          Welcome to Minecraft on Kubernetes!
      PVP:                           false
      LEVEL_TYPE:                    DEFAULT
      GENERATOR_SETTINGS:            
      LEVEL:                         world
      ONLINE_MODE:                   true
      MEMORY:                        512M
      JVM_OPTS:                      
      JVM_XX_OPTS:                   
    Mounts:
      /data from datadir (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-j8zql (ro)
Conditions:
  Type              Status
  Initialized       True 
  Ready             False 
  ContainersReady   False 
  PodScheduled      True 
Volumes:
  datadir:
    Type:       PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
    ClaimName:  mine-release-minecraft-datadir
    ReadOnly:   false
  default-token-j8zql:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  default-token-j8zql
    Optional:    false
QoS Class:       Burstable
Node-Selectors:  <none>
Tolerations:     node.kubernetes.io/not-ready:NoExecute for 300s
                 node.kubernetes.io/unreachable:NoExecute for 300s
Events:
  Type     Reason                  Age                    From                                                      Message
  ----     ------                  ----                   ----                                                      -------
  Warning  FailedScheduling        2m25s                  default-scheduler                                         pod has unbound immediate PersistentVolumeClaims (repeated 3 times)
  Normal   Scheduled               2m24s                  default-scheduler                                         Successfully assigned mine-release/mine-release-minecraft-56f9c8588-xn9pv to ip-10-0-143-216.ap-southeast-2.compute.internal
  Warning  FailedAttachVolume      2m22s (x3 over 2m23s)  attachdetach-controller                                   AttachVolume.Attach failed for volume "pvc-b48ba754-eba7-11e9-b609-02ed13ff0a10" : "Error attaching EBS volume \"vol-08b29bb4eeca4df56\"" to instance "i-00ae1f5b96eed8e6a" since volume is in "creating" state
  Normal   SuccessfulAttachVolume  2m18s                  attachdetach-controller                                   AttachVolume.Attach succeeded for volume "pvc-b48ba754-eba7-11e9-b609-02ed13ff0a10"
  Warning  Unhealthy               60s                    kubelet, ip-10-0-143-216.ap-southeast-2.compute.internal  Readiness probe failed: Traceback (most recent call last):
  File "/usr/bin/mcstatus", line 11, in <module>
    sys.exit(cli())
  File "/usr/lib/python2.7/site-packages/click/core.py", line 764, in __call__
    return self.main(*args, **kwargs)
  File "/usr/lib/python2.7/site-packages/click/core.py", line 717, in main
    rv = self.invoke(ctx)
  File "/usr/lib/python2.7/site-packages/click/core.py", line 1137, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/usr/lib/python2.7/site-packages/click/core.py", line 956, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/usr/lib/python2.7/site-packages/click/core.py", line 555, in invoke
    return callback(*args, **kwargs)
  File "/usr/lib/python2.7/site-packages/mcstatus/scripts/mcstatus.py", line 58, in status
    response = server.status()
  File "/usr/lib/python2.7/site-packages/mcstatus/server.py", line 49, in status
    connection = TCPSocketConnection((self.host, self.port))
  File "/usr/lib/python2.7/site-packages/mcstatus/protocol/connection.py", line 129, in __init__
    self.socket = socket.create_connection(addr, timeout=timeout)
  File "/usr/lib/python2.7/socket.py", line 575, in create_connection
    raise err
socket.error: [Errno 99] Address not available
  Normal   Pulling    58s (x2 over 2m14s)  kubelet, ip-10-0-143-216.ap-southeast-2.compute.internal  pulling image "itzg/minecraft-server:latest"
  Normal   Killing    58s                  kubelet, ip-10-0-143-216.ap-southeast-2.compute.internal  Killing container with id docker://mine-release-minecraft:Container failed liveness probe.. Container will be killed and recreated.
  Normal   Started    55s (x2 over 2m11s)  kubelet, ip-10-0-143-216.ap-southeast-2.compute.internal  Started container
  Normal   Pulled     55s (x2 over 2m11s)  kubelet, ip-10-0-143-216.ap-southeast-2.compute.internal  Successfully pulled image "itzg/minecraft-server:latest"
  Normal   Created    55s (x2 over 2m11s)  kubelet, ip-10-0-143-216.ap-southeast-2.compute.internal  Created container
  Warning  Unhealthy  25s (x2 over 100s)   kubelet, ip-10-0-143-216.ap-southeast-2.compute.internal  Readiness probe failed: Traceback (most recent call last):
  File "/usr/bin/mcstatus", line 11, in <module>
    sys.exit(cli())
  File "/usr/lib/python2.7/site-packages/click/core.py", line 764, in __call__
    return self.main(*args, **kwargs)
  File "/usr/lib/python2.7/site-packages/click/core.py", line 717, in main
    rv = self.invoke(ctx)
  File "/usr/lib/python2.7/site-packages/click/core.py", line 1137, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/usr/lib/python2.7/site-packages/click/core.py", line 956, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/usr/lib/python2.7/site-packages/click/core.py", line 555, in invoke
    return callback(*args, **kwargs)
  File "/usr/lib/python2.7/site-packages/mcstatus/scripts/mcstatus.py", line 58, in status
    response = server.status()
  File "/usr/lib/python2.7/site-packages/mcstatus/server.py", line 61, in status
    raise exception
socket.error: [Errno 104] Connection reset by peer
  Warning  Unhealthy  20s (x8 over 95s)  kubelet, ip-10-0-143-216.ap-southeast-2.compute.internal  Readiness probe failed:
  Warning  Unhealthy  17s (x5 over 97s)  kubelet, ip-10-0-143-216.ap-southeast-2.compute.internal  Liveness probe failed:

I bit more about my Kubernetes Setup:

Kubernetes version 1.14 and nodes running on m5.larges

-- James Marino
amazon-web-services
docker
kubernetes
kubernetes-helm
minecraft

1 Answer

10/11/2019

I made reproduction of your problem and the answer is readiness and liveness probe.

Your chart dont have enough time to get up,so after readiness probe return false, liveness probe kill it and try to do it again,and again.

livenessProbe: Indicates whether the Container is running. If the liveness probe fails, the kubelet kills the Container, and the Container is subjected to its restart policy. If a Container does not provide a liveness probe, the default state is Success.

readinessProbe: Indicates whether the Container is ready to service requests. If the readiness probe fails, the endpoints controller removes the Pod’s IP address from the endpoints of all Services that match the Pod. The default state of readiness before the initial delay is Failure. If a Container does not provide a readiness probe, the default state is Success.

You can either use your command after my edit

helm install --name mine-release --set resources.requests.memory="1024Mi" --set minecraftServer.memory="1024M" --set minecraftServer.eula=TRUE --set livenessProbe.initialDelaySeconds=90 --set livenessProbe.periodSeconds=15 --set readinessProbe.initialDelaySeconds=90 --set readinessprobe.periodSeconds=15 --namespace=mine-release stable/minecraft

OR

Use helm fetch to download helm chart to your pc

helm fetch stable/minecraft --untar 

instead of changing values in helm install command, you can use text editor like vi or nano, and update everything in minecraft/values.yaml

vi/nano ./minecraft/values.yaml

minecraft/values.yaml file after edit

# ref: https://hub.docker.com/r/itzg/minecraft-server/
image: itzg/minecraft-server
imageTag: latest

## Configure resource requests and limits
## ref: http://kubernetes.io/docs/user-guide/compute-resources/
##
resources:
  requests:
    memory: 1024Mi
    cpu: 500m

nodeSelector: {}

tolerations: []

affinity: {}

securityContext:
  # Security context settings
  runAsUser: 1000
  fsGroup: 2000
# Most of these map to environment variables. See Minecraft for details:
# https://hub.docker.com/r/itzg/minecraft-server/
livenessProbe:
  command:
    - mcstatus
    - localhost:25565
    - status
  initialDelaySeconds: 90
  periodSeconds: 15
readinessProbe:
  command:
    - mcstatus
    - localhost:25565
    - status
  initialDelaySeconds: 90
  periodSeconds: 15
minecraftServer:
  # This must be overridden, since we can't accept this for the user.
  eula: "TRUE"
  # One of: LATEST, SNAPSHOT, or a specific version (ie: "1.7.9").
  version: "1.14.4"
  # This can be one of "VANILLA", "FORGE", "SPIGOT", "BUKKIT", "PAPER", "FTB", "SPONGEVANILLA"
  type: "VANILLA"
  # If type is set to FORGE, this sets the version; this is ignored if forgeInstallerUrl is set
  forgeVersion:
  # If type is set to SPONGEVANILLA, this sets the version
  spongeVersion:
  # If type is set to FORGE, this sets the URL to download the Forge installer
  forgeInstallerUrl:
  # If type is set to BUKKIT, this sets the URL to download the Bukkit package
  bukkitDownloadUrl:
  # If type is set to SPIGOT, this sets the URL to download the Spigot package
  spigotDownloadUrl:
  # If type is set to PAPER, this sets the URL to download the PaperSpigot package
  paperDownloadUrl:
  # If type is set to FTB, this sets the server mod to run
  ftbServerMod:
  # Set to true if running Feed The Beast and get an error like "unable to launch forgemodloader"
  ftbLegacyJavaFixer: false
  # One of: peaceful, easy, normal, and hard
  difficulty: easy
  # A comma-separated list of player names to whitelist.
  whitelist:
  # A comma-separated list of player names who should be admins.
  ops:
  # A server icon URL for server listings. Auto-scaled and transcoded.
  icon:
  # Max connected players.
  maxPlayers: 20
  # This sets the maximum possible size in blocks, expressed as a radius, that the world border can obtain.
  maxWorldSize: 10000
  # Allows players to travel to the Nether.
  allowNether: true
  # Allows server to announce when a player gets an achievement.
  announcePlayerAchievements: true
  # Enables command blocks.
  enableCommandBlock: true
  # If true, players will always join in the default gameMode even if they were previously set to something else.
  forcegameMode: false
  # Defines whether structures (such as villages) will be generated.
  generateStructures: true
  # If set to true, players will be set to spectator mode if they die.
  hardcore: false
  # The maximum height in which building is allowed.
  maxBuildHeight: 256
  # The maximum number of milliseconds a single tick may take before the server watchdog stops the server with the message. -1 disables this entirely.
  maxTickTime: 60000
  # Determines if animals will be able to spawn.
  spawnAnimals: true
  # Determines if monsters will be spawned.
  spawnMonsters: true
  # Determines if villagers will be spawned.
  spawnNPCs: true
  # Max view distance (in chunks).
  viewDistance: 10
  # Define this if you want a specific map generation seed.
  levelSeed:
  # One of: creative, survival, adventure, spectator
  gameMode: survival
  # Message of the Day
  motd: "Welcome to Minecraft on Kubernetes!"
  # If true, enable player-vs-player damage.
  pvp: false
  # One of: DEFAULT, FLAT, LARGEBIOMES, AMPLIFIED, CUSTOMIZED
  levelType: DEFAULT
  # When levelType == FLAT or CUSTOMIZED, this can be used to further customize map generation.
  # ref: https://hub.docker.com/r/itzg/minecraft-server/
  generatorSettings:
  worldSaveName: world
  # If set, this URL will be downloaded at startup and used as a starting point
  downloadWorldUrl:
  # force re-download of server file
  forceReDownload: false
  # If set, the modpack at this URL will be downloaded at startup
  downloadModpackUrl:
  # If true, old versions of downloaded mods will be replaced with new ones from downloadModpackUrl
  removeOldMods: false
  # Check accounts against Minecraft account service.
  onlineMode: true
  # If you adjust this, you may need to adjust resources.requests above to match.
  memory: 1024M
  # General JVM options to be passed to the Minecraft server invocation
  jvmOpts: ""
  # Options like -X that need to proceed general JVM options
  jvmXXOpts: ""
  serviceType: LoadBalancer
  rcon:
    # If you enable this, make SURE to change your password below.
    enabled: false
    port: 25575
    password: "CHANGEME!"
    serviceType: LoadBalancer

  query:
    # If you enable this, your server will be "published" to Gamespy
    enabled: false
    port: 25565

## Additional minecraft container environment variables
##
extraEnv: {}

persistence:
  ## minecraft data Persistent Volume Storage Class
  ## If defined, storageClassName: <storageClass>
  ## If set to "-", storageClassName: "", which disables dynamic provisioning
  ## If undefined (the default) or set to null, no storageClassName spec is
  ##   set, choosing the default provisioner.  (gp2 on AWS, standard on
  ##   GKE, AWS & OpenStack)
  ##
  # storageClass: "-"
  dataDir:
    # Set this to false if you don't care to persist state between restarts.
    enabled: true
    Size: 1Gi

podAnnotations: {}

Then we use helm install

helm install --name mine-release --namespace=mine-release ./minecraft -f ./minecraft/values.yaml

Results from helm install:

NAME:   mine-release
LAST DEPLOYED: Fri Oct 11 14:52:17 2019
NAMESPACE: mine-release
STATUS: DEPLOYED

RESOURCES:
==> v1/PersistentVolumeClaim
NAME                            STATUS   VOLUME    CAPACITY  ACCESS MODES  STORAGECLASS  AGE
mine-release-minecraft-datadir  Pending  standard  0s

==> v1/Pod(related)
NAME                                    READY  STATUS   RESTARTS  AGE
mine-release-minecraft-f4558bfd5-mwm55  0/1    Pending  0         0s

==> v1/Secret
NAME                    TYPE    DATA  AGE
mine-release-minecraft  Opaque  1     0s

==> v1/Service
NAME                    TYPE          CLUSTER-IP   EXTERNAL-IP  PORT(S)          AGE
mine-release-minecraft  LoadBalancer  10.0.13.180  <pending>    25565:32020/TCP  0s

==> v1beta1/Deployment
NAME                    READY  UP-TO-DATE  AVAILABLE  AGE
mine-release-minecraft  0/1    1           0          0s


NOTES:
Get the IP address of your Minecraft server by running these commands in the
same shell:

!! NOTE: It may take a few minutes for the LoadBalancer IP to be available. !!

You can watch for EXTERNAL-IP to populate by running:
  kubectl get svc --namespace mine-release -w mine-release-minecraft

Results from logs:

[12:53:45] [Server-Worker-1/INFO]: Preparing spawn area: 98%
[12:53:45] [Server thread/INFO]: Time elapsed: 26661 ms
[12:53:45] [Server thread/INFO]: Done (66.833s)! For help, type "help"
[12:53:45] [Server thread/INFO]: Starting remote control listener
[12:53:45] [RCON Listener #1/INFO]: RCON running on 0.0.0.0:25575
-- jt97
Source: StackOverflow