profile
viewpoint

Ask questionsUsing healthcheck on swarm disturbs nameservices

After enabling healthcheck in our setup we weren't able to deploy this setup anymore. We've found out, that when enabling healthcheck containers can't resolve their service names. This includes the name of the container itself.

We are deploying via docker stack deploy --compose-file docker-compose.yml minitest and tested it using docker 17.03, 17.06 and 17.10. We could observe the problem on all three versions.

I've created a minimal example. The two containers are pinging each other and as a primitive healthcheck we ping localhost.

Using docker-compose with this compose-file will lead to a working setup; using stack deploy will restart the containers time and time again.

version: '3'

services:
  mini-1:
    image: debian
    deploy:
      replicas: 1
    command: bash -c "sleep 15; ping mini-2 > /tmp/mini-1.log"
    healthcheck:
      test: ["CMD", "ping", "-c", "1", "localhost"]
      interval: 1m30s
      timeout: 10s
      retries: 3
    volumes:
      - /minimal_example/mini-1.log:/tmp/mini-1.log      
    networks:
      mini-network:
        aliases:
          - mini

  mini-2:
    image: debian
    deploy:
      replicas: 1
    command: bash -c "sleep 15; ping mini-1 > /tmp/mini-2.log"
    healthcheck:
      test: ["CMD", "ping", "-c", "1", "localhost"]
      interval: 1m30s
      timeout: 10s
      retries: 3
    volumes:
      - /minimal_example/mini-2.log:/tmp/mini-2.log
    networks:
      mini-network:
        aliases:
          - mini

networks:
  mini-network:
    driver: overlay
moby/moby

Answer questions drnybble

Here is what I am doing (read it & weep):

  • create the services with 0 replicas
  • extract the VIP from the service definition (docker service inspect...)
function extract_vip()
{
	local SERVICE_NAME=$1
	local NETWORK_NAME=$2

	local NETWORK_ID=$(docker network inspect --format='{{.Id}}' "${NETWORK_NAME}")

	local ADDR=$(docker service inspect --format="{{range .Endpoint.VirtualIPs}}{{if eq .NetworkID \"${NETWORK_ID}\"}}{{.Addr}}{{end}}{{end}}" "${SERVICE_NAME}")
	# ADDR=10.0.21.11/24; Trim off the /24 to get just the address
	VIP=${ADDR%/*}
	echo ${VIP}
}
  • define environment variables with these values
  • use these in the YAML file extra_hosts
  • now change the replicas to what you want & update the services (replicas can be defined using environment variables) So your service is something like this:
replicas: ${REPLICAS_MYSERVICE}
extra_hosts:
- "hostname:${hostname_vip}

Unfortunately, extra_hosts doesn't work on Windows so I may try with a docker config to replace the hosts file for Windows.

Related questions

start container failed with "failed to umount /var/lib/docker/containers/.../shm: no such file or directory" hot 64
start container failed with "failed to umount /var/lib/docker/containers/.../shm: no such file or directory" hot 29
upgrade docker-18.09.2-ce , shim.sock: bind: address already in use: unknown hot 27
runc regression - EPERM running containers from selinux hot 16
Windows Server 2019 publish ports in swarm not working hot 14
Error response from daemon: rpc error: code = DeadlineExceeded desc = context deadline exceeded hot 13
"docker stack deploy">"rpc error: code = 3 desc = name must be valid as a DNS name component" hot 12
Swarm restarts all containers hot 10
write unix /var/run/docker.sock->@: write: broken pipe hot 10
hcsshim::PrepareLayer failed in Win32: The parameter is incorrect hot 10
OCI runtime exec failed: exec failed: cannot exec a container that has stopped: unknown hot 9
integration: "error reading the kernel parameter" errors during CI hot 9
Docker 18.09.1 doesn't work with iptables v1.8.2 hot 9
dockerd stopped responding to API requests; no installed keys could decrypt message hot 9
manifest invalid error when pushing image to quay.io hot 8
Github User Rank List