profile
viewpoint

Ask questionsSwarm restarts all containers

<!-- If you are reporting a new issue, make sure that we do not have any duplicates already open. You can ensure this by searching the issue list for this repository. If there is a duplicate, please close your issue and add a comment to the existing issue instead.

If you suspect your issue is a bug, please edit your issue description to include the BUG REPORT INFORMATION shown below. If you fail to provide this information within 7 days, we cannot debug your issue and will close it. We will, however, reopen it if you later provide the information.

For more information about reporting issues, see https://github.com/moby/moby/blob/master/CONTRIBUTING.md#reporting-other-issues


GENERAL SUPPORT INFORMATION

The GitHub issue tracker is for bug reports and feature requests. General support for docker can be found at the following locations:

  • Docker Support Forums - https://forums.docker.com
  • Slack - community.docker.com #general channel
  • Post a question on StackOverflow, using the Docker tag

General support for moby can be found at the following locations:

  • Moby Project Forums - https://forums.mobyproject.org
  • Slack - community.docker.com #moby-project channel
  • Post a question on StackOverflow, using the Moby tag

BUG REPORT INFORMATION

Use the commands below to provide key information from your environment: You do NOT have to include this information if this is a FEATURE REQUEST -->

Description We are running a docker swarm cluster with 3 managers and 5 workers. Twice now we have experienced some error in the cluster where every service is restarted. After some time all the services recover and it all goes back to normal. <!-- Briefly describe the problem you are having in a few paragraphs. -->

Steps to reproduce the issue: I am unable to reproduce the error on demand, it has only happened twice on the cluster that has been running for 105 days, with over 200 containers.

Describe the results you received: When looking into the issues i came across this in the logs:

Nov 14 09:38:13 int020522 dockerd[18624]: time="2018-11-14T09:38:13.079639333+01:00" level=error msg="error receiving response" error="rpc error: code = DeadlineExceeded desc = context deadline exceeded"
Nov 14 09:38:15 int020522 dockerd[18624]: time="2018-11-14T09:38:15.079967857+01:00" level=error msg="error receiving response" error="rpc error: code = DeadlineExceeded desc = context deadline exceeded"
Nov 14 09:38:15 int020522 dockerd[18624]: time="2018-11-14T09:38:15.116089882+01:00" level=error msg="error receiving response" error="rpc error: code = DeadlineExceeded desc = context deadline exceeded"
Nov 14 09:38:17 int020522 dockerd[18624]: time="2018-11-14T09:38:17.080341930+01:00" level=error msg="error receiving response" error="rpc error: code = DeadlineExceeded desc = context deadline exceeded"
Nov 14 09:38:17 int020522 dockerd[18624]: time="2018-11-14T09:38:17.116549890+01:00" level=error msg="error receiving response" error="rpc error: code = DeadlineExceeded desc = context deadline exceeded"
Nov 14 09:38:17 int020522 dockerd[18624]: time="2018-11-14T09:38:17.973307312+01:00" level=info msg="memberlist: Marking 29fdf1feb7d0 as failed, suspect timeout reached (2 peer confirmations)"
Nov 14 09:38:17 int020522 dockerd[18624]: time="2018-11-14T09:38:17.973375887+01:00" level=info msg="Node 29fdf1feb7d0/10.2.5.32, left gossip cluster"
Nov 14 09:38:17 int020522 dockerd[18624]: time="2018-11-14T09:38:17.973421640+01:00" level=info msg="Node 29fdf1feb7d0 change state NodeActive --> NodeFailed"
Nov 14 09:38:17 int020522 dockerd[18624]: time="2018-11-14T09:38:17.976071755+01:00" level=info msg="Node 29fdf1feb7d0/10.2.5.32, added to failed nodes list"
Nov 14 09:38:18 int020522 dockerd[18624]: time="2018-11-14T09:38:18.126988968+01:00" level=info msg="memberlist: Suspect 29fdf1feb7d0 has failed, no acks received"
Nov 14 09:38:18 int020522 dockerd[18624]: time="2018-11-14T09:38:18.161264366+01:00" level=error msg="Attempting to transfer leadership" raft_id=71921ff23bc70421
Nov 14 09:38:18 int020522 dockerd[18624]: time="2018-11-14T09:38:18.290127782+01:00" level=info msg="Node 29fdf1feb7d0/10.2.5.32, joined gossip cluster"
Nov 14 09:38:18 int020522 dockerd[18624]: time="2018-11-14T09:38:18.290199543+01:00" level=info msg="Node 29fdf1feb7d0 change state NodeFailed --> NodeActive"
Nov 14 09:38:18 int020522 dockerd[18624]: goroutine 470 [running]:
Nov 14 09:38:18 int020522 dockerd[18624]: github.com/docker/docker/pkg/signal.DumpStacks(0x0, 0x0, 0x0, 0x0, 0x0, 0x0)
Nov 14 09:38:18 int020522 dockerd[18624]: /root/rpmbuild/BUILD/src/engine/.gopath/src/github.com/docker/docker/pkg/signal/trap.go:83 +0xaa
Nov 14 09:38:18 int020522 dockerd[18624]: github.com/docker/docker/vendor/github.com/docker/swarmkit/manager/state/raft.(*Node).Run(0xc420e84000, 0x55ef18e49460, 0xc4256e8700, 0x0, 0x0)
Nov 14 09:38:18 int020522 dockerd[18624]: /root/rpmbuild/BUILD/src/engine/.gopath/src/github.com/docker/docker/vendor/github.com/docker/swarmkit/manager/state/raft/raft.go:597 +0x17e8
Nov 14 09:38:18 int020522 dockerd[18624]: github.com/docker/docker/vendor/github.com/docker/swarmkit/manager.(*Manager).Run.func6(0xc42039e340, 0x55ef18e49460, 0xc420520580)
Nov 14 09:38:18 int020522 dockerd[18624]: /root/rpmbuild/BUILD/src/engine/.gopath/src/github.com/docker/docker/vendor/github.com/docker/swarmkit/manager/manager.go:584 +0x4c
Nov 14 09:38:18 int020522 dockerd[18624]: created by github.com/docker/docker/vendor/github.com/docker/swarmkit/manager.(*Manager).Run
Nov 14 09:38:18 int020522 dockerd[18624]: /root/rpmbuild/BUILD/src/engine/.gopath/src/github.com/docker/docker/vendor/github.com/docker/swarmkit/manager/manager.go:583 +0x1544
Nov 14 09:38:18 int020522 dockerd[18624]: goroutine 1 [chan receive, 33064 minutes]:
Nov 14 09:38:18 int020522 dockerd[18624]: main.(*DaemonCli).start(0xc42047f710, 0xc420179da0, 0x0, 0x0)
Nov 14 09:38:18 int020522 dockerd[18624]: /root/rpmbuild/BUILD/src/engine/.gopath/src/github.com/docker/docker/cmd/dockerd/daemon.go:228 +0xf26
Nov 14 09:38:18 int020522 dockerd[18624]: main.runDaemon(0xc420179da0, 0xc420196700, 0x0)
Nov 14 09:38:18 int020522 dockerd[18624]: /root/rpmbuild/BUILD/src/engine/.gopath/src/github.com/docker/docker/cmd/dockerd/docker_unix.go:7 +0x47
Nov 14 09:38:18 int020522 dockerd[18624]: main.newDaemonCommand.func1(0xc4200eb400, 0xc420087700, 0x0, 0x4, 0x0, 0x0)
Nov 14 09:38:18 int020522 dockerd[18624]: /root/rpmbuild/BUILD/src/engine/.gopath/src/github.com/docker/docker/cmd/dockerd/docker.go:28 +0x5d
Nov 14 09:38:18 int020522 dockerd[18624]: github.com/docker/docker/vendor/github.com/spf13/cobra.(*Command).execute(0xc4200eb400, 0xc4200c4100, 0x4, 0x4, 0xc4200eb400, 0xc4200c4100)
Nov 14 09:38:18 int020522 dockerd[18624]: /root/rpmbuild/BUILD/src/engine/.gopath/src/github.com/docker/docker/vendor/github.com/spf13/cobra/command.go:762 +0x46a
Nov 14 09:38:18 int020522 dockerd[18624]: github.com/docker/docker/vendor/github.com/spf13/cobra.(*Command).ExecuteC(0xc4200eb400, 0x55ef18e1ff70, 0x55ef189fda20, 0x55ef18e1ff80)
Nov 14 09:38:18 int020522 dockerd[18624]: /root/rpmbuild/BUILD/src/engine/.gopath/src/github.com/docker/docker/vendor/github.com/spf13/cobra/command.go:852 +0x30c
Nov 14 09:38:18 int020522 dockerd[18624]: github.com/docker/docker/vendor/github.com/spf13/cobra.(*Command).Execute(0xc4200eb400, 0xc4200c2010, 0x55ef16d6f19f)
Nov 14 09:38:18 int020522 dockerd[18624]: /root/rpmbuild/BUILD/src/engine/.gopath/src/github.com/docker/docker/vendor/github.com/spf13/cobra/command.go:800 +0x2d
Nov 14 09:38:18 int020522 dockerd[18624]: main.main()
Nov 14 09:38:18 int020522 dockerd[18624]: /root/rpmbuild/BUILD/src/engine/.gopath/src/github.com/docker/docker/cmd/dockerd/docker.go:63 +0xa2
Nov 14 09:38:18 int020522 dockerd[18624]: goroutine 20 [syscall, 8583 minutes]:
Nov 14 09:38:18 int020522 dockerd[18624]: os/signal.signal_recv(0x55ef18e33c20)
Nov 14 09:38:18 int020522 dockerd[18624]: /usr/local/go/src/runtime/sigqueue.go:139 +0xa8
Nov 14 09:38:18 int020522 dockerd[18624]: os/signal.loop()
Nov 14 09:38:18 int020522 dockerd[18624]: /usr/local/go/src/os/signal/signal_unix.go:22 +0x24
Nov 14 09:38:18 int020522 dockerd[18624]: created by os/signal.init.0
Nov 14 09:38:18 int020522 dockerd[18624]: /usr/local/go/src/os/signal/signal_unix.go:28 +0x43
Nov 14 09:38:18 int020522 dockerd[18624]: goroutine 25 [select]:
Nov 14 09:38:18 int020522 dockerd[18624]: github.com/docker/docker/vendor/go.opencensus.io/stats/view.(*worker).start(0xc420086f80)
Nov 14 09:38:18 int020522 dockerd[18624]: /root/rpmbuild/BUILD/src/engine/.gopath/src/github.com/docker/docker/vendor/go.opencensus.io/stats/view/worker.go:144 +0x11f
Nov 14 09:38:18 int020522 dockerd[18624]: created by github.com/docker/docker/vendor/go.opencensus.io/stats/view.init.0
Nov 14 09:38:18 int020522 dockerd[18624]: /root/rpmbuild/BUILD/src/engine/.gopath/src/github.com/docker/docker/vendor/go.opencensus.io/stats/view/worker.go:29 +0x5a
Nov 14 09:38:18 int020522 dockerd[18624]: goroutine 28 [syscall, 33065 minutes]:
Nov 14 09:38:18 int020522 dockerd[18624]: syscall.Syscall6(0xf7, 0x1, 0x48c8, 0xc4204a75c8, 0x1000004, 0x0, 0x0, 0x0, 0x0, 0x0)
Nov 14 09:38:18 int020522 dockerd[18624]: /usr/local/go/src/syscall/asm_linux_amd64.s:44 +0x5
Nov 14 09:38:18 int020522 dockerd[18624]: os.(*Process).blockUntilWaitable(0xc420340cc0, 0x55ef16ca2b3b, 0xc42008a480, 0x0)
Nov 14 09:38:18 int020522 dockerd[18624]: /usr/local/go/src/os/wait_waitid.go:31 +0x9a
Nov 14 09:38:18 int020522 dockerd[18624]: os.(*Process).wait(0xc420340cc0, 0xc42008a401, 0x55ef17250677, 0xc4204a7750)
Nov 14 09:38:18 int020522 dockerd[18624]: /usr/local/go/src/os/exec_unix.go:22 +0x3e
Nov 14 09:38:18 int020522 dockerd[18624]: os.(*Process).Wait(0xc420340cc0, 0xc4204a7728, 0xc420157960, 0x55ef18e15aa0)
Nov 14 09:38:18 int020522 dockerd[18624]: /usr/local/go/src/os/exec.go:123 +0x2d
Nov 14 09:38:18 int020522 dockerd[18624]: os/exec.(*Cmd).Wait(0xc4205086e0, 0xc4204a77b8, 0x55ef17261d6b)
Nov 14 09:38:18 int020522 dockerd[18624]: /usr/local/go/src/os/exec/exec.go:461 +0x5e
Nov 14 09:38:18 int020522 dockerd[18624]: github.com/docker/docker/libcontainerd.(*remote).startContainerd.func1(0xc4205086e0, 0xc4203a28c0)
Nov 14 09:38:18 int020522 dockerd[18624]: /root/rpmbuild/BUILD/src/engine/.gopath/src/github.com/docker/docker/libcontainerd/remote_daemon.go:243 +0x31
Nov 14 09:38:18 int020522 dockerd[18624]: created by github.com/docker/docker/libcontainerd.(*remote).startContainerd
Nov 14 09:38:18 int020522 dockerd[18624]: /root/rpmbuild/BUILD/src/engine/.gopath/src/github.com/docker/docker/libcontainerd/remote_daemon.go:241 +0x3db
Nov 14 09:38:18 int020522 dockerd[18624]: goroutine 29 [select, 33065 minutes]:
Nov 14 09:38:18 int020522 dockerd[18624]: github.com/docker/docker/vendor/google.golang.org/grpc.(*ccResolverWrapper).watcher(0xc420368330)
Nov 14 09:38:18 int020522 dockerd[18624]: /root/rpmbuild/BUILD/src/engine/.gopath/src/github.com/docker/docker/vendor/google.golang.org/grpc/resolver_conn_wrapper.go:109 +0x184
Nov 14 09:38:18 int020522 dockerd[18624]: created by github.com/docker/docker/vendor/google.golang.org/grpc.(*ccResolverWrapper).start
Nov 14 09:38:18 int020522 dockerd[18624]: /root/rpmbuild/BUILD/src/engine/.gopath/src/github.com/docker/docker/vendor/google.golang.org/grpc/resolver_conn_wrapper.go:95 +0x41
Nov 14 09:38:18 int020522 dockerd[18624]: goroutine 30 [select, 33065 minutes]:
Nov 14 09:38:18 int020522 dockerd[18624]: github.com/docker/docker/vendor/google.golang.org/grpc.(*ccBalancerWrapper).watcher(0xc420376240)
Nov 14 09:38:18 int020522 dockerd[18624]: /root/rpmbuild/BUILD/src/engine/.gopath/src/github.com/docker/docker/vendor/google.golang.org/grpc/balancer_conn_wrappers.go:122 +0x14c
Nov 14 09:38:18 int020522 dockerd[18624]: created by github.com/docker/docker/vendor/google.golang.org/grpc.newCCBalancerWrapper
Nov 14 09:38:18 int020522 dockerd[18624]: /root/rpmbuild/BUILD/src/engine/.gopath/src/github.com/docker/docker/vendor/google.golang.org/grpc/balancer_conn_wrappers.go:113 +0x151
Nov 14 09:38:18 int020522 dockerd[18624]: goroutine 31 [select, 33065 minutes]:
Nov 14 09:38:18 int020522 dockerd[18624]: github.com/docker/docker/vendor/google.golang.org/grpc.(*addrConn).transportMonitor(0xc42048cb00)
Nov 14 09:38:18 int020522 dockerd[18624]: /root/rpmbuild/BUILD/src/engine/.gopath/src/github.com/docker/docker/vendor/google.golang.org/grpc/clientconn.go:1373 +0x23d
Nov 14 09:38:18 int020522 dockerd[18624]: github.com/docker/docker/vendor/google.golang.org/grpc.(*addrConn).connect.func1(0xc42048cb00)
Nov 14 09:38:18 int020522 dockerd[18624]: /root/rpmbuild/BUILD/src/engine/.gopath/src/github.com/docker/docker/vendor/google.golang.org/grpc/clientconn.go:949 +0x1b7
Nov 14 09:38:18 int020522 dockerd[18624]: created by github.com/docker/docker/vendor/google.golang.org/grpc.(*addrConn).connect
Nov 14 09:38:18 int020522 dockerd[18624]: /root/rpmbuild/BUILD/src/engine/.gopath/src/github.com/docker/docker/vendor/google.golang.org/grpc/clientconn.go:940 +0xe3
Nov 14 09:38:18 int020522 dockerd[18624]: goroutine 33 [IO wait]:
Nov 14 09:38:18 int020522 dockerd[18624]: internal/poll.runtime_pollWait(0x7fcb32003c90, 0x72, 0xc42006abb8)
Nov 14 09:38:18 int020522 dockerd[18624]: /usr/local/go/src/runtime/netpoll.go:173 +0x59
Nov 14 09:38:18 int020522 dockerd[18624]: internal/poll.(*pollDesc).wait(0xc420010698, 0x72, 0xffffffffffffff00, 0x55ef18e2c800, 0x55ef199ba288)
Nov 14 09:38:18 int020522 dockerd[18624]: /usr/local/go/src/internal/poll/fd_poll_runtime.go:85 +0x9d
Nov 14 09:38:18 int020522 dockerd[18624]: internal/poll.(*pollDesc).waitRead(0xc420010698, 0xc420016000, 0x8000, 0x8000)
Nov 14 09:38:18 int020522 dockerd[18624]: /usr/local/go/src/internal/poll/fd_poll_runtime.go:90 +0x3f
Nov 14 09:38:18 int020522 dockerd[18624]: internal/poll.(*FD).Read(0xc420010680, 0xc420016000, 0x8000, 0x8000, 0x0, 0x0, 0x0)
Nov 14 09:38:18 int020522 dockerd[18624]: /usr/local/go/src/internal/poll/fd_unix.go:157 +0x17f
Nov 14 09:38:18 int020522 dockerd[18624]: net.(*netFD).Read(0xc420010680, 0xc420016000, 0x8000, 0x8000, 0x11, 0x0, 0x0)
Nov 14 09:38:18 int020522 dockerd[18624]: /usr/local/go/src/net/fd_unix.go:202 +0x51
Nov 14 09:38:18 int020522 dockerd[18624]: net.(*conn).Read(0xc4200c2770, 0xc420016000, 0x8000, 0x8000, 0x0, 0x0, 0x0)
Nov 14 09:38:18 int020522 dockerd[18624]: /usr/local/go/src/net/net.go:176 +0x6c
Nov 14 09:38:18 int020522 dockerd[18624]: bufio.(*Reader).Read(0xc4202baa20, 0xc4203f2038, 0x9, 0x9, 0x20, 0x0, 0x0)
Nov 14 09:38:18 int020522 dockerd[18624]: /usr/local/go/src/bufio/bufio.go:216 +0x23a
Nov 14 09:38:18 int020522 dockerd[18624]: io.ReadAtLeast(0x55ef18e25520, 0xc4202baa20, 0xc4203f2038, 0x9, 0x9, 0x9, 0xc42006adf0, 0x55ef16c9f0e0, 0xc42006ae9f)
Nov 14 09:38:18 int020522 dockerd[18624]: /usr/local/go/src/io/io.go:309 +0x88
Nov 14 09:38:18 int020522 dockerd[18624]: io.ReadFull(0x55ef18e25520, 0xc4202baa20, 0xc4203f2038, 0x9, 0x9, 0x55ef3787be7e, 0x3787be7e2794492c, 0x5bebdef9)
Nov 14 09:38:18 int020522 dockerd[18624]: /usr/local/go/src/io/io.go:327 +0x5a
Nov 14 09:38:18 int020522 dockerd[18624]: github.com/docker/docker/vendor/golang.org/x/net/http2.readFrameHeader(0xc4203f2038, 0x9, 0x9, 0x55ef18e25520, 0xc4202baa20, 0x0, 0xbef3159e00000000, 0x70c637ff8c2f7, 0x55ef19a18fe0)
Nov 14 09:38:18 int020522 dockerd[18624]: /root/rpmbuild/BUILD/src/engine/.gopath/src/github.com/docker/docker/vendor/golang.org/x/net/http2/frame.go:237 +0x7d
Nov 14 09:38:18 int020522 dockerd[18624]: github.com/docker/docker/vendor/golang.org/x/net/http2.(*Framer).ReadFrame(0xc4203f2000, 0xc427944920, 0xc427944920, 0x0, 0x0)
Nov 14 09:38:18 int020522 dockerd[18624]: /root/rpmbuild/BUILD/src/engine/.gopath/src/github.com/docker/docker/vendor/golang.org/x/net/http2/frame.go:492 +0xa6
Nov 14 09:38:18 int020522 dockerd[18624]: github.com/docker/docker/vendor/google.golang.org/grpc/transport.(*http2Client).reader(0xc4201c0000)
Nov 14 09:38:18 int020522 dockerd[18624]: /root/rpmbuild/BUILD/src/engine/.gopath/src/github.com/docker/docker/vendor/google.golang.org/grpc/transport/http2_client.go:1123 +0x117
Nov 14 09:38:18 int020522 dockerd[18624]: created by github.com/docker/docker/vendor/google.golang.org/grpc/transport.newHTTP2Client
Nov 14 09:38:18 int020522 dockerd[18624]: /root/rpmbuild/BUILD/src/engine/.gopath/src/github.com/docker/docker/vendor/google.golang.org/grpc/transport/http2_client.go:265 +0xb41
Nov 14 09:38:18 int020522 dockerd[18624]: goroutine 66 [select]:
Nov 14 09:38:18 int020522 dockerd[18624]: github.com/docker/docker/vendor/google.golang.org/grpc/transport.(*controlBuffer).get(0xc420376440, 0x1, 0x0, 0x0, 0x0, 0x0)
Nov 14 09:38:18 int020522 dockerd[18624]: /root/rpmbuild/BUILD/src/engine/.gopath/src/github.com/docker/docker/vendor/google.golang.org/grpc/transport/controlbuf.go:289 +0x135
Nov 14 09:38:18 int020522 dockerd[18624]: github.com/docker/docker/vendor/google.golang.org/grpc/transport.(*loopyWriter).run(0xc42008bf20)
Nov 14 09:38:18 int020522 dockerd[18624]: /root/rpmbuild/BUILD/src/engine/.gopath/src/github.com/docker/docker/vendor/google.golang.org/grpc/transport/controlbuf.go:374 +0x1be
Nov 14 09:38:18 int020522 dockerd[18624]: github.com/docker/docker/vendor/google.golang.org/grpc/transport.newHTTP2Client.func3(0xc4201c0000)
Nov 14 09:38:18 int020522 dockerd[18624]: /root/rpmbuild/BUILD/src/engine/.gopath/src/github.com/docker/docker/vendor/google.golang.org/grpc/transport/http2_client.go:298 +0x7e
Nov 14 09:38:18 int020522 dockerd[18624]: created by github.com/docker/docker/vendor/google.golang.org/grpc/transport.newHTTP2Client
Nov 14 09:38:18 int020522 dockerd[18624]: /root/rpmbuild/BUILD/src/engine/.gopath/src/github.com/docker/docker/vendor/google.golang.org/grpc/transport/http2_client.go:296 +0xc91
Nov 14 09:38:18 int020522 dockerd[18624]: goroutine 55 [select]:
Nov 14 09:38:18 int020522 dockerd[18624]: github.com/docker/docker/libcontainerd.(*remote).monitorConnection(0xc4203a28c0, 0xc42007cc60)
Nov 14 09:38:18 int020522 dockerd[18624]: /root/rpmbuild/BUILD/src/engine/.gopath/src/github.com/docker/docker/libcontainerd/remote_daemon.go:267 +0x11f
Nov 14 09:38:18 int020522 dockerd[18624]: created by github.com/docker/docker/libcontainerd.New
Nov 14 09:38:18 int020522 dockerd[18624]: /root/rpmbuild/BUILD/src/engine/.gopath/src/github.com/docker/docker/libcontainerd/remote_daemon.go:116 +0x58e
Nov 14 09:38:18 int020522 dockerd[18624]: goroutine 56 [select, 33064 minutes, locked to thread]:
Nov 14 09:38:18 int020522 dockerd[18624]: runtime.gopark(0x55ef18e15948, 0x0, 0x55ef182074ab, 0x6, 0x18, 0x1)
Nov 14 09:38:18 int020522 dockerd[18624]: /usr/local/go/src/runtime/proc.go:291 +0x120
Nov 14 09:38:18 int020522 dockerd[18624]: runtime.selectgo(0xc4204a2f50, 0xc4201bef00)
Nov 14 09:38:18 int020522 dockerd[18624]: /usr/local/go/src/runtime/select.go:392 +0xe56
Nov 14 09:38:18 int020522 dockerd[18624]: runtime.ensureSigM.func1()
Nov 14 09:38:18 int020522 dockerd[18624]: /usr/local/go/src/runtime/signal_unix.go:549 +0x1f6
Nov 14 09:38:18 int020522 dockerd[18624]: runtime.goexit()
Nov 14 09:38:18 int020522 dockerd[18624]: /usr/local/go/src/runtime/asm_amd64.s:2361 +0x1
Nov 14 09:38:18 int020522 dockerd[18624]: goroutine 57 [chan receive, 8583 minutes]:
Nov 14 09:38:18 int020522 dockerd[18624]: github.com/docker/docker/pkg/signal.Trap.func1(0xc4202e2060, 0x55ef18e27c00, 0xc4200c41e0, 0xc42000d640)
Nov 14 09:38:18 int020522 dockerd[18624]: /root/rpmbuild/BUILD/src/engine/.gopath/src/github.com/docker/docker/pkg/signal/trap.go:38 +0x5d
Nov 14 09:38:18 int020522 dockerd[18624]: created by github.com/docker/docker/pkg/signal.Trap
Nov 14 09:38:18 int020522 dockerd[18624]: /root/rpmbuild/BUILD/src/engine/.gopath/src/github.com/docker/docker/pkg/signal/trap.go:36 +0x120
Nov 14 09:38:18 int020522 dockerd[18624]: goroutine 58 [chan receive, 33065 minutes]:
Nov 14 09:38:18 int020522 dockerd[18624]: github.com/docker/docker/daemon.(*Daemon).setupDumpStackTrap.func1(0xc4202bac00, 0x55ef18215e8b, 0xf)
Nov 14 09:38:18 int020522 dockerd[18624]: /root/rpmbuild/BUILD/src/engine/.gopath/src/github.com/docker/docker/daemon/debugtrap_unix.go:18 +0x46
Nov 14 09:38:18 int020522 dockerd[18624]: created by github.com/docker/docker/daemon.(*Daemon).setupDumpStackTrap
Nov 14 09:38:18 int020522 dockerd[18624]: /root/rpmbuild/BUILD/src/engine/.gopath/src/github.com/docker/docker/daemon/debugtrap_unix.go:17 +0xc1
Nov 14 09:38:18 int020522 dockerd[18624]: goroutine 63 [IO wait, 33065 minutes]:
Nov 14 09:38:18 int020522 dockerd[18624]: internal/poll.runtime_pollWait(0x7fcb32003bc0, 0x72, 0x0)
Nov 14 09:38:18 int020522 dockerd[18624]: /usr/local/go/src/runtime/netpoll.go:173 +0x59
Nov 14 09:38:18 int020522 dockerd[18624]: internal/poll.(*pollDesc).wait(0xc4202bd598, 0x72, 0xc420376800, 0x0, 0x0)
Nov 14 09:38:18 int020522 dockerd[18624]: /usr/local/go/src/internal/poll/fd_poll_runtime.go:85 +0x9d
Nov 14 09:38:18 int020522 dockerd[18624]: internal/poll.(*pollDesc).waitRead(0xc4202bd598, 0xffffffffffffff00, 0x0, 0x0)
Nov 14 09:38:18 int020522 dockerd[18624]: /usr/local/go/src/internal/poll/fd_poll_runtime.go:90 +0x3f
Nov 14 09:38:18 int020522 dockerd[18624]: internal/poll.(*FD).Accept(0xc4202bd580, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0)
Nov 14 09:38:18 int020522 dockerd[18624]: /usr/local/go/src/internal/poll/fd_unix.go:372 +0x1aa
Nov 14 09:38:18 int020522 dockerd[18624]: net.(*netFD).accept(0xc4202bd580, 0xc42006be58, 0x55ef16cb00ea, 0x30)
Nov 14 09:38:18 int020522 dockerd[18624]: /usr/local/go/src/net/fd_unix.go:238 +0x44
Nov 14 09:38:18 int020522 dockerd[18624]: net.(*UnixListener).accept(0xc42003d470, 0x55ef16db30fc, 0x55ef18c90500, 0xc4203688a0)
Nov 14 09:38:18 int020522 dockerd[18624]: /usr/local/go/src/net/unixsock_posix.go:162 +0x34
Nov 14 09:38:18 int020522 dockerd[18624]: net.(*UnixListener).Accept(0xc42003d470, 0xc4200c0040, 0x55ef18a9c620, 0x55ef19995210, 0x55ef18dd3860)
Nov 14 09:38:18 int020522 dockerd[18624]: /usr/local/go/src/net/unixsock.go:253 +0x4b
Nov 14 09:38:18 int020522 dockerd[18624]: net/http.(*Server).Serve(0xc4202f05b0, 0x55ef18e476e0, 0xc42003d470, 0x0, 0x0)
Nov 14 09:38:18 int020522 dockerd[18624]: /usr/local/go/src/net/http/server.go:2770 +0x1a7
Nov 14 09:38:18 int020522 dockerd[18624]: net/http.Serve(0x55ef18e476e0, 0xc42003d470, 0x55ef18e287c0, 0xc42003d4a0, 0x55ef16cd296a, 0x55ef18e157d8)
Nov 14 09:38:18 int020522 dockerd[18624]: /usr/local/go/src/net/http/server.go:2389 +0x75
Nov 14 09:38:18 int020522 dockerd[18624]: github.com/docker/docker/daemon.(*Daemon).listenMetricsSock.func1(0x55ef18e476e0, 0xc42003d470, 0xc42003d4a0)
Nov 14 09:38:18 int020522 dockerd[18624]: /root/rpmbuild/BUILD/src/engine/.gopath/src/github.com/docker/docker/daemon/metrics_unix.go:31 +0x4d
Nov 14 09:38:18 int020522 dockerd[18624]: created by github.com/docker/docker/daemon.(*Daemon).listenMetricsSock
Nov 14 09:38:18 int020522 dockerd[18624]: /root/rpmbuild/BUILD/src/engine/.gopath/src/github.com/docker/docker/daemon/metrics_unix.go:30 +0x195
Nov 14 09:38:18 int020522 dockerd[18624]: goroutine 64 [select, 33065 minutes]:
Nov 14 09:38:18 int020522 dockerd[18624]: github.com/docker/docker/vendor/google.golang.org/grpc.(*ccResolverWrapper).watcher(0xc42003dec0)
Nov 14 09:38:18 int020522 dockerd[18624]: /root/rpmbuild/BUILD/src/engine/.gopath/src/github.com/docker/docker/vendor/google.golang.org/grpc/resolver_conn_wrapper.go:109 +0x184
Nov 14 09:38:18 int020522 dockerd[18624]: created by github.com/docker/docker/vendor/google.golang.org/grpc.(*ccResolverWrapper).start
Nov 14 09:38:18 int020522 dockerd[18624]: /root/rpmbuild/BUILD/src/engine/.gopath/src/github.com/docker/docker/vendor/google.golang.org/grpc/resolver_conn_wrapper.go:95 +0x41
Nov 14 09:38:18 int020522 dockerd[18624]: goroutine 65 [select, 33065 minutes]:
Nov 14 09:38:18 int020522 dockerd[18624]: github.com/docker/docker/vendor/google.golang.org/grpc.(*ccBalancerWrapper).watcher(0xc420198640)
Nov 14 09:38:18 int020522 dockerd[18624]: /root/rpmbuild/BUILD/src/engine/.gopath/src/github.com/docker/docker/vendor/google.golang.org/grpc/balancer_conn_wrappers.go:122 +0x14c
Nov 14 09:38:18 int020522 dockerd[18624]: created by github.com/docker/docker/vendor/google.golang.org/grpc.newCCBalancerWrapper
Nov 14 09:38:18 int020522 dockerd[18624]: /root/rpmbuild/BUILD/src/engine/.gopath/src/github.com/docker/docker/vendor/google.golang.org/grpc/balancer_conn_wrappers.go:113 +0x151
Nov 14 09:38:18 int020522 dockerd[18624]: goroutine 82 [select, 33065 minutes]:
Nov 14 09:38:18 int020522 dockerd[18624]: github.com/docker/docker/vendor/google.golang.org/grpc.(*addrConn).transportMonitor(0xc420436b00)
Nov 14 09:38:18 int020522 dockerd[18624]: /root/rpmbuild/BUILD/src/engine/.gopath/src/github.com/docker/docker/vendor/google.golang.org/grpc/clientconn.go:1373 +0x23d
Nov 14 09:38:18 int020522 dockerd[18624]: github.com/docker/docker/vendor/google.golang.org/grpc.(*addrConn).connect.func1(0xc420436b00)
Nov 14 09:38:18 int020522 dockerd[18624]: /root/rpmbuild/BUILD/src/engine/.gopath/src/github.com/docker/docker/vendor/google.golang.org/grpc/clientconn.go:949 +0x1b7
Nov 14 09:38:18 int020522 dockerd[18624]: created by github.com/docker/docker/vendor/google.golang.org/grpc.(*addrConn).connect
Nov 14 09:38:18 int020522 dockerd[18624]: /root/rpmbuild/BUILD/src/engine/.gopath/src/github.com/docker/docker/vendor/google.golang.org/grpc/clientconn.go:940 +0xe3
Nov 14 09:38:18 int020522 dockerd[18624]: goroutine 84 [IO wait, 95 minutes]:
Nov 14 09:38:18 int020522 dockerd[18624]: internal/poll.runtime_pollWait(0x7fcb32003af0, 0x72, 0xc4202dbbb8)
Nov 14 09:38:18 int020522 dockerd[18624]: /usr/local/go/src/runtime/netpoll.go:173 +0x59
Nov 14 09:38:18 int020522 dockerd[18624]: internal/poll.(*pollDesc).wait(0xc4202bd798, 0x72, 0xffffffffffffff00, 0x55ef18e2c800, 0x55ef199ba288)
Nov 14 09:38:18 int020522 dockerd[18624]: /usr/local/go/src/internal/poll/fd_poll_runtime.go:85 +0x9d
Nov 14 09:38:18 int020522 dockerd[18624]: internal/poll.(*pollDesc).waitRead(0xc4202bd798, 0xc420266000, 0x8000, 0x8000)
Nov 14 09:38:18 int020522 dockerd[18624]: /usr/local/go/src/internal/poll/fd_poll_runtime.go:90 +0x3f
Nov 14 09:38:18 int020522 dockerd[18624]: internal/poll.(*FD).Read(0xc4202bd780, 0xc420266000, 0x8000, 0x8000, 0x0, 0x0, 0x0)
Nov 14 09:38:18 int020522 dockerd[18624]: /usr/local/go/src/internal/poll/fd_unix.go:157 +0x17f
Nov 14 09:38:18 int020522 dockerd[18624]: net.(*netFD).Read(0xc4202bd780, 0xc420266000, 0x8000, 0x8000, 0x11, 0x0, 0x0)
Nov 14 09:38:18 int020522 dockerd[18624]: /usr/local/go/src/net/fd_unix.go:202 +0x51
Nov 14 09:38:18 int020522 dockerd[18624]: net.(*conn).Read(0xc4201820f0, 0xc420266000, 0x8000, 0x8000, 0x0, 0x0, 0x0)
Nov 14 09:38:18 int020522 dockerd[18624]: /usr/local/go/src/net/net.go:176 +0x6c
Nov 14 09:38:18 int020522 dockerd[18624]: bufio.(*Reader).Read(0xc4202e29c0, 0xc42027e038, 0x9, 0x9, 0xc4202ecc00, 0x4, 0xc4202dbd98)
Nov 14 09:38:18 int020522 dockerd[18624]: /usr/local/go/src/bufio/bufio.go:216 +0x23a
Nov 14 09:38:18 int020522 dockerd[18624]: io.ReadAtLeast(0x55ef18e25520, 0xc4202e29c0, 0xc42027e038, 0x9, 0x9, 0x9, 0xc4202dbe10, 0x3, 0x18)
Nov 14 09:38:18 int020522 dockerd[18624]: /usr/local/go/src/io/io.go:309 +0x88
Nov 14 09:38:18 int020522 dockerd[18624]: io.ReadFull(0x55ef18e25520, 0xc4202e29c0, 0xc42027e038, 0x9, 0x9, 0x55ef16cf4410, 0xc4201bf740, 0xc4202dbe58)
Nov 14 09:38:18 int020522 dockerd[18624]: /usr/local/go/src/io/io.go:327 +0x5a
Nov 14 09:38:18 int020522 dockerd[18624]: github.com/docker/docker/vendor/golang.org/x/net/http2.readFrameHeader(0xc42027e038, 0x9, 0x9, 0x55ef18e25520, 0xc4202e29c0, 0x0, 0x55ef00000000, 0x1007fcb3205fd90, 0xc42006d5b0)
Nov 14 09:38:18 int020522 dockerd[18624]: /root/rpmbuild/BUILD/src/engine/.gopath/src/github.com/docker/docker/vendor/golang.org/x/net/http2/frame.go:237 +0x7d
Nov 14 09:38:18 int020522 dockerd[18624]: github.com/docker/docker/vendor/golang.org/x/net/http2.(*Framer).ReadFrame(0xc42027e000, 0xc42d78e7e0, 0xc42d78e7e0, 0x0, 0x0)
Nov 14 09:38:18 int020522 dockerd[18624]: /root/rpmbuild/BUILD/src/engine/.gopath/src/github.com/docker/docker/vendor/golang.org/x/net/http2/frame.go:492 +0xa6
Nov 14 09:38:18 int020522 dockerd[18624]: github.com/docker/docker/vendor/google.golang.org/grpc/transport.(*http2Client).reader(0xc420282000)
Nov 14 09:38:18 int020522 dockerd[18624]: /root/rpmbuild/BUILD/src/engine/.gopath/src/github.com/docker/docker/vendor/google.golang.org/grpc/transport/http2_client.go:1123 +0x117
Nov 14 09:38:18 int020522 dockerd[18624]: created by github.com/docker/docker/vendor/google.golang.org/grpc/transport.newHTTP2Client
Nov 14 09:38:18 int020522 dockerd[18624]: /root/rpmbuild/BUILD/src/engine/.gopath/src/github.com/docker/docker/vendor/google.golang.org/grpc/transport/http2_client.go:265 +0xb41
Nov 14 09:38:18 int020522 dockerd[18624]: goroutine 85 [select, 95 minutes]:
Nov 14 09:38:18 int020522 dockerd[18624]: github.com/docker/docker/vendor/google.golang.org/grpc/transport.(*controlBuffer).get(0xc420198740, 0x1, 0x0, 0x0, 0x0, 0x0)
Nov 14 09:38:18 int020522 dockerd[18624]: /root/rpmbuild/BUILD/src/engine/.gopath/src/github.com/docker/docker/vendor/google.golang.org/grpc/transport/controlbuf.go:289 +0x135
Nov 14 09:38:18 int020522 dockerd[18624]: github.com/docker/docker/vendor/google.golang.org/grpc/transport.(*loopyWriter).run(0xc4202bacc0)
Nov 14 09:38:18 int020522 dockerd[18624]: /root/rpmbuild/BUILD/src/engine/.gopath/src/github.com/docker/docker/vendor/google.golang.org/grpc/transport/controlbuf.go:374 +0x1be
Nov 14 09:38:18 int020522 dockerd[18624]: github.com/docker/docker/vendor/google.golang.org/grpc/transport.newHTTP2Client.func3(0xc420282000)
Nov 14 09:38:18 int020522 dockerd[18624]: /root/rpmbuild/BUILD/src/engine/.gopath/src/github.com/docker/docker/vendor/google.golang.org/grpc/transport/http2_client.go:298 +0x7e
Nov 14 09:38:18 int020522 dockerd[18624]: created by github.com/docker/docker/vendor/google.golang.org/grpc/transport.newHTTP2Client
Nov 14 09:38:18 int020522 dockerd[18624]: /root/rpmbuild/BUILD/src/engine/.gopath/src/github.com/docker/docker/vendor/google.golang.org/grpc/transport/http2_client.go:296 +0xc91
Nov 14 09:38:18 int020522 dockerd[18624]: goroutine 86 [select, 33065 minutes]:
Nov 14 09:38:18 int020522 dockerd[18624]: github.com/docker/docker/libcontainerd.(*client).processEventStream(0xc420390000, 0x55ef18e49460, 0xc4200c7fc0)
Nov 14 09:38:18 int020522 dockerd[18624]: /root/rpmbuild/BUILD/src/engine/.gopath/src/github.com/docker/docker/libcontainerd/client_daemon.go:751 +0x379
Nov 14 09:38:18 int020522 dockerd[18624]: created by github.com/docker/docker/libcontainerd.(*remote).NewClient
Nov 14 09:38:18 int020522 dockerd[18624]: /root/rpmbuild/BUILD/src/engine/.gopath/src/github.com/docker/docker/libcontainerd/remote_daemon.go:136 +0x24b
Nov 14 09:38:18 int020522 dockerd[18624]: goroutine 45 [select, 33065 minutes]:
Nov 14 09:38:18 int020522 dockerd[18624]: github.com/docker/docker/vendor/google.golang.org/grpc.newClientStream.func5(0xc42025e000, 0xc420014200, 0x55ef18e49520, 0xc420338150)
Nov 14 09:38:18 int020522 dockerd[18624]: /root/rpmbuild/BUILD/src/engine/.gopath/src/github.com/docker/docker/vendor/google.golang.org/grpc/stream.go:311 +0x100
Nov 14 09:38:18 int020522 dockerd[18624]: created by github.com/docker/docker/vendor/google.golang.org/grpc.newClientStream
Nov 14 09:38:18 int020522 dockerd[18624]: /root/rpmbuild/BUILD/src/engine/.gopath/src/github.com/docker/docker/vendor/google.golang.org/grpc/stream.go:310 +0xa78
Nov 14 09:38:18 int020522 dockerd[18624]: goroutine 46 [select, 33065 minutes]:
Nov 14 09:38:18 int020522 dockerd[18624]: github.com/docker/docker/vendor/google.golang.org/grpc/transport.(*recvBufferReader).read(0xc42032b720, 0xc42033a9f0, 0x5, 0x5, 0x65, 0x1d, 0x50)
Nov 14 09:38:18 int020522 dockerd[18624]: /root/rpmbuild/BUILD/src/engine/.gopath/src/github.com/docker/docker/vendor/google.golang.org/grpc/transport/transport.go:142 +0x1eb
Nov 14 09:38:18 int020522 dockerd[18624]: github.com/docker/docker/vendor/google.golang.org/grpc/transport.(*recvBufferReader).Read(0xc42032b720, 0xc42033a9f0, 0x5, 0x5, 0x55ef172f6d01, 0xc42003e3a0, 0xc4202dcae0)
Nov 14 09:38:18 int020522 dockerd[18624]: /root/rpmbuild/BUILD/src/engine/.gopath/src/github.com/docker/docker/vendor/google.golang.org/grpc/transport/transport.go:131 +0x69
Nov 14 09:38:18 int020522 dockerd[18624]: github.com/docker/docker/vendor/google.golang.org/grpc/transport.(*transportReader).Read(0xc420338240, 0xc42033a9f0, 0x5, 0x5, 0x65, 0xc4202dcb20, 0x55ef1736618a)
Nov 14 09:38:18 int020522 dockerd[18624]: /root/rpmbuild/BUILD/src/engine/.gopath/src/github.com/docker/docker/vendor/google.golang.org/grpc/transport/transport.go:369 +0x57
Nov 14 09:38:18 int020522 dockerd[18624]: io.ReadAtLeast(0x55ef18e283e0, 0xc420338240, 0xc42033a9f0, 0x5, 0x5, 0x5, 0xc420282000, 0xc420540000, 0xc400000005)
Nov 14 09:38:18 int020522 dockerd[18624]: /usr/local/go/src/io/io.go:309 +0x88
Nov 14 09:38:18 int020522 dockerd[18624]: io.ReadFull(0x55ef18e283e0, 0xc420338240, 0xc42033a9f0, 0x5, 0x5, 0xc4202dcbf0, 0x55ef16cadf8f, 0x55ef18b0eba0)
Nov 14 09:38:18 int020522 dockerd[18624]: /usr/local/go/src/io/io.go:327 +0x5a
Nov 14 09:38:18 int020522 dockerd[18624]: github.com/docker/docker/vendor/google.golang.org/grpc/transport.(*Stream).Read(0xc420540000, 0xc42033a9f0, 0x5, 0x5, 0xc4205e20a0, 0x91, 0x91)
Nov 14 09:38:18 int020522 dockerd[18624]: /root/rpmbuild/BUILD/src/engine/.gopath/src/github.com/docker/docker/vendor/google.golang.org/grpc/transport/transport.go:353 +0xc1
Nov 14 09:38:18 int020522 dockerd[18624]: github.com/docker/docker/vendor/google.golang.org/grpc.(*parser).recvMsg(0xc42033a9e0, 0x1000000, 0xc4202ecd80, 0x3, 0xc4202dcf08, 0xc4202ecd80, 0xc4202d64a0, 0xc4202dcec8)
Nov 14 09:38:18 int020522 dockerd[18624]: /root/rpmbuild/BUILD/src/engine/.gopath/src/github.com/docker/docker/vendor/google.golang.org/grpc/rpc_util.go:452 +0x67
Nov 14 09:38:18 int020522 dockerd[18624]: github.com/docker/docker/vendor/google.golang.org/grpc.recv(0xc42033a9e0, 0x7fcb3200cea0, 0x55ef19a3a188, 0xc420540000, 0x0, 0x0, 0x55ef18c98f40, 0xc420376dc0, 0x1000000, 0x0, ...)
Nov 14 09:38:18 int020522 dockerd[18624]: /root/rpmbuild/BUILD/src/engine/.gopath/src/github.com/docker/docker/vendor/google.golang.org/grpc/rpc_util.go:561 +0x4f
Nov 14 09:38:18 int020522 dockerd[18624]: github.com/docker/docker/vendor/google.golang.org/grpc.(*csAttempt).recvMsg(0xc420542000, 0x55ef18c98f40, 0xc420376dc0, 0x0, 0x0)
Nov 14 09:38:18 int020522 dockerd[18624]: /root/rpmbuild/BUILD/src/engine/.gopath/src/github.com/docker/docker/vendor/google.golang.org/grpc/stream.go:529 +0x134
Nov 14 09:38:18 int020522 dockerd[18624]: github.com/docker/docker/vendor/google.golang.org/grpc.(*clientStream).RecvMsg(0xc420014200, 0x55ef18c98f40, 0xc420376dc0, 0xc4202b6900, 0x1)
Nov 14 09:38:18 int020522 dockerd[18624]: /root/rpmbuild/BUILD/src/engine/.gopath/src/github.com/docker/docker/vendor/google.golang.org/grpc/stream.go:395 +0x45
Nov 14 09:38:18 int020522 dockerd[18624]: github.com/docker/docker/vendor/github.com/containerd/containerd/api/services/events/v1.(*eventsSubscribeClient).Recv(0xc4203f8c70, 0x0, 0x0, 0x0)
Nov 14 09:38:18 int020522 dockerd[18624]: /root/rpmbuild/BUILD/src/engine/.gopath/src/github.com/docker/docker/vendor/github.com/containerd/containerd/api/services/events/v1/events.pb.go:209 +0x64
Nov 14 09:38:18 int020522 dockerd[18624]: github.com/docker/docker/vendor/github.com/containerd/containerd.(*eventRemote).Subscribe.func1(0xc4202d7440, 0x55ef18e54fa0, 0xc4203f8c70, 0xc4202b6960, 0x55ef18e49460, 0xc4200c7fc0)
Nov 14 09:38:18 int020522 dockerd[18624]: /root/rpmbuild/BUILD/src/engine/.gopath/src/github.com/docker/docker/vendor/github.com/containerd/containerd/events.go:99 +0x7a
Nov 14 09:38:18 int020522 dockerd[18624]: created by github.com/docker/docker/vendor/github.com/containerd/containerd.(*eventRemote).Subscribe
Nov 14 09:38:18 int020522 dockerd[18624]: /root/rpmbuild/BUILD/src/engine/.gopath/src/github.com/docker/docker/vendor/github.com/containerd/containerd/events.go:95 +0x1bb
Nov 14 09:38:48 int020522 dockerd[18624]: sync duration of 11.076656207s, expected less than 1s

This dump only appears on one of the 3 managers, on the other 2 there are logs such as:

Nov 14 09:37:38 int020521 dockerd[849]: time="2018-11-14T09:37:38.351718994+01:00" level=info msg="manager selected by agent for new session: { }" module=node/agent node.id=eyfb92om7v0g8osi2i93rruy0
Nov 14 09:37:38 int020521 dockerd[849]: time="2018-11-14T09:37:38.351767611+01:00" level=info msg="waiting 356.630906ms before registering session" module=node/agent node.id=eyfb92om7v0g8osi2i93rruy0
Nov 14 09:37:39 int020521 dockerd[849]: time="2018-11-14T09:37:39.557138300+01:00" level=warning msg="memberlist: Refuting a suspect message (from: 79b85c7e9512)"
Nov 14 09:37:43 int020521 dockerd[849]: time="2018-11-14T09:37:43.542961618+01:00" level=info msg="memberlist: Suspect 0a9e7a91e12a has failed, no acks received"
Nov 14 09:37:51 int020521 dockerd[849]: time="2018-11-14T09:37:51.484614862+01:00" level=error msg="heartbeat to manager { } failed" error="rpc error: code = DeadlineExceeded desc = context deadline exceeded" method="(*session).heartbeat"
Nov 14 09:37:57 int020521 dockerd[849]: time="2018-11-14T09:37:57.651680868+01:00" level=error msg="error receiving response" error="rpc error: code = DeadlineExceeded desc = context deadline exceeded"
Nov 14 09:38:00 int020521 dockerd[849]: time="2018-11-14T09:38:00.007012307+01:00" level=error msg="error while reading from stream" error="rpc error: code = Canceled desc = context canceled"
Nov 14 09:38:00 int020521 dockerd[849]: time="2018-11-14T09:38:00.007214746+01:00" level=error msg="error while reading from stream" error="rpc error: code = Canceled desc = context canceled"
Nov 14 09:38:00 int020521 dockerd[849]: time="2018-11-14T09:38:00.007277176+01:00" level=error msg="error while reading from stream" error="rpc error: code = Canceled desc = context canceled"
Nov 14 09:38:00 int020521 dockerd[849]: time="2018-11-14T09:38:00.684388095+01:00" level=error msg="error receiving response" error="rpc error: code = DeadlineExceeded desc = context deadline exceeded"
Nov 14 09:38:00 int020521 dockerd[849]: time="2018-11-14T09:38:00.792709786+01:00" level=info msg="memberlist: Suspect 4b1cf53873fb has failed, no acks received"
Nov 14 09:38:03 int020521 dockerd[849]: time="2018-11-14T09:38:03.792478217+01:00" level=error msg="Error getting tasks: rpc error: code = DeadlineExceeded desc = context deadline exceeded"
Nov 14 09:38:03 int020521 dockerd[849]: time="2018-11-14T09:38:03.792618812+01:00" level=error msg="Handler for GET /tasks returned error: rpc error: code = DeadlineExceeded desc = context deadline exceeded"
Nov 14 09:38:04 int020521 dockerd[849]: time="2018-11-14T09:38:04.282694609+01:00" level=error msg="error while reading from stream" error="rpc error: code = Canceled desc = context canceled"
Nov 14 09:38:06 int020521 dockerd[849]: time="2018-11-14T09:38:06.554832190+01:00" level=error msg="error receiving response" error="rpc error: code = DeadlineExceeded desc = context deadline exceeded"
Nov 14 09:38:06 int020521 dockerd[849]: time="2018-11-14T09:38:06.555065184+01:00" level=error msg="error while reading from stream" error="rpc error: code = Canceled desc = context canceled"
Nov 14 09:38:06 int020521 dockerd[849]: time="2018-11-14T09:38:06.555159784+01:00" level=error msg="error while reading from stream" error="rpc error: code = Canceled desc = context canceled"
Nov 14 09:38:09 int020521 dockerd[849]: time="2018-11-14T09:38:09.454666203+01:00" level=error msg="error receiving response" error="rpc error: code = DeadlineExceeded desc = context deadline exceeded"
Nov 14 09:38:10 int020521 dockerd[849]: time="2018-11-14T09:38:10.636116137+01:00" level=warning msg="memberlist: Failed to push local state: write tcp 10.2.5.33:7946->10.2.5.37:33812: i/o timeout from=10.2.5.37:33812"
Nov 14 09:38:16 int020521 dockerd[849]: time="2018-11-14T09:38:16.820984395+01:00" level=error msg="error while reading from stream" error="rpc error: code = Canceled desc = context canceled"
Nov 14 09:38:17 int020521 dockerd[849]: time="2018-11-14T09:38:17.378482227+01:00" level=warning msg="memberlist: Failed to push local state: write tcp 10.2.5.33:7946->10.2.5.36:58784: i/o timeout from=10.2.5.36:58784"
Nov 14 09:38:17 int020521 dockerd[849]: time="2018-11-14T09:38:17.378925104+01:00" level=warning msg="memberlist: Failed to push local state: write tcp 10.2.5.33:7946->10.2.5.34:38172: i/o timeout from=10.2.5.34:38172"
Nov 14 09:38:17 int020521 dockerd[849]: time="2018-11-14T09:38:17.379366371+01:00" level=warning msg="memberlist: Refuting a suspect message (from: 79b85c7e9512)"
Nov 14 09:38:18 int020521 dockerd[849]: time="2018-11-14T09:38:18.541191141+01:00" level=error msg="error while reading from stream" error="rpc error: code = Canceled desc = context canceled"
Nov 14 09:38:18 int020521 dockerd[849]: time="2018-11-14T09:38:18.541443509+01:00" level=error msg="Error getting services: rpc error: code = DeadlineExceeded desc = context deadline exceeded"
Nov 14 09:38:18 int020521 dockerd[849]: time="2018-11-14T09:38:18.541518401+01:00" level=error msg="Handler for GET /v1.22/services returned error: rpc error: code = DeadlineExceeded desc = context deadline exceeded"
Nov 14 09:38:18 int020521 dockerd[849]: time="2018-11-14T09:38:18.590921998+01:00" level=error msg="error while reading from stream" error="rpc error: code = Canceled desc = context canceled"
Nov 14 09:38:18 int020521 dockerd[849]: time="2018-11-14T09:38:18.684823084+01:00" level=error msg="error while reading from stream" error="rpc error: code = Canceled desc = context canceled"
Nov 14 09:38:18 int020521 dockerd[849]: time="2018-11-14T09:38:18.783440428+01:00" level=info msg="Node 29fdf1feb7d0/10.2.5.32, left gossip cluster"
Nov 14 09:38:18 int020521 dockerd[849]: time="2018-11-14T09:38:18.843426761+01:00" level=info msg="Node 29fdf1feb7d0 change state NodeActive --> NodeFailed"
Nov 14 09:38:18 int020521 dockerd[849]: time="2018-11-14T09:38:18.922176988+01:00" level=info msg="Node 29fdf1feb7d0/10.2.5.32, added to failed nodes list"
Nov 14 09:38:18 int020521 dockerd[849]: time="2018-11-14T09:38:18.939693741+01:00" level=info msg="Node 29fdf1feb7d0/10.2.5.32, joined gossip cluster"
Nov 14 09:38:19 int020521 dockerd[849]: time="2018-11-14T09:38:18.975033585+01:00" level=info msg="Node 29fdf1feb7d0 change state NodeFailed --> NodeActive"
Nov 14 09:38:19 int020521 dockerd[849]: time="2018-11-14T09:38:18.975284620+01:00" level=info msg="Node 29fdf1feb7d0/10.2.5.32, left gossip cluster"
Nov 14 09:38:19 int020521 dockerd[849]: time="2018-11-14T09:38:19.171412844+01:00" level=info msg="Node 29fdf1feb7d0 change state NodeActive --> NodeFailed"
Nov 14 09:38:19 int020521 dockerd[849]: time="2018-11-14T09:38:19.246547054+01:00" level=warning msg="memberlist: Failed to push local state: write tcp 10.2.5.33:7946->10.2.5.32:54448: i/o timeout from=10.2.5.32:54448"
Nov 14 09:38:19 int020521 dockerd[849]: time="2018-11-14T09:38:19.259079148+01:00" level=info msg="Node 29fdf1feb7d0/10.2.5.32, added to failed nodes list"
Nov 14 09:38:19 int020521 dockerd[849]: time="2018-11-14T09:38:19.259849145+01:00" level=info msg="memberlist: Suspect 0a9e7a91e12a has failed, no acks received"
Nov 14 09:38:19 int020521 dockerd[849]: time="2018-11-14T09:38:19.261016113+01:00" level=info msg="Node 29fdf1feb7d0/10.2.5.32, joined gossip cluster"
Nov 14 09:38:19 int020521 dockerd[849]: time="2018-11-14T09:38:19.261254928+01:00" level=info msg="Node 29fdf1feb7d0 change state NodeFailed --> NodeActive"
Nov 14 09:38:19 int020521 dockerd[849]: time="2018-11-14T09:38:19.771817478+01:00" level=error msg="Error getting nodes: rpc error: code = DeadlineExceeded desc = context deadline exceeded"
Nov 14 09:38:19 int020521 dockerd[849]: time="2018-11-14T09:38:19.771902515+01:00" level=error msg="Handler for GET /v1.35/nodes returned error: rpc error: code = DeadlineExceeded desc = context deadline exceeded"
Nov 14 09:38:22 int020521 dockerd[849]: time="2018-11-14T09:38:22.202186309+01:00" level=warning msg="NetworkDB stats int020521(79b85c7e9512) - healthscore:5 (connectivity issues)"
Nov 14 09:38:29 int020521 dockerd[849]: time="2018-11-14T09:38:29.300432239+01:00" level=info msg="memberlist: Suspect 29fdf1feb7d0 has failed, no acks received"
Nov 14 09:38:30 int020521 dockerd[849]: time="2018-11-14T09:38:30.792494913+01:00" level=info msg="Node 29fdf1feb7d0/10.2.5.32, left gossip cluster"
Nov 14 09:38:30 int020521 dockerd[849]: time="2018-11-14T09:38:30.792552538+01:00" level=info msg="Node 29fdf1feb7d0 change state NodeActive --> NodeFailed"

Seems that the managers are having trouble communicating, but i am unsure as to why.

I would appriciate any assistance you can give in solving this issue.

Output of docker version:

Client:
 Version:           18.06.0-ce
 API version:       1.38
 Go version:        go1.10.3
 Git commit:        0ffa825
 Built:             Wed Jul 18 19:08:18 2018
 OS/Arch:           linux/amd64
 Experimental:      false

Server:
 Engine:
  Version:          18.06.0-ce
  API version:      1.38 (minimum version 1.12)
  Go version:       go1.10.3
  Git commit:       0ffa825
  Built:            Wed Jul 18 19:10:42 2018
  OS/Arch:          linux/amd64
  Experimental:     true

Output of docker info:

Containers: 21
 Running: 6
 Paused: 0
 Stopped: 15
Images: 11
Server Version: 18.06.0-ce
Storage Driver: overlay2
 Backing Filesystem: xfs
 Supports d_type: true
 Native Overlay Diff: true
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins:
 Volume: local
 Network: bridge host ipvlan macvlan null overlay weaveworks/net-plugin:latest_release
 Log: awslogs fluentd gcplogs gelf journald json-file logentries splunk syslog
Swarm: active
 NodeID: 04wzeypxq4nz49yz4uhf6ydc0
 Is Manager: true
 ClusterID: j27guugg9bq6zk2k2a9mp20pc
 Managers: 3
 Nodes: 8
 Orchestration:
  Task History Retention Limit: 5
 Raft:
  Snapshot Interval: 10000
  Number of Old Snapshots to Retain: 0
  Heartbeat Tick: 1
  Election Tick: 10
 Dispatcher:
  Heartbeat Period: 5 seconds
 CA Configuration:
  Expiry Duration: 3 months
  Force Rotate: 0
 Autolock Managers: false
 Root Rotation In Progress: false
 Node Address: 10.2.5.32
 Manager Addresses:
  10.2.5.32:2377
  10.2.5.33:2377
  10.2.5.34:2377
Runtimes: runc
Default Runtime: runc
Init Binary: docker-init
containerd version: d64c661f1d51c48782c9cec8fda7604785f93587
runc version: 69663f0bd4b60df09991c08812a60108003fa340
init version: fec3683
Security Options:
 seccomp
  Profile: default
Kernel Version: 3.10.0-862.9.1.el7.x86_64
Operating System: CentOS Linux 7 (Core)
OSType: linux
Architecture: x86_64
CPUs: 2
Total Memory: 1.795GiB
Name: int020520
ID: UHSX:ENDP:X2RM:3XBV:IM2Y:YXPO:UIXT:RA47:EPT6:DYVG:VUZO:G3ZQ
Docker Root Dir: /var/lib/docker
Debug Mode (client): false
Debug Mode (server): false
Registry: https://index.docker.io/v1/
Labels:
Experimental: true
Insecure Registries:
 docker.bbn.intergral.com:5000
 docker.bbn.intergral.com:5050
 127.0.0.0/8
Live Restore Enabled: false

WARNING: bridge-nf-call-ip6tables is disabled

Additional environment details (AWS, VirtualBox, physical, etc.): Virtualization: kvm Operating System: CentOS Linux 7 (Core) CPE OS Name: cpe:/o:centos:centos: Kernel: Linux 3.10.0-862.9.1.el7.x86_64 Architecture: x86-64

moby/moby

Answer questions bugwheels94

I am also getting same issue when the heartbeat is getting failed. The service 4adb11869318 on manager node and the service e7b284330420 on worker node had the issues very frequently

My manager node logs:

Apr 18 09:49:22 kms-mediator dockerd[25059]: time="2019-04-18T09:49:22.467787577Z" level=info msg="memberlist: Suspect e7b284330420 has failed, no acks received"
Apr 18 09:49:23 kms-mediator dockerd[25059]: time="2019-04-18T09:49:23.267884494Z" level=warning msg="memberlist: Refuting a suspect message (from: e7b284330420)"
Apr 18 09:49:25 kms-mediator dockerd[25059]: time="2019-04-18T09:49:25.467471174Z" level=warning msg="memberlist: Was able to connect to e7b284330420 but other probes failed, network may be misconfigured"
Apr 18 09:49:25 kms-mediator dockerd[25059]: time="2019-04-18T09:49:25.728220236Z" level=info msg="memberlist: Marking e7b284330420 as failed, suspect timeout reached (0 peer confirmations)"
Apr 18 09:49:25 kms-mediator dockerd[25059]: time="2019-04-18T09:49:25.728834806Z" level=info msg="Node e7b284330420/142.93.61.181, left gossip cluster"
Apr 18 09:49:25 kms-mediator dockerd[25059]: time="2019-04-18T09:49:25.729191439Z" level=info msg="Node e7b284330420 change state NodeActive --> NodeFailed"
Apr 18 09:49:25 kms-mediator dockerd[25059]: time="2019-04-18T09:49:25.731386994Z" level=info msg="Node e7b284330420/142.93.61.181, added to failed nodes list"
Apr 18 09:49:25 kms-mediator kernel: [3376954.372918] IPVS: __ip_vs_del_service: enter
Apr 18 09:49:25 kms-mediator kernel: [3376954.481196] IPVS: __ip_vs_del_service: enter
Apr 18 09:49:25 kms-mediator kernel: [3376954.550956] IPVS: __ip_vs_del_service: enter
Apr 18 09:49:25 kms-mediator kernel: [3376954.622775] IPVS: __ip_vs_del_service: enter
Apr 18 09:49:26 kms-mediator kernel: [3376954.694568] IPVS: __ip_vs_del_service: enter
Apr 18 09:49:26 kms-mediator dockerd[25059]: time="2019-04-18T09:49:26.468973653Z" level=error msg="node: e7b284330420 is unknown to memberlist"
Apr 18 09:49:26 kms-mediator dockerd[25059]: time="2019-04-18T09:49:26.665523578Z" level=info msg="Node e7b284330420/142.93.61.181, joined gossip cluster"
Apr 18 09:49:26 kms-mediator dockerd[25059]: time="2019-04-18T09:49:26.666218716Z" level=info msg="Node e7b284330420 change state NodeFailed --> NodeActive"
Apr 18 09:49:28 kms-mediator dockerd[25059]: time="2019-04-18T09:49:28.467625225Z" level=info msg="memberlist: Suspect e7b284330420 has failed, no acks received"
Apr 18 09:49:30 kms-mediator dockerd[25059]: time="2019-04-18T09:49:30.468476025Z" level=info msg="memberlist: Suspect 3af987f41544 has failed, no acks received"
Apr 18 09:49:32 kms-mediator dockerd[25059]: time="2019-04-18T09:49:32.468450928Z" level=info msg="memberlist: Marking e7b284330420 as failed, suspect timeout reached (0 peer confirmations)"
Apr 18 09:49:32 kms-mediator dockerd[25059]: time="2019-04-18T09:49:32.469189666Z" level=info msg="Node e7b284330420/142.93.61.181, left gossip cluster"
Apr 18 09:49:32 kms-mediator dockerd[25059]: time="2019-04-18T09:49:32.469470452Z" level=info msg="Node e7b284330420 change state NodeActive --> NodeFailed"
Apr 18 09:49:32 kms-mediator dockerd[25059]: time="2019-04-18T09:49:32.470109864Z" level=info msg="Node e7b284330420/142.93.61.181, added to failed nodes list"
Apr 18 09:49:33 kms-mediator dockerd[25059]: time="2019-04-18T09:49:33.469522967Z" level=info msg="memberlist: Suspect e7b284330420 has failed, no acks received"
Apr 18 09:49:33 kms-mediator dockerd[25059]: time="2019-04-18T09:49:33.667422538Z" level=warning msg="NetworkDB stats kms-mediator(4adb11869318) - healthscore:3 (connectivity issues)"
Apr 18 09:49:34 kms-mediator dockerd[25059]: time="2019-04-18T09:49:34.469674875Z" level=info msg="memberlist: Marking 3af987f41544 as failed, suspect timeout reached (0 peer confirmations)"
Apr 18 09:49:34 kms-mediator dockerd[25059]: time="2019-04-18T09:49:34.470319061Z" level=info msg="Node 3af987f41544/134.209.118.8, left gossip cluster"
Apr 18 09:49:34 kms-mediator dockerd[25059]: time="2019-04-18T09:49:34.470637570Z" level=info msg="Node 3af987f41544 change state NodeActive --> NodeFailed"
Apr 18 09:49:34 kms-mediator dockerd[25059]: time="2019-04-18T09:49:34.472656043Z" level=info msg="Node 3af987f41544/134.209.118.8, added to failed nodes list"
Apr 18 09:49:34 kms-mediator kernel: [3376963.113830] IPVS: __ip_vs_del_service: enter
Apr 18 09:49:34 kms-mediator kernel: [3376963.226682] IPVS: __ip_vs_del_service: enter
Apr 18 09:49:34 kms-mediator kernel: [3376963.308286] IPVS: __ip_vs_del_service: enter
Apr 18 09:49:36 kms-mediator dockerd[25059]: time="2019-04-18T09:49:36.869239471Z" level=warning msg="bulk sync to node e7b284330420 failed: failed to send a TCP message during bulk sync: dial tcp 142.93.61.181:7946: i/o timeout"
Apr 18 09:49:38 kms-mediator dockerd[25059]: time="2019-04-18T09:49:38.467811908Z" level=info msg="memberlist: Suspect 3af987f41544 has failed, no acks received"
Apr 18 09:49:41 kms-mediator dockerd[25059]: time="2019-04-18T09:49:41.192632861Z" level=warning msg="failed to deactivate service binding for container registry.1.xsbeeiivmethu6y39sla2rhm4" error="No such container: registry.1.xsbeeiivmethu6y39sla2rhm4" module=node/agent node.id=0ipcceidwwiwvbtt17gzm85qh
Apr 18 09:49:43 kms-mediator dockerd[25059]: time="2019-04-18T09:49:43.871978135Z" level=warning msg="memberlist: Refuting a suspect message (from: 4adb11869318)"
Apr 18 09:49:43 kms-mediator dockerd[25059]: time="2019-04-18T09:49:43.872527373Z" level=info msg="Node 3af987f41544/134.209.118.8, joined gossip cluster"
Apr 18 09:49:43 kms-mediator dockerd[25059]: time="2019-04-18T09:49:43.872804716Z" level=info msg="Node 3af987f41544 change state NodeFailed --> NodeActive"
Apr 18 09:49:50 kms-mediator dockerd[25059]: time="2019-04-18T09:49:50.467724250Z" level=warning msg="memberlist: Was able to connect to 3af987f41544 but other probes failed, network may be misconfigured"
Apr 18 09:49:54 kms-mediator dockerd[25059]: time="2019-04-18T09:49:54.268218780Z" level=warning msg="bulk sync to node 3af987f41544 failed: failed to send a TCP message during bulk sync: dial tcp 134.209.118.8:7946: i/o timeout"
Apr 18 09:49:56 kms-mediator dockerd[25059]: time="2019-04-18T09:49:56.467677675Z" level=warning msg="memberlist: Was able to connect to 3af987f41544 but other probes failed, network may be misconfigured"
Apr 18 09:49:58 kms-mediator systemd-udevd[27120]: Could not generate persistent MAC address for vethbd080e0: No such file or directory
Apr 18 09:49:58 kms-mediator kernel: [3376987.439119] veth6: renamed from veth749d935
Apr 18 09:49:58 kms-mediator kernel: [3376987.439403] device veth6 entered promiscuous mode
Apr 18 09:49:58 kms-mediator systemd-udevd[27132]: Could not generate persistent MAC address for veth168e384: No such file or directory
Apr 18 09:49:58 kms-mediator kernel: [3376987.443324] device veth12a1589 entered promiscuous mode
Apr 18 09:49:58 kms-mediator kernel: [3376987.443389] IPv6: ADDRCONF(NETDEV_UP): veth12a1589: link is not ready
Apr 18 09:49:58 kms-mediator kernel: [3376987.443393] docker_gwbridge: port 5(veth12a1589) entered forwarding state
Apr 18 09:49:58 kms-mediator kernel: [3376987.443405] docker_gwbridge: port 5(veth12a1589) entered forwarding state
Apr 18 09:49:58 kms-mediator systemd-udevd[27135]: Could not generate persistent MAC address for veth12a1589: No such file or directory
Apr 18 09:49:58 kms-mediator containerd[1449]: time="2019-04-18T09:49:58.823363801Z" level=info msg="shim containerd-shim started" address="/containerd-shim/moby/8196bb6167b12c8099c5804c752b5dc37a5d72955979f78e36ed928d8bcb0859/shim.sock" debug=false pid=27142
Apr 18 09:49:58 kms-mediator kernel: [3376987.546195] IPVS: Creating netns size=2192 id=259
Apr 18 09:49:59 kms-mediator kernel: [3376987.859396] eth0: renamed from vethbd080e0
Apr 18 09:49:59 kms-mediator kernel: [3376987.859641] docker_gwbridge: port 5(veth12a1589) entered disabled state
Apr 18 09:49:59 kms-mediator kernel: [3376987.859679] br0: port 4(veth6) entered forwarding state
Apr 18 09:49:59 kms-mediator kernel: [3376987.859692] br0: port 4(veth6) entered forwarding state
Apr 18 09:49:59 kms-mediator kernel: [3376987.955523] eth1: renamed from veth168e384
Apr 18 09:49:59 kms-mediator kernel: [3376987.955779] IPv6: ADDRCONF(NETDEV_CHANGE): veth12a1589: link becomes ready
Apr 18 09:49:59 kms-mediator kernel: [3376987.955811] docker_gwbridge: port 5(veth12a1589) entered forwarding state
Apr 18 09:49:59 kms-mediator kernel: [3376987.955819] docker_gwbridge: port 5(veth12a1589) entered forwarding state
Apr 18 09:49:59 kms-mediator dockerd[25059]: time="2019-04-18T09:49:59.852392328Z" level=info msg="worker gctoo2ifeyh7q36gndsvj8wzw was successfully registered" method="(*Dispatcher).register"
Apr 18 09:50:00 kms-mediator dockerd[25059]: time="2019-04-18T09:50:00.159227736Z" level=info msg="worker mb6891ny99kiw0spipbfdt73x was successfully registered" method="(*Dispatcher).register"
Apr 18 09:50:01 kms-mediator dockerd[25059]: time="2019-04-18T09:50:01.275724633Z" level=error msg="node: e7b284330420 is unknown to memberlist"
Apr 18 09:50:01 kms-mediator dockerd[25059]: time="2019-04-18T09:50:01.466315575Z" level=info msg="Node e7b284330420/142.93.61.181, joined gossip cluster"
Apr 18 09:50:01 kms-mediator dockerd[25059]: time="2019-04-18T09:50:01.467024727Z" level=info msg="Node e7b284330420 change state NodeFailed --> NodeActive"
Apr 18 09:50:01 kms-mediator dockerd[25059]: time="2019-04-18T09:50:01.468190934Z" level=warning msg="memberlist: Was able to connect to 3af987f41544 but other probes failed, network may be misconfigured"
Apr 18 09:50:04 kms-mediator dockerd[25059]: time="2019-04-18T09:50:04.468612334Z" level=warning msg="memberlist: Was able to connect to 3af987f41544 but other probes failed, network may be misconfigured"
Apr 18 09:50:14 kms-mediator kernel: [3377002.874676] br0: port 4(veth6) entered forwarding state
Apr 18 09:50:14 kms-mediator kernel: [3377003.002697] docker_gwbridge: port 5(veth12a1589) entered forwarding state
Apr 18 09:50:31 kms-mediator dockerd[25059]: time="2019-04-18T09:50:31.668648717Z" level=error msg="Bulk sync to node e7b284330420 timed out"
Apr 18 09:50:43 kms-mediator dockerd[25059]: time="2019-04-18T09:50:43.469005548Z" level=error msg="Bulk sync to node 3af987f41544 timed out"

My worker node logs:

Apr 18 09:49:21 janus dockerd[1507]: time="2019-04-18T09:49:21.265113686Z" level=warning msg="memberlist: Was able to connect to 3af987f41544 but other probes failed, network may be misconfigured"
Apr 18 09:49:22 janus dockerd[1507]: time="2019-04-18T09:49:22.266022948Z" level=warning msg="memberlist: Was able to connect to 3af987f41544 but other probes failed, network may be misconfigured"
Apr 18 09:49:23 janus dockerd[1507]: time="2019-04-18T09:49:23.266784400Z" level=info msg="memberlist: Suspect 4adb11869318 has failed, no acks received"
Apr 18 09:49:25 janus dockerd[1507]: time="2019-04-18T09:49:25.139673348Z" level=error msg="heartbeat to manager {0ipcceidwwiwvbtt17gzm85qh 157.230.233.54:2377} failed" error="rpc error: code = DeadlineExceeded desc = context deadline exceeded" method="(*session).heartbeat" module=node/agent node.id=gctoo2ifeyh7q36gndsvj8wzw session.id=ujzh9mwhqrsdwvhd59nx98pg7 sessionID=ujzh9mwhqrsdwvhd59nx98pg7
Apr 18 09:49:25 janus dockerd[1507]: time="2019-04-18T09:49:25.140357481Z" level=error msg="agent: session failed" backoff=100ms error="rpc error: code = DeadlineExceeded desc = context deadline exceeded" module=node/agent node.id=gctoo2ifeyh7q36gndsvj8wzw
Apr 18 09:49:25 janus dockerd[1507]: time="2019-04-18T09:49:25.140808542Z" level=info msg="parsed scheme: \"\"" module=grpc
Apr 18 09:49:25 janus dockerd[1507]: time="2019-04-18T09:49:25.141096434Z" level=info msg="scheme \"\" not registered, fallback to default scheme" module=grpc
Apr 18 09:49:25 janus dockerd[1507]: time="2019-04-18T09:49:25.141603122Z" level=info msg="manager selected by agent for new session: {0ipcceidwwiwvbtt17gzm85qh 157.230.233.54:2377}" module=node/agent node.id=gctoo2ifeyh7q36gndsvj8wzw
Apr 18 09:49:25 janus dockerd[1507]: time="2019-04-18T09:49:25.141734630Z" level=info msg="ccResolverWrapper: sending new addresses to cc: [{157.230.233.54:2377 0  <nil>}]" module=grpc
Apr 18 09:49:25 janus dockerd[1507]: time="2019-04-18T09:49:25.141992006Z" level=info msg="ClientConn switching balancer to \"pick_first\"" module=grpc
Apr 18 09:49:25 janus dockerd[1507]: time="2019-04-18T09:49:25.142073678Z" level=info msg="pickfirstBalancer: HandleSubConnStateChange: 0xc420169ca0, CONNECTING" module=grpc
Apr 18 09:49:25 janus dockerd[1507]: time="2019-04-18T09:49:25.141953965Z" level=info msg="waiting 99.356175ms before registering session" module=node/agent node.id=gctoo2ifeyh7q36gndsvj8wzw
Apr 18 09:49:25 janus dockerd[1507]: time="2019-04-18T09:49:25.267614963Z" level=warning msg="memberlist: Was able to connect to 4adb11869318 but other probes failed, network may be misconfigured"
Apr 18 09:49:26 janus dockerd[1507]: time="2019-04-18T09:49:26.267901488Z" level=info msg="memberlist: Suspect 3af987f41544 has failed, no acks received"
Apr 18 09:49:26 janus dockerd[1507]: time="2019-04-18T09:49:26.468992358Z" level=warning msg="memberlist: Refuting a suspect message (from: e7b284330420)"
Apr 18 09:49:28 janus dockerd[1507]: time="2019-04-18T09:49:28.268244800Z" level=warning msg="memberlist: Was able to connect to 3af987f41544 but other probes failed, network may be misconfigured"
Apr 18 09:49:30 janus dockerd[1507]: time="2019-04-18T09:49:30.241756224Z" level=error msg="agent: session failed" backoff=300ms error="session initiation timed out" module=node/agent node.id=gctoo2ifeyh7q36gndsvj8wzw
Apr 18 09:49:30 janus dockerd[1507]: time="2019-04-18T09:49:30.241890984Z" level=info msg="parsed scheme: \"\"" module=grpc
Apr 18 09:49:30 janus dockerd[1507]: time="2019-04-18T09:49:30.241927218Z" level=info msg="scheme \"\" not registered, fallback to default scheme" module=grpc
Apr 18 09:49:30 janus dockerd[1507]: time="2019-04-18T09:49:30.242272705Z" level=info msg="manager selected by agent for new session: {0ipcceidwwiwvbtt17gzm85qh 157.230.233.54:2377}" module=node/agent node.id=gctoo2ifeyh7q36gndsvj8wzw
Apr 18 09:49:30 janus dockerd[1507]: time="2019-04-18T09:49:30.242317533Z" level=info msg="waiting 244.443604ms before registering session" module=node/agent node.id=gctoo2ifeyh7q36gndsvj8wzw
Apr 18 09:49:30 janus dockerd[1507]: time="2019-04-18T09:49:30.242383708Z" level=info msg="ccResolverWrapper: sending new addresses to cc: [{157.230.233.54:2377 0  <nil>}]" module=grpc
Apr 18 09:49:30 janus dockerd[1507]: time="2019-04-18T09:49:30.242438990Z" level=info msg="ClientConn switching balancer to \"pick_first\"" module=grpc
Apr 18 09:49:30 janus dockerd[1507]: time="2019-04-18T09:49:30.242495184Z" level=info msg="pickfirstBalancer: HandleSubConnStateChange: 0xc420169110, CONNECTING" module=grpc
Apr 18 09:49:30 janus dockerd[1507]: time="2019-04-18T09:49:30.268553787Z" level=info msg="memberlist: Suspect 4adb11869318 has failed, no acks received"
Apr 18 09:49:33 janus dockerd[1507]: time="2019-04-18T09:49:33.268914127Z" level=info msg="memberlist: Suspect 4adb11869318 has failed, no acks received"
Apr 18 09:49:34 janus dockerd[1507]: time="2019-04-18T09:49:34.268903472Z" level=info msg="memberlist: Marking 4adb11869318 as failed, suspect timeout reached (0 peer confirmations)"
Apr 18 09:49:34 janus dockerd[1507]: time="2019-04-18T09:49:34.268993067Z" level=info msg="Node 4adb11869318/157.230.233.54, left gossip cluster"
Apr 18 09:49:34 janus dockerd[1507]: time="2019-04-18T09:49:34.269028860Z" level=info msg="Node 4adb11869318 change state NodeActive --> NodeFailed"
Apr 18 09:49:34 janus dockerd[1507]: time="2019-04-18T09:49:34.269333679Z" level=info msg="Node 4adb11869318/157.230.233.54, added to failed nodes list"
Apr 18 09:49:34 janus kernel: [3504251.394876] IPVS: __ip_vs_del_service: enter
Apr 18 09:49:34 janus kernel: [3504251.495917] IPVS: __ip_vs_del_service: enter
Apr 18 09:49:34 janus kernel: [3504251.575377] IPVS: __ip_vs_del_service: enter
Apr 18 09:49:34 janus kernel: [3504251.659992] IPVS: __ip_vs_del_service: enter
Apr 18 09:49:35 janus dockerd[1507]: time="2019-04-18T09:49:35.487208278Z" level=error msg="agent: session failed" backoff=700ms error="session initiation timed out" module=node/agent node.id=gctoo2ifeyh7q36gndsvj8wzw
Apr 18 09:49:35 janus dockerd[1507]: time="2019-04-18T09:49:35.487868265Z" level=info msg="parsed scheme: \"\"" module=grpc
Apr 18 09:49:35 janus dockerd[1507]: time="2019-04-18T09:49:35.488186744Z" level=info msg="scheme \"\" not registered, fallback to default scheme" module=grpc
Apr 18 09:49:35 janus dockerd[1507]: time="2019-04-18T09:49:35.488704776Z" level=info msg="manager selected by agent for new session: {0ipcceidwwiwvbtt17gzm85qh 157.230.233.54:2377}" module=node/agent node.id=gctoo2ifeyh7q36gndsvj8wzw
Apr 18 09:49:35 janus dockerd[1507]: time="2019-04-18T09:49:35.489027214Z" level=info msg="waiting 365.409385ms before registering session" module=node/agent node.id=gctoo2ifeyh7q36gndsvj8wzw
Apr 18 09:49:35 janus dockerd[1507]: time="2019-04-18T09:49:35.488844574Z" level=info msg="ccResolverWrapper: sending new addresses to cc: [{157.230.233.54:2377 0  <nil>}]" module=grpc
Apr 18 09:49:35 janus dockerd[1507]: time="2019-04-18T09:49:35.489582889Z" level=info msg="ClientConn switching balancer to \"pick_first\"" module=grpc
Apr 18 09:49:35 janus dockerd[1507]: time="2019-04-18T09:49:35.489939617Z" level=info msg="pickfirstBalancer: HandleSubConnStateChange: 0xc421378040, CONNECTING" module=grpc
Apr 18 09:49:37 janus dockerd[1507]: time="2019-04-18T09:49:37.269649164Z" level=info msg="memberlist: Suspect 3af987f41544 has failed, no acks received"
Apr 18 09:49:38 janus do-agent[1404]: 2019/04/18 09:49:38 Sending metrics to DigitalOcean: Post https://nyc1.sonar.digitalocean.com/v1/metrics/droplet_id/135362746: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)
Apr 18 09:49:39 janus dockerd[1507]: time="2019-04-18T09:49:39.438069338Z" level=warning msg="memberlist: Push/Pull with 4adb11869318 failed: dial tcp 157.230.233.54:7946: i/o timeout"
Apr 18 09:49:40 janus dockerd[1507]: time="2019-04-18T09:49:40.854892764Z" level=error msg="agent: session failed" backoff=1.5s error="session initiation timed out" module=node/agent node.id=gctoo2ifeyh7q36gndsvj8wzw
Apr 18 09:49:40 janus dockerd[1507]: time="2019-04-18T09:49:40.855429738Z" level=info msg="parsed scheme: \"\"" module=grpc
Apr 18 09:49:40 janus dockerd[1507]: time="2019-04-18T09:49:40.855681824Z" level=info msg="scheme \"\" not registered, fallback to default scheme" module=grpc
Apr 18 09:49:40 janus dockerd[1507]: time="2019-04-18T09:49:40.856119073Z" level=info msg="manager selected by agent for new session: {0ipcceidwwiwvbtt17gzm85qh 157.230.233.54:2377}" module=node/agent node.id=gctoo2ifeyh7q36gndsvj8wzw
Apr 18 09:49:40 janus dockerd[1507]: time="2019-04-18T09:49:40.856362886Z" level=info msg="waiting 432.166258ms before registering session" module=node/agent node.id=gctoo2ifeyh7q36gndsvj8wzw
Apr 18 09:49:40 janus dockerd[1507]: time="2019-04-18T09:49:40.856255184Z" level=info msg="ccResolverWrapper: sending new addresses to cc: [{157.230.233.54:2377 0  <nil>}]" module=grpc
Apr 18 09:49:40 janus dockerd[1507]: time="2019-04-18T09:49:40.856800015Z" level=info msg="ClientConn switching balancer to \"pick_first\"" module=grpc
Apr 18 09:49:40 janus dockerd[1507]: time="2019-04-18T09:49:40.857067979Z" level=info msg="pickfirstBalancer: HandleSubConnStateChange: 0xc421378a70, CONNECTING" module=grpc
Apr 18 09:49:41 janus dockerd[1507]: time="2019-04-18T09:49:41.269968071Z" level=info msg="memberlist: Marking 3af987f41544 as failed, suspect timeout reached (0 peer confirmations)"
Apr 18 09:49:41 janus dockerd[1507]: time="2019-04-18T09:49:41.270588789Z" level=info msg="Node 3af987f41544/134.209.118.8, left gossip cluster"
Apr 18 09:49:41 janus dockerd[1507]: time="2019-04-18T09:49:41.270947436Z" level=info msg="Node 3af987f41544 change state NodeActive --> NodeFailed"
Apr 18 09:49:41 janus dockerd[1507]: time="2019-04-18T09:49:41.271527701Z" level=info msg="Node 3af987f41544/134.209.118.8, added to failed nodes list"
Apr 18 09:49:41 janus kernel: [3504258.396729] IPVS: __ip_vs_del_service: enter
Apr 18 09:49:41 janus kernel: [3504258.486638] IPVS: __ip_vs_del_service: enter
Apr 18 09:49:41 janus kernel: [3504258.575787] IPVS: __ip_vs_del_service: enter
Apr 18 09:49:42 janus dockerd[1507]: time="2019-04-18T09:49:42.270019328Z" level=info msg="memberlist: Suspect 3af987f41544 has failed, no acks received"
Apr 18 09:49:45 janus dockerd[1507]: time="2019-04-18T09:49:45.142356141Z" level=warning msg="Failed to dial 157.230.233.54:2377: grpc: the connection is closing; please retry." module=grpc
Apr 18 09:49:46 janus dockerd[1507]: time="2019-04-18T09:49:46.288940894Z" level=error msg="agent: session failed" backoff=3.1s error="session initiation timed out" module=node/agent node.id=gctoo2ifeyh7q36gndsvj8wzw
Apr 18 09:49:46 janus dockerd[1507]: time="2019-04-18T09:49:46.289051117Z" level=info msg="parsed scheme: \"\"" module=grpc
Apr 18 09:49:46 janus dockerd[1507]: time="2019-04-18T09:49:46.289067940Z" level=info msg="scheme \"\" not registered, fallback to default scheme" module=grpc
Apr 18 09:49:46 janus dockerd[1507]: time="2019-04-18T09:49:46.289289871Z" level=info msg="manager selected by agent for new session: {0ipcceidwwiwvbtt17gzm85qh 157.230.233.54:2377}" module=node/agent node.id=gctoo2ifeyh7q36gndsvj8wzw
Apr 18 09:49:46 janus dockerd[1507]: time="2019-04-18T09:49:46.289332266Z" level=info msg="waiting 2.506216736s before registering session" module=node/agent node.id=gctoo2ifeyh7q36gndsvj8wzw
Apr 18 09:49:46 janus dockerd[1507]: time="2019-04-18T09:49:46.289365687Z" level=info msg="ccResolverWrapper: sending new addresses to cc: [{157.230.233.54:2377 0  <nil>}]" module=grpc
Apr 18 09:49:46 janus dockerd[1507]: time="2019-04-18T09:49:46.289382561Z" level=info msg="ClientConn switching balancer to \"pick_first\"" module=grpc
Apr 18 09:49:46 janus dockerd[1507]: time="2019-04-18T09:49:46.289446337Z" level=info msg="pickfirstBalancer: HandleSubConnStateChange: 0xc420d15910, CONNECTING" module=grpc
Apr 18 09:49:50 janus dockerd[1507]: time="2019-04-18T09:49:50.242871855Z" level=warning msg="Failed to dial 157.230.233.54:2377: grpc: the connection is closing; please retry." module=grpc
Apr 18 09:49:53 janus dockerd[1507]: time="2019-04-18T09:49:53.796030247Z" level=error msg="agent: session failed" backoff=6.3s error="session initiation timed out" module=node/agent node.id=gctoo2ifeyh7q36gndsvj8wzw
Apr 18 09:49:53 janus dockerd[1507]: time="2019-04-18T09:49:53.796147710Z" level=info msg="parsed scheme: \"\"" module=grpc
Apr 18 09:49:53 janus dockerd[1507]: time="2019-04-18T09:49:53.796164114Z" level=info msg="scheme \"\" not registered, fallback to default scheme" module=grpc
Apr 18 09:49:53 janus dockerd[1507]: time="2019-04-18T09:49:53.796396647Z" level=info msg="manager selected by agent for new session: {0ipcceidwwiwvbtt17gzm85qh 157.230.233.54:2377}" module=node/agent node.id=gctoo2ifeyh7q36gndsvj8wzw
Apr 18 09:49:53 janus dockerd[1507]: time="2019-04-18T09:49:53.796442884Z" level=info msg="waiting 5.029222702s before registering session" module=node/agent node.id=gctoo2ifeyh7q36gndsvj8wzw
Apr 18 09:49:53 janus dockerd[1507]: time="2019-04-18T09:49:53.796478551Z" level=info msg="ccResolverWrapper: sending new addresses to cc: [{157.230.233.54:2377 0  <nil>}]" module=grpc
Apr 18 09:49:53 janus dockerd[1507]: time="2019-04-18T09:49:53.796497536Z" level=info msg="ClientConn switching balancer to \"pick_first\"" module=grpc
Apr 18 09:49:53 janus dockerd[1507]: time="2019-04-18T09:49:53.796563651Z" level=info msg="pickfirstBalancer: HandleSubConnStateChange: 0xc4214ed5b0, CONNECTING" module=grpc
Apr 18 09:49:55 janus dockerd[1507]: time="2019-04-18T09:49:55.490382209Z" level=warning msg="Failed to dial 157.230.233.54:2377: grpc: the connection is closing; please retry." module=grpc
Apr 18 09:49:59 janus dockerd[1507]: time="2019-04-18T09:49:59.265377050Z" level=error msg="Failed to join memberlist [157.230.233.54] on retry: 1 error(s) occurred:\n\n* Failed to join 157.230.233.54: dial tcp 157.230.233.54:7946: i/o timeout"
Apr 18 09:49:59 janus dockerd[1507]: time="2019-04-18T09:49:59.801429986Z" level=info msg="pickfirstBalancer: HandleSubConnStateChange: 0xc4214ed5b0, READY" module=grpc

docker version

Client:
 Version:           18.09.3
 API version:       1.39
 Go version:        go1.10.8
 Git commit:        774a1f4
 Built:             Thu Feb 28 06:40:58 2019
 OS/Arch:           linux/amd64
 Experimental:      false

Server: Docker Engine - Community
 Engine:
  Version:          18.09.3
  API version:      1.39 (minimum version 1.12)
  Go version:       go1.10.8
  Git commit:       774a1f4
  Built:            Thu Feb 28 05:59:55 2019
  OS/Arch:          linux/amd64
  Experimental:     false

docker info

Containers: 4
 Running: 4
 Paused: 0
 Stopped: 0
Images: 75
Server Version: 18.09.3
Storage Driver: overlay2
 Backing Filesystem: extfs
 Supports d_type: true
 Native Overlay Diff: true
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins:
 Volume: local
 Network: bridge host macvlan null overlay
 Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
Swarm: active
 NodeID: 0ipcceidwwiwvbtt17gzm85qh
 Is Manager: true
 ClusterID: plnokvrq4yq7dqi9js0x5wqdd
 Managers: 1
 Nodes: 3
 Default Address Pool: 10.0.0.0/8  
 SubnetSize: 24
 Orchestration:
  Task History Retention Limit: 5
 Raft:
  Snapshot Interval: 10000
  Number of Old Snapshots to Retain: 0
  Heartbeat Tick: 1
  Election Tick: 10
 Dispatcher:
  Heartbeat Period: 5 seconds
 CA Configuration:
  Expiry Duration: 3 months
  Force Rotate: 0
 Autolock Managers: false
 Root Rotation In Progress: false
 Node Address: 157.230.233.54
 Manager Addresses:
  157.230.233.54:2377
Runtimes: runc
Default Runtime: runc
Init Binary: docker-init
containerd version: e6b3f5632f50dbc4e9cb6288d911bf4f5e95b18e
runc version: 6635b4f0c6af3810594d2770f662f34ddc15b40d
init version: fec3683
Security Options:
 apparmor
 seccomp
  Profile: default
Kernel Version: 4.4.0-142-generic
Operating System: Ubuntu 16.04.5 LTS
OSType: linux
Architecture: x86_64
CPUs: 1
Total Memory: 1.953GiB
Name: kms-mediator
ID: YWTG:TJEH:IYMA:UKAE:ZBOE:ODEB:3IX6:GWTG:ILHM:Q3J7:53PJ:5EBV
Docker Root Dir: /var/lib/docker
Debug Mode (client): false
Debug Mode (server): false
Registry: https://index.docker.io/v1/
Labels:
 provider=digitalocean
Experimental: false
Insecure Registries:
 127.0.0.0/8
Live Restore Enabled: false
Product License: Community Engine

WARNING: No swap limit support

Environment: Both nodes are separate DigitalOcean droplets

Related questions

start container failed with "failed to umount /var/lib/docker/containers/.../shm: no such file or directory" hot 65
start container failed with "failed to umount /var/lib/docker/containers/.../shm: no such file or directory" hot 29
upgrade docker-18.09.2-ce , shim.sock: bind: address already in use: unknown hot 27
runc regression - EPERM running containers from selinux hot 16
Windows Server 2019 publish ports in swarm not working hot 14
Error response from daemon: rpc error: code = DeadlineExceeded desc = context deadline exceeded hot 13
"docker stack deploy">"rpc error: code = 3 desc = name must be valid as a DNS name component" hot 12
integration: "error reading the kernel parameter" errors during CI hot 10
write unix /var/run/docker.sock->@: write: broken pipe hot 10
hcsshim::PrepareLayer failed in Win32: The parameter is incorrect hot 10
OCI runtime exec failed: exec failed: cannot exec a container that has stopped: unknown hot 9
Docker 18.09.1 doesn't work with iptables v1.8.2 hot 9
dockerd stopped responding to API requests; no installed keys could decrypt message hot 9
manifest invalid error when pushing image to quay.io hot 8
Containers on overlay network cannot reach other containers hot 7
Github User Rank List