Ask questionsUnable to get DNS resolution on swarm overlay network
<!-- If you are reporting a new issue, make sure that we do not have any duplicates already open. You can ensure this by searching the issue list for this repository. If there is a duplicate, please close your issue and add a comment to the existing issue instead.
If you suspect your issue is a bug, please edit your issue description to include the BUG REPORT INFORMATION shown below. If you fail to provide this information within 7 days, we cannot debug your issue and will close it. We will, however, reopen it if you later provide the information.
For more information about reporting issues, see https://github.com/docker/docker/blob/master/CONTRIBUTING.md#reporting-other-issues
The GitHub issue tracker is for bug reports and feature requests. General support can be found at the following locations:
Use the commands below to provide key information from your environment: You do NOT have to include this information if this is a FEATURE REQUEST -->
<!-- Briefly describe the problem you are having in a few paragraphs. --> A swarm cluster on our internal network is using an internal DNS server address 10.0.0.20. NOTE: this is not the same network as the swarm cluster.
A swarm overlay network is using 10.0.0.0/24. Containers attached to this network can't get DNS resolution.
/etc/resolv.conf on the HOST:
# Generated by NetworkManager nameserver 10.0.0.20
/etc/resolv.conf in the container
nameserver 127.0.0.11 options ndots:0
ip addr in the container:
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN qlen 1 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo valid_lft forever preferred_lft forever 119: eth0@if120: <BROADCAST,MULTICAST,UP,LOWER_UP,M-DOWN> mtu 1450 qdisc noqueue state UP link/ether 02:42:0a:00:00:05 brd ff:ff:ff:ff:ff:ff inet 10.0.0.5/24 scope global eth0 valid_lft forever preferred_lft forever inet 10.0.0.4/32 scope global eth0 valid_lft forever preferred_lft forever 121: eth1@if122: <BROADCAST,MULTICAST,UP,LOWER_UP,M-DOWN> mtu 1500 qdisc noqueue state UP link/ether 02:42:ac:12:00:07 brd ff:ff:ff:ff:ff:ff inet 172.18.0.7/16 scope global eth1 valid_lft forever preferred_lft forever 123: eth2@if124: <BROADCAST,MULTICAST,UP,LOWER_UP,M-DOWN> mtu 1450 qdisc noqueue state UP link/ether 02:42:0a:ff:00:0d brd ff:ff:ff:ff:ff:ff inet 10.255.0.13/16 scope global eth2 valid_lft forever preferred_lft forever inet 10.255.0.12/32 scope global eth2 valid_lft forever preferred_lft forever
something that tries to resolve:
# ping google.com ping: bad address 'google.com'
I'm seeing errors from dockerd like this:
Sep 13 16:52:21 swarm-demo1 dockerd: time="2017-09-13T16:52:21.473125855-04:00" level=error msg="Reprogramming on L3 miss failed for 10.0.0.20, no peer entry"
I changed to a different internal DNS server that is not on the 10.0.0.0/24 network and resolution works normally
# ping google.com PING google.com (126.96.36.199): 56 data bytes 64 bytes from 188.8.131.52: seq=0 ttl=55 time=22.758 ms 64 bytes from 184.108.40.206: seq=1 ttl=55 time=9.352 ms 64 bytes from 220.127.116.11: seq=2 ttl=55 time=15.719 ms ^C
Steps to reproduce the issue: I've don this on several internal swarms running different versions of docker.
Describe the results you received: No DNS resolution
Describe the results you expected:
Additional information you deem important (e.g. issue happens only occasionally):
Client: Version: 17.06.1-ce API version: 1.30 Go version: go1.8.3 Git commit: 874a737 Built: Thu Aug 17 22:53:49 2017 OS/Arch: linux/amd64 Server: Version: 17.06.1-ce API version: 1.30 (minimum version 1.12) Go version: go1.8.3 Git commit: 874a737 Built: Thu Aug 17 23:01:50 2017 OS/Arch: linux/amd64 Experimental: false
Client: Version: 17.06.1-ce API version: 1.30 Go version: go1.8.3 Git commit: 874a737 Built: Thu Aug 17 22:53:49 2017 OS/Arch: linux/amd64 Server: Version: 17.06.1-ce API version: 1.30 (minimum version 1.12) Go version: go1.8.3 Git commit: 874a737 Built: Thu Aug 17 23:01:50 2017 OS/Arch: linux/amd64 Experimental: false [root@swarm-demo1 compose-files]# docker info Containers: 7 Running: 7 Paused: 0 Stopped: 0 Images: 6 Server Version: 17.06.1-ce Storage Driver: devicemapper Pool Name: docker-thinpool Pool Blocksize: 524.3kB Base Device Size: 10.74GB Backing Filesystem: xfs Data file: Metadata file: Data Space Used: 4.098GB Data Space Total: 102GB Data Space Available: 97.91GB Metadata Space Used: 1.552MB Metadata Space Total: 1.074GB Metadata Space Available: 1.072GB Thin Pool Minimum Free Space: 10.2GB Udev Sync Supported: true Deferred Removal Enabled: true Deferred Deletion Enabled: false Deferred Deleted Device Count: 0 Library Version: 1.02.135-RHEL7 (2016-11-16) Logging Driver: json-file Cgroup Driver: cgroupfs Plugins: Volume: local Network: bridge host macvlan null overlay Log: awslogs fluentd gcplogs gelf journald json-file logentries splunk syslog Swarm: active NodeID: dej1htlbh6dultjpa5wt1ao9r Is Manager: true ClusterID: j666ldxnl401vc8et51c5ti60 Managers: 3 Nodes: 6 Orchestration: Task History Retention Limit: 5 Raft: Snapshot Interval: 10000 Number of Old Snapshots to Retain: 0 Heartbeat Tick: 1 Election Tick: 3 Dispatcher: Heartbeat Period: 5 seconds CA Configuration: Expiry Duration: 3 months Force Rotate: 0 Root Rotation In Progress: false Node Address: 10.2.252.121 Manager Addresses: 10.2.252.121:2377 10.2.252.122:2377 10.2.252.123:2377 Runtimes: runc Default Runtime: runc Init Binary: docker-init containerd version: 6e23458c129b551d5c9871e5174f6b1b7f6d1170 runc version: 810190ceaa507aa2727d7ae6f4790c76ec150bd2 init version: 949e6fa Security Options: seccomp Profile: default Kernel Version: 3.10.0-514.26.2.el7.x86_64 Operating System: CentOS Linux 7 (Core) OSType: linux Architecture: x86_64 CPUs: 6 Total Memory: 7.781GiB Name: swarm-demo1 ID: KHSR:U7MT:TRP2:U3EZ:OFD5:VTVD:NKUV:SOYD:3LS5:KCKF:5ZNC:3ANG Docker Root Dir: /var/lib/docker Debug Mode (client): false Debug Mode (server): false Registry: https://index.docker.io/v1/ Experimental: false Insecure Registries: xxx.local:5000 127.0.0.0/8 Live Restore Enabled: false
Additional environment details (AWS, VirtualBox, physical, etc.):
Answer questions itsgk92
I see the similar issue on user defined bridge network in swarm scope.