profile
viewpoint

Ask questionsetcd3 and kube-apiserver fail on terraform apply after terraform destroying w/ kops generated config

1. What kops version are you running? The command kops version, will display this information. Version 1.12.3 (git-e55205471) 2. What Kubernetes version are you running? kubectl version will print the version if a cluster is running or provide the Kubernetes version specified as a kops flag.

Client Version: version.Info{Major:"1", Minor:"12", GitVersion:"v1.12.10", GitCommit:"e3c134023df5dea457638b614ee17ef234dc34a6", GitTreeState:"clean", BuildDate:"2019-07-08T03:50:59Z", GoVersion:"go1.10.8", Compiler:"gc", Platform:"linux/amd64"}
Unable to connect to the server: EOF

3. What cloud provider are you using? AWS 4. What commands did you run? What is the simplest way to reproduce this issue?

  • kops edit cluster to configure.
  • kops update cluster --out=./ --target=terraform to generate the Terraform config
  • terraform apply to apply the config.

5. What happened after the commands executed?

  • The terraform apply completes successfully and all instances are created. However, I cannot connect to the cluster even after 30 minutes and all nodes are marked as OutOfService by the load balancer.
  • Running a kops validate cluster results in: unexpected error during validation: error listing nodes: Get https://api.example.net/api/v1/nodes: EOF
  • kubectl get cluster results in: Unable to connect to the server: EOF
  • ssh into the master nodes and I find
0815 14:23:15.697247       1 storage_decorator.go:57] Unable to create storage backend: config (&{etcd3 /registry [https://127.0.0.1:4001] /etc/kubernetes/pki/kube-apiserver/etcd-client.key /etc/kubernetes/pki/kube-apiserver/etcd-client.crt /etc/kubernetes/pki/kube-apiserver/etcd-ca.crt true true 1000 0xc420b3ddd0 <nil> 5m0s 1m0s}), err (dial tcp 127.0.0.1:4001: connect: connection refused)

6. What did you expect to happen? I expected the cluster to be up and ready to go, and to be able to connect to it. 7. Please provide your cluster manifest. Execute kops get --name my.example.com -o yaml to display your cluster manifest. You may want to remove your cluster name and other sensitive information.

apiVersion: kops/v1alpha2
kind: Cluster
metadata:
  creationTimestamp: 2019-07-31T15:59:08Z
  name: example.net
spec:
  additionalPolicies:
    node: |
      [   
        {
          "Effect": "Allow",
          "Action": [
            "sts:AssumeRole"
          ],
          "Resource": [
            "arn:aws:iam:::role/k8s-*"
          ]
        }
      ]   
  api:
    loadBalancer:
      sslCertificate: arn:aws:acm:us-east-1:secretsecretsecret
      type: Internal
  authorization:
    rbac: {}
  channel: stable
  cloudProvider: aws 
  configBase: s3://kops.example.internal/example.net
  dnsZone: example.net
  etcdClusters:
  - etcdMembers:
    - instanceGroup: master-us-east-1b
      name: b
    - instanceGroup: master-us-east-1d
      name: d
    - instanceGroup: master-us-east-1e
      name: e
    name: main
  - etcdMembers:
    - instanceGroup: master-us-east-1b
      name: b
    - instanceGroup: master-us-east-1d
      name: d
    - instanceGroup: master-us-east-1e
      name: e
    name: events
  iam:
    allowContainerRegistry: true
    legacy: false
  kubelet:
    anonymousAuth: false
    authenticationTokenWebhook: true
    authorizationMode: Webhook
  kubernetesApiAccess:
  - office_cidr
  - another_cidr
  kubernetesVersion: 1.12.10
  masterInternalName: api.internal.example.net
  masterPublicName: api.example.net
  networkCIDR: another_cidr
  networkID: vpc-abcdefg
  networking:
    weave:
      mtu: 8912
  nonMasqueradeCIDR: another_cidr
  sshAccess:
  - another_cidr
  - another_cidr
  subnets:
  - cidr: another_cidr
    egress: nat-0e0e0e0e0e0e0
    id: subnet-abcdefg
    name: us-east-1b
    type: Private
    zone: us-east-1b
  - cidr: another_cidr
    egress: nat-0e0e0e0e0e0e0
    id: subnet-abcdefg
    name: us-east-1d
    type: Private
    zone: us-east-1d
  - cidr: another_cidr
    egress: nat-0e0e0e0e0e0e0
    id: subnet-abcdefg
    name: us-east-1e
    type: Private
    zone: us-east-1e
  - cidr: another_cidr
    id: subnet-abcdefg
    name: utility-us-east-1b
    type: Utility
    zone: us-east-1b
  - cidr: another_cidr
    id: subnet-abcdefg
    name: utility-us-east-1d
    type: Utility
    zone: us-east-1d
  - cidr: another_cidr
    id: subnet-abcdefg
    name: utility-us-east-1e
    type: Utility
    zone: us-east-1e
  topology:
    dns:
      type: Public
    masters: private
    nodes: private

---

apiVersion: kops/v1alpha2
kind: InstanceGroup
metadata:
  creationTimestamp: 2019-08-02T02:43:49Z
  labels:
    kops.k8s.io/cluster: example.net
  name: master-us-east-1b
spec:
  image: kope.io/k8s-1.12-debian-stretch-amd64-hvm-ebs-2019-06-21
  machineType: m3.medium
  maxSize: 1
  minSize: 1
  nodeLabels:
    kops.k8s.io/instancegroup: master-us-east-1b
  role: Master
  subnets:
  - us-east-1b

---

apiVersion: kops/v1alpha2
kind: InstanceGroup
metadata:
  creationTimestamp: 2019-08-02T02:43:49Z
  labels:
    kops.k8s.io/cluster: example.net
  name: master-us-east-1d
spec:
  image: kope.io/k8s-1.12-debian-stretch-amd64-hvm-ebs-2019-06-21
  machineType: m3.medium
  maxSize: 1
  minSize: 1
  nodeLabels:
    kops.k8s.io/instancegroup: master-us-east-1d
  role: Master
  subnets:
  - us-east-1d

---

apiVersion: kops/v1alpha2
kind: InstanceGroup
metadata:
  creationTimestamp: 2019-08-02T02:43:50Z
  labels:
    kops.k8s.io/cluster: example.net
  name: master-us-east-1e
spec:
  image: kope.io/k8s-1.12-debian-stretch-amd64-hvm-ebs-2019-06-21
  machineType: m3.medium
  maxSize: 1
  minSize: 1
  nodeLabels:
    kops.k8s.io/instancegroup: master-us-east-1e
  role: Master
  subnets:
  - us-east-1e

---

apiVersion: kops/v1alpha2
kind: InstanceGroup
metadata:
  creationTimestamp: 2019-08-02T02:43:50Z
  labels:
    kops.k8s.io/cluster: example.net
  name: nodes
spec:
  image: kope.io/k8s-1.12-debian-stretch-amd64-hvm-ebs-2019-06-21
  machineType: m4.xlarge
  maxSize: 3
  minSize: 3
  nodeLabels:
    kops.k8s.io/instancegroup: nodes
  role: Node
  subnets:
  - us-east-1b
  - us-east-1d
  - us-east-1e


8. Please run the commands with most verbose logging by adding the -v 10 flag. Paste the logs into this report, or in a gist and provide the gist link here. I don't think the verbose logs from the kops commands or the terraform commands are relevant here, and that the more relevant logs are those that I have posted below. 9. Anything else do we need to know?

  • All of this works perfectly with version 1.11 of k8s and kops. We're moving forward with development on that version for now, but concerned about the future.
  • If I ssh into the master nodes, the last message in kube-apiserver.log is
0815 14:23:15.697247       1 storage_decorator.go:57] Unable to create storage backend: config (&{etcd3 /registry [https://127.0.0.1:4001] /etc/kubernetes/pki/kube-apiserver/etcd-client.key /etc/kubernetes/pki/kube-apiserver/etcd-client.crt /etc/kubernetes/pki/kube-apiserver/etcd-ca.crt true true 1000 0xc420b3ddd0 <nil> 5m0s 1m0s}), err (dial tcp 127.0.0.1:4001: connect: connection refused)
  • All three masters show this same message.
  • Although it looks like the volume eventually attaches at a different mount point, an error that might be relevant in /var/log/etcd.log is
W0815 13:46:56.016362    3495 mounter.go:293] Error attaching volume "vol-0e7dd951edc73df76": Error attaching EBS volume "vol-0e7dd951edc73df76": InvalidParameterValue: Invalid value '/dev/xvdu' for unixDevice. Attachment point /dev/xvdu is already in use
        status code: 400, request id: 1cbf8ec1-49e1-4be0-8f20-487d79228007
  • The cluster is running etcd version 3.2.24.
kubernetes/kops

Answer questions mamoit

I just had this same issue, but only happened with one of the masters, so I didn't even notice until I did kubectl get node and only 2 showed up. Running:

  • terraform 0.11.14
  • kops 1.12.3
  • kubernetes 1.12.10

This didn't happened to me with kops 1.12.2 and the same versions of everything else.

When I run validate I get all green except:

VALIDATION ERRORS
KIND    NAME                    MESSAGE
Machine i-0123     machine "i-0123" has not yet joined cluster

logs

I can provide extra logs from the machine, just let me know what is needed.

etcd

They seem to show that it's restarting for no good reason each 10s:

I0822 10:05:04.919181    1411 controller.go:531] controller loop complete
I0822 10:05:14.920505    1411 controller.go:173] starting controller iteration

kube-apiserver

These show something way more interesting:

W0822 10:21:10.297431       1 admission.go:76] PersistentVolumeLabel admission controller is deprecated. Please remove this controller from your configuration files and scripts.

and

F0822 10:21:30.373645       1 storage_decorator.go:57] Unable to create storage backend: config (&{etcd3 /registry [https://127.0.0.1:4002] /etc/kubernetes/pki/kube-apiserver/etcd-client.key /etc/kubernetes/pki/kube-apiserver/etcd-client.crt /etc/kubernetes/pki/kube-apiserver/etcd-ca.crt true true 1000 {0xc42042b950 0xc4209ac1b0} <nil> 5m0s 1m0s}), err (dial tcp 127.0.0.1:4002: connect: connection refused)
useful!

Related questions

Unable to use a local filesystem state store hot 2
Kops 1.12-beta.2 won't/can't bring up etcd server, manager or kube-api hot 1
kube controller manager refuses to connect after upgrading from 1.10.6 to 1.11.7 hot 1
Missing kops controller support for cloudproviders hot 1
InstanceGroup not found (for etcd ap-southeast-2a/main): "ap-southeast-2a" hot 1
Rolling-update fails due to calico-node with 1.12.0-beta.2 hot 1
Kubelet Unable To Apply Reserved Cgroup Limits because Cgroup does not exist hot 1
Upgrade from Kops 1.11 to 1.12 has failed. hot 1
Couldn't find key etcd_endpoints in ConfigMap kube-system/calico-config hot 1
Protokube has sustained cpu usage above 100% hot 1
Allow just one instance type in mixedInstancesPolicy hot 1
kubectl command: Unable to connect to the server: EOF hot 1
DNS record for public API address not updated hot 1
Issues encountered deploying to OpenStack hot 1
Master does not rejoin cluster after (simulated) EBS volume failure hot 1
source:https://uonfu.com/
Github User Rank List