profile
viewpoint

Ask questionsrke up --local fails to deploy successfully

RKE version: v0.2.1

Docker version: (docker version,docker info preferred)

$ docker info
Containers: 2
 Running: 2
 Paused: 0
 Stopped: 0
Images: 3
Server Version: 18.09.2
Storage Driver: overlay2
 Backing Filesystem: extfs
 Supports d_type: true
 Native Overlay Diff: true
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins:
 Volume: local
 Network: bridge host macvlan null overlay
 Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
Swarm: inactive
Runtimes: runc
Default Runtime: runc
Init Binary: docker-init
containerd version: 9754871865f7fe2f4e74d43e2fc7ccd237edcbce
runc version: 09c8266bf2fcf9519a651b04ae54c967b9ab86ec
init version: v0.18.0 (expected: fec3683b971d9c3ef73f284f176672c44b448662)
Security Options:
 apparmor
 seccomp
  Profile: default
Kernel Version: 4.15.0-47-generic
Operating System: Ubuntu 18.04.2 LTS
OSType: linux
Architecture: x86_64
CPUs: 1
Total Memory: 985.5MiB
Name: rke-node1
ID: V4C5:PQRQ:7AY7:E7QP:NQ7A:MLGF:DUHB:FRAI:DGTK:CWMU:NFGR:JZJW
Docker Root Dir: /var/lib/docker
Debug Mode (client): false
Debug Mode (server): false
Registry: https://index.docker.io/v1/
Labels:
Experimental: false
Insecure Registries:
 127.0.0.0/8
Live Restore Enabled: false

WARNING: No swap limit support

Operating system and kernel: (cat /etc/os-release, uname -r preferred)

$ cat /etc/os-release
NAME="Ubuntu"
VERSION="18.04.2 LTS (Bionic Beaver)"
ID=ubuntu
ID_LIKE=debian
PRETTY_NAME="Ubuntu 18.04.2 LTS"
VERSION_ID="18.04"
HOME_URL="https://www.ubuntu.com/"
SUPPORT_URL="https://help.ubuntu.com/"
BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/"
PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy"
VERSION_CODENAME=bionic
UBUNTU_CODENAME=bionic

Type/provider of hosts: (VirtualBox/Bare-metal/AWS/GCE/DO) VirtualBox

cluster.yml file: none!

Steps to Reproduce: Ran rke up --local

Results: Above command fails on etcd healthcheck. Output:

$ rke up --local
INFO[0000] Failed to resolve cluster file, using default cluster instead
INFO[0000] Initiating Kubernetes cluster
INFO[0000] [certificates] Generating admin certificates and kubeconfig
INFO[0000] Successfully Deployed state file at [./cluster.rkestate]
INFO[0000] Building Kubernetes cluster
INFO[0000] [network] Deploying port listener containers
INFO[0000] [network] Successfully started [rke-cp-port-listener] container on host [127.0.0.1]
INFO[0001] [network] Successfully started [rke-worker-port-listener] container on host [127.0.0.1]
INFO[0001] [network] Port listener containers deployed successfully
INFO[0001] [network] Running control plane -> etcd port checks
INFO[0001] [network] Successfully started [rke-port-checker] container on host [127.0.0.1]
INFO[0001] [network] Running control plane -> worker port checks
INFO[0002] [network] Successfully started [rke-port-checker] container on host [127.0.0.1]
INFO[0002] [network] Running workers -> control plane port checks
INFO[0002] [network] Successfully started [rke-port-checker] container on host [127.0.0.1]
INFO[0002] [network] Checking KubeAPI port Control Plane hosts
INFO[0002] [network] Removing port listener containers
INFO[0002] [remove/rke-etcd-port-listener] Successfully removed container on host [127.0.0.1]
INFO[0002] [remove/rke-cp-port-listener] Successfully removed container on host [127.0.0.1]
INFO[0003] [remove/rke-worker-port-listener] Successfully removed container on host [127.0.0.1]
INFO[0003] [network] Port listener containers removed successfully
INFO[0003] [certificates] Deploying kubernetes certificates to Cluster nodes
INFO[0008] [reconcile] Rebuilding and updating local kube config
INFO[0008] Successfully Deployed local admin kubeconfig at [./kube_config_cluster.yml]
INFO[0008] [certificates] Successfully deployed kubernetes certificates to Cluster nodes
INFO[0008] [reconcile] Reconciling cluster state
INFO[0008] [reconcile] This is newly generated cluster
INFO[0008] Pre-pulling kubernetes images
INFO[0008] Kubernetes images pulled successfully
INFO[0008] [etcd] Building up etcd plane..
INFO[0008] [etcd] Saving snapshot [etcd-rolling-snapshots] on host [127.0.0.1]
INFO[0008] [remove/etcd-rolling-snapshots] Successfully removed container on host [127.0.0.1]
INFO[0009] [etcd] Successfully started [etcd-rolling-snapshots] container on host [127.0.0.1]
INFO[0014] [certificates] Successfully started [rke-bundle-cert] container on host [127.0.0.1]
INFO[0014] [certificates] successfully saved certificate bundle [/opt/rke/etcd-snapshots//pki.bundle.tar.gz] on host [127.0.0.1]
INFO[0015] [etcd] Successfully started [rke-log-linker] container on host [127.0.0.1]
INFO[0015] [remove/rke-log-linker] Successfully removed container on host [127.0.0.1]
INFO[0015] [etcd] Successfully started etcd plane.. Checking etcd cluster health
FATA[0015] [etcd] Failed to bring up Etcd Plane: [etcd] Etcd Cluster is not healthy
rancher/rke

Answer questions dnoland1

Tested on rke v0.2.2. It doesn't fail on etcd and gets further, but still fails. Latest logs:

$ ./rke-0.2.2 up --local
INFO[0000] Failed to resolve cluster file, using default cluster instead
INFO[0000] Initiating Kubernetes cluster
INFO[0000] [certificates] Generating CA kubernetes certificates
INFO[0000] [certificates] Generating Kubernetes API server aggregation layer requestheader client CA certificates
INFO[0000] [certificates] Generating admin certificates and kubeconfig
INFO[0000] [certificates] Generating Kubernetes API server proxy client certificates
INFO[0000] [certificates] Generating etcd-127.0.0.1 certificate and key
INFO[0001] [certificates] Generating Kube Scheduler certificates
INFO[0001] [certificates] Generating Kube Controller certificates
INFO[0001] [certificates] Generating Kube Proxy certificates
INFO[0001] [certificates] Generating Node certificate
INFO[0001] [certificates] Generating Kubernetes API server certificates
INFO[0001] Successfully Deployed state file at [./cluster.rkestate]
INFO[0001] Building Kubernetes cluster
INFO[0001] [network] Deploying port listener containers
INFO[0001] [network] Pulling image [rancher/rke-tools:v0.1.27] on host [127.0.0.1]
INFO[0009] [network] Successfully pulled image [rancher/rke-tools:v0.1.27] on host [127.0.0.1]
INFO[0010] [network] Successfully started [rke-etcd-port-listener] container on host [127.0.0.1]
INFO[0010] [network] Successfully started [rke-cp-port-listener] container on host [127.0.0.1]
INFO[0011] [network] Successfully started [rke-worker-port-listener] container on host [127.0.0.1]
INFO[0011] [network] Port listener containers deployed successfully
INFO[0011] [network] Running control plane -> etcd port checks
INFO[0011] [network] Successfully started [rke-port-checker] container on host [127.0.0.1]
INFO[0011] [network] Running control plane -> worker port checks
INFO[0012] [network] Successfully started [rke-port-checker] container on host [127.0.0.1]
INFO[0012] [network] Running workers -> control plane port checks
INFO[0013] [network] Successfully started [rke-port-checker] container on host [127.0.0.1]
INFO[0013] [network] Checking KubeAPI port Control Plane hosts
INFO[0013] [network] Removing port listener containers
INFO[0013] [remove/rke-etcd-port-listener] Successfully removed container on host [127.0.0.1]
INFO[0013] [remove/rke-cp-port-listener] Successfully removed container on host [127.0.0.1]
INFO[0013] [remove/rke-worker-port-listener] Successfully removed container on host [127.0.0.1]
INFO[0013] [network] Port listener containers removed successfully
INFO[0013] [certificates] Deploying kubernetes certificates to Cluster nodes
INFO[0019] [reconcile] Rebuilding and updating local kube config
INFO[0019] Successfully Deployed local admin kubeconfig at [./kube_config_cluster.yml]
INFO[0019] [certificates] Successfully deployed kubernetes certificates to Cluster nodes
INFO[0019] [reconcile] Reconciling cluster state
INFO[0019] [reconcile] This is newly generated cluster
INFO[0019] Pre-pulling kubernetes images
INFO[0019] [pre-deploy] Pulling image [rancher/hyperkube:v1.13.5-rancher1] on host [127.0.0.1]
INFO[0056] [pre-deploy] Successfully pulled image [rancher/hyperkube:v1.13.5-rancher1] on host [127.0.0.1]
INFO[0056] Kubernetes images pulled successfully
INFO[0056] [etcd] Building up etcd plane..
INFO[0056] [etcd] Pulling image [rancher/coreos-etcd:v3.2.24-rancher1] on host [127.0.0.1]
INFO[0059] [etcd] Successfully pulled image [rancher/coreos-etcd:v3.2.24-rancher1] on host [127.0.0.1]
INFO[0060] [etcd] Successfully started [etcd] container on host [127.0.0.1]
INFO[0060] [etcd] Saving snapshot [etcd-rolling-snapshots] on host [127.0.0.1]
INFO[0060] [etcd] Successfully started [etcd-rolling-snapshots] container on host [127.0.0.1]
INFO[0066] [certificates] Successfully started [rke-bundle-cert] container on host [127.0.0.1]
INFO[0066] [certificates] successfully saved certificate bundle [/opt/rke/etcd-snapshots//pki.bundle.tar.gz] on host [127.0.0.1]
INFO[0066] [etcd] Successfully started [rke-log-linker] container on host [127.0.0.1]
INFO[0067] [remove/rke-log-linker] Successfully removed container on host [127.0.0.1]
INFO[0067] [etcd] Successfully started etcd plane.. Checking etcd cluster health
INFO[0067] [controlplane] Building up Controller Plane..
INFO[0067] [controlplane] Successfully started [kube-apiserver] container on host [127.0.0.1]
INFO[0067] [healthcheck] Start Healthcheck on service [kube-apiserver] on host [127.0.0.1]
INFO[0079] [healthcheck] service [kube-apiserver] on host [127.0.0.1] is healthy
INFO[0080] [controlplane] Successfully started [rke-log-linker] container on host [127.0.0.1]
INFO[0080] [remove/rke-log-linker] Successfully removed container on host [127.0.0.1]
INFO[0081] [controlplane] Successfully started [kube-controller-manager] container on host [127.0.0.1]
INFO[0081] [healthcheck] Start Healthcheck on service [kube-controller-manager] on host [127.0.0.1]
INFO[0086] [healthcheck] service [kube-controller-manager] on host [127.0.0.1] is healthy
INFO[0086] [controlplane] Successfully started [rke-log-linker] container on host [127.0.0.1]
INFO[0086] [remove/rke-log-linker] Successfully removed container on host [127.0.0.1]
INFO[0086] [controlplane] Successfully started [kube-scheduler] container on host [127.0.0.1]
INFO[0086] [healthcheck] Start Healthcheck on service [kube-scheduler] on host [127.0.0.1]
INFO[0091] [healthcheck] service [kube-scheduler] on host [127.0.0.1] is healthy
INFO[0092] [controlplane] Successfully started [rke-log-linker] container on host [127.0.0.1]
INFO[0092] [remove/rke-log-linker] Successfully removed container on host [127.0.0.1]
INFO[0092] [controlplane] Successfully started Controller Plane..
INFO[0092] [authz] Creating rke-job-deployer ServiceAccount
INFO[0092] [authz] rke-job-deployer ServiceAccount created successfully
INFO[0092] [authz] Creating system:node ClusterRoleBinding
INFO[0092] [authz] system:node ClusterRoleBinding created successfully
INFO[0092] Successfully Deployed state file at [./cluster.rkestate]
INFO[0092] [state] Saving full cluster state to Kubernetes
INFO[0092] [state] Successfully Saved full cluster state to Kubernetes ConfigMap: cluster-state
INFO[0092] [worker] Building up Worker Plane..
INFO[0092] [sidekick] Sidekick container already created on host [127.0.0.1]
INFO[0093] [worker] Successfully started [kubelet] container on host [127.0.0.1]
INFO[0093] [healthcheck] Start Healthcheck on service [kubelet] on host [127.0.0.1]
INFO[0098] [healthcheck] service [kubelet] on host [127.0.0.1] is healthy
INFO[0098] [worker] Successfully started [rke-log-linker] container on host [127.0.0.1]
INFO[0099] [remove/rke-log-linker] Successfully removed container on host [127.0.0.1]
INFO[0099] [worker] Successfully started [kube-proxy] container on host [127.0.0.1]
INFO[0099] [healthcheck] Start Healthcheck on service [kube-proxy] on host [127.0.0.1]
INFO[0104] [healthcheck] service [kube-proxy] on host [127.0.0.1] is healthy
INFO[0105] [worker] Successfully started [rke-log-linker] container on host [127.0.0.1]
INFO[0105] [remove/rke-log-linker] Successfully removed container on host [127.0.0.1]
INFO[0105] [worker] Successfully started Worker Plane..
INFO[0106] [cleanup] Successfully started [rke-log-cleaner] container on host [127.0.0.1]
INFO[0106] [remove/rke-log-cleaner] Successfully removed container on host [127.0.0.1]
INFO[0106] [sync] Syncing nodes Labels and Taints
INFO[0106] [sync] Successfully synced nodes Labels and Taints
INFO[0106] [network] Setting up network plugin: canal
INFO[0106] [addons] Saving ConfigMap for addon rke-network-plugin to Kubernetes
INFO[0106] [addons] Successfully saved ConfigMap for addon rke-network-plugin to Kubernetes
INFO[0106] [addons] Executing deploy job rke-network-plugin
FATA[0136] Failed to get job complete status for job rke-network-plugin-deploy-job in namespace kube-system

I'll change the issue title to make it more general...

useful!

Related questions

"Failed to reconcile etcd plane" when updating RKE binary hot 3
Failed to get /health for host - remote error: tls: bad certificate hot 2
Failed to rotate expired certificates on an RKE cluster: unable to reach api server to fetch CA hot 2
Error response from daemon: chown /etc/resolv.conf: operation not permitted hot 1
Pods can't reach networks outside of node hot 1
rke 0.1.17 Can't initiate NewClient: protocol not available hot 1
Calico node failed to start after upgrading the cluster hot 1
Job rke-network-plugin-deploy-job never completes (virtualbox) hot 1
Job rke-network-plugin-deploy-job never completes (virtualbox) hot 1
go panic on intial rke up hot 1
Unable to update cluster "crypto/rsa: verification error" hot 1
Calico node failed to start after upgrading the cluster hot 1
pods in "CrashLoopBackOff" status after restoring from backup hot 1
[SOLVED] Failed to apply the ServiceAccount needed for job execution: Post https://10.102.X.X:6443/apis/rbac.authorization.k8s.io/v1/clusterrolebindings: Forbidden hot 1
Failed to get /health for host - remote error: tls: bad certificate hot 1
Github User Rank List