profile
viewpoint
Benjamin Elder BenTheElder @Google Sunnyvale, CA https://elder.dev/ maintaining @kubernetes things, https://sigs.k8s.io/kind creator

BenTheElder/creaturebox 2

golang/gomobile evolutionary avoidance simulation

BenTheElder/color-schemes 1

sublime text themes

BenTheElder/api 0

The canonical location of the Kubernetes API definition.

BenTheElder/autoscaler 0

Autoscaling components for Kubernetes

BenTheElder/bluebliss-atom 0

blubliss atom syntax theme

BenTheElder/cadvisor 0

Analyzes resource usage and performance characteristics of running containers.

BenTheElder/cluster-api 0

Home for the Cluster Management API work, a subproject of sig-cluster-lifecycle

BenTheElder/complx 0

Complx the LC-3 Simulator used in CS2110 managed by Brandon

issue commentkubernetes-sigs/kind

Cannot run kind from HEAD

fce9435798937bebdd4418a19d32e0f52436b722 definitely seems to be the culprit, now to identify which change ...

https://github.com/kubernetes-sigs/kind/compare/v0.8.1..544bf5ad0aad65f9f044ddd5ead3ba946ae161d0#diff-065f169938b73c43248f266b4fc6291b

howardjohn

comment created time in a few seconds

issue commentkubernetes-sigs/kind

Cannot run kind from HEAD

i have glinux fully updated, bisecting kind.

howardjohn

comment created time in 30 minutes

issue commentkubernetes-sigs/kind

Cannot run kind from HEAD

/priority critical-urgent Does the old image there work? Kind v0.8.1 works but not head? I ran into some issues updating the machine, around them now but going to need to reboot to finish.

howardjohn

comment created time in 2 hours

issue commentkubernetes-sigs/kind

increase kubernetes component verbosity

/priority important-soon

BenTheElder

comment created time in 2 hours

issue commentkubernetes-sigs/kind

cluster logs may not be dumped when tests timeout

/priority critical-urgent

BenTheElder

comment created time in 2 hours

issue commentkubernetes-sigs/kind

Cannot run kind from HEAD

I was using kindest/node:v1.18.2 so I don't think k8s version matters (?). Let me try building an image with this new binary at HEAD

Possibly the difference in these two images? https://github.com/kubernetes-sigs/kind/commit/544bf5ad0aad65f9f044ddd5ead3ba946ae161d0#diff-820f108d566cca3ded58ac0913cf33a6

howardjohn

comment created time in 2 hours

push eventBenTheElder/kubernetes

Johannes M. Scheuermann

commit sha cfb24eeebce8e6c384322b0d197e5d2efa1b2ca4

Update internal traffic shaping docs

view details

Johannes M. Scheuermann

commit sha a9cf6cec729ea97b3713c05935401adc43247a2a

Correct typos in docs and add better explanation for burstrate

view details

mrobson

commit sha e401ee9158e184df093671525cbe9859b606fd93

Errors from cgroup destroy and pid kills are swallowed. Log a warning when that happens.

view details

kanozec

commit sha 1c5506ae645f9b58160c9929b0ac94051a1324cb

fix golint: don't use underscores in Go names

view details

Valentyn Boginskey

commit sha 6b8b8491bb742603431b5dfc6d8bc482eb697e74

Raise verbosity of kubelet_node_status messages In standalone kubelet scenarios, `initialNode` gets called in a loop via `syncPod` -> `GetNode` -> `initialNode`. This causes excessive log spam from the controller attach/detach messages.

view details

Philipp Stehle

commit sha ff69810f1a5131855a0b41dfa1542b3f2a70772c

Fix: UpdateStrategy.RollingUpdate.Partition is lost when UpdateStrategy.Type is not set

view details

Odin Ugedal

commit sha 65e9b6099d0a15fab9a1549c24bb57b868b2df59

Add odinuge to sig-node-reviewers

view details

RainbowMango

commit sha 7b7c73bf8754f2d9c7c3bfa7f3a196fdb19e74f1

Clean up duplicate code and remove import cycle.

view details

Eric Mountain

commit sha 22e0ee768bfaa56dac511759207ac0c42a33b545

Removes container RefManager

view details

Zhou Peng

commit sha 930bedf14418f9f94d28fd47581695b5c0c65989

[pkg/kubelet]: make func a little comfortable This func has only 1 argument, don't wrap it across multiple lines Signed-off-by: Zhou Peng <p@ctriple.cn>

view details

Sascha Grunert

commit sha 2dfb22b5b79dd0956b1fb23db4bbb56b7c127817

Remove unnecessary sprintf in node status tests There is no invocation to sprintf needed for those strings so we can remove them. Signed-off-by: Sascha Grunert <sgrunert@suse.com>

view details

Noah Kantrowitz

commit sha 14969831e9a2ff40e1ee71ce5afdc504f7536a59

Apply the same style of fix as #87913 but for HTTP methods too. Go does not validate HTTP methods beyond len!=0 and that they don't contain HTTP meta chars like a newline. Also to using string sets instead of maps.

view details

marosset

commit sha b81a418b2e1cc8f3148359c77d9e7bde3c418432

CRI - Adding annotations to ImageSpec and ImageSpec to Image

view details

marosset

commit sha 348153d55a0dacc8e9f023b58ffe30526219e331

CRI - updating fake_image_server.go to support Annotations in ImageSpec

view details

Keerthan Reddy,Mala

commit sha aae8a2847aa4831b4e8514ca061d391b3b163bcd

Check for sandboxes before deleting the pod from apiserver

view details

Keerthan Reddy,Mala

commit sha 1e42737e5819f586ccc628fec567d75bc669a5da

add unit tests

view details

Keerthan Reddy,Mala

commit sha c24349e9f288d8789046a2db125d6a60807e7b41

update the build file for bazel

view details

Keerthan Reddy,Mala

commit sha 70e2559acab102553b340600f905f0f08840fa1a

use runtime sandbox status instead of calling cri

view details

Keerthan Reddy,Mala

commit sha 9b9cf33771a7797570cd917f5ca404a2457a99c5

fake remote runtime should call correct method on remove pod sandbox

view details

Quan Tian

commit sha 23e54301abc54e691cb34a601f516d2bd7440829

Delete the wrong comment about CertDirectory of kubelet The default value of CertDirectory was changed to /var/lib/kubelet/pki.

view details

push time in 2 hours

issue commentkubernetes-sigs/kind

Cannot run kind from HEAD

also to confirm: do you have kubernetes checked out to HEAD? Kubernetes (well cAdvisor) had a nasty bug that broke running it on some hosts (kind or otherwise) in 1.19 alpha, which has since been fixed.

currently updating everything on glinux before attempting repro..

howardjohn

comment created time in 2 hours

push eventBenTheElder/kind

Benjamin Elder

commit sha 786255b677a020a19483563559b5fb238785af41

use path package for container paths

view details

Benjamin Elder

commit sha 94031cdb8590c99a75d153626b02b713ee7298d8

Merge pull request #1630 from BenTheElder/path-not-filepath use path package for container paths

view details

push time in 2 hours

issue commentkubernetes-sigs/kind

Cannot run kind from HEAD

Re: cgroups ... cgroupsv2 is a kernel boot parameter. It's possible your setup was updated to use it, if cgroupsv2 is enabled kubernetes does not work (or runc or ...)

If previous versions of kind are working right now then you're not in cgroupsv2.

howardjohn

comment created time in 2 hours

pull request commentkubernetes/test-infra

Update root OWNERS

/lgtm

spiffxp

comment created time in 5 hours

issue closedkubernetes-sigs/kind

How to access the local registry from inside the cluster?

I need to create, from within running pods, deployments that contain containers from the local registry. Is this possible? I can't see anything that implies so in https://kind.sigs.k8s.io/docs/user/local-registry/

closed time in 7 hours

lolin9

issue commentkubernetes-sigs/kind

How to access the local registry from inside the cluster?

this part:

  1. And now we can use the image kubectl create deployment hello-server --image=localhost:5000/hello-app:1.0

kubernetes doesn't care if your deployment object was created inside or outside of the cluster.

use localhost:500/foo:bar for deployments created from either.

lolin9

comment created time in 7 hours

issue commentkubernetes-sigs/kind

coredns Error: [FATAL] plugin/loop: Loop

1.16.3 in particular is NOT an image we've published with v0.8+, unless you have built your own.

teoincontatto

comment created time in 7 hours

issue closedkubernetes-sigs/kind

coredns Error: [FATAL] plugin/loop: Loop

<!-- Please use this template while reporting a bug and provide as much info as possible. Not doing so may result in your bug not being addressed in a timely manner. Thanks!-->

What happened:

coredns pods fails for kubernetes versions 1.12.10, 1.14.10 and 1.16.3 with an error similar too:

$ kubectl logs -n kube-system coredns-6dcc67dcbc-b249g
.:53
2020-05-29T14:45:51.415Z [INFO] CoreDNS-1.3.1
2020-05-29T14:45:51.416Z [INFO] linux/amd64, go1.11.4, 6b56a9c
CoreDNS-1.3.1
linux/amd64, go1.11.4, 6b56a9c
2020-05-29T14:45:51.416Z [INFO] plugin/reload: Running configuration MD5 = 599b9eb76b8c147408aed6a0bbe0f669
2020-05-29T14:45:52.418Z [FATAL] plugin/loop: Loop (127.0.0.1:47903 -> :53) detected for zone ".", see https://coredns.io/plugins/loop#troubleshooting. Query: "HINFO 6331369333141562632.3881881777978083923."

What you expected to happen:

coredns pods running without errors

How to reproduce it (as minimally and precisely as possible):

cat << 'EOF' > kind-config.yaml
  kind: Cluster
  apiVersion: kind.sigs.k8s.io/v1alpha3
  networking:
    apiServerAddress: "0.0.0.0"
  nodes:
  - role: control-plane
EOF

kind create cluster --name kind --config kind-config.yaml --image kindest/node:v1.12.10

kubectl get pod -n kube-system -o name --watch | stdbuf -o0 grep '^pod/coredns-' \
  | xargs -r -n 1 -I % kubectl logs -n kube-system % --prefix 

Environment:

  • kind version: kind v0.8.1 go1.14.2 linux/amd64
  • Kubernetes versions: 1.12.10, 1.14.10, 1.16.3
  • Docker version:
$ docker info
Client:
 Debug Mode: false

Server:
 Containers: 1
  Running: 1
  Paused: 0
  Stopped: 0
 Images: 81
 Server Version: 19.03.8
 Storage Driver: overlay2
  Backing Filesystem: <unknown>
  Supports d_type: true
  Native Overlay Diff: true
 Logging Driver: json-file
 Cgroup Driver: cgroupfs
 Plugins:
  Volume: local
  Network: bridge host ipvlan macvlan null overlay
  Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
 Swarm: inactive
 Runtimes: runc
 Default Runtime: runc
 Init Binary: docker-init
 containerd version: 7ad184331fa3e55e52b890ea95e65ba581ae3429
 runc version: dc9208a3303feef5b3839f4323d9beb36df0a9dd
 init version: fec3683
 Security Options:
  apparmor
  seccomp
   Profile: default
 Kernel Version: 5.3.0-53-generic
 Operating System: Ubuntu 18.04.4 LTS
 OSType: linux
 Architecture: x86_64
 CPUs: 4
 Total Memory: 15.5GiB
 Name: ThinkPadX1C
 ID: SOAI:57RW:N6V7:UY7H:NMDK:RXTA:S7YC:MHYI:Q4ZJ:6TMA:6SMP:YRJ4
 Docker Root Dir: /var/lib/docker
 Debug Mode: false
 Registry: https://index.docker.io/v1/
 Labels:
 Experimental: false
 Insecure Registries:
  127.0.0.0/8
 Live Restore Enabled: false

WARNING: No swap limit support
  • OS:
 Kernel Version: 5.3.0-53-generic
 Operating System: Ubuntu 18.04.4 LTS
 OSType: linux
 Architecture: x86_64

closed time in 7 hours

teoincontatto

issue commentkubernetes-sigs/kind

coredns Error: [FATAL] plugin/loop: Loop

this is almost definitely happening due to using an out of date node image with kind https://github.com/kubernetes-sigs/kind/releases/tag/v0.8.0#breaking-changes

teoincontatto

comment created time in 7 hours

issue commentkubernetes-sigs/kind

coredns Error: [FATAL] plugin/loop: Loop

you must specify the images including the hashes as mentioned in the release notes.

teoincontatto

comment created time in 7 hours

pull request commentkubernetes-sigs/kind

Add host path mount support for k8s e2e test script

It needs to go in soon, we're having fun with bash #1626

saschagrunert

comment created time in 7 hours

Pull request review commentkubernetes/kubernetes

vendor: update google/cadvisor and opencontainers/runc

 require ( 	github.com/onsi/ginkgo v1.11.0 	github.com/onsi/gomega v1.7.0 	github.com/opencontainers/go-digest v1.0.0-rc1-	github.com/opencontainers/runc v1.0.0-rc10-	github.com/opencontainers/selinux v1.3.3+	github.com/opencontainers/runc v1.0.0-rc9.0.20200519182809-b207d578ec2d

this is not really a chicken and egg, if the work is experimental it can be tested with a patchset that is not in master.

giuseppe

comment created time in 8 hours

pull request commentkubernetes/test-infra

Migrate 100 node scalability release-blocking job to k8s-infra-prow-build

/lgtm /approve

spiffxp

comment created time in 8 hours

issue commentkubernetes-sigs/kind

Use memory storage for etcd

We've already tried this and taken nearly all of the obvious steps that don't require upstream changes. Boot time is very important to us.

As I said, the upstream Kubernetes components may be optimizable. The bootstrapping process with kubeadm is suspiciously long, but you'll have to track down what's slow yourself, we haven't gotten to this yet.

On Fri, May 29, 2020, 07:52 Thomas Havlik notifications@github.com wrote:

I've tried using tmpfs for Docker's data-root (silly, yes) and there is no performance benefit to that either, so I am wondering how exactly I should go about optimizing cluster creation. I am able to confirm that my tmpfs grows by about 1.2gb while my persistent disks are untouched by the cluster creation process. While the tmpfs grows, all cores are basically idle. Sometimes I will see a relevant process (e.g. kubeadm) jump to ~1% usage.

Any ideas? Obviously setting data-root is far from ideal. At this point I'm just trying to figure out how this all should behave.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/kubernetes-sigs/kind/issues/845#issuecomment-636017110, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAHADK6TSVY7MWNMVBCEPL3RT7D3VANCNFSM4IUPOOHA .

aojea

comment created time in 8 hours

pull request commentkubernetes/kubernetes

Revert "Revert "Rely on default watch cache capacity and ignore its requested size""

it's not obvious to me what we should do about the api server timeouts, I'm hesitant to have testing depend on non-default values for things like this.

wojtek-t

comment created time in a day

pull request commentkubernetes/kubernetes

Revert "Revert "Rely on default watch cache capacity and ignore its requested size""

correct, there's multiple timeouts that have not matched:

  • job duration
  • grace period for cleanup after job (was 15s instead of 15m!)
  • api server timeout config (kubeadm does not match GCE testing by a long shot)
  • bootstrap (legacy prowjob) also implements job timeouts incorrectly in such a way that much more time (possibly infinite) time may be granted
wojtek-t

comment created time in a day

pull request commentkubernetes/kubernetes

Revert "Revert "Rely on default watch cache capacity and ignore its requested size""

I think I know the correct short-term fix for the log dumping now, ugh bash. Sorry about that :/ We hadn't been hitting timeouts usually, so this wasn't highlighted.

The long term fix is to not write this test script in bash, which I have a long standing PR to rebase. https://github.com/kubernetes-sigs/kind/pull/1269

wojtek-t

comment created time in a day

pull request commentkubernetes/kubernetes

Revert "Revert "Rely on default watch cache capacity and ignore its requested size""

I'm attempting to but it hasn't obviously correlated with anything else.

The tests are taking a long time, under normal circumstances kind is about like this:

  • 5m to build with bazel cache
  • a couple min for things like git cloning, downloading base image
  • around 1 minute for cluster up
  • ~1s for cluster down
  • ~20m for testing

Whereas GCE spends much, much longer on cluster setup / tear down. KIND is flaking out right now though, I suspect due to either a regression in kubernetes, containerd, or the CI hosts. We've not changed anything interesting on our end other than upgrading containerd.

The last time we saw flakiness like this, the prow / CI host nodes had bugged kubelets wasting multiple cores and trashing performance.

On Thu, May 28, 2020 at 1:00 PM Jordan Liggitt notifications@github.com wrote:

For now, I don't see any reason to thing that this is related to this PR (or this PR makes the failures more probable). I would like to hold cancel on this PR. @liggitt https://github.com/liggitt - wdyt?

It's really hard to tell what's going on with the noisy signal we have. Did we look back to see if any of the timeout/flake failures had a noticeable starting point?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/kubernetes/kubernetes/pull/91491#issuecomment-635564928, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAHADK7UORAWU7MDVDSCX73RT27HJANCNFSM4NMCVAQA .

wojtek-t

comment created time in a day

delete branch BenTheElder/kind

delete branch : path-not-filepath

delete time in a day

push eventkubernetes-sigs/kind

Benjamin Elder

commit sha 786255b677a020a19483563559b5fb238785af41

use path package for container paths

view details

Benjamin Elder

commit sha 94031cdb8590c99a75d153626b02b713ee7298d8

Merge pull request #1630 from BenTheElder/path-not-filepath use path package for container paths

view details

push time in a day

PR merged kubernetes-sigs/kind

Reviewers
use path package for container paths approved cncf-cla: yes lgtm size/S

cc @aojea @amwat

Note that filepath should only be used for local paths on the local host (e.g. when exporting logs to the current machine), path should be used for container paths.

Fixes https://github.com/kubernetes-sigs/kind/issues/1555

+5 -5

3 comments

1 changed file

BenTheElder

pr closed time in a day

issue closedkubernetes-sigs/kind

HA cluster with kind on Windows errors on copying certificate.

What happened: When i try to create a cluster with three control-plane nodes I get an error on copying the ca certificate to the 2nd control plane node.

ERROR: failed to create cluster: failed to copy admin kubeconfig: failed to write "/etc/kubernetes/pki/ca.crt" to node: command "docker exec --privileged -i cluster1-control-plane2 cp /dev/stdin /etc/kubernetes/pki/ca.crt" failed with error: exit status 1

Command Output: cp: cannot create regular file '/etc/kubernetes/pki/ca.crt': No such file or directory

Stack Trace: sigs.k8s.io/kind/pkg/errors.WithStack /src/pkg/errors/errors.go:51 sigs.k8s.io/kind/pkg/exec.(*LocalCmd).Run /src/pkg/exec/local.go:124 sigs.k8s.io/kind/pkg/cluster/internal/providers/docker.(*nodeCmd).Run /src/pkg/cluster/internal/providers/docker/node.go:146 sigs.k8s.io/kind/pkg/cluster/nodeutils.CopyNodeToNode /src/pkg/cluster/nodeutils/util.go:70 sigs.k8s.io/kind/pkg/cluster/internal/create/actions/kubeadminit.(*action).Execute /src/pkg/cluster/internal/create/actions/kubeadminit/init.go:95 sigs.k8s.io/kind/pkg/cluster/internal/create.Cluster /src/pkg/cluster/internal/create/create.go:135 sigs.k8s.io/kind/pkg/cluster.(*Provider).Create /src/pkg/cluster/provider.go:138 sigs.k8s.io/kind/pkg/cmd/kind/create/cluster.runE /src/pkg/cmd/kind/create/cluster/createcluster.go:91 sigs.k8s.io/kind/pkg/cmd/kind/create/cluster.NewCommand.func1 /src/pkg/cmd/kind/create/cluster/createcluster.go:56 github.com/spf13/cobra.(*Command).execute /go/pkg/mod/github.com/spf13/cobra@v1.0.0/command.go:842 github.com/spf13/cobra.(*Command).ExecuteC /go/pkg/mod/github.com/spf13/cobra@v1.0.0/command.go:950 github.com/spf13/cobra.(*Command).Execute /go/pkg/mod/github.com/spf13/cobra@v1.0.0/command.go:887 sigs.k8s.io/kind/cmd/kind/app.Run /src/cmd/kind/app/main.go:53 sigs.k8s.io/kind/cmd/kind/app.Main /src/cmd/kind/app/main.go:35 main.main /src/main.go:25 runtime.main /usr/local/go/src/runtime/proc.go:203 runtime.goexit /usr/local/go/src/runtime/asm_amd64.s:1373

What you expected to happen: A multi-master deployed kind cluster

How to reproduce it (as minimally and precisely as possible): << config >> kind: Cluster apiVersion: kind.x-k8s.io/v1alpha4 nodes:

  • role: control-plane
  • role: control-plane
  • role: control-plane
  • role: worker
  • role: worker
  • role: worker << /config >>

command: kind create cluster --config config.yaml

Anything else we need to know?: -Docker is running in Linux mode -Single master cluster and multiple workers goes fine

Environment: Windows 10 Pro

  • kind version: 0.8.1:
  • Kubernetes version: 1.18.2:
  • Docker version: 19.03.8:
  • OS: Windows 10 Professional

closed time in a day

nielsvbrecht

pull request commentkubernetes-sigs/kind

use path package for container paths

CI is having issues again, I'm looking into it but have not found the root cause yet.

BenTheElder

comment created time in a day

delete branch BenTheElder/k8s.io

delete branch : revert-912-add-certmanager

delete time in a day

PR opened kubernetes/k8s.io

Revert "Add cert-manager image bucket and project"

Reverts kubernetes/k8s.io#912

suggested by @justaugustus

We don't have consensus on this (hosting / providing resources for projects that are not even CNCF, let alone in-org) yet. See also https://github.com/kubernetes/k8s.io/issues/915

/cc @thockin @dims @spiffxp

+0 -35

0 comment

5 changed files

pr created time in a day

create barnchBenTheElder/k8s.io

branch : revert-912-add-certmanager

created branch time in a day

pull request commentkubernetes/kubernetes

Revert "Revert "Rely on default watch cache capacity and ignore its requested size""

there's been a broad issue with flakiness / timeouts in CI right now, it's not clear why yet https://kubernetes.slack.com/archives/C09QZ4DQB/p1590691844400100

wojtek-t

comment created time in a day

pull request commentkubernetes/test-infra

Add job to push images from kubernetes-sigs/boskos

/approve /lgtm

ixdy

comment created time in a day

pull request commentkubernetes-sigs/kind

Update docs for WSL2

/ok-to-test

sozercan

comment created time in a day

pull request commentkubernetes-sigs/kind

Add host path mount support for k8s e2e test script

/hold

I'm very supportive of expanding the testing we can do :upside_down_face:

However, I want to stop expanding the environment variable "API surface" especially with #1269 pending

Does this work as expected? What happens if the host doesn't have selinux enabled (or does?)

saschagrunert

comment created time in a day

issue commentkubernetes-sigs/kind

HA cluster with kind on Windows errors on copying certificate.

this doesn't make much sense, we mkdir -p the directory before ...

except that we use filepath package, which uses the hosts's path formats... :man_facepalming:

we've been pretty good about this, but it's easy to slip up, usually it would be more correct to use filepath for file paths (shocker!) than path but ... not in this case. we need #1529

fix in https://github.com/kubernetes-sigs/kind/pull/1630

nielsvbrecht

comment created time in a day

PR opened kubernetes-sigs/kind

use path package for container paths

cc @aojea @amwat

Note that filepath should only be used for local paths on the local host (e.g. when exporting logs to the current machine), path should be used for container paths.

Fixes https://github.com/kubernetes-sigs/kind/issues/1555

+5 -5

0 comment

1 changed file

pr created time in a day

create barnchBenTheElder/kind

branch : path-not-filepath

created branch time in a day

issue commentkubernetes-sigs/kind

ipv6 port-forward has a consistent 36s delay on each request

I think this should be fixed if using a node image built with HEAD

howardjohn

comment created time in a day

pull request commentkubernetes/test-infra

Move Kops to its own testgrid dasboard

Spoke to @spiffxp: for now :man_shrugging: but there may be more enforcement of the organization of testgrid in the future. /approve

hakman

comment created time in a day

issue commentkubernetes/kubernetes

Zero cpu's (cpu: 0) reported for nodes on K8s built on "master"

as far as I know this is fixed now.

uablrek

comment created time in a day

issue commentkubernetes-sigs/kind

Cache Docker images

No worries! I almost wonder if we should offer and admission controller to enforce images are not always

On Thu, May 28, 2020, 09:49 Ruben Suarez Alvarez notifications@github.com wrote:

What is the pull policy on the metallb deployment? Manifests can specify always pull.

Kubernetes internally normalizes the names, docker images without an explicit host typically assume docker.io as the host. There is also the special case of foo:bar => docker.io/library/foo:bar

Ouch!!! You are totally right!!

image: docker.io/metallb/speaker:v0.9.3 imagePullPolicy: Always

So much sorry for bothering you :(

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/kubernetes-sigs/kind/issues/1591#issuecomment-635465650, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAHADK477E67O327CIFO553RT2I3LANCNFSM4NAUFPQQ .

rubensa

comment created time in a day

issue commentkubernetes-sigs/kind

Cache Docker images

This normalization is true of the whole ecosystem but isn't always as user visible.

rubensa

comment created time in a day

issue commentkubernetes-sigs/kind

Cache Docker images

What is the pull policy on the metallb deployment? Manifests can specify always pull.

Kubernetes internally normalizes the names, docker images without an explicit host typically assume docker.io as the host. There is also the special case of foo:bar => docker.io/library/foo:bar

rubensa

comment created time in a day

issue commentkubernetes/test-infra

Prow issue: how to get rid of `do-not-merge/invalid-commit-message`?

A repo admin can bypass this by removing the label and forcibly merge, which is what I'd expect during a repo migration for preserving history.

If the history is just from your own development, please update the commit message.

humblec

comment created time in a day

issue commentkubernetes-sigs/kind

Use memory storage for etcd

Persistence beyond host reboot was the most highly requested issue in the tracker, people do use kind outside of CI ...

For performance improvements to startup the most impact will be had improving the upstream bootstrapping / upstream component performance. You'll be hard pressed to find a kubeadm environment starting much faster than kind with the node image already downloaded...

Apiserver, kubeadm, kubelet etc. are all upstream and the majority of the boot time is spent on those things coming up.

On Thu, May 28, 2020, 08:03 Antonio Ojea notifications@github.com wrote:

None of the manifests in this issue decreased the amount of time it took kind to create a cluster

and that's not the bottleneck creating a cluster, this patch is because etcd is very IO intensive, if you are using slow disks or a laptop with other apps running you will notice the difference, but the time to create a cluster does not depend on this.

persistence is unnecessary.

well, for a CI or dev environment it may not be necessary, but any production clusters needs to persist the data 😅

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/kubernetes-sigs/kind/issues/845#issuecomment-635406892, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAHADKY2VNS3T5NEIZQNLOLRTZ4MRANCNFSM4IUPOOHA .

aojea

comment created time in a day

issue commentkubernetes/kubernetes

Remove the apiserver insecure port

cc @neolit123

tallclair

comment created time in 2 days

issue commentkubernetes-sigs/kind

cluster logs may not be dumped when tests timeout

grace period is correct, though it's difficult to tell from the CI output that anything happened in that time, it looks as if the ginkgo after suite simply hung for 15m ... https://prow.k8s.io/view/gcs/kubernetes-jenkins/pr-logs/pull/kubernetes-sigs_kind/1627/pull-kind-e2e-kubernetes-1-18/1265854255458160642

BenTheElder

comment created time in 2 days

issue closedkubernetes-sigs/kind

Cache Docker images

<!-- Consider also checking https://kind.sigs.k8s.io/#community-discussion-contribution-and-support for support, our slack community is especially helpful! --> I have an slow internet connection so I'm trying to cache the Docker images used in my deployments in kind cluster.

For example, If my deployment references gcr.io/kuar-demo/kuard-amd64:blue image I tried:

$ docker pull gcr.io/kuar-demo/kuard-amd64:blue
$ kind load docker-image gcr.io/kuar-demo/kuard-amd64:blue --name kind
$ cat <<EOF | kubectl apply -f -
apiVersion: apps/v1
kind: Deployment
metadata:
  name: kuard
  labels:
    app: kuard
spec:
  replicas: 3
  selector:
    matchLabels:
      app: kuard
  template:
    metadata:
      labels:
        app: kuard
    spec:
      containers:
      - name: kuard
        image: gcr.io/kuar-demo/kuard-amd64:blue
        ports:
        - containerPort: 8080
EOF

But looks like the image is always downloaded from dockerhub by the node's containerd.

I couldn't find specific documentation about kind load docker-image but thought this was the purpose (to install the local image in the nodes that do not have It already).

The only thing I could find related to my problem is an issue in minkube.

Is this possible? Or my assumptions are wrong?

Environment:

  • kind version: (use kind version):
kind v0.8.0 go1.14.2 linux/amd64
  • Kubernetes version: (use kubectl version):
Client Version: version.Info{Major:"1", Minor:"18", GitVersion:"v1.18.2", GitCommit:"52c56ce7a8272c798dbc29846288d7cd9fbae032", GitTreeState:"clean", BuildDate:"2020-04-16T11:56:40Z", GoVersion:"go1.13.9", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"18", GitVersion:"v1.18.2", GitCommit:"52c56ce7a8272c798dbc29846288d7cd9fbae032", GitTreeState:"clean", BuildDate:"2020-04-30T20:19:45Z", GoVersion:"go1.13.9", Compiler:"gc", Platform:"linux/amd64"}
  • Docker version: (use docker info):
Client:
 Debug Mode: false

Server:
 Containers: 11
  Running: 5
  Paused: 0
  Stopped: 6
 Images: 86
 Server Version: 19.03.8
 Storage Driver: overlay2
  Backing Filesystem: <unknown>
  Supports d_type: true
  Native Overlay Diff: true
 Logging Driver: json-file
 Cgroup Driver: cgroupfs
 Plugins:
  Volume: local
  Network: bridge host ipvlan macvlan null overlay
  Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
 Swarm: inactive
 Runtimes: runc
 Default Runtime: runc
 Init Binary: docker-init
 containerd version: 7ad184331fa3e55e52b890ea95e65ba581ae3429
 runc version: dc9208a3303feef5b3839f4323d9beb36df0a9dd
 init version: fec3683
 Security Options:
  apparmor
  seccomp
   Profile: default
 Kernel Version: 5.3.0-51-generic
 Operating System: Ubuntu 19.10
 OSType: linux
 Architecture: x86_64
 CPUs: 8
 Total Memory: 15.53GiB
 Name: tluportatil082
 ID: 77FN:SH5V:Z2NG:P64K:S7YC:IRIO:GOXY:KNL5:ODQB:K4GX:N25N:4XFZ
 Docker Root Dir: /media/data/docker
 Debug Mode: false
 Registry: https://index.docker.io/v1/
 Labels:
 Experimental: false
 Insecure Registries:
  nexus.treelogic.local:8082
  nexus.treelogic.local:8083
  127.0.0.0/8
 Live Restore Enabled: false
 Product License: Community Engine

WARNING: No swap limit support
  • OS (e.g. from /etc/os-release):
NAME="Ubuntu"
VERSION="19.10 (Eoan Ermine)"
ID=ubuntu
ID_LIKE=debian
PRETTY_NAME="Ubuntu 19.10"
VERSION_ID="19.10"
HOME_URL="https://www.ubuntu.com/"
SUPPORT_URL="https://help.ubuntu.com/"
BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/"
PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy"
VERSION_CODENAME=eoan
UBUNTU_CODENAME=eoan

closed time in 2 days

rubensa

issue commentkubernetes-sigs/kind

Cache Docker images

I'm not sure why you're seeing network traffic that looks like something is being downloaded but that's not a lot to go on, on my end I don't see anything wrong happening here. We do not have the minikube bug.

rubensa

comment created time in 2 days

issue commentkubernetes-sigs/kind

Cache Docker images

I also see that the image shows up in crictl images on the node:

$ docker exec kind-control-plane crictl images
IMAGE                                      TAG                 IMAGE ID            SIZE
docker.io/kindest/kindnetd                 0.5.4               2186a1a396deb       113MB
docker.io/rancher/local-path-provisioner   v0.0.12             db10073a6f829       42MB
gcr.io/kuar-demo/kuard-amd64               blue                1db936caa6acc       23.2MB
k8s.gcr.io/coredns                         1.6.7               67da37a9a360e       43.9MB
k8s.gcr.io/debian-base                     v2.0.0              9bd6154724425       53.9MB
k8s.gcr.io/etcd                            3.4.3-0             303ce5db0e90d       290MB
k8s.gcr.io/kube-apiserver                  v1.18.2             7df05884b1e25       147MB
k8s.gcr.io/kube-controller-manager         v1.18.2             31fd71c85722f       133MB
k8s.gcr.io/kube-proxy                      v1.18.2             312d3d1cb6c72       133MB
k8s.gcr.io/kube-scheduler                  v1.18.2             121edc8356c58       113MB
k8s.gcr.io/pause                           3.2                 80d28bedfe5de       686kB
rubensa

comment created time in 2 days

issue commentkubernetes-sigs/kind

Cache Docker images

If I run:

$ docker pull gcr.io/kuar-demo/kuard-amd64:blue
$ kind load docker-image gcr.io/kuar-demo/kuard-amd64:blue --name kind
$ cat <<EOF | kubectl apply -f -
apiVersion: apps/v1
kind: Deployment
metadata:
  name: kuard
  labels:
    app: kuard
spec:
  replicas: 3
  selector:
    matchLabels:
      app: kuard
  template:
    metadata:
      labels:
        app: kuard
    spec:
      containers:
      - name: kuard
        image: gcr.io/kuar-demo/kuard-amd64:blue
        ports:
        - containerPort: 8080
EOF

with a fresh v0.8.0 install, I see this when describing the pods:

$ kubectl describe po | grep Pulled
  Normal  Pulled     65s   kubelet, kind-control-plane  Container image "gcr.io/kuar-demo/kuard-amd64:blue" already present on machine
  Normal  Pulled     65s   kubelet, kind-control-plane  Container image "gcr.io/kuar-demo/kuard-amd64:blue" already present on machine
  Normal  Pulled     65s   kubelet, kind-control-plane  Container image "gcr.io/kuar-demo/kuard-amd64:blue" already present on machine

Which means it did use the loaded image and did not pull.

rubensa

comment created time in 2 days

issue commentkubernetes-sigs/kind

Cache Docker images

sorry for the lack of comments so far, ctr --namespace=k8s.io images ... was the commands I got from speaking with a containerd maintainer previously. I can't speak to what minikube is doing here.

you can see the correct behavior upstream in containerd here https://github.com/containerd/cri/pull/1170/files#diff-43d291f587f0ddc4781702cedc3d6e81R61

rubensa

comment created time in 2 days

pull request commentkubernetes/test-infra

correct grace period config and match bootstrap

https://storage.googleapis.com/kubernetes-jenkins/pr-logs/pull/kubernetes-sigs_kind/1627/pull-kind-e2e-kubernetes-1-18/1265854255458160642/build-log.txt

entrypoint logs show the 15m timeout, but not anything happening during it.

BenTheElder

comment created time in 2 days

delete branch BenTheElder/test-infra

delete branch : grace-again

delete time in 2 days

issue closedkubernetes-sigs/kind

Kind v0.8.x - K8s v1.11.10 Node Image Errors on Cluster Creation

What happened: Cluster provisioning when using the kind provided K8s v1.11.10 node images fails on Kind version v0.8.1

What you expected to happen: Expected successful provisioning of a kind k8s cluster for v1.11.10 (all other K8s versions work with the provided images)

How to reproduce it (as minimally and precisely as possible):

$ kind create cluster --name nth-test-68e3cc35 --image kindest/node:v1.11.10@sha256:74c8740710649a3abb169e7f348312deff88fc97d74cfb874c5095ab3866bb42 --config kind-two-node-cluster.yaml --kubeconfig build/tmp-nth-test-68e3cc35/kubeconfig --retain

Anything else we need to know?:

$ kind create cluster --name nth-test-68e3cc35 --image kindest/node:v1.11.10@sha256:74c8740710649a3abb169e7f348312deff88fc97d74cfb874c5095ab3866bb42 --config kind-two-node-cluster.yaml --kubeconfig build/tmp-nth-test-68e3cc35/kubeconfig --retain
Creating cluster "nth-test-68e3cc35" ...
 ✓ Ensuring node image (kindest/node:v1.11.10) 🖼
 ✓ Preparing nodes 📦 📦
 ✓ Writing configuration 📜
 ✗ Starting control-plane 🕹️
ERROR: failed to create cluster: failed to init node with kubeadm: command "docker exec --privileged nth-test-68e3cc35-control-plane kubeadm init --ignore-preflight-errors=all --config=/kind/kubeadm.conf --skip-token-print --v=6" failed with error: exit status 1
Command Output: I0505 15:37:34.907857     166 masterconfig.go:113] loading configuration from the given file
I0505 15:37:34.910748     166 feature_gate.go:230] feature gates: &{map[]}
I0505 15:37:34.910843     166 init.go:250] [init] validating feature gates
[init] using Kubernetes version: v1.11.10
[preflight] running pre-flight checks
I0505 15:37:34.910912     166 checks.go:581] validating kubernetes and kubeadm version
I0505 15:37:34.911106     166 checks.go:179] validating if the firewall is enabled and active
I0505 15:37:34.918022     166 checks.go:216] validating availability of port 6443
I0505 15:37:34.918277     166 checks.go:216] validating availability of port 10251
I0505 15:37:34.918394     166 checks.go:216] validating availability of port 10252
I0505 15:37:34.918647     166 checks.go:291] validating the existence of file /etc/kubernetes/manifests/kube-apiserver.yaml
I0505 15:37:34.918915     166 checks.go:291] validating the existence of file /etc/kubernetes/manifests/kube-controller-manager.yaml
I0505 15:37:34.918969     166 checks.go:291] validating the existence of file /etc/kubernetes/manifests/kube-scheduler.yaml
I0505 15:37:34.918978     166 checks.go:291] validating the existence of file /etc/kubernetes/manifests/etcd.yaml
I0505 15:37:34.919020     166 checks.go:438] validating if the connectivity type is via proxy or direct
I0505 15:37:34.919054     166 checks.go:474] validating http connectivity to first IP address in the CIDR
I0505 15:37:34.919090     166 checks.go:474] validating http connectivity to first IP address in the CIDR
I0505 15:37:34.919130     166 checks.go:138] validating if the service is enabled and active
	[WARNING Service-Docker]: docker service is not enabled, please run 'systemctl enable docker.service'
	[WARNING Service-Docker]: docker service is not active, please run 'systemctl start docker.service'
I0505 15:37:34.928593     166 checks.go:340] validating the contents of file /proc/sys/net/bridge/bridge-nf-call-iptables
	[WARNING FileContent--proc-sys-net-bridge-bridge-nf-call-iptables]: /proc/sys/net/bridge/bridge-nf-call-iptables does not exist
I0505 15:37:34.928614     166 checks.go:340] validating the contents of file /proc/sys/net/ipv4/ip_forward
I0505 15:37:34.928709     166 checks.go:653] validating whether swap is enabled or not
	[WARNING Swap]: running with swap on is not supported. Please disable swap
I0505 15:37:34.929045     166 checks.go:381] validating the presence of executable crictl
I0505 15:37:34.929122     166 checks.go:381] validating the presence of executable ip
I0505 15:37:34.929178     166 checks.go:381] validating the presence of executable iptables
I0505 15:37:34.929197     166 checks.go:381] validating the presence of executable mount
I0505 15:37:34.929208     166 checks.go:381] validating the presence of executable nsenter
I0505 15:37:34.929321     166 checks.go:381] validating the presence of executable ebtables
I0505 15:37:34.929422     166 checks.go:381] validating the presence of executable ethtool
I0505 15:37:34.929455     166 checks.go:381] validating the presence of executable socat
I0505 15:37:34.929537     166 checks.go:381] validating the presence of executable tc
I0505 15:37:34.929563     166 checks.go:381] validating the presence of executable touch
I0505 15:37:34.929581     166 checks.go:523] running all checks
I0505 15:37:34.931021     166 kernel_validator.go:81] Validating kernel version
I0505 15:37:34.931174     166 kernel_validator.go:96] Validating kernel config
[preflight] The system verification failed. Printing the output from the verification:
KERNEL_VERSION: 4.19.76-linuxkit
CONFIG_NAMESPACES: enabled
CONFIG_NET_NS: enabled
CONFIG_PID_NS: enabled
CONFIG_IPC_NS: enabled
CONFIG_UTS_NS: enabled
CONFIG_CGROUPS: enabled
CONFIG_CGROUP_CPUACCT: enabled
CONFIG_CGROUP_DEVICE: enabled
CONFIG_CGROUP_FREEZER: enabled
CONFIG_CGROUP_SCHED: enabled
CONFIG_CPUSETS: enabled
CONFIG_MEMCG: enabled
CONFIG_INET: enabled
CONFIG_EXT4_FS: enabled
CONFIG_PROC_FS: enabled
CONFIG_NETFILTER_XT_TARGET_REDIRECT: enabled
CONFIG_NETFILTER_XT_MATCH_COMMENT: enabled
CONFIG_OVERLAY_FS: enabled
CONFIG_AUFS_FS: not set - Required for aufs.
CONFIG_BLK_DEV_DM: enabled
OS: Linux
CGROUPS_CPU: enabled
CGROUPS_CPUACCT: enabled
CGROUPS_CPUSET: enabled
CGROUPS_DEVICES: enabled
CGROUPS_FREEZER: enabled
CGROUPS_MEMORY: enabled
	[WARNING SystemVerification]: failed to get docker info: Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the docker daemon running?
I0505 15:37:34.935173     166 checks.go:411] checking whether the given node name is reachable using net.LookupHost
I0505 15:37:34.936392     166 checks.go:622] validating kubelet version
I0505 15:37:35.050300     166 checks.go:138] validating if the service is enabled and active
I0505 15:37:35.058544     166 checks.go:216] validating availability of port 10250
I0505 15:37:35.058650     166 checks.go:216] validating availability of port 2379
[preflight/images] Pulling images required for setting up a Kubernetes cluster
[preflight/images] This might take a minute or two, depending on the speed of your internet connection
[preflight/images] You can also perform this action in beforehand using 'kubeadm config images pull'
I0505 15:37:35.058790     166 checks.go:253] validating the existence and emptiness of directory /var/lib/etcd
`docker` is required when docker is the container runtime and the kubelet is not running: exec: "docker": executable file not found in $PATH
$ cat kind-two-node-cluster.yaml
kind: Cluster
apiVersion: kind.x-k8s.io/v1alpha4
nodes:
- role: control-plane
  kubeadmConfigPatches:
    - |
      apiVersion: kubeadm.k8s.io/v1beta2
      kind: ClusterConfiguration
      metadata:
        name: config
      apiServer:
        extraArgs:
          "enable-admission-plugins": "NodeRestriction,PodSecurityPolicy"
- role: worker%

Environment:

  • kind version: (use kind version):
$ kind version
kind v0.8.1 go1.14.2 darwin/amd64
  • Kubernetes version: (use kubectl version):
kind-worker-node $ kubectl version
Client Version: version.Info{Major:"1", Minor:"11", GitVersion:"v1.11.10", GitCommit:"7a578febe155a7366767abce40d8a16795a96371", GitTreeState:"clean", BuildDate:"2020-05-01T03:01:03Z", GoVersion:"go1.10.8", Compiler:"gc", Platform:"linux/amd64"}
  • Docker version: (use docker info):
docker info
Client:
 Debug Mode: false
 Plugins:
  buildx: Build with BuildKit (Docker Inc., v0.3.1-tp-docker)
  app: Docker Application (Docker Inc., v0.8.0)

Server:
 Containers: 2
  Running: 2
  Paused: 0
  Stopped: 0
 Images: 1
 Server Version: 19.03.8
 Storage Driver: overlay2
  Backing Filesystem: <unknown>
  Supports d_type: true
  Native Overlay Diff: true
 Logging Driver: json-file
 Cgroup Driver: cgroupfs
 Plugins:
  Volume: local
  Network: bridge host ipvlan macvlan null overlay
  Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
 Swarm: inactive
 Runtimes: runc
 Default Runtime: runc
 Init Binary: docker-init
 containerd version: 7ad184331fa3e55e52b890ea95e65ba581ae3429
 runc version: dc9208a3303feef5b3839f4323d9beb36df0a9dd
 init version: fec3683
 Security Options:
  seccomp
   Profile: default
 Kernel Version: 4.19.76-linuxkit
 Operating System: Docker Desktop
 OSType: linux
 Architecture: x86_64
 CPUs: 4
 Total Memory: 7.778GiB
 Name: docker-desktop
 ID: TO3B:GCJF:M7JQ:NVVW:KOQ3:756K:NIBR:UEIZ:UR2M:MFO3:SXEX:Q3Z6
 Docker Root Dir: /var/lib/docker
 Debug Mode: true
  File Descriptors: 52
  Goroutines: 63
  System Time: 2020-05-05T15:43:54.827453043Z
  EventsListeners: 3
 HTTP Proxy: gateway.docker.internal:3128
 HTTPS Proxy: gateway.docker.internal:3129
 Registry: https://index.docker.io/v1/
 Labels:
 Experimental: false
 Insecure Registries:
  127.0.0.0/8
 Live Restore Enabled: false
 Product License: Community Engine
  • OS (e.g. from /etc/os-release): Mac OS X Catalina 10.15.4
$ uname -a
Darwin imac.lan 19.4.0 Darwin Kernel Version 19.4.0: Wed Mar  4 22:28:40 PST 2020; root:xnu-6153.101.6~15/RELEASE_X86_64 x86_64

closed time in 2 days

bwagner5

issue commentkubernetes-sigs/kind

Kind v0.8.x - K8s v1.11.10 Node Image Errors on Cluster Creation

confirmed that 1.12.10 works. I've updated the release notes.

bwagner5

comment created time in 2 days

issue commentkubernetes-sigs/kind

Kind v0.8.x - K8s v1.11.10 Node Image Errors on Cluster Creation

I'm sorry this took so long to get to, I think this is a kubeadm bug. I'm going to drop the image from the release and drop 1.11.X at least going forward, as it's been out of community support for a full year now.

I'm going to discuss this with some of our other contributors but I think our support policy going forward is going to be something like "everything kubernetes supports plus the unreleased code plus 3 older versions", which would currently be: (1.18.X, 1.17.X, 1.16.X) + 1.19.0-alpha + (1.15.X, 1.14.X, 1.13.X).

bwagner5

comment created time in 2 days

pull request commentkubernetes-sigs/kind

trap INT and TERM in CI

/test all

BenTheElder

comment created time in 2 days

pull request commentkubernetes/test-infra

Add ingress-nginx push image postsubmit job

/lgtm /approve

aledbf

comment created time in 2 days

pull request commentkubernetes/test-infra

correct grace period config and match bootstrap

To be reviewed

BenTheElder

comment created time in 2 days

issue commentkubernetes-sigs/kind

cluster logs may not be dumped when tests timeout

... correcting grace period https://github.com/kubernetes/test-infra/pull/17736

BenTheElder

comment created time in 2 days

pull request commentkubernetes/test-infra

correct grace period config and match bootstrap

15m seems high, but for now i'm just going to match bootstrap, I'm most concerned with making sure we trigger cleanup etc. properly when we hit the timeout, we can narrow the period later if we wish.

BenTheElder

comment created time in 2 days

PR opened kubernetes/test-infra

correct grace period config and match bootstrap

the previous key was a bad copy paste, i'm surprised it passed validation, I guess we don't have strict key checking still

this also updates the value to match bootstrap, I did some thorough digging this time and i'm pretty certain that bootstrap jobs get a 15m grace period if the timeout actually is triggered (which is very buggy in bootstrap).

/cc @aojea @amwat

+9 -9

0 comment

2 changed files

pr created time in 2 days

create barnchBenTheElder/test-infra

branch : grace-again

created branch time in 2 days

Pull request review commentkubernetes/test-infra

set grace period on some key kind jobs

 presubmits:     path_alias: sigs.k8s.io/kind     decoration_config:       timeout: 40m+      terminationGracePeriodSeconds: 300

this is wrong, bad cut and paste, it's grace_period: 15s (for the default anyhow)

BenTheElder

comment created time in 2 days

push eventBenTheElder/test-infra

Justin Santa Barbara

commit sha 2c400b44bc98914cad6b220cd3b87b85d55a02a1

kops: capture sysctl settings We're seeing some differences across OSes, particularly around flannel networking. One workaround is to change some sysctls, so let's capture those sysctl values.

view details

Marko Mudrinić

commit sha 93b54445fe40408a72a0d82d4a747858e5ccba8f

CAPDO: Add image-pushing postsubmit

view details

Mateusz Szostok

commit sha fb20ecb377e947a5b4c925a8788f0de4f580c99e

Bump Go version to 1.13 for svcat unit-tests

view details

Alex Pavel

commit sha efa8945edbcb2cffd9e0fa383f1add0a3ef88ec6

config: use fsnotify-based watcher for config agent

view details

Alex Pavel

commit sha f210fd4d54889503eb9123d131ccf09ec9868020

config agent: improve godoc

view details

Ernest Wong

commit sha 5b6a44c7921f8a81b335e96ba144048c42b27d7a

Azure: delete msft.asc after installing azure-cli & add additional fields in API model

view details

Dexter Rivera

commit sha a353e64bb36f167b644fc070cbf3431cad4c46a8

Modify gci option in --extract flag By default, the --extract=gci/FAMILY flag will look at COS images in the GCP project "container-vm-image-staging" and use the GCS bucket "gs://container-vm-image-staging/k8s-version-map" to map the image name to k8s version to be extracted. These values are hard-coded, and it would be ideal to pass that information to the --extract flag instead.

view details

Dexter Rivera

commit sha fa4b6c24a7e7b00f62f95cd47bdb9d3b1851f1cc

Fix unformatted go files

view details

Arnaud Meukam

commit sha 7f0068dcffdd60f122c14af3c043e206ae6456c8

Migrate slack-infra images Signed-off-by: Arnaud Meukam <ameukam@gmail.com>

view details

Dexter Rivera

commit sha d04ab84b49213822d085928c9e9791f8b0efc628

Add go unit tests for changes to extract_k8s.go

view details

Chao Dai

commit sha 70e9dd26df9351299727bd4cbf2e04f3d616092f

Just moving code around

view details

Chao Dai

commit sha 294e0e544ee0608feb125dd95afac4708f1e55c7

Deck runlocal supports dynamic job listing

view details

Alex Pavel

commit sha 143cd644d4535d899360bad3a3fdefa2e9c5f8a1

config agent: improvements based on review

view details

Chao Dai

commit sha f527158aea1f15df85d8727d8ad57690d279a9a2

Update based on code review

view details

Benjamin Elder

commit sha b3940552fe70edbb4b1300bd1606c2546039e02e

remove extra p

view details

Kubernetes Prow Robot

commit sha 4f38406fcbd6102a1607f62f68df53e484fee84c

Merge pull request #17604 from BenTheElder/one-two-many-p remove extra p

view details

Ahmad Nurus S

commit sha bc599d3acfc4b321e5e8efbfbb02d506c3991b6e

Add capdo e2e job

view details

Kubernetes Prow Robot

commit sha 1c40f4e835564575d032d60e46da4d3230801d60

Merge pull request #17603 from prksu/add-capdo-e2e-job Add capdo e2e job

view details

Kubernetes Prow Robot

commit sha 59705fc01a8d64ddae729d025505102fa4163a2f

Merge pull request #17567 from mszostok/change-go-ver Bump Go version to 1.13 for svcat unit-tests

view details

Alvaro Aleman

commit sha 33fe09506485342354ad6a3a913534d36a57ffe6

Deck rerun: Fix trusted user check for presubmits We have to check if the user that requests the rerun is authorized, not if the user that created the PullRequest associated with the presubmit is.

view details

push time in 2 days

pull request commentkubernetes-sigs/kind

fix NO_PROXY

The document asks to add "172.17.0.0/16" to NO_PROXY, however these versions seem to require to add docker network's subnet and all nodes' name, like "172.18.0.0/16,kind-control-plane". (And after this commit is included, users won't need to specify anything special to NO_PROXY for changing kind's behavior.)

That's correct, sorry about that.

The next release is tentatively due by june 1st, though we got a bit less done than I hoped so far .. https://github.com/kubernetes-sigs/kind/milestones

BTW, this PR is very nice work, for it removed much complexity for users in deploying kind behind a proxy. Thank you very much for your work!

thanks :-)

BenTheElder

comment created time in 2 days

pull request commentkubernetes-sigs/kind

trap INT and TERM in CI

/test all

BenTheElder

comment created time in 2 days

Pull request review commentkubernetes/test-infra

add greenhouse-metrics for k8s-infra-prow-build

 stringData:       static_configs:         - targets:             - "35.225.115.154" # external ip greenhouse-metrics for k8s-prow-builds+            - "34.72.140.202" # external ip greenhouse-metrics for k8s-infra-prow-build

not sure how this behaves but /lgtm /hold

spiffxp

comment created time in 2 days

pull request commentkubernetes/k8s.io

Add greenhouse to k8s-infra-prow-build

/lgtm

spiffxp

comment created time in 2 days

issue commentkubernetes-sigs/kind

[connection]: Unable to access kind cluster inside a custom Github Action

awesome thanks!

I'd like to figure out 0.8+, but we can revisit that later :-)

uditgaurav

comment created time in 2 days

issue closedkubernetes-sigs/kind

[connection]: Unable to access kind cluster inside a custom Github Action

What is happening?

  • I'm Using a KinD cluster in Github Action the flow of workflow yaml is:
    • Creating KinD cluster (Created successfully)
    • Getting Nodes via kubectl get nodes
NAME                 STATUS   ROLES    AGE   VERSION
kind-control-plane   Ready    master   68s   v1.17.0
  • Now I'm trying to use the cluster inside my personal Github action bypassing kubeconfig. It is showing connection error:
exit status 1: The connection to the server 127.0.0.1:32768 was refused - did you specify the right host or port?
  • Did I need to do anything else?

Workflow-yaml

name: Push
on:
  push:
    branches: [ master ]
jobs:
  build:
    runs-on: ubuntu-latest
    steps:
    - uses: actions/checkout@v2

    - name: Creating KinD cluster
      run: |
        GO111MODULE=on 
        curl -Lo ./kind https://github.com/kubernetes sigs/kind/releases/download/v0.7.0/kind-$(uname)-amd64
        chmod +x ./kind
        sudo mv ./kind /usr/local/bin/kind
        kind version
        kind create cluster --loglevel debug
        sleep 60

    - name: Getting Nodes
      run: |
        kubectl get nodes

    - name: Set config
      run: echo ::set-env name=KUBE_CONFIG_DATA::$(base64 -w 0 ~/.kube/config)
  
      #This is the custom Github Action inside which the kubectl command is not running.
    - name: Running Litmus pod delete chaos experiment
      uses: uditgaurav/kubernetes-chaos@v0.1.0

closed time in 2 days

uditgaurav

issue commentkubernetes-sigs/kind

cluster logs may not be dumped when tests timeout

setting a grace period here https://github.com/kubernetes/test-infra/pull/17735

BenTheElder

comment created time in 2 days

PR opened kubernetes/test-infra

set grace period on some key kind jobs

/cc @amwat @aojea @spiffxp

xref: https://github.com/kubernetes-sigs/kind/issues/1626

+9 -0

0 comment

2 changed files

pr created time in 2 days

create barnchBenTheElder/test-infra

branch : set-grace-period

created branch time in 2 days

issue commentkubernetes-sigs/kind

cluster logs may not be dumped when tests timeout

the GCE jobs are using test-infra's deprecated "bootstrap.py" runner on prow instead of the pod utils. as a result the kind jobs are only getting 15s to do any graceful cleanup.

BenTheElder

comment created time in 2 days

pull request commentkubernetes-sigs/kind

trap INT and TERM in CI

notably I'm concerned about how podutils / our config is handling these, I think we need a longer grace period.

May 27 21:06:34.545: INFO: Running AfterSuite actions on all nodes

{"component":"entrypoint","file":"prow/entrypoint/run.go:164","func":"k8s.io/test-infra/prow/entrypoint.Options.ExecuteProcess","level":"error","msg":"Process did not finish before 40m0s timeout","time":"2020-05-27T21:29:25Z"} {"component":"entrypoint","file":"prow/entrypoint/run.go:245","func":"k8s.io/test-infra/prow/entrypoint.gracefullyTerminate","level":"error","msg":"Process did not exit before 15s grace period","time":"2020-05-27T21:29:40Z"}

note those times, 15s is not even enough time for the ginkgo after-suite to finish. the GCE jobs are on bootstrap.py

BenTheElder

comment created time in 2 days

pull request commentkubernetes-sigs/kind

trap INT and TERM in CI

/hold cancel there's other reasons we may not work during timeouts, this change is still correct

BenTheElder

comment created time in 2 days

pull request commentkubernetes-sigs/kind

[doc] Fix 1361 node label

no problem. i haven't actually found docs for this yet 😅, but it is a "known" upstream restriction.

antoinetran

comment created time in 2 days

Pull request review commentkubernetes-sigs/kind

trap INT and TERM in CI

 install_kind() { main() {   # create temp dir and setup cleanup   TMP_DIR=$(mktemp -d)-  trap cleanup EXIT+  trap cleanup INT TERM EXIT

There's also what @Shawn said: Ash and Dash don't trap signals with EXIT.

that's the only relevant part really.

POSIX has requirements about what it looks like when EXIT is called, but when it is called is under specified. we explicitly want these

BenTheElder

comment created time in 2 days

issue commentkubernetes-sigs/kind

[connection]: Unable to access kind cluster inside a custom Github Action

I forgot we only have "kind get kubeconfig --internal >$HOME/.kube/config"

This may be an issue on 0.8+ ... I'm not sure if github exposes what docker network they use for actions or any way to configure it. it looks like no and no.

On Wed, May 27, 2020 at 1:53 PM UDIT GAURAV notifications@github.com wrote:

My kind version is 0.7 kind export kubeconfig --internal gives ERROR: unknown flag: --internal maybe I'm missing some parameter? So, I passed this to action and now it seems to be running fine for me.

- name: Export kubeconfig
  run: |
    kind get kubeconfig --name "kind" --internal | sed "s/kind-control-plane/$(docker inspect "kind-control-plane" --format "{{ .NetworkSettings.Networks.kind.IPAddress }}")/g" > config

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/kubernetes-sigs/kind/issues/1619#issuecomment-634933009, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAHADK3BS4Q5BGCH4YRHVODRTV4SRANCNFSM4NISAQSQ .

uditgaurav

comment created time in 2 days

issue commentkubernetes-sigs/kind

cluster logs may not be dumped when tests timeout

filed https://github.com/kubernetes-sigs/kind/pull/1627 to possibly fix the exit handler in the CI script(s)

BenTheElder

comment created time in 2 days

Pull request review commentkubernetes-sigs/kind

trap INT and TERM in CI

 install_kind() { main() {   # create temp dir and setup cleanup   TMP_DIR=$(mktemp -d)-  trap cleanup EXIT+  trap cleanup INT TERM EXIT

this one should already be doing this because it's bash, but the other script may be ash/dash (it's using sh)

BenTheElder

comment created time in 2 days

PR opened kubernetes-sigs/kind

trap INT and TERM in CI

EXIT is not handled consistently across shells

this will be obviated by #986 but the need to debug some timeouts is a bit more pressing and this change is tiny ...

+2 -2

0 comment

2 changed files

pr created time in 2 days

create barnchBenTheElder/kind

branch : all-the-traps

created branch time in 2 days

pull request commentkubernetes-sigs/kind

[doc] Fix 1361 node label

only certain labels may be used. additionally kubeadm uses node role labels for it's own purposes currently.

antoinetran

comment created time in 2 days

pull request commentkubernetes-sigs/kind

[doc] Fix 1361 node label

kubernetes.io is reserved by kubernetes upstream, you need to use your own labels https://github.com/kubernetes/kops/issues/7494

antoinetran

comment created time in 2 days

pull request commentkubernetes/kubernetes

Revert "Revert "Rely on default watch cache capacity and ignore its requested size""

tracking in https://github.com/kubernetes-sigs/kind/issues/1626, we chatted and I think kind / kubernetes test setup is not handling timeout properly. working on it.

wojtek-t

comment created time in 2 days

issue openedkubernetes-sigs/kind

cluster logs may not be dumped when tests timeout

<!-- Please use this template while reporting a bug and provide as much info as possible. Not doing so may result in your bug not being addressed in a timely manner. Thanks!-->

What happened:

when running kubernetes's e2e tests we sometimes lose the cluster logs, this appears to be due to poor handling of timeout

What you expected to happen:

we should handle test timeout correctly

See:

https://github.com/kubernetes/kubernetes/pull/91491#issuecomment-634689364

/assign

created time in 2 days

issue openedkubernetes-sigs/kind

increase kubernetes component verbosity

<!-- Please only use this template for submitting enhancement requests -->

What would you like to be added:

during e2e testing we should increase the component log verbosity to match the GCE testing

Why is this needed:

so components can be more easily debugged

it might? not make sense to do this in clusters used for other purposes (e.g. istio development) but it definitely makes sense here.

requested by @wojtek-t https://github.com/kubernetes/kubernetes/blob/master/cluster/gce/config-test.sh#L204

/assign

created time in 2 days

push eventBenTheElder/kind

Benjamin Elder

commit sha fefccb7a7dc8ac6d249c2d8bc5d54a6d7b0ea73c

upgrade to containerd v1.4.0-beta.0-2-g6312b52d

view details

Benjamin Elder

commit sha fce9435798937bebdd4418a19d32e0f52436b722

bump base image

view details

Benjamin Elder

commit sha 544bf5ad0aad65f9f044ddd5ead3ba946ae161d0

bump node image

view details

Kubernetes Prow Robot

commit sha 82c4dae30441b78de410a34c09b1059921b29b58

Merge pull request #1599 from BenTheElder/upgrade-containerd Upgrade containerd to 1.4-beta.0

view details

push time in 2 days

issue commentkubernetes-sigs/kind

[connection]: Unable to access kind cluster inside a custom Github Action

Taking a second look: does this last action run in a container?

Localhost / the loopback interface is unique to each container. 127.0.0.1 is local to each container...

You can use kind export kubeconfig --internal on 0.7, on 0.8+ you'd either need this step to run on the kind network somehow (not sure if this is possible with GitHub actions), or else https://github.com/kubernetes-sigs/kind/issues/1558

uditgaurav

comment created time in 2 days

issue commentkubernetes/kubernetes

kube-proxy:1.17 for s390x image is broken?

can someone link back the fix mentioned in https://github.com/kubernetes/kubernetes/issues/87197#issuecomment-634424564 ?

cheeye

comment created time in 2 days

issue commentkubernetes-sigs/kind

Unhelpful info log verbosity level

We can change the message a bit but, there are no advertised values. Kubernetes doesn't advertise any either.

lukehinds

comment created time in 2 days

issue commentkubernetes-sigs/kind

Unhelpful info log verbosity level

This is a standard kubernetes / klog / glog thing. You increase the value it until it shows what you're looking for.

The verbose logs aren't a contract of any sort, they're for developing kind. The loglevel mapping is purely there to maintain that calling the CLI won't break until users get a chance to switch.

User facing information emits at the default, and can be silenced with the quiet flag.

lukehinds

comment created time in 2 days

pull request commentkubernetes-sigs/kind

Upgrade containerd to 1.4-beta.0

/hold cancel

BenTheElder

comment created time in 3 days

more