profile
viewpoint
Michael Crosby crosbymichael Indiana http://crosbymichael.com Building things for others who build things

containerd/cgroups 441

cgroups package for Go

ClusterHQ/powerstrip 306

Powerstrip: A tool for prototyping Docker extensions

crosbymichael/boss 257

Run containers like a ross

containerd/ttrpc 180

GRPC for low-memory environments

christopherhesse/rethinkgo 139

OBSOLETE Go language driver for RethinkDB

containerd/continuity 78

A transport-agnostic, filesystem metadata manifest system

containerd/go-cni 71

A generic CNI library to provide APIs for CNI plugin interactions

containerd/go-runc 61

runc bindings for Go

containerd/fifo 56

fifo pkg for Go

crosbymichael/.dotfiles 53

bootstrap for my dev setup

pull request commentopencontainers/runc

Release 1.0.0-rc91

LGTM

mrunalp

comment created time in 4 days

startedcontainernetworking/cni

started time in 6 days

pull request commentcontainerd/containerd

Use path based unix socket for shims

Ok, it looks like we have 1-2 test to fix. The handling and timeouts of using file based unix sockets seem to be different than an abstract socket. I'm guessing this has to be something about if there is someone listening on the other end or not.

If anyone else has time to look into this go for it, I'll try to debug Monday.

crosbymichael

comment created time in 9 days

push eventcrosbymichael/containerd

Michael Crosby

commit sha 1839f67616349ed2fd92877260a0c6e9abbd664b

Use path based unix socket for shims This allows filesystem based ACLs for configuring access to the socket of a shim. Signed-off-by: Michael Crosby <michael@thepasture.io>

view details

push time in 9 days

push eventcrosbymichael/containerd

Johannes Frey

commit sha 87f9fdb06519594d8f26d6a20f85e79b9a35d8bf

Cope with double quotes in Linux Mountinfo Signed-off-by: Johannes Frey <me@johannes-frey.de>

view details

Johannes Frey

commit sha cb91b1724dec212db7ba68958f2b7aba8a4ceee9

Add testcase containing mountpoint with escaped backslash Signed-off-by: Johannes Frey <me@johannes-frey.de>

view details

Johannes Frey

commit sha 8897e152030ec3d6076558388f84e447a7be1b64

Add more test cases with single quotes Signed-off-by: Johannes Frey <me@johannes-frey.de>

view details

Johannes Frey

commit sha ee734e867ab9732a7c42028be1e8a8a76ac3da84

Add test case with backticks Signed-off-by: Johannes Frey <me@johannes-frey.de>

view details

Akihiro Suda

commit sha fd99b6566be4f2303e71274976e2cd6eee4553c0

decrease log level of cgroup2 ToggleController error when running in UserNS Fix #4312 Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>

view details

Avi Deitcher

commit sha e7f069e2c337bf77d31b7460bda980482fdaf508

describe content flow and dependencies Signed-off-by: Avi Deitcher <avi@deitcher.net>

view details

Derek McGowan

commit sha 1127ffc7400e2d1b438979fd782b7ed9c73e5c9b

Merge pull request #4207 from deitch/doc-content describe content flow and dependencies

view details

Michael Crosby

commit sha 492c014136a301eff66a970311cd480d1d31228b

Merge pull request #4340 from AkihiroSuda/fix-4312 decrease log level of cgroup2 ToggleController error when running in UserNS

view details

Michael Crosby

commit sha c75180740937d4b2d44b9c1edc1c27b208e66e32

Merge pull request #4325 from c445/mountinfo-linux-double-quotes Cope with double quotes in Linux Mountinfo

view details

Michael Crosby

commit sha 578337fd5c89caee95f2fb282c8c82d30eb67700

Use path based unix socket for shims This allows filesystem based ACLs for configuring access to the socket of a shim. Signed-off-by: Michael Crosby <michael@thepasture.io>

view details

push time in 9 days

Pull request review commentcontainerd/containerd

Use path based unix socket for shims

 func AdjustOOMScore(pid int) error { 	return nil } -// SocketAddress returns an abstract socket address+// SocketAddress returns a socket address func SocketAddress(ctx context.Context, id string) (string, error) {

I think from @AkihiroSuda comment above about prefixing the paths with file:// or unix:// then we can solve this problem this way. If there is not URI based path in the address file, we will treat it as an abstract socket for backwards compatibility. I think this will work out really well :)

crosbymichael

comment created time in 10 days

Pull request review commentcontainerd/containerd

Use path based unix socket for shims

 func AdjustOOMScore(pid int) error { 	return nil } -// SocketAddress returns an abstract socket address+// SocketAddress returns a socket address func SocketAddress(ctx context.Context, id string) (string, error) { 	ns, err := namespaces.NamespaceRequired(ctx) 	if err != nil { 		return "", err 	} 	d := sha256.Sum256([]byte(filepath.Join(ns, id)))-	return filepath.Join(string(filepath.Separator), "containerd-shim", fmt.Sprintf("%x.sock", d)), nil+	return filepath.Join("/run/containerd/s", fmt.Sprintf("%x", d)), nil

ok, let me see if I can get that data easily.

crosbymichael

comment created time in 10 days

Pull request review commentcontainerd/containerd

Use path based unix socket for shims

 func AdjustOOMScore(pid int) error { 	return nil } -// SocketAddress returns an abstract socket address+// SocketAddress returns a socket address func SocketAddress(ctx context.Context, id string) (string, error) { 	ns, err := namespaces.NamespaceRequired(ctx) 	if err != nil { 		return "", err 	} 	d := sha256.Sum256([]byte(filepath.Join(ns, id)))-	return filepath.Join(string(filepath.Separator), "containerd-shim", fmt.Sprintf("%x.sock", d)), nil+	return filepath.Join("/run/containerd/s", fmt.Sprintf("%x", d)), nil }  // AnonDialer returns a dialer for an abstract socket func AnonDialer(address string, timeout time.Duration) (net.Conn, error) { 	address = strings.TrimPrefix(address, "unix://")-	return dialer.Dialer("\x00"+address, timeout)+	return dialer.Dialer(address, timeout)

I think that could work. Instead of file what about unix://? Most of our code strips unix:// anyways.

crosbymichael

comment created time in 10 days

Pull request review commentcontainerd/containerd

Use path based unix socket for shims

 func AdjustOOMScore(pid int) error { 	return nil } -// SocketAddress returns an abstract socket address+// SocketAddress returns a socket address func SocketAddress(ctx context.Context, id string) (string, error) {

That is a tricky one. I think, with a little more work, I can work it to support existing abstract sockets based on the static /containerd-shim path (outside of /run). We can just assume this is an abstract socket path. Does that sound ok to you?

crosbymichael

comment created time in 10 days

push eventcontainerd/containerd

Johannes Frey

commit sha 87f9fdb06519594d8f26d6a20f85e79b9a35d8bf

Cope with double quotes in Linux Mountinfo Signed-off-by: Johannes Frey <me@johannes-frey.de>

view details

Johannes Frey

commit sha cb91b1724dec212db7ba68958f2b7aba8a4ceee9

Add testcase containing mountpoint with escaped backslash Signed-off-by: Johannes Frey <me@johannes-frey.de>

view details

Johannes Frey

commit sha 8897e152030ec3d6076558388f84e447a7be1b64

Add more test cases with single quotes Signed-off-by: Johannes Frey <me@johannes-frey.de>

view details

Johannes Frey

commit sha ee734e867ab9732a7c42028be1e8a8a76ac3da84

Add test case with backticks Signed-off-by: Johannes Frey <me@johannes-frey.de>

view details

Michael Crosby

commit sha c75180740937d4b2d44b9c1edc1c27b208e66e32

Merge pull request #4325 from c445/mountinfo-linux-double-quotes Cope with double quotes in Linux Mountinfo

view details

push time in 11 days

PR merged containerd/containerd

Cope with double quotes in Linux Mountinfo

This PR resolves the double quote problem described in issue https://github.com/containerd/containerd/issues/4257

+107 -2

12 comments

2 changed files

johannesfrey

pr closed time in 11 days

push eventcontainerd/containerd

Akihiro Suda

commit sha fd99b6566be4f2303e71274976e2cd6eee4553c0

decrease log level of cgroup2 ToggleController error when running in UserNS Fix #4312 Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>

view details

Michael Crosby

commit sha 492c014136a301eff66a970311cd480d1d31228b

Merge pull request #4340 from AkihiroSuda/fix-4312 decrease log level of cgroup2 ToggleController error when running in UserNS

view details

push time in 11 days

issue closedcontainerd/containerd

[cgroup2+rootless] silence ToggleControllers EACCES log

<!-- If you are reporting a new issue, make sure that we do not have any duplicates already open. You can ensure this by searching the issue list for this repository. If there is a duplicate, please close your issue and add a comment to the existing issue instead. -->

Description

<!-- Briefly describe the problem you are having in a few paragraphs. --> An extra log is printed on cgroup2+rootless. Does not affect the actual behavior.

Steps to reproduce the issue:

  1. Start dockerd-rootless.sh on a cgroup v2 host (needs git master for everything. Prebuilt binaries are here: https://github.com/AkihiroSuda/moby-snapshot)
  2. Run some container: docker run --rm hello-world
  3. Observe the daemon log

Describe the results you received: Works as expected, but an extra message is logged:

time="2020-06-08T21:17:14.892316833+09:00" level=error msg="failed to enable controllers ([cpuset cpu io memory pids rdma])" error="failed to write subtree controllers [cpuset cpu io memory pids rdma] to \"/sys/fs/cgroup/user.slice/user-1001.slice/cgroup.subtree_control\": open /sys/fs/cgroup/user.slice/user-1001.slice/cgroup.subtree_control: permission denied"

https://github.com/containerd/containerd/blob/38cb1c1a54e3180edd29933974d715b69334f0f1/runtime/v2/runc/v2/service.go#L353-L354

Describe the results you expected: The log should be silenced.

Output of containerd --version:

containerd github.com/containerd/containerd v1.4.0-beta.1-18-g38cb1c1a 38cb1c1a54e3180edd29933974d715b69334f0f1

closed time in 11 days

AkihiroSuda

push eventcrosbymichael/containerd

Michael Crosby

commit sha af42ed328d2df97f5bf84d3566b03b6ba191866f

Use path based unix socket for shims This allows filesystem based ACLs for configuring access to the socket of a shim. Signed-off-by: Michael Crosby <michael@thepasture.io>

view details

push time in 11 days

push eventcrosbymichael/containerd

Michael Crosby

commit sha 91a9d951bee10abadc5d4b27d5ebb147a3faba62

Use path based unix socket for shims This allows filesystem based ACLs for configuring access to the socket of a shim. Signed-off-by: Michael Crosby <michael@thepasture.io>

view details

push time in 11 days

PR opened containerd/containerd

Use path based unix socket for shims

This allows filesystem based ACLs for configuring access to the socket of a shim.

Signed-off-by: Michael Crosby michael@thepasture.io

+23 -14

0 comment

4 changed files

pr created time in 11 days

create barnchcrosbymichael/containerd

branch : shim-socket-path

created branch time in 11 days

push eventcontainerd/containerd

Akihiro Suda

commit sha bebfbab03163da35300c360504ce5df33a19a40d

vendor: update bbolt to v1.3.5 We had once updated bbolt from v1.3.3 to v1.3.4 in #4134, but reverted to v1.3.3 in #4156 due to "fatal error: sweep increased allocation count" (etcd-io/bbolt#214). The issue was fixed in bbolt v1.3.5 (etcd-io/bbolt#220). Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>

view details

Michael Crosby

commit sha c2f8011ff84dc584f821f155d17f36bdf550a157

Merge pull request #4334 from AkihiroSuda/bbolt-1.3.5 vendor: update bbolt to v1.3.5

view details

push time in 12 days

PR merged containerd/containerd

vendor: update bbolt to v1.3.5

We had once updated bbolt from v1.3.3 to v1.3.4 in #4134, but reverted to v1.3.3 in #4156 due to "fatal error: sweep increased allocation count" (etcd-io/bbolt#214).

The issue was fixed in bbolt v1.3.5 (etcd-io/bbolt#220).

Full changes: https://github.com/etcd-io/bbolt/compare/v1.3.3...v1.3.5

+251 -148

2 comments

25 changed files

AkihiroSuda

pr closed time in 12 days

pull request commentcontainerd/containerd

vendor: update bbolt to v1.3.5

LGTM

AkihiroSuda

comment created time in 12 days

pull request commentcontainerd/containerd

[release/1.3] Bump Golang 1.13.12

LGTM

AkihiroSuda

comment created time in 12 days

pull request commentcontainerd/project

Add security advisor role

LGTM

dmcgowan

comment created time in 17 days

startedAnuken/Mindustry

started time in 17 days

startedstamblerre/gocode

started time in 18 days

startedfatih/vim-go

started time in 18 days

startedmhinz/vim-signify

started time in 19 days

Pull request review commentcontainerd/project

Add security advisor role

+# containerd project security advisors+#+# See GOVERNANCE.md for description of role+#+# SECURITY ADVISORS+# GitHub ID, Name, Email address

Should we have a column for who these advisors may represent? i.e. a company?

dmcgowan

comment created time in 19 days

startednlpodyssey/spago

started time in 19 days

delete branch crosbymichael/containerd

delete branch : cri-bump1.4x

delete time in 20 days

PR opened containerd/containerd

Bump CRI for 1.4x release

includes selinux bump.

Signed-off-by: Michael Crosby michael@thepasture.io

+68 -18

0 comment

13 changed files

pr created time in 20 days

create barnchcrosbymichael/containerd

branch : cri-bump1.4x

created branch time in 20 days

issue closedcontainerd/containerd

Native nvidia support

Seeing https://github.com/NVIDIA/nvidia-docker#quickstart

Note that with the release of Docker 19.03, usage of nvidia-docker2 packages are deprecated since NVIDIA GPUs are now natively supported as devices in the Docker runtime.

I was wondering if this is coming to containerd, assuming this is an implementation in runc? Or will we continue to need to add nvidia-container-runtime to the config.toml?

closed time in 20 days

joedborg

issue commentcontainerd/containerd

Native nvidia support

I'm also looking into better ways to handle this in CRI.

But right now, I tested this the other day on my ubuntu system and the WithGPUs works great still. You can also use ctr run --gpus 0 ... to test this.

joedborg

comment created time in 20 days

issue commentcontainerd/containerd

Containerd cannot create containers that mount paths containing double quotes

I think this is a good patch to accept. Feel free to open a PR if you can.

johannesfrey

comment created time in 20 days

delete branch crosbymichael/cri

delete branch : selinux-bump

delete time in 20 days

push eventcontainerd/containerd

Wei Fu

commit sha d656fa38ca32fc0e08f31b74169552f371bbd4e0

restart plugin: support binary log uri Introduce LogURIGenerator helper function in cio package. It is used in the restart options, like WithBinaryLogURI and WithFileLogURI. And restart.LogPathLabel might be used in production and work well. In order to reduce breaking change, the LogPathLabel is still recognized if new LogURILabel is not set. In next release 1.5, the LogPathLabel will be removed. Signed-off-by: Wei Fu <fuweid89@gmail.com>

view details

Michael Crosby

commit sha ae2f3fdfd1a435fe83fb083e4db9fa63a9e0a13e

Merge pull request #4315 from fuweid/fix-4294 restart plugin: support binary log uri

view details

push time in 20 days

PR merged containerd/containerd

restart plugin: support binary log uri

Introduce LogURIGenerator helper function in cio package. It is used in the restart options, like WithBinaryLogURI and WithFileLogURI.

And restart.LogPathLabel might be used in production and work well. In order to reduce breaking change, the LogPathLabel is still recognized if new LogURILabel is not set. In next release 1.5, the LogPathLabel will be removed.

Signed-off-by: Wei Fu fuweid89@gmail.com


fix: #4294

+135 -21

5 comments

5 changed files

fuweid

pr closed time in 20 days

issue closedcontainerd/containerd

Plugin restart doesn't follow log-uri tasks

<!-- If you are reporting a new issue, make sure that we do not have any duplicates already open. You can ensure this by searching the issue list for this repository. If there is a duplicate, please close your issue and add a comment to the existing issue instead. -->

Description

With change introduced by #3085, we can now specify an external binary to execute to handle logs. When a container is restarted by restart plugin, this is lost.

If any binary:// URI is specified for logs, this is lost when the container restart. Is there any way to keep that alive ?

Steps to reproduce the issue:

  1. Set a task via container.NewTask(ctx, cio.LogURI(uri))
  2. Set restart.WithStatus(containerd.Running) on container
  3. Kill the container main process

Describe the results you received:

The shim task is lost. The restart plugin support an extra argument (WithLogPath) to handle logs, but it only support log file and tasks support lot more things. (fifo, binary, etc.)

Describe the results you expected:

There should be a way to restart the same task (keep the URI and not specifying a file log path) and use this URI inside restart module ?

closed time in 20 days

maxux

pull request commentcontainerd/containerd

restart plugin: support binary log uri

LGTM

fuweid

comment created time in 20 days

PR opened containerd/cri

bump selinux dep

Includes fixes for the category range and mount labeling.

Signed-off-by: Michael Crosby michael@thepasture.io

+13 -3

0 comment

4 changed files

pr created time in 20 days

create barnchcrosbymichael/cri

branch : selinux-bump

created branch time in 20 days

push eventcontainerd/cri

Laszlo Janosi

commit sha 479dfbac453a6af15bb871a1efc23339772b3a3f

Remove the protocol filter from the portMappings constructor. Reason: originally it was introduced to prevent the loading of the SCTP kernel module on the nodes. But iptables chain creation alone does not load the kernel module. The module would be loaded if an SCTP socket was created, but neither cri nor the portmap CNI plugin starts managing SCTP sockets if hostPort / portmappings are defined. Signed-off-by: Laszlo Janosi <laszlo.janosi@ibm.com>

view details

Michael Crosby

commit sha 61648227141a843ce71ecaf25997cf8d37c31c5a

Merge pull request #1508 from janosi/sctp-hostport Remove the protocol filter from the HostPort management

view details

push time in 20 days

PR merged containerd/cri

Remove the protocol filter from the HostPort management ok-to-test size/S

Reason: originally it was introduced to prevent the loading of the SCTP kernel module on the nodes. But iptables chain creation alone does not load the kernel module. The module would be loaded if an SCTP socket was created, but neither cri nor the portmap CNI plugin starts managing SCTP sockets if hostPort / portmappings are defined.

Trigger: Failing SCTP HostPort e2e test in K8s: https://github.com/kubernetes/kubernetes/issues/92041

Containerd issue (if this one is merged containerd/vendor shall be updated): https://github.com/containerd/containerd/issues/4321

+12 -9

20 comments

2 changed files

janosi

pr closed time in 20 days

pull request commentcontainerd/cri

Remove the protocol filter from the HostPort management

LGTM

janosi

comment created time in 20 days

push eventcontainerd/containerd

Davanum Srinivas

commit sha cbdfca8157f00c278105e4391e144c86e332520b

Build runc with selinux support docker-ce seems to be building runc with selinux support, let us follow the same pattern here please: https://github.com/docker/docker-ce/search?p=1&q=RUNC_BUILDTAGS&unscoped_q=RUNC_BUILDTAGS Signed-off-by: Davanum Srinivas <davanum@gmail.com> (cherry picked from commit 7a252f3ca1f158203574f2c7786a28e6c0368f5e)

view details

Michael Crosby

commit sha 48cc59890abbd0f0f88eb2014569f4ee7434582b

Merge pull request #4319 from hakman/runc-selinux-1.2 [release/1.2 backport] Build runc with selinux support

view details

push time in 23 days

PR merged containerd/containerd

[release/1.2 backport] Build runc with selinux support

docker-ce seems to be building runc with selinux support, let us follow the same pattern here please: https://github.com/docker/docker-ce/search?p=1&q=RUNC_BUILDTAGS&unscoped_q=RUNC_BUILDTAGS

Signed-off-by: Davanum Srinivas davanum@gmail.com

(cherry picked from commit 7a252f3ca1f158203574f2c7786a28e6c0368f5e)

+2 -2

2 comments

2 changed files

hakman

pr closed time in 23 days

pull request commentcontainerd/containerd

[release/1.3] Make killing shims more resilient

LGTM

fuweid

comment created time in 23 days

pull request commentcontainerd/containerd

restart plugin: support binary log uri

Code LGTM.

Lets wait for @maxux to confirm that this resolves their issue before merging.

fuweid

comment created time in 23 days

push eventcontainerd/containerd

Kenta Tada

commit sha 730b7a932e36472428e0c4147b6f794963c87033

Change the type of PdeathSignal Use x/sys as same as runtime/v1/linux/runtime.go Signed-off-by: Kenta Tada <Kenta.Tada@sony.com>

view details

Michael Crosby

commit sha 185ea541d2254c734a5d123797868e8d3ac399f4

Merge pull request #4317 from KentaTada/modify-pdeathsignal-type Change the type of PdeathSignal

view details

push time in 23 days

PR merged containerd/containerd

Change the type of PdeathSignal

The type of PdeathSignal is different between https://github.com/containerd/containerd/blob/v1.4.0-beta.1/runtime/v1/linux/runtime.go#L509 and https://github.com/containerd/containerd/blob/v1.4.0-beta.1/pkg/process/init.go#L90

I think we use x/sys instead of syscall. I'll modify go-runc.

Signed-off-by: Kenta Tada Kenta.Tada@sony.com

+4 -4

2 comments

1 changed file

KentaTada

pr closed time in 23 days

pull request commentcontainerd/containerd

Change the type of PdeathSignal

LGTM

KentaTada

comment created time in 23 days

PR merged containerd/go-runc

Change the type of PdeathSignal

As related to https://github.com/containerd/containerd/pull/4317, this commit changes the definition of PdeathSignal to use x/sys.

Signed-off-by: Kenta Tada Kenta.Tada@sony.com

+2 -2

2 comments

1 changed file

KentaTada

pr closed time in 23 days

push eventcontainerd/go-runc

Kenta Tada

commit sha 421b4cab7d101dabcb4a1ff6348ce9b695431b0b

Change the type of PdeathSignal Use x/sys instead of syscall Signed-off-by: Kenta Tada <Kenta.Tada@sony.com>

view details

Michael Crosby

commit sha 0d1871416c41225a9b23326e94e9744fa9f02b01

Merge pull request #61 from KentaTada/modify-pdeathsignal-type Change the type of PdeathSignal

view details

push time in 23 days

pull request commentcontainerd/go-runc

Change the type of PdeathSignal

LGTM

KentaTada

comment created time in 23 days

pull request commentcontainerd/containerd

[release/1.2 backport] Build runc with selinux support

LGTM

hakman

comment created time in 23 days

pull request commentcontainerd/containerd

[release/1.3 backport] Build runc with selinux support

LGTM

hakman

comment created time in 23 days

pull request commentopencontainers/selinux

mountLabel can be changed without changing processLabel

LGTM

rhatdan

comment created time in 23 days

issue closedopencontainers/selinux

Detecting duplicates in new API

If we use the new API in the root of the selinux package, functions like:

// ReserveLabel reserves the MLS/MCS level component of the specified label
func ReserveLabel(label string) {
	if len(label) != 0 {
		con := strings.SplitN(label, ":", 4)
		if len(con) > 3 {
			mcsAdd(con[3])
		}
	}
}

do not return the error from mcsAdd().

How do we detect duplicate labels from this or does this matter anymore? Should we handle this in higher layers or should we expand the package a little bit to add functions like:

MustReserveLabel(label string) error { ???

closed time in 23 days

crosbymichael

issue commentopencontainers/selinux

Detecting duplicates in new API

OK, I think I'll just handle duplicates in the high layers for now and not touch the current API.

Thanks!

crosbymichael

comment created time in 23 days

startedhodgesds/perf-utils

started time in 23 days

issue openedopencontainers/selinux

Detecting duplicates in new API

If we use the new API in the root of the selinux package, functions like:

// ReserveLabel reserves the MLS/MCS level component of the specified label
func ReserveLabel(label string) {
	if len(label) != 0 {
		con := strings.SplitN(label, ":", 4)
		if len(con) > 3 {
			mcsAdd(con[3])
		}
	}
}

do not return the error from mcsAdd().

How do we detect duplicate labels from this or does this matter anymore? Should we handle this in higher layers or should we expand the package a little bit to add functions like:

MustReserveLabel(label string) error { ???

created time in 25 days

push eventcontainerd/containerd

Gaurav Singh

commit sha ae08491bff2fdef7a91ff9c2d9e532d2f63d4bbd

waitForPid: fix goroutine leak Signed-off-by: Gaurav Singh <gaurav1086@gmail.com>

view details

Michael Crosby

commit sha 7868e8d6aab20a44dafe6f330aa8e2afadf3b750

Merge pull request #4309 from gaurav1086/waitForPid_fix_goroutine_leak waitForPid: fix goroutine leak

view details

push time in a month

PR merged containerd/containerd

waitForPid: fix goroutine leak

Signed-off-by: Gaurav Singh gaurav1086@gmail.com

The goroutine fetching the Process pid may timeout and cause a leak if there is no reader for the channel. To fix this, make it a buffered channel.

+1 -1

2 comments

1 changed file

gaurav1086

pr closed time in a month

pull request commentcontainerd/containerd

waitForPid: fix goroutine leak

LGTM

gaurav1086

comment created time in a month

push eventcontainerd/containerd

Gaurav Singh

commit sha 7213cd89d659876c31468dd1c9f5c98ec16ecdcb

Process I/O: Fix goroutine leak Signed-off-by: Gaurav Singh <gaurav1086@gmail.com>

view details

Michael Crosby

commit sha 7fdcd07febba0aea18b543587897efd6744f62d1

Merge pull request #4310 from gaurav1086/process_io_fix_goroutine_leak Process I/O: Fix goroutine leak

view details

push time in a month

PR merged containerd/containerd

Process I/O: Fix goroutine leak

Signed-off-by: Gaurav Singh gaurav1086@gmail.com

The goroutine executing b.cmd.Wait() can leak if it time outs and there is no corresponding reader for the done channel. To fix this, make error an asynchronous buffered channel.

+1 -1

2 comments

1 changed file

gaurav1086

pr closed time in a month

pull request commentcontainerd/containerd

Process I/O: Fix goroutine leak

LGTM

gaurav1086

comment created time in a month

pull request commentcontainerd/containerd

overlay: use index=off to fix EBUSY on mount

LGTM

Thanks!

rudyfly

comment created time in a month

delete branch crosbymichael/containerd

delete branch : allow-list

delete time in a month

delete branch crosbymichael/selinux

delete branch : cat-range

delete time in a month

PR opened containerd/containerd

Update usage of whitelist in project

Signed-off-by: Michael Crosby michael@thepasture.io

+3 -3

0 comment

3 changed files

pr created time in a month

push eventcrosbymichael/containerd

Michael Crosby

commit sha 0f831093ce6ed28a9bb21f839d3f369ca6be9113

Update usage of whitelist in project Signed-off-by: Michael Crosby <michael@thepasture.io>

view details

push time in a month

create barnchcrosbymichael/containerd

branch : allow-list

created branch time in a month

startediovisor/bpftrace

started time in a month

pull request commentopencontainers/selinux

Allow the category range to be changed

@rhatdan I do have 1024 but want to limit the upper range for some reservations.

crosbymichael

comment created time in a month

pull request commentopencontainers/selinux

Allow the category range to be changed

Travis passed, I'm not sure why GitHub hasn't updated the status

crosbymichael

comment created time in a month

PR opened opencontainers/selinux

Allow the category range to be changed

This keeps the range as part of the global state of the package but allows users to modify the upper bounds of the category range.

Signed-off-by: Michael Crosby michael@thepasture.io

+11 -1

0 comment

2 changed files

pr created time in a month

create barnchcrosbymichael/selinux

branch : cat-range

created branch time in a month

pull request commentcontainerd/containerd

Make killing shims more resilient

Thanks for your first PR @ashrayjain !

ashrayjain

comment created time in a month

push eventcontainerd/containerd

Ashray Jain

commit sha 3e95727f390b530b91908aedf20a2769435d93bf

Make killing shims more resilient Currently, we send a single SIGKILL to the shim process once and then we spin in a loop where we use kill(pid, 0) to detect when the pid has disappeared completely. Unfortunately, this has a race condition since pids can be reused causing us to spin in an infinite loop when that happens. This adds a timeout to this loop which logs a warning and exits the infinite loop. Signed-off-by: Ashray Jain <ashrayj@palantir.com>

view details

Michael Crosby

commit sha 7ce8a9d7d3e0e972ad118c7f72203a79a6b29a38

Merge pull request #4204 from ashrayjain/aj/add-kill-retry Make killing shims more resilient

view details

push time in a month

PR merged containerd/containerd

Make killing shims more resilient

Currently, we send a single SIGKILL to the shim process once and then we spin in a loop where we use kill(pid, 0) to detect when the pid has disappeared completely.

Unfortunately, this has a race condition since pids can be reused causing us to spin in an infinite loop when that happens.

This adds a timeout to this loop which logs a warning and exits the infinite loop.

This fixes https://github.com/containerd/cri/issues/1427

+15 -5

38 comments

1 changed file

ashrayjain

pr closed time in a month

issue closedcontainerd/cri

CRI stops receiving events, causes timeouts in StopContainer and StopSandboxContainer

# ctr version
Client:
  Version:  1.3.3
  Revision: d76c121f76a5fc8a462dc64594aea72fe18e1178

Server:
  Version:  1.3.3
  Revision: d76c121f76a5fc8a462dc64594aea72fe18e1178
  UUID: 32b2cc77-d405-4038-81f6-daf6752ab018
# kubelet --version
Kubernetes v1.17.

We've been seeing an issue where a single node gets stuck in a state where pods from CronJobs that should have terminated long ago show as "Running" in Kubernetes. When I look on the machine, the processes are dead, and the containerd task for the container is STOPPED.

We were finally able to catch this happening and take time to investigate, and we're seeing lots of log messages like this from containerd:

time="2020-03-27T21:45:13.132058945Z" level=error msg="StopContainer for "94ade1615964afc99b3135f6aaefb4593da0ebc3357f674ce3181ccc569cd515" failed" error="rpc error: code = DeadlineExceeded desc = an error occurs during waiting for container "94ade1615964afc99b3135f6aaefb4593da0ebc3357f674ce3181ccc569cd515" to be killed: wait container "94ade1615964afc99b3135f6aaefb4593da0ebc3357f674ce3181ccc569cd515": context deadline exceeded"

It seems like some of our users noticed their jobs were still running and were stuck (as far as they could tell) and tried to kill the job. containerd seems to have gone into a loop trying to stop the containers for those jobs, timing out every time.

I think that the reason these start failing is because for some reason, the CRI service is no longer receiving events from containerd. If I search my logs for TaskExit, I can see lots of messages right up until when the issue started, at which point they drop off completely. Without those events, the CRI service can't update the status of the container, and stopping the container relies on waiting for that status to update.

Happy to provide more information if you need it! This can be really frustrating for our users, as it's hard for us to notice before it's caused downstream negative effects for them.

closed time in a month

mmoriarity-stripe

pull request commentcontainerd/containerd

Make killing shims more resilient

LGTM

This is fine for v1 but the proper fix is to move to v2 for the runtime shim :)

ashrayjain

comment created time in a month

pull request commentcontainerd/containerd

Fallback to request scope if missing from auth challange

@dmcgowan could you take a look at this one?

ecordell

comment created time in a month

pull request commentcontainerd/containerd

Revendor CRI to 62c91260d2f43b57fff408a9263a800b7a06a647

I bumped the build to see if that test just dropped out because of actions

dims

comment created time in a month

pull request commentcontainerd/containerd

Revendor CRI to 62c91260d2f43b57fff408a9263a800b7a06a647

LGTM

dims

comment created time in a month

pull request commentopencontainers/tob

Simplify mission of OCI

LGTM

caniszczyk

comment created time in a month

push eventcontainerd/containerd

Akihiro Suda

commit sha 2f601013e67c49c446311f980f235c964293a87f

cgroup2: implement `containerd.events.TaskOOM` event How to test (from https://github.com/opencontainers/runc/pull/2352#issuecomment-620834524): (host)$ sudo swapoff -a (host)$ sudo ctr run -t --rm --memory-limit $((1024*1024*32)) docker.io/library/alpine:latest foo (container)$ sh -c 'VAR=$(seq 1 100000000)' An event `/tasks/oom {"container_id":"foo"}` will be displayed in `ctr events`. Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>

view details

Michael Crosby

commit sha 62dd14114d5452f772c2f487b6a292ef18ad6fe5

Merge pull request #4273 from AkihiroSuda/oomv2 cgroup2: implement `containerd.events.TaskOOM` event

view details

push time in a month

PR merged containerd/containerd

Reviewers
cgroup2: implement `containerd.events.TaskOOM` event

How to test (from https://github.com/opencontainers/runc/pull/2352#issuecomment-620834524):

(host)$ sudo swapoff -a
(host)$ sudo ctr run -t --rm --memory-limit $((1024*1024*32)) docker.io/library/alpine:latest foo
(container)$ sh -c 'VAR=$(seq 1 100000000)'

An event /tasks/oom {"container_id":"foo"} will be displayed in ctr events.

DRAFT: Requires https://github.com/containerd/cgroups/pull/158

+861 -149

1 comment

17 changed files

AkihiroSuda

pr closed time in a month

issue closedcontainerd/cri

Custom Runtime Spec Defaults

There are many different defaults that an operator would want to change when it comes to the container runtime. Anything from rlimits, capabilities, or default mounts are fields that could be configurable in a deployment. Right now, CRI does not have a way to specify this at the daemon level.

I would like to add defaults to the daemon for runtime configuration on how the default runtime spec is created. However, I would also like to prevent having many fields in the daemon's configuration for an ever growing number of runtime options that a operator could want to change.

To solve both these requirements, I would like to propose custom spec defaults by changing the default spec. Currently all specs are created from a static, compiled in struct with default options. ref: https://github.com/containerd/cri/blob/master/pkg/server/container_create.go#L291

This function uses the default found in the oci package as the base spec that all container's are created from. I want to add a simple daemon level option called base_runtime_spec which is a path to a serialized runtime spec on disk.

base_runtime_spec = "/etc/containerd/cri-base.json"

This would allow an operator to change ANYTHING on the spec to suit their needs and is forward compatible with all future runtime spec changes. The only read code change we would need in this codebase is to read this spec from disk and subsitute it in pace of the GenerateSpec's default, compiled in struct.

Issues:

However, there could be a few issues because of the way CRI handles the spec. It looks like functions and opts will unset or replace a lot of options on the spec and will need to be accounted for.

ref: https://github.com/containerd/cri/blob/master/pkg/server/container_create_unix.go#L115

Comments, questions, concerns?

closed time in a month

crosbymichael

push eventcontainerd/cri

Maksym Pavlenko

commit sha 8d54f397534c737023c7d41c12e0202e140710a6

Allow specify base OCI runtime spec Signed-off-by: Maksym Pavlenko <pavlenko.maksym@gmail.com>

view details

Maksym Pavlenko

commit sha df8d6c5b7b58ab43846e289caccf27b065126da0

Update documentation for base OCI spec files Signed-off-by: Maksym Pavlenko <pavlenko.maksym@gmail.com>

view details

Maksym Pavlenko

commit sha 17c61e36cb5ed6ee59d27074e6be7e08663646fa

Fix cgroups path for base OCI spec Signed-off-by: Maksym Pavlenko <pavlenko.maksym@gmail.com>

view details

Michael Crosby

commit sha 8898550e348932e406049e937d98fb7564ac4e7a

Merge pull request #1498 from mxpv/base Specify base OCI runtime spec

view details

push time in a month

PR merged containerd/cri

Specify base OCI runtime spec size/L

This PR adds base_runtime_spec to runtime config to use a file with base OCI spec.

Details: https://github.com/containerd/cri/issues/1488

Signed-off-by: Maksym Pavlenko pavlenko.maksym@gmail.com

+197 -9

1 comment

11 changed files

mxpv

pr closed time in a month

pull request commentcontainerd/cri

Specify base OCI runtime spec

LGTM

mxpv

comment created time in a month

Pull request review commentcontainerd/cri

Specify base OCI runtime spec

 func (c *criService) volumeMounts(containerRootDir string, criMounts []*runtime. }  // runtimeSpec returns a default runtime spec used in cri-containerd.-func runtimeSpec(id string, opts ...oci.SpecOpts) (*runtimespec.Spec, error) {+func (c *criService) runtimeSpec(id string, baseSpecFile string, opts ...oci.SpecOpts) (*runtimespec.Spec, error) { 	// GenerateSpec needs namespace. 	ctx := ctrdutil.NamespacedContext()-	spec, err := oci.GenerateSpec(ctx, nil, &containers.Container{ID: id}, opts...)+	container := &containers.Container{ID: id}++	if baseSpecFile != "" {+		baseSpec, ok := c.baseOCISpecs[baseSpecFile]+		if !ok {+			return nil, errors.Errorf("can't find base OCI spec %q", baseSpecFile)+		}++		spec := oci.Spec{}+		if err := util.DeepCopy(&spec, &baseSpec); err != nil {+			return nil, errors.Wrap(err, "failed to clone OCI spec")+		}++		if err := oci.ApplyOpts(ctx, nil, container, &spec, opts...); err != nil {

After doing this you need to make sure that you setup the cgrouppath based on the namespace of context.

				var s oci.Spec
		if err := json.NewDecoder(f).Decode(&s); err != nil {
			return errors.Wrap(err, "decode spec")
		}
		s.Linux.CgroupsPath = filepath.Join("/", ns, c.ID)

		if err := oci.ApplyOpts(ctx, client, c, &s, opts...); err != nil {
			return errors.Wrap(err, "apply opts")
		}
mxpv

comment created time in a month

pull request commentopencontainers/tob

README: @crosbymichael is at Apple now

LGTM

cyphar

comment created time in a month

push eventcontainerd/containerd

Derek McGowan

commit sha 8f1ddb1428577783079f23a0399d448753cc05ce

Update release for 1.4.0-beta.1 Signed-off-by: Derek McGowan <derek@mcg.dev>

view details

Michael Crosby

commit sha 77bc753024ac3ed0da699ead3acb985979494515

Merge pull request #4289 from dmcgowan/next-1.4-beta Update release notes for 1.4.0-beta.1

view details

push time in a month

PR merged containerd/containerd

Update release notes for 1.4.0-beta.1

For preparing the next beta release

+2 -1

1 comment

2 changed files

dmcgowan

pr closed time in a month

pull request commentcontainerd/containerd

Update release notes for 1.4.0-beta.1

LGTM

dmcgowan

comment created time in a month

more