profile
viewpoint

containers/libpod 4525

libpod is a library used to create container pods. Home of Podman.

containers/conmon 116

An OCI container runtime monitor.

containers/ocicrypt 14

Encryption libraries for Encrypted OCI Container images

jwhonce/origin-server 2

Public open source repository for the OpenShift Origin server components (CrankCase)

containers/libtrust 0

Primitives for identity and authorization

mrunalp/archive 0

self contained version of docker's pkg/archive

mrunalp/buildah 0

A tool which facilitates building OCI images

Pull request review commentopencontainers/runc

Cpu quota fixes

 EOF     check_systemd_value "TasksMax" 20 } -@test "update cgroup v1 cpu limits" {-    [[ "$ROOTLESS" -ne 0 ]] && requires rootless_cgroup-    requires cgroups_v1+function check_cpu_quota() {+	local quota=$1+	local period=$2+	local sd_quota=$3++	if [ "$CGROUP_UNIFIED" = "yes" ]; then+		if [ "$quota" = "-1" ]; then+			quota="max"+		fi+		check_cgroup_value "cpu.max" "$quota $period"+		check_systemd_value "CPUQuotaPerSecUSec" $sd_quota+	else+		check_cgroup_value "cpu.cfs_quota_us" $quota+		check_cgroup_value "cpu.cfs_period_us" $period+		# no systemd support in v1+	fi+} -    # run a few busyboxes detached-    runc run -d --console-socket $CONSOLE_SOCKET test_update-    [ "$status" -eq 0 ]+function check_cpu_shares() {+	local shares=$1++	if [ "$CGROUP_UNIFIED" = "yes" ]; then+		local weight=$((1 + ((shares - 2) * 9999) / 262142))+		check_cgroup_value "cpu.weight" $weight+		check_systemd_value "CPUWeight" $weight+	else+		check_cgroup_value "cpu.shares" $shares+		check_systemd_value "CPUShares" $shares+	fi+} -    # check that initial values were properly set-    check_cgroup_value "cpu.cfs_period_us" 1000000-    check_cgroup_value "cpu.cfs_quota_us" 500000-    check_systemd_value "CPUQuotaPerSecUSec" 500ms+@test "update cgroup cpu limits" {+	[[ "$ROOTLESS" -ne 0 ]] && requires rootless_cgroup -    check_cgroup_value "cpu.shares" 100-    check_systemd_value "CPUShares" 100+	# run a few busyboxes detached+	runc run -d --console-socket $CONSOLE_SOCKET test_update+	[ "$status" -eq 0 ] -    # systemd driver does not allow to update quota and period separately-    if [ -z "$RUNC_USE_SYSTEMD" ]; then-        # update cpu period-        runc update test_update --cpu-period 900000-        [ "$status" -eq 0 ]-        check_cgroup_value "cpu.cfs_period_us" 900000+	# check that initial values were properly set+	check_cpu_quota 500000 1000000 "500ms"+	check_cpu_shares 100++	# updating cpu period alone is not allowed+	runc update test_update --cpu-period 900000+	[ "$status" -eq 1 ] -        # update cpu quota-        runc update test_update --cpu-quota 600000+	# update cpu quota+	runc update test_update --cpu-quota 600000+	[ "$status" -eq 0 ]+	check_cpu_quota 600000 1000000 "600ms"++        # remove cpu quota

Is there any tool we can run to ensure consistency?

kolyshkin

comment created time in 2 hours

pull request commentopencontainers/runc

Cpu quota fixes

LGTM

kolyshkin

comment created time in 2 hours

pull request commentopencontainers/runc

Fix some cases of swap setting

LGTM

kolyshkin

comment created time in a day

pull request commentopenshift/release

Add ds/bpf-sync-ds on build01

this looks fine to me. Will let @rphillips review and give the go ahead for merge.

hongkailiu

comment created time in 4 days

pull request commentcri-o/cri-o

[1.11] oci: handle timeouts correctly for probes

/lgtm

haircommander

comment created time in 4 days

pull request commentcri-o/cri-o

only set cpu period and quota when both are defined

@kolyshkin ptal

haircommander

comment created time in 4 days

pull request commentcri-o/cri-o

[1.11] conmon: make un-OOM-killable

/lgtm

haircommander

comment created time in 4 days

pull request commentkubernetes/kubernetes

cri-api: Introduce errors package for the CRI

/retest

mrunalp

comment created time in 4 days

push eventmrunalp/kubernetes

Mrunal Patel

commit sha ba90b40cd67265927e003d1f9a0f60f399316dde

cri-api: Introduce errors package for the CRI We start by adding a helper function for IsNotFound errors. The expectation is that CRI implementations return the grcp not found status code for situations where they can't find a container or a pod. This is the lowest hanging fruit to start improving the kubelet to detect such conditions and react better. Signed-off-by: Mrunal Patel <mpatel@redhat.com>

view details

push time in 4 days

push eventmrunalp/kubernetes

Mrunal Patel

commit sha 840f0d10bd75ea34599dbfb70b29fb20b083e532

cri-api: Introduce errors package for the CRI We start by adding a helper function for IsNotFound errors. The expectation is that CRI implementations return the grcp not found status code for situations where they can't find a container or a pod. This is the lowest hanging fruit to start improving the kubelet to detect such conditions and react better. Signed-off-by: Mrunal Patel <mpatel@redhat.com>

view details

push time in 5 days

push eventmrunalp/kubernetes

Mrunal Patel

commit sha 3e8ad7c7058517ce4a7bd1c871a143d96123d993

cri-api: Introduce errors package for the CRI We start by adding a helper function for IsNotFound errors. The expectation is that CRI implementations return the grcp not found status code for situations where they can't find a container or a pod. This is the lowest hanging fruit to start improving the kubelet to detect such conditions and react better. Signed-off-by: Mrunal Patel <mpatel@redhat.com>

view details

push time in 5 days

push eventopencontainers/runc

Kir Kolyshkin

commit sha 236ec04599a734f738f3d6e76e1c713ed24f79e2

Dockerfile: speed up criu build ... in case we have more than one CPU, that is. Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>

view details

Mrunal Patel

commit sha 3c8da9dae0f3e673839d0fe3f0d862b9b2325f1c

Merge pull request #2422 from kolyshkin/criu-j Dockerfile: speed up criu build

view details

push time in 5 days

PR merged opencontainers/runc

Dockerfile: speed up criu build easy-to-review enhancement

... in case we have more than one CPU, that is.

+1 -1

2 comments

1 changed file

kolyshkin

pr closed time in 5 days

pull request commentopencontainers/runc

Dockerfile: speed up criu build

LGTM

kolyshkin

comment created time in 5 days

push eventopencontainers/runtime-spec

Paulo Gomes

commit sha a9f11705d1697b64c1d6427a29b154c04c271182

Add seccomp kill process Signed-off-by: Paulo Gomes <pjbgf@linux.com>

view details

Mrunal Patel

commit sha 44341cdd36f6fee6ddd73e602f9e3eca1466052f

Merge pull request #1044 from pjbgf/add-seccomp-kill-process seccomp: Add support for SCMP_ACT_KILL_PROCESS

view details

push time in 5 days

PR merged opencontainers/runtime-spec

seccomp: Add support for SCMP_ACT_KILL_PROCESS

Adds support for SCMP_ACT_KILL_PROCESS, which allows users to kill the entire process when a syscall blocked by seccomp is called.

Signed-off-by: Paulo Gomes pjbgf@linux.com

+9 -6

3 comments

3 changed files

pjbgf

pr closed time in 5 days

pull request commentopencontainers/runtime-spec

seccomp: Add support for SCMP_ACT_KILL_PROCESS

LGTM

pjbgf

comment created time in 5 days

push eventopencontainers/runtime-spec

Renaud Gaubert

commit sha 6042999e6864615ddf5636583ae901125d59f2c0

Define State for container and runtime namespace Signed-off-by: Renaud Gaubert <rgaubert@nvidia.com>

view details

Mrunal Patel

commit sha e5487283b900cf5db59f4badfb2e2a6cc0287618

Merge pull request #1045 from RenaudWasTaken/state-pid Define State for container and runtime namespace

view details

push time in 5 days

PR merged opencontainers/runtime-spec

Define State for container and runtime namespace

Hello!

As a followup of the OCI Weekly Discussion on 5/20/2020, as well as discussions on the runc hooks Pull Request, I've updated the definition of the PID field for the state structure.

We are updating it with the following definition in mind:

  • Runtime namespace hooks get runtime/absolute information
  • Container Namespace hooks get container/relative information

/cc @cyphar @mrunalp @mikebrow

Signed-off-by: Renaud Gaubert rgaubert@nvidia.com

+3 -1

3 comments

1 changed file

RenaudWasTaken

pr closed time in 5 days

pull request commentcri-o/cri-o

vendor: update seccomp/containers-golang to v0.4.1

restarted integration on circle

giuseppe

comment created time in 5 days

push eventopencontainers/runc

Kir Kolyshkin

commit sha 4fc9fa05dad959fc260b4b5f98c202878ea354c0

tests/int: simplify check_systemd_value use ...so it will be easier to write more tests Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>

view details

Kir Kolyshkin

commit sha e4a84bea99926e855d662e130c59a9f6b28106b8

cgroupv2+systemd: set MemoryLow For some reason, this was not set before. Test case is added by the next commit. Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>

view details

Kir Kolyshkin

commit sha 7abd93d1565a0db942249c44ddc65916e9d2e74e

tests/integration/update.bats: more systemd checks 1. add missing checks for systemd's MemoryMax / MemoryLimit. 2. add checks for systemd's MemoryLow and MemorySwapMax. Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>

view details

Kir Kolyshkin

commit sha 06d7c1d2616052974ebb39a403a31e4ff95b9a20

systemd+cgroupv1: fix updating CPUQuotaPerSecUSec 1. do not allow to set quota without period or period without quota, as we won't be able to calculate new value for CPUQuotaPerSecUSec otherwise. 2. do not ignore setting quota to -1 when a period is not set. 3. update the test case accordingly. Note that systemd value checks will be added in the next commit. Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>

view details

Kir Kolyshkin

commit sha 95413ecdb09488595103bd61779a839c3e399c46

tests/int/update: add cgroupv1 systemd CPU checks Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>

view details

Kir Kolyshkin

commit sha 59897367c4685b58b80e02ced0f66d4de8f7e9ea

cgroups/systemd: allow to set -1 as pids.limit Currently, both systemd cgroup drivers (v1 and v2) only set "TasksMax" unit property if the value > 0, so there is no way to update the limit to -1 / unlimited / infinity / max. Since systemd driver is backed by fs driver, and both fs and fs2 set the limit of -1 properly, it works, but systemd still has the old value: # runc --systemd-cgroup update $CT --pids-limit 42 # systemctl show runc-$CT.scope | grep TasksMax TasksMax=42 # cat /sys/fs/cgroup/system.slice/runc-$CT.scope/pids.max 42 # ./runc --systemd-cgroup update $CT --pids-limit -1 # systemctl show runc-$CT.scope | grep TasksMax= TasksMax=42 # cat /sys/fs/cgroup/system.slice/runc-xx77.scope/pids.max max Fix by changing the condition to allow -1 as a valid value. NOTE other negative values are still being ignored by systemd drivers (as it was done before). I am not sure whether this is correct, or should we return an error. A test case is added. Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>

view details

Mrunal Patel

commit sha 6a6ba0c0363865506de0f67bab1fac4350e5d4af

Merge pull request #2423 from kolyshkin/systemd-v2-pids-max Fix setting some systemd limits, add more tests

view details

push time in 5 days

PR merged opencontainers/runc

Fix setting some systemd limits, add more tests area/cgroupv2 area/ci area/systemd bug

This fixes a few systemd cgroup bugs for both v1 and v2, and adds tests for systemd limit values (which uncover the bugs being fixed).

In particular, the bug fixes are:

  • allow to set TasksMax=infinity (aka pids.limit: -1) [v1 and v2]
  • set MemoryLow (aka memory.reservation) [v2 only]
  • update CPUQuotaPerSecUSec for some cases [v1 only]

The integration test improvements are:

  • simplify check_systemd_value usage
  • add missing checks for systemd's MemoryMax / MemoryLimit
  • add checks for systemd's MemoryLow and MemorySwapMax
  • add checks for systemd's TasksMax
  • add cgroupv1 systemd CPU checks (CPUShares and CPUQuotaPerSecUSec)
  • add a test that sets pids.limit=-1

Please see individual commits for details.

+104 -39

6 comments

4 changed files

kolyshkin

pr closed time in 5 days

pull request commentopencontainers/runc

Fix setting some systemd limits, add more tests

LGTM

kolyshkin

comment created time in 5 days

pull request commentcri-o/cri-o

test: install conntrack to fix udp test

/test e2e_rhel

haircommander

comment created time in 5 days

pull request commentcri-o/cri-o

test: install conntrack to fix udp test

/lgtm

haircommander

comment created time in 5 days

pull request commentcri-o/cri-o

vendor: update seccomp/containers-golang to v0.4.1

/lgtm

giuseppe

comment created time in 5 days

pull request commentcri-o/cri-o

test: install conntrack to fix udp test

/lgtm

haircommander

comment created time in 5 days

push eventmrunalp/fileutils

Kir Kolyshkin

commit sha 356118694b4349258a93b5e4bdf3cad57f0d218b

MkdirAllNewAs: error out if dir exists as file os.MkdirAll() function returns "not a directory" error in case a directory to be created already exists but is not a directory (e.g. a file). MkdirAllNewAs function do not replicate the behavior. This is a bug since it is expected to ensure the required directory exists and is indeed a directory, and return an error otherwise. Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>

view details

Kir Kolyshkin

commit sha 5f27393c553a4faf32f4143074c5c72092af55d5

idtools: fix MkdirAll usage This subtle bug keeps lurking in because error checking for `Mkdir()` and `MkdirAll()` is slightly different wrt `EEXIST`/`IsExist`: - for `Mkdir()`, `IsExist` error should (usually) be ignored (unless you want to make sure directory was not there before) as it means "the destination directory was already there"; - for `MkdirAll()`, `IsExist` error should NEVER be ignored. This commit removes ignoring the IsExist error, as it should not be ignored. For more details, a quote from my runc commit 6f82d4b (July 2015): TL;DR: check for IsExist(err) after a failed MkdirAll() is both redundant and wrong -- so two reasons to remove it. Quoting MkdirAll documentation: > MkdirAll creates a directory named path, along with any necessary > parents, and returns nil, or else returns an error. If path > is already a directory, MkdirAll does nothing and returns nil. This means two things: 1. If a directory to be created already exists, no error is returned. 2. If the error returned is IsExist (EEXIST), it means there exists a non-directory with the same name as MkdirAll need to use for directory. Example: we want to MkdirAll("a/b"), but file "a" (or "a/b") already exists, so MkdirAll fails. The above is a theory, based on quoted documentation and my UNIX knowledge. 3. In practice, though, current MkdirAll implementation [1] returns ENOTDIR in most of cases described in #2, with the exception when there is a race between MkdirAll and someone else creating the last component of MkdirAll argument as a file. In this very case MkdirAll() will indeed return EEXIST. Because of #1, IsExist check after MkdirAll is not needed. Because of #2 and #3, ignoring IsExist error is just plain wrong, as directory we require is not created. It's cleaner to report the error now. Note this error is all over the tree, I guess due to copy-paste, or trying to follow the same usage pattern as for Mkdir(), or some not quite correct examples on the Internet. [1] https://github.com/golang/go/blob/f9ed2f75/src/os/path.go Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>

view details

Mrunal Patel

commit sha abd8a0e7697675e96f030dde7a43359d67369dcb

Merge pull request #5 from kolyshkin/mkall Fixes for MkdirAllAsNew

view details

push time in 5 days

PR merged mrunalp/fileutils

Fixes for MkdirAllAsNew

A couple of fixes; same as https://github.com/moby/moby/pull/35618

1. MkdirAllNewAs: error out if dir exists as file

os.MkdirAll() function returns "not a directory" error in case a directory to be created already exists but is not a directory (e.g. a file). MkdirAllNewAs function do not replicate the behavior.

This is a bug since it is expected to ensure the required directory exists and is indeed a directory, and return an error otherwise.

2. idtools: fix MkdirAll usage

This subtle bug keeps lurking in because error checking for Mkdir() and MkdirAll() is slightly different wrt EEXIST/IsExist:

  • for Mkdir(), IsExist error should (usually) be ignored (unless you want to make sure directory was not there before) as it means "the destination directory was already there";

  • for MkdirAll(), IsExist error should NEVER be ignored.

This commit removes ignoring the IsExist error, as it should not be ignored.

For more details, a quote from my https://github.com/opencontainers/runc/pull/162:


TL;DR: check for IsExist(err) after a failed MkdirAll() is both redundant and wrong -- so two reasons to remove it.

Quoting MkdirAll documentation:

MkdirAll creates a directory named path, along with any necessary parents, and returns nil, or else returns an error. If path is already a directory, MkdirAll does nothing and returns nil.

This means two things:

  1. If a directory to be created already exists, no error is returned.

  2. If the error returned is IsExist (EEXIST), it means there exists a non-directory with the same name as MkdirAll need to use for directory. Example: we want to MkdirAll("a/b"), but file "a" (or "a/b") already exists, so MkdirAll fails.

The above is a theory, based on quoted documentation and my UNIX knowledge.

  1. In practice, though, current MkdirAll implementation [1] returns ENOTDIR in most of cases described in #2, with the exception when there is a race between MkdirAll and someone else creating the last component of MkdirAll argument as a file. In this very case MkdirAll() will indeed return EEXIST.

Because of #1, IsExist check after MkdirAll is not needed.

Because of #2 and #3, ignoring IsExist error is just plain wrong, as directory we require is not created. It's cleaner to report the error now.

Note this error is all over the tree, I guess due to copy-paste, or trying to follow the same usage pattern as for Mkdir(), or some not quite correct examples on the Internet.

[1] https://github.com/golang/go/blob/f9ed2f75/src/os/path.go


+7 -2

2 comments

1 changed file

kolyshkin

pr closed time in 5 days

pull request commentmrunalp/fileutils

Fixes for MkdirAllAsNew

LGTM

kolyshkin

comment created time in 5 days

push eventopencontainers/runc

Giuseppe Scrivano

commit sha 510c79f9cf6b9df0b74a938769e716122a982b8b

vendor: update runtime-specs to 237cc4f519e Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>

view details

Giuseppe Scrivano

commit sha 41aa19662b6aa05b8ec70962f0c74f6f77098835

libcontainer: honor seccomp errnoRet Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>

view details

Mrunal Patel

commit sha 9a808dd0143ab2dfd1b72615efc772eb4ab44120

Merge pull request #2424 from giuseppe/errno-ret libcontainer: honor seccomp errnoRet

view details

push time in 5 days

PR merged opencontainers/runc

libcontainer: honor seccomp errnoRet

Signed-off-by: Giuseppe Scrivano gscrivan@redhat.com

+98 -15

2 comments

9 changed files

giuseppe

pr closed time in 5 days

pull request commentopencontainers/runc

libcontainer: honor seccomp errnoRet

LGTM

giuseppe

comment created time in 5 days

pull request commentkubernetes/enhancements

KEP: Custom errors in the CRI

Opened PR - https://github.com/kubernetes/kubernetes/pull/91273

mrunalp

comment created time in 6 days

PR opened kubernetes/kubernetes

cri-api: Introduce errors package for the CRI

What type of PR is this?

/kind feature

What this PR does / why we need it: We start by adding a helper function for IsNotFound errors. The expectation is that CRI implementations return the grcp not found status code for situations where they can't find a container or a pod. This is the lowest hanging fruit to start improving the kubelet to detect such conditions and react better.

Which issue(s) this PR fixes: None

Does this PR introduce a user-facing change?: -->

NONE

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.: -->

[KEP]: https://github.com/kubernetes/enhancements/pull/1654 

cc: @derekwaynecarr @dchen1107

+103 -0

0 comment

3 changed files

pr created time in 6 days

create barnchmrunalp/kubernetes

branch : cri_errors

created branch time in 6 days

pull request commentopencontainers/runc

fix "libcontainer/cgroups/fs/cpuset.go:63:14: undefined: fmt"

@kolyshkin ptal.

AkihiroSuda

comment created time in 6 days

pull request commentkubernetes/enhancements

KEP: Custom errors in the CRI

Updated. I will open a PR for the cri api later today.

mrunalp

comment created time in 6 days

Pull request review commentkubernetes/enhancements

KEP: Custom errors in the CRI

+---+title: Custom errors in the CRI+authors:+  - "@mrunalp"+owning-sig: sig-node+reviewers:+  - "@derekwaynecarr"+  - "@dchen1107"+approvers:+  - "@derekwaynecarr"+  - "@dchen1107"+editor: "@mrunalp"+creation-date: 2020-03-16+last-updated: 2020-03-30+status: +---+# Custom errors in the CRI+<!-- toc -->+- [Release Signoff Checklist](#release-signoff-checklist)+- [Summary](#summary)+- [Motivation](#motivation)+  - [Goals](#goals)+  - [Non-Goals](#non-goals)+- [Proposal](#proposal)+  - [User Stories](#user-stories)+    - [Story 1](#story-1)+    - [Story 2](#story-2)+  - [Options](#options)+    - [Option 1: Introduce an error package for the CRI](#option-1-introduce-an-error-package-for-the-cri)+    - [Option 2: Add additional states for pods and containers](#option-2-add-additional-states-for-pods-and-containers)+    - [Option 3: Use grpc rpc status and add helper package](#option-3-use-grpc-rpc-status-and-add-helper-package)+- [Design Details](#design-details)+  - [Test Plan](#test-plan)+  - [Graduation Criteria](#graduation-criteria)+- [Implementation History](#implementation-history)+<!-- /toc -->++## Release Signoff Checklist+- [ ] Enhancement issue in release milestone, which links to KEP dir in [kubernetes/enhancements] (not the initial KEP PR)+- [ ] KEP approvers have approved the KEP status as `implementable`+- [ ] Design details are appropriately documented+- [ ] Test plan is in place, giving consideration to SIG Architecture and SIG Testing input+- [ ] Graduation criteria is in place+- [ ] "Implementation History" section is up-to-date for milestone+- [ ] User-facing documentation has been created in [kubernetes/website], for publication to [kubernetes.io]+- [ ] Supporting documentation e.g., additional design documents, links to mailing list discussions/SIG meetings, relevant PRs/issues, release notes+++## Summary+Add custom errors to the CRI for common error scenarios to make the kubelet +more efficient.++## Motivation+Kubelet spends a lot of time during termination determining whether a pod or a +container is still running. Implementing some basic custom errors types for+atleast the 'not found' case should improve the time it takes for the kubelet+to tear down.++### Goals+Introduce custom errors in the CRI to cover the most common error scenarios+that the kubelet will benefit from.++### Non-Goals+Introduce a custom error for every possible error that we return from+the container runtimes.++## Proposal++### User Stories++#### Story 1+Detect containers and pods not found by the container runtimes, so the kubelet could exit+early out of some loops.++#### Story 2+Inform the kubelet to retry an operation after some time if we are waiting for+some condition to be satisfied.++### Options++#### Option 1: Introduce an error package for the CRI+```go+k8s.io/cri-api/pkg/errors++const (+    ErrNotFound = errors.New("not found")+    ...+)+```++Pros:+- Easy to add support first class errors for golang.++Cons:+- No support for container runtime servers implemented in other languages.++#### Option 2: Add additional states for pods and containers+```+-- a/staging/src/k8s.io/cri-api/pkg/apis/runtime/v1alpha2/api.proto++++ b/staging/src/k8s.io/cri-api/pkg/apis/runtime/v1alpha2/api.proto+@@ -435,6 +435,7 @@ message LinuxPodSandboxStatus {+ enum PodSandboxState {+     SANDBOX_READY    = 0;+     SANDBOX_NOTREADY = 1;++    SANDBOX_NOTFOUND = 2;+ }+ + // PodSandboxStatus contains the status of the PodSandbox.+@@ -828,10 +829,11 @@ message RemoveContainerRequest {+ message RemoveContainerResponse {}+ + enum ContainerState {+-    CONTAINER_CREATED = 0;+-    CONTAINER_RUNNING = 1;+-    CONTAINER_EXITED  = 2;+-    CONTAINER_UNKNOWN = 3;++    CONTAINER_CREATED  = 0;++    CONTAINER_RUNNING  = 1;++    CONTAINER_EXITED   = 2;++    CONTAINER_UNKNOWN  = 3;++    CONTAINER_NOTFOUND = 4;+ }+```++Pros:+- Easier to support different programming languages for container runtimes.++Cons:+- Needs more changes to support errors for the APIs. +- Requires inspecting objects for errors on the client side which doesn't+  look idiomatic.++#### Option 3: Use grpc rpc status and add helper package+```

We will use the error codes from grpc with a wrapper helper package in the CRI API.

mrunalp

comment created time in 6 days

push eventmrunalp/enhancements

Pablo Castellano

commit sha ea7dd8f0c9c70062aa74b5fdf472a3bb3959cdc1

Removed tab Tabs are shown in red by the github markdown renderer

view details

colstuwjx

commit sha 28e8137920022553bd617dbfbd73a81b28a0430c

Correct max endpoints per slice flag.

view details

Zihong Zheng

commit sha fa614a77e0f5ecdeefa4ad1111cdfe5292a7ec2a

Update service finalizer KEP to implemented

view details

Nick Young

commit sha 2d03ef544dae1e962c83420c6d2115cd753de339

Add Kubernetes Yearly Support Period KEP Signed-off-by: Nick Young <ynick@vmware.com>

view details

Nick Young

commit sha 3729e4fb0b9a6afa10678aabd6d4fd1a1cd08acd

Move KEP to sig-release

view details

Nick Young

commit sha 6185e84ef0758db6efa9049b5086edaa60afb1ed

Update timeframe from 1.18 to 1.19

view details

Antonio Ojea

commit sha f82b250215dd355b70ac341c55da16ccaa722742

Update DualStack KEP Add section for the different types of services. Headless services weren't considered before. Remove the need to bind on multiple ip addresses. Be more restrictive using comma separated strings with CIDRs.

view details

Nick Young

commit sha 888ac59c96c924ce6803c876ca033698d6c246cc

Update with feedback on test process changes

view details

Josh Berkus

commit sha c1e2ca61dce02c8f137a676ab037a789450ed728

Straighten out documentation and testing sections for the KEP. Add some new test plan steps, and documentation locations. Delete some duplicate text.

view details

Nick Young

commit sha 780421876c4aa0fa8da8c1dafabe719329e8d5e5

Merge pull request #1 from jberkus/one-year-support-window Straighten out documentation and testing sections for the KEP.

view details

Nick Young

commit sha db3f7c042b3d23232a8168c7c3efc3d693437c5f

Update table of contents

view details

jay vyas

commit sha 06b640a88617a7a151eb367e26b38dcbdfd0f42e

Create 20200204-cni-verification-rearchitecture.md

view details

jay vyas

commit sha 418e382a84368a3fdb2d077885ae19f0950f3e35

Update 20200204-cni-verification-rearchitecture.md

view details

jay vyas

commit sha 97d7cebd4432acb14e6860b2f5dfba990ec803f4

Update 20200204-cni-verification-rearchitecture.md

view details

jay vyas

commit sha e834f3f1bf6ade553507e64628b51851e1e5e384

move it to the right directory

view details

jay vyas

commit sha b1750ab89d6d7bce3bf59e89f9df7435a4e9e507

update contents to latest KEP from other repo

view details

Matt Fenwick

commit sha 395292e1d526dec15641f2411baa4ae182b1dfe3

cleanup

view details

Matt Fenwick

commit sha f2b98329839f7be8746dd19fc56f87cee1def0df

more cleanup

view details

Cria Hu

commit sha acbb7aa2c467957684d1dae90009c24bf861b957

fix broken link : https://github.com/kubernetes/community/blob/master/contributors/devel/instrumentation.md

view details

jay vyas

commit sha 8d4561f365528d1c46c484d54dd9097714b08c89

Update 20200204-cni-verification-rearchitecture.md

view details

push time in 6 days

pull request commentgoogle/cadvisor

libcontainer, cgroupv2: adapt to new API

It is going to take some to establish a release cadence in runc that lets us iterate quick enough on cgroups v2. I propose that keep updating runc and then pin to a release which gives us a way to keep making progress on cgroups v2.

giuseppe

comment created time in 6 days

push eventopencontainers/runc

Kir Kolyshkin

commit sha f160352682f4518c8356ba394e5f147822a1eb74

libct/cgroup: prep to rm GetClosestMountpointAncestor This function is not very efficient, does not really belong to cgroup package, and is only used once (from fs/cpuset.go). Prepare to remove it by replacing with the implementation based on the parser from github.com/moby/sys/mountinfo parser. This commit is here to make sure the proposed replacement passes the unit test. Funny, but the unit test need to be slightly modified since it supplies the wrong mountinfo (space as the first character, empty line at the end). Validated by $ go test -v -run Ance === RUN TestGetClosestMountpointAncestor --- PASS: TestGetClosestMountpointAncestor (0.00s) PASS ok github.com/opencontainers/runc/libcontainer/cgroups 0.002s Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>

view details

Kir Kolyshkin

commit sha 2db3240f35b0f082d5129b52a0622b303cb4ccd1

libct/cgroups: rm GetClosestMountpointAncestor The function GetClosestMountpointAncestor is not very efficient, does not really belong to cgroup package, and is only used once (from fs/cpuset.go). Remove it, replacing with the implementation based on moby/sys/mountinfo parser. Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>

view details

Mrunal Patel

commit sha 53a46497768e7d55ceb18c4edf277ff5911d54b0

Merge pull request #2401 from kolyshkin/fs-cpuset-mountinfo libct/cgroup: rm GetClosestMountpointAncestor using moby/sys/mountinfo parser

view details

push time in 7 days

PR merged opencontainers/runc

libct/cgroup: rm GetClosestMountpointAncestor using moby/sys/mountinfo parser kind/refactor

The function GetClosestMountpointAncestor is not very efficient, does not really belong to cgroup package, and is only used once (from fs/cpuset.go).

The removal is done in two stages/commits:

1

Prepare to remove it by replacing with the implementation based on the github.com/moby/sys/mountinfo parser.

This commit is here to make sure the proposed replacement passes the unit test.

Funny, but the unit test need to be slightly modified since it supplies the wrong mountinfo (space as the first character, empty line at the end).

Validated by

 $ go test -v -run Ance
 === RUN   TestGetClosestMountpointAncestor
 --- PASS: TestGetClosestMountpointAncestor (0.00s)
 PASS
 ok  	github.com/opencontainers/runc/libcontainer/cgroups	0.002s

2

Actually remove GetClosestMountpointAncestor, move the replacement code to libcontainer/cgroups/fs/cpuset.go.

+25 -46

6 comments

3 changed files

kolyshkin

pr closed time in 7 days

push eventopencontainers/runc

Kir Kolyshkin

commit sha ca1d135bd4f9e7fc2f14d52ac28e8fa15efcc4ec

runc checkpoint: fix --status-fd to accept fd 1. The command `runc checkpoint --lazy-server --status-fd $FD` actually accepts a file name as an $FD. Make it accept a file descriptor, like its name implies and the documentation states. In addition, since runc itself does not use the result of CRIU status fd, remove the code which relays it, and pass the FD directly to CRIU. Note 1: runc should close this file descriptor itself after passing it to criu, otherwise whoever waits on it might wait forever. Note 2: due to the way criu swrk consumes the fd (it reopens /proc/$SENDER_PID/fd/$FD), runc can't close it as soon as criu swrk has started. There is no good way to know when criu swrk has reopened the fd, so we assume that as soon as we have received something back, the fd is already reopened. 2. Since the meaning of --status-fd has changed, the test case using it needs to be fixed as well. Modify the lazy migration test to remove "sleep 2", actually waiting for the the lazy page server to be ready. While at it, - remove the double fork (using shell's background process is sufficient here); - check the exit code for "runc checkpoint" and "criu lazy-pages"; - remove the check for no errors in dump.log after restore, as we are already checking its exit code. [v2: properly close status fd after spawning criu] [v3: move close status fd to after the first read] Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>

view details

Mrunal Patel

commit sha 825e91ada6bd0055b595cb70f41d5dd7267f48ba

Merge pull request #2341 from kolyshkin/test-cpt-lazy runc checkpoint: fix --status-fd to accept fd

view details

push time in 7 days

PR merged opencontainers/runc

runc checkpoint: fix --status-fd to accept fd area/checkpoint-restore area/ci

1

The command runc checkpoint --lazy-server --status-fd $FD actually accepts a file name as an $FD. Make it accept a file descriptor, like its name implies and the documentation states.

In addition, since runc itself does not use the result of CRIU status fd, remove the code which relays it, and pass the FD directly to CRIU.

Note 1: runc should close this file descriptor itself after passing it to criu, otherwise whoever waits on it might wait forever.

Note 2: due to the way criu swrk consumes the fd (it reopens /proc/$SENDER_PID/fd/$FD), runc can't close it as soon as criu swrk has started. There is no good way to know when criu swrk has reopened the fd, so we assume that as soon as we have received something back, the fd is already reopened.

2

Since the meaning of --status-fd has changed, the test case using it needs to be fixed as well.

Modify the lazy migration test to remove "sleep 2", actually waiting for the the lazy page server to be ready.

While at it

  • remove the double fork (using shell's background process is sufficient here);

  • check the exit code for "runc checkpoint" and "criu lazy-pages";

  • remove the check for no errors in dump.log after restore, as we are already checking its exit code.

History

[v2: properly close status fd after spawning criu] [v3: move close status fd to after the first read]

+56 -64

13 comments

5 changed files

kolyshkin

pr closed time in 7 days

pull request commentopencontainers/runc

runc checkpoint: fix --status-fd to accept fd

LGTM

kolyshkin

comment created time in 7 days

push eventopencontainers/runc

lifubang

commit sha 9ad1beb40fe6867ef1fe5bd8d536e2baf6003831

never write empty string to memory.swap.max Because the empty string means set swap to 0. Signed-off-by: lifubang <lifubang@acmcoder.com>

view details

Mrunal Patel

commit sha 67fac528d0b6a8c6b586da3e05b4193a2cdeba37

Merge pull request #2410 from lifubang/swap0patch cgroupv2: never write empty string to memory.swap.max

view details

push time in 7 days

PR merged opencontainers/runc

cgroupv2: never write empty string to memory.swap.max area/cgroupv2 bug

Never write empty string to memory.swap.max Because the empty string means set swap to 0.

For example: with memory limit:

"memory": {
        "limit": 33554432,
        "swap": 33558528
 }

The value of memory.swap.max is 4096. After we run runc update --memory 33554432 test, the value of memory.swap.max becomes 0. This is a regression introduced by #2370 . I'm so sorry.

Signed-off-by: lifubang lifubang@acmcoder.com

+5 -2

3 comments

1 changed file

lifubang

pr closed time in 7 days

pull request commentopencontainers/runc

cgroupv2: never write empty string to memory.swap.max

LGTM

lifubang

comment created time in 7 days

Pull request review commentopenshift/enhancements

[WIP] Mount another image's filesystem to a container

+---+title: mount-image-filesystem+authors:+  - "@jwforres"+reviewers:+  - TBD+  - "@alicedoe"+approvers:+  - TBD+  - "@oscardoe"+creation-date: 2020-05-11+last-updated: 2020-05-11+status: provisional|implementable(?)+see-also:+replaces:+superseded-by:+---++# Mount Image Filesystem++## Release Signoff Checklist++- [ ] Enhancement is `implementable`+- [ ] Design details are appropriately documented from clear requirements+- [ ] Test plan is defined+- [ ] Graduation criteria for dev preview, tech preview, GA+- [ ] User-facing documentation is created in [openshift-docs](https://github.com/openshift/openshift-docs/)++## Open Questions [optional]++1. Should it be possible to swap out an image mount while a container is running like we do for ConfigMaps and Secrets when their data changes? Example, my large file changed and now I have a new image available and I want to hot swap that file. Unlike a ConfigMap or Secret who's reference doesn't change on the container spec, to make this possible for image mounts the container would have to now reference a new image pullspec.++## Summary++A running container in a Pod on OpenShift will be able to directly mount a read-only filesystem from another image.++## Motivation++There are many situations where it is beneficial to ship the main runtime image separately from a large binary file that will be used by the application at runtime. Putting this large binary inside another image makes it easy to use existing image pull/push semantics to move content around. This pattern is used frequently, but in order to make the content available to the runtime image it must be copied from an initContainer into the shared filesystem of the Pod. For very large files this creates a significant startup cost while copying. It also requires needlessly running the image containing the binary content for the sole purpose of moving the data.++This enhancement proposes allowing the image's filesystem to be directly mounted to the container of the main runtime image. This eliminates the need for file copying within the Pod and gives the runtime container's processes access to the data immediately.++### Goals++The goal of this proposal is to work through the details of a functional prototype within containers/storage, containers/libpod, and/or a CSI driver that can back the ephemeral CSI volume.++### Non-Goals++---++## Proposal++### User Stories [optional]++#### Story 1+As a user deploying an application on OpenShift I can specify the pullspec of an image as a read-only Volume mount for my Pod.++#### Story 2+As a user deploying an application on OpenShift I can specify the pullspec of an image as a read-only Volume mount for my Pod and restrict the mount to a path within the image's filesystem.++### Implementation Details/Notes/Constraints [optional]++For the CSI driver there is some previous work in this space that it may be possible to build on: https://github.com/kubernetes-csi/csi-driver-image-populator

This is interesting.

jwforres

comment created time in 7 days

Pull request review commentopenshift/enhancements

[WIP] Mount another image's filesystem to a container

+---+title: mount-image-filesystem+authors:+  - "@jwforres"+reviewers:+  - TBD+  - "@alicedoe"+approvers:+  - TBD+  - "@oscardoe"+creation-date: 2020-05-11+last-updated: 2020-05-11+status: provisional|implementable(?)+see-also:+replaces:+superseded-by:+---++# Mount Image Filesystem++## Release Signoff Checklist++- [ ] Enhancement is `implementable`+- [ ] Design details are appropriately documented from clear requirements+- [ ] Test plan is defined+- [ ] Graduation criteria for dev preview, tech preview, GA+- [ ] User-facing documentation is created in [openshift-docs](https://github.com/openshift/openshift-docs/)++## Open Questions [optional]++1. Should it be possible to swap out an image mount while a container is running like we do for ConfigMaps and Secrets when their data changes? Example, my large file changed and now I have a new image available and I want to hot swap that file. Unlike a ConfigMap or Secret who's reference doesn't change on the container spec, to make this possible for image mounts the container would have to now reference a new image pullspec.++## Summary++A running container in a Pod on OpenShift will be able to directly mount a read-only filesystem from another image.++## Motivation++There are many situations where it is beneficial to ship the main runtime image separately from a large binary file that will be used by the application at runtime. Putting this large binary inside another image makes it easy to use existing image pull/push semantics to move content around. This pattern is used frequently, but in order to make the content available to the runtime image it must be copied from an initContainer into the shared filesystem of the Pod. For very large files this creates a significant startup cost while copying. It also requires needlessly running the image containing the binary content for the sole purpose of moving the data.++This enhancement proposes allowing the image's filesystem to be directly mounted to the container of the main runtime image. This eliminates the need for file copying within the Pod and gives the runtime container's processes access to the data immediately.++### Goals++The goal of this proposal is to work through the details of a functional prototype within containers/storage, containers/libpod, and/or a CSI driver that can back the ephemeral CSI volume.++### Non-Goals++---++## Proposal++### User Stories [optional]++#### Story 1+As a user deploying an application on OpenShift I can specify the pullspec of an image as a read-only Volume mount for my Pod.++#### Story 2+As a user deploying an application on OpenShift I can specify the pullspec of an image as a read-only Volume mount for my Pod and restrict the mount to a path within the image's filesystem.++### Implementation Details/Notes/Constraints [optional]++For the CSI driver there is some previous work in this space that it may be possible to build on: https://github.com/kubernetes-csi/csi-driver-image-populator++The CSI driver must not pull images into the same image filesystem as the one the kubelet uses, otherwise the image will be garbage collected by the kubelet even though its filesytem is in use by a container.

We could instantiate a separate image store but then we will probably need some controller to gc the images in that store. If we end up going that path, it will be useful for the use case of a buildah build cache as well. cc: @nalind

jwforres

comment created time in 7 days

Pull request review commentopenshift/enhancements

[WIP] Mount another image's filesystem to a container

+---+title: mount-image-filesystem+authors:+  - "@jwforres"+reviewers:+  - TBD+  - "@alicedoe"+approvers:+  - TBD+  - "@oscardoe"+creation-date: 2020-05-11+last-updated: 2020-05-11+status: provisional|implementable(?)+see-also:+replaces:+superseded-by:+---++# Mount Image Filesystem++## Release Signoff Checklist++- [ ] Enhancement is `implementable`+- [ ] Design details are appropriately documented from clear requirements+- [ ] Test plan is defined+- [ ] Graduation criteria for dev preview, tech preview, GA+- [ ] User-facing documentation is created in [openshift-docs](https://github.com/openshift/openshift-docs/)++## Open Questions [optional]++1. Should it be possible to swap out an image mount while a container is running like we do for ConfigMaps and Secrets when their data changes? Example, my large file changed and now I have a new image available and I want to hot swap that file. Unlike a ConfigMap or Secret who's reference doesn't change on the container spec, to make this possible for image mounts the container would have to now reference a new image pullspec.++## Summary++A running container in a Pod on OpenShift will be able to directly mount a read-only filesystem from another image.++## Motivation++There are many situations where it is beneficial to ship the main runtime image separately from a large binary file that will be used by the application at runtime. Putting this large binary inside another image makes it easy to use existing image pull/push semantics to move content around. This pattern is used frequently, but in order to make the content available to the runtime image it must be copied from an initContainer into the shared filesystem of the Pod. For very large files this creates a significant startup cost while copying. It also requires needlessly running the image containing the binary content for the sole purpose of moving the data.

Could we add a more concrete example, please? :)

jwforres

comment created time in 7 days

Pull request review commentopenshift/enhancements

[WIP] Mount another image's filesystem to a container

+---+title: mount-image-filesystem+authors:+  - "@jwforres"+reviewers:+  - TBD+  - "@alicedoe"+approvers:+  - TBD+  - "@oscardoe"+creation-date: 2020-05-11+last-updated: 2020-05-11+status: provisional|implementable(?)+see-also:+replaces:+superseded-by:+---++# Mount Image Filesystem++## Release Signoff Checklist++- [ ] Enhancement is `implementable`+- [ ] Design details are appropriately documented from clear requirements+- [ ] Test plan is defined+- [ ] Graduation criteria for dev preview, tech preview, GA+- [ ] User-facing documentation is created in [openshift-docs](https://github.com/openshift/openshift-docs/)++## Open Questions [optional]++1. Should it be possible to swap out an image mount while a container is running like we do for ConfigMaps and Secrets when their data changes? Example, my large file changed and now I have a new image available and I want to hot swap that file. Unlike a ConfigMap or Secret who's reference doesn't change on the container spec, to make this possible for image mounts the container would have to now reference a new image pullspec.

I think this depends on how we envision exposing this in a pod spec and the trigger that we can use to update the mount for the image. I think we want something like this:

apiVersion: v1
kind: Pod
metadata:
  name: test-image-volume
spec:
  containers:
  - image: quay.io/fedora:32
    name: test-container
    volumeMounts:
      - mountPath: /data
      name: test-volume
  volumes:
  - name: test-volume
    imagePath:
      image: quay.io/mydata:1.3.0
      # optional field that specifies what subpath to mount.
      subPath: /image/subpath/to/mount

There could be a controller that periodically watches the image and then updates it as needed. What kind of latency will be acceptable for an update for our use cases?

jwforres

comment created time in 7 days

pull request commentopenshift/enhancements

Add enhancement proposal for selinux-operator

@JAORMX It went into containers-selinux in https://github.com/containers/container-selinux/commit/b321ea4107bae3eb73859031467f2416ddc0b28f and follow on PRs. I am more wondering here whether that should be split out to be an optional policy that we only install on kata nodes. Thanks!

JAORMX

comment created time in 7 days

pull request commentopenshift/enhancements

Add enhancement proposal for selinux-operator

@rhatdan recently added support for kata containers to SELinux. @rhatdan do you think it may be worth only installing that policy on nodes where we end up installing kata using what where we end up with this proposal?

JAORMX

comment created time in 8 days

pull request commentopenshift/enhancements

Add enhancement proposal for selinux-operator

cc: @rhatdan

JAORMX

comment created time in 8 days

pull request commentcri-o/cri-o

oci: always have conmon log to syslog

With systemd you could journalctl -u crio-conmon-id.scope and get the logs, but with cgroups manager there is no way to get those logs, hence it was that way. That said, if the scope is gone, you can't see the logs anymore in systemd either.

haircommander

comment created time in 10 days

pull request commentcri-o/cri-o

[1.16] unmount network namespace on remove

/lgtm

korivka

comment created time in 11 days

Pull request review commentcri-o/cri-o

Introduce state lock for finer grained access control

 func (c *Container) Metadata() *pb.ContainerMetadata {  // State returns the state of the running container func (c *Container) State() *ContainerState {-	c.opLock.RLock()-	defer c.opLock.RUnlock()-	return c.state+	c.stateLock.RLock()+	defer c.stateLock.RUnlock()+	if c.state != nil {+		cs := *c.state+		return &cs+	}+	return nil } -// StateNoLock returns the state of a container without using a lock.-func (c *Container) StateNoLock() *ContainerState {-	return c.state+func (c *Container) Pid() int {+	c.stateLock.RLock()

The pid also shouldn't change after we set it for the container. We though need to make sure around the runc state calls that return stopped status.

tedyu

comment created time in 12 days

Pull request review commentcri-o/cri-o

Introduce state lock for finer grained access control

 func (c *Container) StatePath() string {  // CreatedAt returns the container creation time func (c *Container) CreatedAt() time.Time {+	c.stateLock.RLock()

We don't need a read lock for this since this is set as part of NewContainer and then never modified.

tedyu

comment created time in 12 days

pull request commentcri-o/cri-o

[1.18] bump to v1.18.1

/lgtm

haircommander

comment created time in 12 days

pull request commentcri-o/cri-o

Update nix image and dependencies

/lgtm

saschagrunert

comment created time in 12 days

pull request commentcri-o/cri-o

crio wipe: log less

/lgtm

haircommander

comment created time in 12 days

pull request commentcontainers/image

blob-copy detection

:+1: to reviving this

vrothberg

comment created time in 12 days

pull request commentopencontainers/runtime-spec

MAINTAINERS: Add @cyphar as maintainer

LGTM

giuseppe

comment created time in 12 days

pull request commentopencontainers/runc

Release rc11

As @derekwaynecarr pointed out kubernetes has relied on using libcontainer cgroups library for a long time now. @giuseppe is helping to make cgroups v2 a reality in k8s and @kolyshkin has been refactoring and improving the groups code in libcontainer and runc. I think we should find a path to get a quick iteration of testing cgroups v2 in k8s as it gives us a valuable feedback loop to improve croups v2 support in runc. We can't possibly assume we have fixed everything in cgroups v2 without that feedback. So, I think it makes sense for us to cut rc releases frequently to help facilitate updates in k8s. Once we get enough signal from there we can start thinking of factoring out libcontainer/cgroups as a separate library.

mrunalp

comment created time in 12 days

pull request commentcontainers/storage

new interface for MountImage added

@kunalkushwaha Could you rebase this?

kunalkushwaha

comment created time in 12 days

pull request commentopenshift/enhancements

[WIP] Mount another image's filesystem to a container

I will make a pass tomorrow. Thanks!

On May 12, 2020, at 7:31 PM, Jessica Forrester notifications@github.com wrote:

@mrunalp this is the proposal we chatted about, if there are any gaps / better detail that you want to fill in

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or unsubscribe.

jwforres

comment created time in 13 days

push eventopencontainers/runc

Kir Kolyshkin

commit sha f0daf65100dae51ff31760c0f1f184ff2adcb33e

Vagrantfile: use criu from stable repo CRIU 3.14 has made its way to the F32 stable repo, let's use it. Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>

view details

Mrunal Patel

commit sha df3d7f673aff8344c2b9d0747501f1002af136bb

Merge pull request #2393 from kolyshkin/criu-pi Vagrantfile: use criu from stable repo

view details

push time in 13 days

PR merged opencontainers/runc

Vagrantfile: use criu from stable repo area/ci easy-to-review

CRIU 3.14 has made its way to the F32 stable repo, let's use it.

+1 -4

3 comments

1 changed file

kolyshkin

pr closed time in 13 days

pull request commentopencontainers/runc

Vagrantfile: use criu from stable repo

LGTM

kolyshkin

comment created time in 13 days

pull request commentopencontainers/runc

Dockerfile: bump bats to 1.2.0

LGTM

kolyshkin

comment created time in 13 days

pull request commentkubernetes/kubernetes

Separate out cri related code into self contained package

@dims Thanks for opening a PR with your analysis.

dims

comment created time in 13 days

pull request commentkubernetes/kubernetes

Separate out cri related code into self contained package

This is what we have in the crio go.mod file today:

require (
        ...
        k8s.io/apimachinery v0.17.4                                                                                                                                                                                 
        k8s.io/client-go v0.0.0                                                                                                                                                                                     
        k8s.io/cri-api v0.0.0                                                                                                                                                                                       
        k8s.io/klog v1.0.0                                                                                                                                                                                          
        k8s.io/kubernetes v1.18.1                                                                                                                                                                                   
        k8s.io/release v0.3.0                                                                                                                                                                                       
        k8s.io/utils v0.0.0-20200327001022-6496210b90e8                                                                                                                                                             
        ...                                                                                                                                                                                     
)                                                                                                                                                                                                                   
                                                                                                                                                                                                                    
replace (                                                                                                                                                                                                           
        k8s.io/api => k8s.io/kubernetes/staging/src/k8s.io/api v0.0.0-20200505125908-b48f5af2602b                                                                                                                   
        k8s.io/apiextensions-apiserver => k8s.io/kubernetes/staging/src/k8s.io/apiextensions-apiserver v0.0.0-20200505125908-b48f5af2602b                                                                           
        k8s.io/apimachinery => k8s.io/kubernetes/staging/src/k8s.io/apimachinery v0.0.0-20200505125908-b48f5af2602b                                                                                                 
        k8s.io/apiserver => k8s.io/kubernetes/staging/src/k8s.io/apiserver v0.0.0-20200505125908-b48f5af2602b                                                                                                       
        k8s.io/cli-runtime => k8s.io/kubernetes/staging/src/k8s.io/cli-runtime v0.0.0-20200505125908-b48f5af2602b                                                                                                   
        k8s.io/client-go => k8s.io/kubernetes/staging/src/k8s.io/client-go v0.0.0-20200505125908-b48f5af2602b                                                                                                       
        k8s.io/cloud-provider => k8s.io/kubernetes/staging/src/k8s.io/cloud-provider v0.0.0-20200505125908-b48f5af2602b                                                                                             
        k8s.io/cluster-bootstrap => k8s.io/kubernetes/staging/src/k8s.io/cluster-bootstrap v0.0.0-20200505125908-b48f5af2602b                                                                                       
        k8s.io/code-generator => k8s.io/kubernetes/staging/src/k8s.io/code-generator v0.0.0-20200505125908-b48f5af2602b                                                                                             
        k8s.io/component-base => k8s.io/kubernetes/staging/src/k8s.io/component-base v0.0.0-20200505125908-b48f5af2602b                                                                                             
        k8s.io/cri-api => k8s.io/kubernetes/staging/src/k8s.io/cri-api v0.0.0-20200505125908-b48f5af2602b                                                                                                           
        k8s.io/csi-translation-lib => k8s.io/kubernetes/staging/src/k8s.io/csi-translation-lib v0.0.0-20200505125908-b48f5af2602b                                                                                   
        k8s.io/kube-aggregator => k8s.io/kubernetes/staging/src/k8s.io/kube-aggregator v0.0.0-20200505125908-b48f5af2602b                                                                                           
        k8s.io/kube-controller-manager => k8s.io/kubernetes/staging/src/k8s.io/kube-controller-manager v0.0.0-20200505125908-b48f5af2602b                                                                           
        k8s.io/kube-proxy => k8s.io/kubernetes/staging/src/k8s.io/kube-proxy v0.0.0-20200505125908-b48f5af2602b                                                                                                     
        k8s.io/kube-scheduler => k8s.io/kubernetes/staging/src/k8s.io/kube-scheduler v0.0.0-20200505125908-b48f5af2602b                                                                                             
        k8s.io/kubectl => k8s.io/kubernetes/staging/src/k8s.io/kubectl v0.0.0-20200505125908-b48f5af2602b                                                                                                           
        k8s.io/kubelet => k8s.io/kubernetes/staging/src/k8s.io/kubelet v0.0.0-20200505125908-b48f5af2602b                                                                                                           
        k8s.io/kubernetes => k8s.io/kubernetes v1.19.0-alpha.3                                                                                                                                                      
        k8s.io/legacy-cloud-providers => k8s.io/kubernetes/staging/src/k8s.io/legacy-cloud-providers v0.0.0-20200505125908-b48f5af2602b                                                                             
        k8s.io/metrics => k8s.io/kubernetes/staging/src/k8s.io/metrics v0.0.0-20200505125908-b48f5af2602b                                                                                                           
        k8s.io/sample-apiserver => k8s.io/kubernetes/staging/src/k8s.io/sample-apiserver v0.0.0-20200505125908-b48f5af2602b                                                                                         
)    
dims

comment created time in 13 days

pull request commentcri-o/cri-o

[1.18] fix naming test panic

/test e2e_fedora

kolyshkin

comment created time in 13 days

pull request commentcri-o/cri-o

[1.18] fix naming test panic

/lgtm

kolyshkin

comment created time in 13 days

pull request commentcri-o/cri-o

[1.18] version: small tweaks

/lgtm

haircommander

comment created time in 13 days

pull request commentopencontainers/runtime-spec

seccomp: fix go-specs for errnoRet

LGTM

giuseppe

comment created time in 13 days

pull request commentcri-o/cri-o

Pin runc for CI

/lgtm

kolyshkin

comment created time in 13 days

create barnchprojectatomic/runc

branch : rhaos-4.5

created branch time in 13 days

push eventopencontainers/runc

Kir Kolyshkin

commit sha 714c91e9f73a1512808476eb532b4aa36bbb7530

Simplify cgroup path handing in v2 via unified API This unties the Gordian Knot of using GetPaths in cgroupv2 code. The problem is, the current code uses GetPaths for three kinds of things: 1. Get all the paths to cgroup v1 controllers to save its state (see (*linuxContainer).currentState(), (*LinuxFactory).loadState() methods). 2. Get all the paths to cgroup v1 controllers to have the setns process enter the proper cgroups in `(*setnsProcess).start()`. 3. Get the path to a specific controller (for example, `m.GetPaths()["devices"]`). Now, for cgroup v2 instead of a set of per-controller paths, we have only one single unified path, and a dedicated function `GetUnifiedPath()` to get it. This discrepancy between v1 and v2 cgroupManager API leads to the following problems with the code: - multiple if/else code blocks that have to treat v1 and v2 separately; - backward-compatible GetPaths() methods in v2 controllers; - - repeated writing of the PID into the same cgroup for v2; Overall, it's hard to write the right code with all this, and the code that is written is kinda hard to follow. The solution is to slightly change the API to do the 3 things outlined above in the same manner for v1 and v2: 1. Use `GetPaths()` for state saving and setns process cgroups entering. 2. Introduce and use Path(subsys string) to obtain a path to a subsystem. For v2, the argument is ignored and the unified path is returned. This commit converts all the controllers to the new API, and modifies all the users to use it. Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>

view details

Mrunal Patel

commit sha 867c9f5bc417a85ca0e5a80412b55961a4f20352

Merge pull request #2386 from kolyshkin/gordian-knot Simplify cgroup paths handling in v2 via unified v1/v2 API

view details

push time in 14 days

PR merged opencontainers/runc

Simplify cgroup paths handling in v2 via unified v1/v2 API area/cgroupv2 kind/refactor

This unties the Gordian Knot of using GetPaths in cgroupv2 code.

The problem is, the current code uses GetPaths for three kinds of things:

  1. Get all the paths to cgroup v1 controllers to save its state (see (*linuxContainer).currentState(), (*LinuxFactory).loadState() methods).

  2. Get all the paths to cgroup v1 controllers to have the setns process enter the proper cgroups in (*setnsProcess).start().

  3. Get the path to a specific controller (for example, m.GetPaths()["devices"]).

Now, for cgroup v2 instead of a set of per-controller paths, we have only one single unified path, and a dedicated function GetUnifiedPath() to get it.

This discrepancy between v1 and v2 cgroupManager API leads to the following problems with the code:

  • multiple if/else code blocks that have to treat v1 and v2 separately;

  • backward-compatible GetPaths() methods in v2 controllers;

  • repeated writing of the PID into the same cgroup for v2;

Overall, it's hard to write the right code with all this, and the code that is written is kinda hard to follow.

The solution is to slightly change the API to do the 3 things outlined above in the same manner for v1 and v2:

  1. Use GetPaths() for state saving and setns process cgroups entering.

  2. Introduce and use Path(subsys string) to obtain a path to a subsystem. For v2, the argument is ignored and the unified path is returned.

This commit converts all the controllers to the new API, and modifies all the users to use it.

+88 -127

8 comments

8 changed files

kolyshkin

pr closed time in 14 days

pull request commentopencontainers/runc

Simplify cgroup paths handling in v2 via unified v1/v2 API

LGTM

kolyshkin

comment created time in 14 days

Pull request review commentcri-o/cri-o

Introduce state lock for finer grained access control

 func waitContainerStop(ctx context.Context, c *Container, timeout time.Duration, 			return fmt.Errorf("failed to wait process, timeout reached after %.0f seconds", 				timeout.Seconds()) 		}-		err := kill(c.state.Pid)+		err := kill(pid) 		if err != nil { 			return fmt.Errorf("failed to kill process: %v", err) 		} 	} -	c.state.Finished = time.Now()

Any reason we are dropping this logic?

tedyu

comment created time in 14 days

Pull request review commentcri-o/cri-o

Introduce state lock for finer grained access control

 func (r *runtimeOCI) StopContainer(ctx context.Context, c *Container, timeout in 		} 	} -	return waitContainerStop(ctx, c, killContainerTimeout, false)+	err = waitContainerStop(ctx, c, killContainerTimeout, false)

This block should be reverted to previous version.

tedyu

comment created time in 14 days

Pull request review commentcri-o/cri-o

Introduce state lock for finer grained access control

 func (r *runtimeOCI) StopContainer(ctx context.Context, c *Container, timeout in 		} 	} -	return waitContainerStop(ctx, c, killContainerTimeout, false)+	err = waitContainerStop(ctx, c, killContainerTimeout, false)+	if err == nil {+		c.SetFinished(time.Now())

Same here.

tedyu

comment created time in 14 days

Pull request review commentcri-o/cri-o

Introduce state lock for finer grained access control

 func (r *runtimeOCI) StopContainer(ctx context.Context, c *Container, timeout in 		} 		err = waitContainerStop(ctx, c, time.Duration(timeout)*time.Second, true) 		if err == nil {+			c.SetFinished(time.Now())

This shouldn't be added. We already set finished time from the exit file time stamp.

tedyu

comment created time in 14 days

Pull request review commentcri-o/cri-o

Introduce state lock for finer grained access control

 func waitContainerStop(ctx context.Context, c *Container, timeout time.Duration, 			return fmt.Errorf("failed to wait process, timeout reached after %.0f seconds", 				timeout.Seconds()) 		}-		err := kill(c.state.Pid)+		err := kill(pid) 		if err != nil { 			return fmt.Errorf("failed to kill process: %v", err) 		} 	} -	c.state.Finished = time.Now() 	return nil }  // StopContainer stops a container. Timeout is given in seconds.-func (r *runtimeOCI) StopContainer(ctx context.Context, c *Container, timeout int64) error {+func (r *runtimeOCI) StopContainer(ctx context.Context, c *Container, timeout int64) (errRet error) { 	c.opLock.Lock()-	defer c.opLock.Unlock()+	defer func() {

this could be defer c.opLock.Unlock()

tedyu

comment created time in 14 days

Pull request review commentcri-o/cri-o

Introduce state lock for finer grained access control

 func (r *runtimeOCI) StopContainer(ctx context.Context, c *Container, timeout in 		} 	} -	return waitContainerStop(ctx, c, killContainerTimeout, false)+	err = waitContainerStop(ctx, c, killContainerTimeout, false)+	if err == nil {+		c.SetFinished(time.Now())+		return nil+	}+	return err }  func checkProcessGone(c *Container) error {-	process, perr := findprocess.FindProcess(c.state.Pid)+	process, perr := findprocess.FindProcess(c.Pid())

Same thing here. get pid in the beginning instead of multiple calls to c.Pid()

tedyu

comment created time in 14 days

Pull request review commentcri-o/cri-o

Introduce state lock for finer grained access control

 func waitContainerStop(ctx context.Context, c *Container, timeout time.Duration, 			return fmt.Errorf("failed to wait process, timeout reached after %.0f seconds", 				timeout.Seconds()) 		}-		err := kill(c.state.Pid)+		err := kill(c.Pid())

kill(pid)

tedyu

comment created time in 14 days

Pull request review commentcri-o/cri-o

Introduce state lock for finer grained access control

 func (c *Container) Metadata() *pb.ContainerMetadata {  // State returns the state of the running container func (c *Container) State() *ContainerState {-	c.opLock.RLock()-	defer c.opLock.RUnlock()-	return c.state+	c.stateLock.RLock()+	defer c.stateLock.RUnlock()+	if c.state != nil {+		cs := *c.state+		return &cs

Sure!

tedyu

comment created time in 14 days

Pull request review commentcri-o/cri-o

Introduce state lock for finer grained access control

 func waitContainerStop(ctx context.Context, c *Container, timeout time.Duration, 			return fmt.Errorf("failed to wait process, timeout reached after %.0f seconds", 				timeout.Seconds()) 		}-		err := kill(c.state.Pid)+		err := kill(c.Pid()) 		if err != nil { 			return fmt.Errorf("failed to kill process: %v", err) 		} 	} -	c.state.Finished = time.Now() 	return nil }  // StopContainer stops a container. Timeout is given in seconds.-func (r *runtimeOCI) StopContainer(ctx context.Context, c *Container, timeout int64) error {+func (r *runtimeOCI) StopContainer(ctx context.Context, c *Container, timeout int64) (errRet error) { 	c.opLock.Lock()-	defer c.opLock.Unlock()+	defer func() {+		c.opLock.Unlock()+	}()  	// Check if the process is around before sending a signal-	process, err := findprocess.FindProcess(c.state.Pid)+	process, err := findprocess.FindProcess(c.Pid())

Get pid once and store it locally instead of multiple calls to c.Pid() in this function.

tedyu

comment created time in 14 days

Pull request review commentcri-o/cri-o

Introduce state lock for finer grained access control

 func (c *Container) Metadata() *pb.ContainerMetadata {  // State returns the state of the running container func (c *Container) State() *ContainerState {-	c.opLock.RLock()-	defer c.opLock.RUnlock()-	return c.state+	c.stateLock.RLock()+	defer c.stateLock.RUnlock()+	if c.state != nil {+		cs := *c.state+		return &cs

I think we just return ContainerState instead of a pointer from this function.

tedyu

comment created time in 14 days

Pull request review commentcri-o/cri-o

Introduce state lock for finer grained access control

 func waitContainerStop(ctx context.Context, c *Container, timeout time.Duration, 			case <-chControl: 				return 			default:-				process, err := findprocess.FindProcess(c.state.Pid)+				process, err := findprocess.FindProcess(c.Pid())

We can get the pid once at the beginning instead of calling into c.Pid() that take a read lock. The pid isn't expected to change.

tedyu

comment created time in 14 days

pull request commentcri-o/cri-o

[HACK][WIP] storage: disable fdatasync() for atomic writes

@giuseppe mind opening a release-1.18 PR to get e2e-aws run data as well? Thanks!

giuseppe

comment created time in 14 days

Pull request review commentcri-o/cri-o

Introduce state lock for finer grained access control

 func waitContainerStop(ctx context.Context, c *Container, timeout time.Duration, 			return fmt.Errorf("failed to wait process, timeout reached after %.0f seconds", 				timeout.Seconds()) 		}-		err := kill(c.state.Pid)+		err := kill(c.Pid()) 		if err != nil { 			return fmt.Errorf("failed to kill process: %v", err) 		} 	} -	c.state.Finished = time.Now() 	return nil }  // StopContainer stops a container. Timeout is given in seconds.-func (r *runtimeOCI) StopContainer(ctx context.Context, c *Container, timeout int64) error {+func (r *runtimeOCI) StopContainer(ctx context.Context, c *Container, timeout int64) (errRet error) { 	c.opLock.Lock()-	defer c.opLock.Unlock()  	// Check if the process is around before sending a signal-	process, err := findprocess.FindProcess(c.state.Pid)+	process, err := findprocess.FindProcess(c.Pid()) 	if err == findprocess.ErrNotFound {-		c.state.Finished = time.Now()+		c.opLock.Unlock()

Any reason we moved this down to the return paths instead of the defer?

tedyu

comment created time in 14 days

more