profile
viewpoint

containers/libpod 4050

libpod is a library used to create container pods. Home of Podman.

containers/skopeo 1929

Work with remote images registries - retrieving information, images, signing content

docker/go-plugins-helpers 248

Go helper packages to extend the Docker Engine

docker/go-connections 126

Utility package to work with network connections

containers/conmon 99

An OCI container runtime monitor.

docker/go-units 91

Parse and print size and time units in human-readable format

projectatomic/docker 70

Docker - the open-source application container engine

cri-o/cri-o-ansible 40

Playbooks to install CRI-O from source

containers/Demos 27

Repository is a location of user demos for technologies listed on github.com/containers

Pull request review commentopenshift/machine-config-operator

Ignition types.Config (v2) to runtime.RawExtension conversion

 package common  import (+	"bytes"+	"encoding/json"+	"fmt" 	"io/ioutil" 	"reflect"+	"sort" +	ign "github.com/coreos/ignition/config/v2_2" 	igntypes "github.com/coreos/ignition/config/v2_2/types" 	validate "github.com/coreos/ignition/config/validate" 	"github.com/golang/glog" 	mcfgv1 "github.com/openshift/machine-config-operator/pkg/apis/machineconfiguration.openshift.io/v1" 	errors "github.com/pkg/errors"+	"k8s.io/apimachinery/pkg/runtime"+	"k8s.io/apimachinery/pkg/util/yaml" ) -// NewIgnConfig returns an empty ignition config with version set as latest version-func NewIgnConfig() igntypes.Config {+const (+	kernelTypeDefault  = "default"+	kernelTypeRealtime = "realtime"+)++// DecodeIgnitionConfigSpecV2 decodes byte slices to Ignition Config Spec v2 or errors out+func DecodeIgnitionConfigSpecV2(data []byte) (*igntypes.Config, error) {

Yes we will need such a function! But since we only support V2 at this time, contents are always assumed to be V2 (or error out otherwise).

I think that's Kirsten WIP PR

LorbusChris

comment created time in 3 days

pull request commentopenshift/machine-config-operator

Make `mcfgv1` and `ctrlcommon` canonical names for importing

lifting hold I guess - this has 0 risk

/hold cancel

LorbusChris

comment created time in 3 days

pull request commentopenshift/machine-config-operator

WIP: mcc: accept ign3 & translate down

amazing, I think this has to be a combination of this and https://github.com/openshift/machine-config-operator/pull/996 in order to actually accept Ign v3 and have a smoke e2e that creates a v3 MachineConfig (which is later translated down to v2 internally for the 4.5 timeframe cc @yuqi-zhang )

kikisdeliveryservice

comment created time in 4 days

pull request commentopenshift/machine-config-operator

make etcd quorum guard privileged to read files from etcd operator

/approve

@hexfusion can you bless this?

deads2k

comment created time in 4 days

pull request commentopenshift/machine-config-operator

*: remove etcd quorum guard

depends on openshift/cluster-etcd-operator#142

this PR has been closed :thinking: does this still depend on something?

hexfusion

comment created time in 4 days

pull request commentopenshift/machine-config-operator

WIP: Move GCP routes service into gcp-routes-controller

@sttts could you take a look at this?

LorbusChris

comment created time in 4 days

pull request commentopenshift/machine-config-operator

Bug 1705750: generate CRD manifests and fix for oc explain

awesome 👍

/approve

will leave to the team to further take a look and address comments, well done!

yuqi-zhang

comment created time in 4 days

PR opened openshift/machine-config-operator

Revert 1473 and skip etcd-member.yaml validation

Closes #1480

Enable CEO testing - return early when validating the file CEO cares about

@deads2k @kikisdeliveryservice

+4 -10

0 comment

2 changed files

pr created time in 5 days

create barnchruncom/machine-config-operator

branch : ceo-enablement-0

created branch time in 5 days

pull request commentopenshift/machine-config-operator

Make `mcfgv1` and `ctrlcommon` canonical names for importing

loving this - unfortunately we can't merge this right away as we need to wait at least Monday

/hold

LorbusChris

comment created time in 9 days

pull request commentopenshift/machine-config-operator

Bug 1794493: add ctrcfg e2e test

/bugzilla refresh

haircommander

comment created time in 13 days

fork runcom/oi-userland

Unified build system for OpenIndiana distribution components

fork in 15 days

pull request commentopenshift/machine-config-operator

Bug 1794495: [release-4.3] add ctrcfg e2e test

/approve /lgtm

haircommander

comment created time in 15 days

pull request commentopenshift/machine-config-operator

Bug 1794495: [release-4.3] add ctrcfg e2e test

yeah the tests seem legit 😕

maybe something is different between 4.4/master and 4.3 🤔

haircommander

comment created time in 16 days

pull request commentopenshift/machine-config-operator

Bug 1794495: [release-4.3] add ctrcfg e2e test

@haircommander there seems to be some issue in op here instead :(

haircommander

comment created time in 16 days

pull request commentopenshift/machine-config-operator

Bug 1794493: add ctrcfg e2e test

/skip /approve /lgtm

we need a backport for this for 4.3 as well right? @umohnani8 @haircommander

haircommander

comment created time in 16 days

pull request commentopenshift/machine-config-operator

Bug 1794493: add ctrcfg e2e test

ok now it's ready (assuming the aws failure is unrelated, which I think it is?)

🎉 yep, it looks like it's ok to merge - we'll fix any flake afterwards :trollface:

haircommander

comment created time in 16 days

startedOpenIndiana/oi-userland

started time in 17 days

pull request commentopenshift/machine-config-operator

Bug 1794493: add ctrcfg e2e test

I don't know if it's an "issue". one could call the sleeps a bit hacky, but I don't know of a better way to make sure the operation has made it to the openshift api-server without them.

oh no sorry, I meant to say the gpg-op test is still failing with something that I think it's related to this change uhm

haircommander

comment created time in 17 days

pull request commentopenshift/machine-config-operator

Bug 1794493: add ctrcfg e2e test

there seems to still be some kind of issue/race right? 🤔

haircommander

comment created time in 17 days

pull request commentopenshift/machine-config-operator

ovirt: ipv6, switch to NM dispatcher for DNS VIP prepending

/lgtm

@celebdor need a BZ for 4.4 to get this in now

rgolangh

comment created time in 18 days

Pull request review commentopenshift/machine-config-operator

bug 1797797: fix(machineconfig.crd.yaml): add passwd schema

 spec:     singular: machineconfig     # kind is normally the CamelCased singular type. Your resource manifests use this.     kind: MachineConfig+  validation:+    openAPIV3Schema:+      type: object+      properties:+        apiVersion:+          description: "APIVersion defines the versioned schema of this representation+            of an object. Servers should convert recognized schemas to the latest+            internal value, and may reject unrecognized values. More info: https://git.k8s.io/community/contributors/devel/api-conventions.md#resources"+          type: string+        kind:+          description: "Kind is a string value representing the REST resource this+            object represents. Servers may infer this from the endpoint the client+            submits requests to. Cannot be updated. In CamelCase. More info: https://git.k8s.io/community/contributors/devel/api-conventions.md#types-kinds"+          type: string+        metadata:+          type: object+        spec:+          description: "spec hold the intent of how this operator should behave"+          type: object+          properties:+            config:+              description: "config contains options related to the configuration"+              type: object+              properties:+                ignition:+                  description: "ignition section contains metadata about the configuration itself"+                  type: object+                  properties:+                    version:+                      description: "version string is the semantic version number of the spec"+                      type: string+                passwd:+                  type: object+                  properties:+                    users:+                      type: array+                      items:+                        type: object+                        properties:+                          name:+                            description: "name of user. Must be \"core\" user."+                            type: string+                          sshAuthorizedKeys:+                            description: "public keys to be assigned to user core"+                            type: array+                            items:+                              type: string+                storage:+                  description: "storage describes the desired state of the system's storage devices"+                  type: object+                  properties:+                    directories:+                      description: "directories is the list of directories to be created"+                      type: array+                      items:+                        description: "items is list of directories to be written"+                        type: object+                        properties:+                          filesystem:+                            description: "filesystem is the internal identifier of the filesystem in which to write the file. This matches the last filesystem with the given identifier"+                            type: string+                          mode:+                            description: "mode is the file's permission mode. Note that the mode must be properly specified as a decimal value (i.e. 0644 -> 420)"+                            type: integer+                          path:+                            description: "path is the absolute path to the file"+                            type: string+                          user:+                            description: "user object specifies the file's owner"+                            type: object+                            properties:+                              id:+                                description: "id is the user ID of the owner"+                                type: integer+                              name:+                                description: "name is the user name of the owner"+                                type: string+                          group:+                            description: "group object specifies group of the owner"+                            type: object+                            properties:+                              id:+                                description: "id specifies group ID of the owner"+                                type: integer+                              name:+                                description: "name is the group name of the owner"+                                type: string+                          overwrite:+                            description: "overwrite specifies whether to delete preexisting nodes at the path"+                            type: boolean+                    files:+                      description: "files is the list of files to be created"+                      type: array+                      items:+                        description: "items is list of files to be written"+                        type: object+                        properties:+                          contents:+                            description: "contents specifies options related to the contents of the file"+                            type: object+                            properties:+                              compression:+                                description: "the type of compression used on the contents (null or gzip). Compression cannot be used with S3."+                                type: string+                              source:+                                description: "source is the URL of the file contents. Supported schemes are http, https, tftp, s3, and data. When using http, it is advisable to use the verification option to ensure the contents haven't been modified."+                                type: string+                              verification:+                                description: "verification specifies options related to the verification of the file contents"+                                type: object+                                properties:+                                  hash:+                                    description: "hash is the hash of the config, in the form <type>-<value> where type is sha512"+                                    type: string+                          filesystem:+                            description: "filesystem is the internal identifier of the filesystem in which to write the file. This matches the last filesystem with the given identifier"+                            type: string+                          mode:+                            description: "mode specifies the file's permission mode. Note that the mode must be properly specified as a decimal value (i.e. 0644 -> 420)"+                            type: integer+                          path:+                            description: "path is the absolute path to the file"+                            type: string+                          user:+                            description: "user object specifies the file's owner"+                            type: object+                            properties:+                              id:+                                description: "id is the user ID of the owner"+                                type: integer+                              name:+                                description: "name is the user name of the owner"+                                type: string+                          group:+                            description: "group object specifies group of the owner"+                            type: object+                            properties:+                              id:+                                description: "id specifies group ID of the owner"+                                type: integer+                              name:+                                description: "name is the group name of the owner"+                                type: string+                          overwrite:+                            description: "overwrite specifies whether to delete preexisting nodes at the path"+                            type: boolean+                          append:+                            description: "append specifies whether to append to the specified file. Creates a new file if nothing exists at the path. Cannot be set if overwrite is set to true."+                            type: boolean+                systemd:+                  description: "systemd describes the desired state of the systemd units"+                  type: object+                  properties:+                    units:+                      description: "units is a list of units to be configured"+                      type: array+                      items:+                        description: "items describes unit configuration"+                        type: object+                        properties:+                          name:+                            description: "name is the name of the unit. This must be suffixed with a valid unit type (e.g. 'thing.service')"+                            type: string+                          enabled:+                            description: "enabled describes whether or not the service shall be enabled. When true, the service is enabled. When false, the service is disabled. When omitted, the service is unmodified. In order for this to have any effect, the unit must have an install section"+                            type: boolean+                          mask:+                            description: "mask describes whether or not the service shall be masked. When true, the service is masked by symlinking it to /dev/null"+                            type: boolean+                          contents:+                            description: "contents is the contents of the unit"+                            type: string+                          dropins:+                            description: "dropins is the list of drop-ins for the unit"+                            type: array+                            items:+                              description: "items describes unit dropin"+                              type: object+                              properties:+                                contents:+                                  description: "contents is the contents of the drop-in"+                                  type: string+                                name:+                                  description: "name is the name of the drop-in. This must be suffixed with '.conf'."+                                  type: string+          kargs:

both aren't supported in 4.1

ericavonb

comment created time in 18 days

Pull request review commentopenshift/machine-config-operator

bug 1797790: prevent hitting annotation max size limit on nodes (cherry-pick)

 func (nw *clusterNodeWriter) SetWorking(client corev1client.NodeInterface, liste // SetUnreconcilable sets the state to Unreconcilable. func (nw *clusterNodeWriter) SetUnreconcilable(err error, client corev1client.NodeInterface, lister corev1lister.NodeLister, node string) error { 	glog.Errorf("Marking Unreconcilable due to: %v", err)+	// truncatedErr caps error message at a reasonable length to limit the risk of hitting the total+	// annotation size limit (256 kb) at any point+	truncatedErr := fmt.Sprintf("%.2000s", err.Error()) 	annos := map[string]string{ 		constants.MachineConfigDaemonStateAnnotationKey:  constants.MachineConfigDaemonStateUnreconcilable,-		constants.MachineConfigDaemonReasonAnnotationKey: err.Error(),+		constants.MachineConfigDaemonReasonAnnotationKey: truncatedErr, 	}+	MCDState.WithLabelValues(constants.MachineConfigDaemonStateUnreconcilable, truncatedErr).SetToCurrentTime()

yea we need to drop these calls

ericavonb

comment created time in 18 days

pull request commentopenshift/machine-config-operator

Bug 1794824: If PlatformStatus.VSphere is nil do not template the file

LGTM. Already approved so I will give other reviewers a chance.

/lgtm /retest

jcpowermac

comment created time in 18 days

pull request commentopenshift/machine-config-operator

Bug 1798146: [baremetal] Ipv6 non virtual ip fix

/approve /retest

celebdor

comment created time in 18 days

pull request commentopenshift/machine-config-operator

Bug 1797908: kubelet: Remove quotes from log level argument

/retest /approve

@rphillips ptal for lgtm

mtnbikenc

comment created time in 18 days

issue commentopenshift/machine-config-operator

ImageContentSourcePolicy restarts all worker nodes.

/cc @umohnani8 @mtrmac @haircommander

cdjohnson

comment created time in 18 days

Pull request review commentopenshift/machine-config-operator

Bug 1794493: add ctrcfg e2e test

+package e2e_test++import (+	"fmt"+	"regexp"+	"testing"+	"time"++	mcfgv1 "github.com/openshift/machine-config-operator/pkg/apis/machineconfiguration.openshift.io/v1"+	"github.com/openshift/machine-config-operator/test/e2e/framework"+	"github.com/pkg/errors"+	"github.com/stretchr/testify/require"+	corev1 "k8s.io/api/core/v1"+	metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"+	"k8s.io/apimachinery/pkg/util/uuid"+	"k8s.io/apimachinery/pkg/util/wait"+)++func TestContainerRuntimeConfigPidsLimit(t *testing.T) {+	runTestWithCtrcfg(t, "pids-limit", `pids_limit = (\S+)`, "12345", &mcfgv1.ContainerRuntimeConfiguration{+		PidsLimit: 12345,+	})+}++// runTestWithCtrcfg creates a ctrcfg and checks whether the expected updates were applied, then deletes the ctrcfg and makes+// sure the node rolled back as expected+// testName is a string to identify the objects created (MCP, MC, ctrcfg)+// regex key is the searching critera in the crio.conf. It is expected that a single field is in a capture group, and this field+//   should equal expectedConfValue upon update+// cfg is the ctrcfg config to update to and rollback from+func runTestWithCtrcfg(t *testing.T, testName, regexKey, expectedConfValue string, cfg *mcfgv1.ContainerRuntimeConfiguration) {+	cs := framework.NewClientSet("")+	matchValue := fmt.Sprintf("%s-%s", testName, uuid.NewUUID())+	ctrcfgName := fmt.Sprintf("ctrcfg-%s", matchValue)+	poolName := fmt.Sprintf("node-%s", matchValue)+	mcName := fmt.Sprintf("mc-%s", matchValue)++	// label one node from the pool to specify which worker to update+	unlabelFunc := labelRandomNodeFromPool(t, cs, "worker", mcpNameToRole(poolName))+	defer unlabelFunc()

I think I understand what's going on here, follow https://github.com/openshift/machine-config-operator/blob/master/docs/custom-pools.md#removing-a-custom-pool

Basically, first thing unlabel the node, in this function you should wait for the pool to go idle as it's going back to worker - then only after go ahead and cleanup 1) MC 2) MCP

haircommander

comment created time in 18 days

pull request commentopenshift/machine-config-operator

Ipv6 non virtual ip

can you link this PR to https://bugzilla.redhat.com/show_bug.cgi?id=1797647 as well?

celebdor

comment created time in 18 days

pull request commentopenshift/machine-config-operator

WIP: Bug 1794493: add ctrcfg e2e test

I don’t believe that ever worked , you can use oc rsh to execute commands within a given node

haircommander

comment created time in 22 days

pull request commentopenshift/machine-config-operator

Bug 1794824: If PlatformStatus.VSphere is nil do not template the file

/approve

can we get some more eyes from vsphere folks?

jcpowermac

comment created time in 23 days

pull request commentopenshift/machine-config-operator

Bug 1796563: [release-4.3] gcp: use readyz endpoint

Need a lgtm on this one also🙏

openshift-cherrypick-robot

comment created time in 23 days

Pull request review commentopenshift/machine-config-operator

WIP: Bug 1794493: add ctrcfg e2e test

+package e2e_test++import (+	"fmt"+	"testing"++	mcfgv1 "github.com/openshift/machine-config-operator/pkg/apis/machineconfiguration.openshift.io/v1"+	"github.com/openshift/machine-config-operator/test/e2e/framework"+	"github.com/stretchr/testify/require"+	"k8s.io/apimachinery/pkg/api/resource"+	metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"+	"k8s.io/apimachinery/pkg/util/uuid"+)++func TestContainerRuntimeConfigLogSizeMax(t *testing.T) {+	runTestWithCtrcfg(t, "log-size-max", &mcfgv1.ContainerRuntimeConfiguration{+		LogSizeMax: resource.MustParse("-1"),+	})+}++func TestContainerRuntimeConfigPidsLimit(t *testing.T) {+	runTestWithCtrcfg(t, "pids-limit", &mcfgv1.ContainerRuntimeConfiguration{+		PidsLimit: 2048,+	})+}++func runTestWithCtrcfg(t *testing.T, testName string, cfg *mcfgv1.ContainerRuntimeConfiguration) {+	cs := framework.NewClientSet("")+	testUUID := uuid.NewUUID()+	matchValue := fmt.Sprintf("%s-%s", testName, testUUID)+	ctrcfgName := fmt.Sprintf("ctrcfg-%s", testUUID)++	// cache the old machine config value to compare against+	oldMC := getMcName(t, cs, "worker")++	// label one node from the pool to specify which worker to update+	unlabelFunc := labelRandomNodeFromPool(t, cs, "worker", mcpNameToRole(matchValue))+	defer unlabelFunc()++	// create an MCP to match the node we tagged+	mcpCleanupFunc := createMCP(t, cs, matchValue)

Just the return is a function itself which cleans things up

haircommander

comment created time in 23 days

Pull request review commentopenshift/machine-config-operator

WIP: Bug 1794493: add ctrcfg e2e test

+package e2e_test++import (+	"fmt"+	"testing"++	mcfgv1 "github.com/openshift/machine-config-operator/pkg/apis/machineconfiguration.openshift.io/v1"+	"github.com/openshift/machine-config-operator/test/e2e/framework"+	"github.com/stretchr/testify/require"+	"k8s.io/apimachinery/pkg/api/resource"+	metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"+	"k8s.io/apimachinery/pkg/util/uuid"+)++func TestContainerRuntimeConfigLogSizeMax(t *testing.T) {+	runTestWithCtrcfg(t, "log-size-max", &mcfgv1.ContainerRuntimeConfiguration{+		LogSizeMax: resource.MustParse("-1"),+	})+}++func TestContainerRuntimeConfigPidsLimit(t *testing.T) {+	runTestWithCtrcfg(t, "pids-limit", &mcfgv1.ContainerRuntimeConfiguration{+		PidsLimit: 2048,+	})+}++func runTestWithCtrcfg(t *testing.T, testName string, cfg *mcfgv1.ContainerRuntimeConfiguration) {+	cs := framework.NewClientSet("")+	testUUID := uuid.NewUUID()+	matchValue := fmt.Sprintf("%s-%s", testName, testUUID)+	ctrcfgName := fmt.Sprintf("ctrcfg-%s", testUUID)++	// cache the old machine config value to compare against+	oldMC := getMcName(t, cs, "worker")++	// label one node from the pool to specify which worker to update+	unlabelFunc := labelRandomNodeFromPool(t, cs, "worker", mcpNameToRole(matchValue))+	defer unlabelFunc()++	// create an MCP to match the node we tagged+	mcpCleanupFunc := createMCP(t, cs, matchValue)++	// create our ctrcfg and attach it to our created node pool+	cleanupCtrcfg := createCtrcfgWithConfig(t, cs, ctrcfgName, matchValue, cfg)

It’s because the function is returning a function which cleanup the resource, pretty standard pattern in golang

haircommander

comment created time in 23 days

pull request commentopenshift/machine-config-operator

[release-4.3] Bug 1796444: gcp: use readyz endpoint

/retitle Bug 1796563: [release-4.3] gcp: use readyz endpoint

openshift-cherrypick-robot

comment created time in 23 days

delete branch runcom/machine-config-operator

delete branch : gcp-readyz

delete time in 23 days

pull request commentopenshift/machine-config-operator

WIP: Bug 1794493: add ctrcfg e2e test

e2e-gcp-op still related to this now that we're back with that

haircommander

comment created time in 24 days

pull request commentopenshift/machine-config-operator

Bug 1777379: [Baremetal] add CPU and memory resources for Haproxy pod

/approve /retest /skip

yboaron

comment created time in 24 days

pull request commentopenshift/machine-config-operator

[release-4.2] Bug 1781665: pkg/daemon: randomize pivot container name

@yuqi-zhang @kikisdeliveryservice @sinnykumari @ericavonb can anyone review this and lgtm?

openshift-cherrypick-robot

comment created time in 24 days

pull request commentopenshift/machine-config-operator

Bug 1785279: [release-4.3] Discard audit messages from journald

Is this a backport?

It is yes

fbac

comment created time in 24 days

pull request commentopenshift/machine-config-operator

Bug 1796444: gcp: use readyz endpoint

I'm suspecting we need this in 4.3 as well which is where we introduced this also (I'll clone the BZ accordingly)

/cherry-pick release-4.3

runcom

comment created time in 24 days

pull request commentopenshift/machine-config-operator

Bug 1796440: [release-4.3] DR: Cherry-pick keep keys and data separate for snapshot backup

/bugzilla refresh

@retroflexer no need to refresh again, the BZs are all good now, just need to wait for the dependant BZ to be VERIFIED by QE before jumping back here to refresh :)

retroflexer

comment created time in 24 days

pull request commentopenshift/cluster-api

Bug 1745772: [openshift-4.2-cluster-api-0.1.0] pkg/drain: always honor pod termination timeout

The bug referenced was closed, so was this PR required for the BZ? We need a new one. Do we have something that fixes this in 4.3/master? We're also going to need review/lgtm/approval from the team.

This has been fixed in 4.2 with https://github.com/openshift/machine-config-operator/pull/1154 so we don't necessarily need this and we can close it (and the BZ as well), I don't have power to close this tho

openshift-cherrypick-robot

comment created time in 24 days

pull request commentopenshift/cluster-api

Bug 1758345: [openshift-4.1-cluster-api-0.0.0-alpha.4] pkg/drain: always honor pod termination timeout

I feel like we can close this and the BZ (but don't have power here)

openshift-cherrypick-robot

comment created time in 24 days

pull request commentopenshift/cluster-api

Bug 1758345: [openshift-4.1-cluster-api-0.0.0-alpha.4] pkg/drain: always honor pod termination timeout

4.1 has been fixed anyway with https://github.com/openshift/machine-config-operator/pull/1166 - this might still be something we might want to get it but it does not need a bug or get it any soon

openshift-cherrypick-robot

comment created time in 24 days

pull request commentopenshift/machine-config-operator

Bug 1796440: [release-4.3] DR: Cherry-pick keep keys and data separate for snapshot backup

@retroflexer: This pull request references Bugzilla bug 1796440, which is invalid:

  • expected dependent Bugzilla bug 1795235 to be in one of the following states: VERIFIED, RELEASE_PENDING, CLOSED (ERRATA), but it is MODIFIED instead

Looks ok now :+1:

retroflexer

comment created time in 24 days

pull request commentopenshift/machine-config-operator

Bug 1796444: gcp: use readyz endpoint

/skip

runcom

comment created time in 24 days

pull request commentopenshift/machine-config-operator

Bug 1796440: [release-4.3] DR: Cherry-pick keep keys and data separate for snapshot backup

/approve

But you probably need to fix some things in bugzilla to make the bot happy

retroflexer

comment created time in 24 days

pull request commentopenshift/machine-config-operator

WIP: Bug 1794493: add ctrcfg e2e test

/retest

haircommander

comment created time in 24 days

pull request commentopenshift/machine-config-operator

Bug 1785279: [release-4.3] Discard audit messages from journald

@runcom: This pull request references Bugzilla bug 1785279, which is invalid:

  • expected dependent Bugzilla bug 1785281 to be in one of the following states: VERIFIED, RELEASE_PENDING, CLOSED (ERRATA), but it is ON_QA instead

do not change and attach links to bugzillas, the above is fully correct as the master 4.4 BZ is still in ON_QA. Once the 4.4 is verified, we'll come here refreshing and it will all be good.

fbac

comment created time in 24 days

pull request commentopenshift/machine-config-operator

Bug 1785219: [release-4.2] Discard audit messages from journald

@runcom: This pull request references Bugzilla bug 1785219, which is invalid:

  • expected dependent Bugzilla bug 1785279 to be in one of the following states: VERIFIED, RELEASE_PENDING, CLOSED (ERRATA), but it is NEW instead

do not change and attach links to bugzillas, the above is fully correct as the 4.3 BZ is still in NEW even if it has a PR already. It's in turn blocked by the master (4.4) BZ which is ON_QA. Once the 4.4 is verified, we'll come here refreshing (and in the 4.3 PR) and it will all be good.

fbac

comment created time in 24 days

pull request commentopenshift/machine-config-operator

Bug 1785279: [release-4.3] Discard audit messages from journald

/bugzilla refresh

fbac

comment created time in 24 days

pull request commentopenshift/machine-config-operator

Bug 1785219: [release-4.2] Discard audit messages from journald

/bugzilla refresh

fbac

comment created time in 24 days

pull request commentopenshift/machine-config-operator

Bug 1785219: [release-4.2] Discard audit messages from journald

/refresh bugzilla

fbac

comment created time in 24 days

pull request commentopenshift/machine-config-operator

Bug 1785279: [release-4.3] Discard audit messages from journald

/refresh bugzilla

fbac

comment created time in 24 days

pull request commentopenshift/machine-config-operator

gcp: use readyz endpoint

uhm, not sure this is fully related but it might be (cc @hexfusion ):

level=info msg="Cluster operator openshift-etcd Progressing is True with Progressing: "
level=info msg="Cluster operator openshift-etcd Available is False with Available: "

/retest

runcom

comment created time in 24 days

pull request commentopenshift/machine-config-operator

WIP: add ctrcfg e2e test

/retest

haircommander

comment created time in 24 days

pull request commentopenshift/machine-config-operator

daemon: Move "Validated on-disk state" log to only when we have

Uhm probably need a BZ now that master has been enforced , let’s see

/refresh

cgwalters

comment created time in 24 days

pull request commentopenshift/machine-config-operator

gcp: use readyz endpoint

/retest

runcom

comment created time in 24 days

pull request commentopenshift/machine-config-operator

Provide RoleLabelKey as const to be re-used by other operators

BZ have been enforced on master so we need a bug for this if you want it. This can probably be postoponed to the next release tho right?

isimluk

comment created time in 24 days

delete branch runcom/machine-config-operator

delete branch : no-workers-bootstrap-mcs

delete time in 24 days

PR opened openshift/machine-config-operator

Bug 1796147: pkg/server: serve config only to master in bootstrap server

The new cluster etcd operator flow is:

  1. start bootstrap mcs
  2. start etcd on bootstrap
  3. wait for bootstrapping to finish i.e. atleast one control-plane is ready and there is MCS running on cluster
  4. turn down bootstrap mcs

What the above does is giving a chance to workers to grab the ignition config from the bootstap server which now stays up longer. However, by the time they attempt to create a CSR the kube-apiserver has rotated that bootstrap chain of trust out which causes the workers to error out with:

Jan 29 19:55:20 ip-10-0-130-205 hyperkube[2623]: E0129 19:55:20.869251 2623 certificate_manager.go:421] Failed while requesting a signed certificate from the master: cannot create certificate signing request: Unauthorized

The above results in workers not being able to join the cluster eventually.

What this patch does is denying serving the configuration to all pools but master within the bootstrap server, effectively delaying workers to grab the wrong config from the wrong server. Workers will keep polling for configuration and they'll eventually grab the correct one from the server running within the new cluster.

Signed-off-by: Antonio Murdaca runcom@linux.com

+36 -2

0 comment

3 changed files

pr created time in 24 days

create barnchruncom/machine-config-operator

branch : no-workers-bootstrap-mcs

created branch time in 24 days

pull request commentopenshift/machine-config-operator

WIP: add ctrcfg e2e test

/skip

haircommander

comment created time in 24 days

pull request commentopenshift/machine-config-operator

Provide RoleLabelKey as const to be re-used by other operators

/skip /lgtm /approve

isimluk

comment created time in 25 days

pull request commentopenshift/machine-config-operator

daemon: Move "Validated on-disk state" log to only when we have

(after some chats...) PR is super tiny so it can go in as it's not really requiring a BZ

/hold cancel /lgtm

cgwalters

comment created time in 25 days

pull request commentopenshift/machine-config-operator

gcp: use readyz endpoint

uhm, not fully sure tests failures are related

/retest

runcom

comment created time in 25 days

pull request commentopenshift/machine-config-operator

WIP: add ctrcfg e2e test

uhm I think e2e-gcp-op failures are now related

haircommander

comment created time in 25 days

pull request commentopenshift/machine-config-operator

baremetal: Remove .template from path in dhcp-dhclient-conf.yaml

but it looks like it never worked even on ipv4

I think we would love a BZ if this bug is present in 4.3 and below

/hold

hardys

comment created time in 25 days

pull request commentopenshift/machine-config-operator

baremetal: Remove .template from path in dhcp-dhclient-conf.yaml

/approve /lgtm

this is ipv6 work right?

hardys

comment created time in 25 days

pull request commentopenshift/machine-config-operator

Provide RoleLabelKey as const to be re-used by other operators

Makes sense to me :+1:

Will leave to the team to review again and merge but we may not get this in w/o a BZ , I’ll get back with more info on that soon

isimluk

comment created time in 25 days

Pull request review commentopenshift/machine-config-operator

WIP: add ctrcfg e2e test

+package e2e_test++import (+	"fmt"+	"os/exec"+	"testing"+	"time"++	"github.com/stretchr/testify/require"+	"k8s.io/apimachinery/pkg/util/uuid"+	corev1 "k8s.io/api/core/v1"+	"github.com/openshift/machine-config-operator/test/e2e/framework"+	metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"+	mcfgv1 "github.com/openshift/machine-config-operator/pkg/apis/machineconfiguration.openshift.io/v1"+	"github.com/pkg/errors"+	"k8s.io/apimachinery/pkg/util/wait"+)++func TestContainerRuntimeConfigPidsLimit(t *testing.T) {+	cs := framework.NewClientSet("")+	testUUID := uuid.NewUUID()++	matchValue := fmt.Sprintf("pids-limit-%s", testUUID)+	mcpKey := "machine-config-pool/test"+	ctrcfgName := fmt.Sprintf("ctrcfg-%s", testUUID)++	_, err := exec.Command("oc", "label", "mcp", "worker", mcpKey+"="+matchValue).CombinedOutput()+	require.Nil(t, err, "unable to label worker mcp")++	defer func() {+		_, err := exec.Command("oc", "label", "mcp", "worker", mcpKey+"-").CombinedOutput()+		require.Nil(t, err, "unable to remove label worker mcp")+	}()++	cfg := &mcfgv1.ContainerRuntimeConfiguration{+		PidsLimit: 2048,+	}++	cleanupCRC := createCRCWithConfig(t, cs, ctrcfgName, mcpKey, matchValue, cfg)+	defer cleanupCRC()++	mcName, err := waitForRenderedConfigFromCtrcfg(t, cs, "worker", ctrcfgName)+	require.Nil(t, err, "failed to render machine config from container runtime config")++	err = waitForPoolComplete(t, cs, "worker", mcName)+	require.Nil(t, err)+}++func createCRCWithConfig(t *testing.T, cs *framework.ClientSet, name, key, value string, config *mcfgv1.ContainerRuntimeConfiguration) func() {+	ctrcfg := &mcfgv1.ContainerRuntimeConfig{}+	ctrcfg.ObjectMeta = metav1.ObjectMeta{+		Name: name,+	}+	spec := mcfgv1.ContainerRuntimeConfigSpec{+		MachineConfigPoolSelector: &metav1.LabelSelector{+			MatchLabels: make(map[string]string),+		},+		ContainerRuntimeConfig: config,+	}+	spec.MachineConfigPoolSelector.MatchLabels[key] = value+	ctrcfg.Spec = spec++	_, err := cs.ContainerRuntimeConfigs().Create(ctrcfg)+	require.Nil(t, err)+	return func() {+		cs.ContainerRuntimeConfigs().Delete(name, &metav1.DeleteOptions{})

also still not sure why tests failed 🤔looks super weird but this is OT from this convo

haircommander

comment created time in 25 days

Pull request review commentopenshift/machine-config-operator

WIP: add ctrcfg e2e test

+package e2e_test++import (+	"fmt"+	"os/exec"+	"testing"+	"time"++	"github.com/stretchr/testify/require"+	"k8s.io/apimachinery/pkg/util/uuid"+	corev1 "k8s.io/api/core/v1"+	"github.com/openshift/machine-config-operator/test/e2e/framework"+	metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"+	mcfgv1 "github.com/openshift/machine-config-operator/pkg/apis/machineconfiguration.openshift.io/v1"+	"github.com/pkg/errors"+	"k8s.io/apimachinery/pkg/util/wait"+)++func TestContainerRuntimeConfigPidsLimit(t *testing.T) {+	cs := framework.NewClientSet("")+	testUUID := uuid.NewUUID()++	matchValue := fmt.Sprintf("pids-limit-%s", testUUID)+	mcpKey := "machine-config-pool/test"+	ctrcfgName := fmt.Sprintf("ctrcfg-%s", testUUID)++	_, err := exec.Command("oc", "label", "mcp", "worker", mcpKey+"="+matchValue).CombinedOutput()+	require.Nil(t, err, "unable to label worker mcp")++	defer func() {+		_, err := exec.Command("oc", "label", "mcp", "worker", mcpKey+"-").CombinedOutput()+		require.Nil(t, err, "unable to remove label worker mcp")+	}()++	cfg := &mcfgv1.ContainerRuntimeConfiguration{+		PidsLimit: 2048,+	}++	cleanupCRC := createCRCWithConfig(t, cs, ctrcfgName, mcpKey, matchValue, cfg)+	defer cleanupCRC()++	mcName, err := waitForRenderedConfigFromCtrcfg(t, cs, "worker", ctrcfgName)+	require.Nil(t, err, "failed to render machine config from container runtime config")++	err = waitForPoolComplete(t, cs, "worker", mcName)+	require.Nil(t, err)+}++func createCRCWithConfig(t *testing.T, cs *framework.ClientSet, name, key, value string, config *mcfgv1.ContainerRuntimeConfiguration) func() {+	ctrcfg := &mcfgv1.ContainerRuntimeConfig{}+	ctrcfg.ObjectMeta = metav1.ObjectMeta{+		Name: name,+	}+	spec := mcfgv1.ContainerRuntimeConfigSpec{+		MachineConfigPoolSelector: &metav1.LabelSelector{+			MatchLabels: make(map[string]string),+		},+		ContainerRuntimeConfig: config,+	}+	spec.MachineConfigPoolSelector.MatchLabels[key] = value+	ctrcfg.Spec = spec++	_, err := cs.ContainerRuntimeConfigs().Create(ctrcfg)+	require.Nil(t, err)+	return func() {+		cs.ContainerRuntimeConfigs().Delete(name, &metav1.DeleteOptions{})

I think this should be a continuation of the actual test where you make sure that rolling back (deleting) the ctrcfg brings the cluster back to normal (see #1412 for instance). TLDR: it should fail

haircommander

comment created time in 25 days

Pull request review commentopenshift/machine-config-operator

WIP: add ctrcfg e2e test

+package e2e_test++import (+	"fmt"+	"os/exec"+	"testing"+	"time"++	"github.com/stretchr/testify/require"+	"k8s.io/apimachinery/pkg/util/uuid"+	corev1 "k8s.io/api/core/v1"+	"github.com/openshift/machine-config-operator/test/e2e/framework"+	metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"+	mcfgv1 "github.com/openshift/machine-config-operator/pkg/apis/machineconfiguration.openshift.io/v1"+	"github.com/pkg/errors"+	"k8s.io/apimachinery/pkg/util/wait"+)++func TestContainerRuntimeConfigPidsLimit(t *testing.T) {+	cs := framework.NewClientSet("")+	testUUID := uuid.NewUUID()++	matchValue := fmt.Sprintf("pids-limit-%s", testUUID)+	mcpKey := "machine-config-pool/test"+	ctrcfgName := fmt.Sprintf("ctrcfg-%s", testUUID)++	_, err := exec.Command("oc", "label", "mcp", "worker", mcpKey+"="+matchValue).CombinedOutput()+	require.Nil(t, err, "unable to label worker mcp")++	defer func() {+		_, err := exec.Command("oc", "label", "mcp", "worker", mcpKey+"-").CombinedOutput()+		require.Nil(t, err, "unable to remove label worker mcp")+	}()++	cfg := &mcfgv1.ContainerRuntimeConfiguration{+		PidsLimit: 2048,+	}++	cleanupCRC := createCRCWithConfig(t, cs, ctrcfgName, mcpKey, matchValue, cfg)+	defer cleanupCRC()++	mcName, err := waitForRenderedConfigFromCtrcfg(t, cs, "worker", ctrcfgName)+	require.Nil(t, err, "failed to render machine config from container runtime config")++	err = waitForPoolComplete(t, cs, "worker", mcName)+	require.Nil(t, err)+}++func createCRCWithConfig(t *testing.T, cs *framework.ClientSet, name, key, value string, config *mcfgv1.ContainerRuntimeConfiguration) func() {+	ctrcfg := &mcfgv1.ContainerRuntimeConfig{}+	ctrcfg.ObjectMeta = metav1.ObjectMeta{+		Name: name,+	}+	spec := mcfgv1.ContainerRuntimeConfigSpec{+		MachineConfigPoolSelector: &metav1.LabelSelector{+			MatchLabels: make(map[string]string),+		},+		ContainerRuntimeConfig: config,+	}+	spec.MachineConfigPoolSelector.MatchLabels[key] = value+	ctrcfg.Spec = spec++	_, err := cs.ContainerRuntimeConfigs().Create(ctrcfg)+	require.Nil(t, err)+	return func() {+		cs.ContainerRuntimeConfigs().Delete(name, &metav1.DeleteOptions{})

why I love how this all works it's kinda suboptimal as we're ignoring the error - maybe handle it somehow?

haircommander

comment created time in 25 days

pull request commentopenshift/installer

docs: Docs for setting kernelType during initial cluster install

@abhinavdahiya ptal

sinnykumari

comment created time in 25 days

Pull request review commentopenshift/machine-config-operator

cmd: add gcp-routes-controllers to manage the routes service to prevent blackholes

+filesystem: "root"+mode: 0644+path: "/etc/kubernetes/manifests/gcp-routes-controller.yaml"+contents:+  inline: |+    apiVersion: v1+    kind: Pod+    metadata:+      name: gcp-routes-controller+      namespace: kube-system+    spec:+      containers:+      - name: gcp-routes-controller+        image: "{{.Images.gcpRoutesControllerKey}}"+        command: ["gcp-routes-controller"]+        args:+        - "run"+        - "--health-check-url=https://127.0.0.1:6443/healthz"

created https://github.com/openshift/machine-config-operator/pull/1417 - please let me know about BZs and I'll go ahead and create them

abhinavdahiya

comment created time in 25 days

more