profile
viewpoint

opencontainers/selinux 92

common selinux implementation

spdx/license-list-XML 69

This is the repository for the master files that comprise the SPDX License List

jbenet/depviz 49

dependency visualizer for the web

mndrix/tap-go 21

Test Anything Protocol for Go

adina/boot-camps 2

Software Carpentry boot camp material

LJWilliams/2014-03-17-uw 1

Software Carpentry workshop at U. Washington, March 17-18 2014

wking/angular-validation-match 1

Checks if one input matches another. Useful for confirming passwords, emails, or anything.

wking/awk-lesson 1

test run of a bower workflow for lesson distribution

wking/bes 1

Bulk uploader for Elastic Search, written in Python

PullRequestReviewEvent

Pull request review commentopenshift/cluster-kube-controller-manager-operator

Bug 1881246: change group name for PDB alerts

 metadata:     exclude.release.openshift.io/internal-openshift-hosted: "true" spec:   groups:-    - name: cluster-version+    - name: kube-controller-manager

Changing spec.groups[].name like this makes sense to me. But the bug was about the overlap in metadata.name. Can we combine the two PrometheusRule files in a single file with a single group with multiple rules?

soltysh

comment created time in an hour

pull request commentopenshift/cluster-kube-controller-manager-operator

Bug 1881246: change group name for PDB alerts

/bugzilla refresh

soltysh

comment created time in an hour

PullRequestReviewEvent
PullRequestReviewEvent

delete branch wking/openshift-release

delete branch : proxy-ssh-public-ip

delete time in an hour

delete branch wking/openshift-docs

delete branch : PDBB-typo

delete time in 8 hours

delete branch wking/openshift-docs

delete branch : internal-must-gather-link

delete time in 8 hours

pull request commentopenshift/release

ci-operator/step-registry/ipi/conf/aws/proxy: Use public IP for SSH

/hold cancel

wking

comment created time in 8 hours

pull request commentopenshift/cluster-version-operator

Bug 1872906: pkg/start: Release leader lease on graceful shutdown

4.6 bug is VERIFIED, so we should be unblocked here:

/bugzilla refresh

wking

comment created time in 11 hours

PullRequestReviewEvent

pull request commentopenshift/release

ci-operator/step-registry/ipi/conf/aws/proxy: Use public IP for SSH

/hold

for rehearsals

wking

comment created time in 15 hours

PR opened openshift/release

ci-operator/step-registry/ipi/conf/aws/proxy: Use public IP for SSH

Avoid:

ssh: connect to host 10.0.1.169 port 22: Connection timed out

and similar when we try to SSH in from outside the VPN.

+8 -3

0 comment

1 changed file

pr created time in 15 hours

create barnchwking/openshift-release

branch : proxy-ssh-public-ip

created branch time in 15 hours

Pull request review commentopenshift/enhancements

enhancements/update/manifest-install-levels: Propose a new enhancement

+---+title: manifest-install-levels+authors:+  - "@wking"+reviewers:+  - "@LalatenduMohanty"+approvers:+  - TBD+creation-date: 2020-09-12+last-updated: 2020-09-12+status: implementable+---++# Manifest Install Levels++## Release Signoff Checklist++- [x] Enhancement is `implementable`+- [x] Design details are appropriately documented from clear requirements+- [x] Test plan is defined+- [x] Graduation criteria for dev preview, tech preview, GA+- [ ] User-facing documentation is created in [openshift-docs](https://github.com/openshift/openshift-docs/)++## Summary++[Several operators][extra-manifests] pass manifests [like these][extra-manifests-example] to the cluster-version operator which are nice-to-have, [create-only][] defaults, but not critical to bootstrap completion.+That's fine, and makes life in the post-install clusters easier.+However, when a user feeds the installer their own alternative content, [cluster-bootstrap][] will [race][] the cluster-version operator to push the content into the cluster.+Sometimes cluster-bootstrap wins, and the user-provided manifest ends up in-cluster.+But sometimes the cluster-version operator wins, and the user-provided content is forgotten when the create-only default ends up in-cluster.+This enhancement adds a new enhancement to delay non-critical manifests to later in the installation, ensuring the cluster-version operator will always lose the race and cluster-bootstrap will push the user-provided manifest into the cluster.++## Motivation++### Goals++* Respect manifests that users feed the installer, even if they target a resource that is also backed by a cluster-version operator, create-only manifest.++### Non-Goals++* Completely ordering installation.+    The cluster-version operator used to order manifests during installation, but we [dropped that][parallel-install] for quicker installs.+    This enhancement should restore enough install-time ordering to avoid the races, but not so much that it significantly delays installs.++## Proposal++This enhancemnt defines a new manifest annotation, `release.openshift.io/install-level`, which allows manifests to be ordered by run-level during installation.+The default value is `0`, and currently the only other allowed value is `1`.++The cluster-version operator will ignore the new annotation for all phases except installation.

I've explicitly mentioned all three CVO phases in 916cef8 -> 949a91b. Hopefully it's more clear now that this new annotation is completely ignored outside of the install phase.

wking

comment created time in 21 hours

PullRequestReviewEvent

push eventwking/openshift-enhancements

W. Trevor King

commit sha 949a91b77bf5fd9eb323f1f7f83e54c7253e15e1

enhancements/update/manifest-install-levels: Propose a new enhancement

view details

push time in 21 hours

Pull request review commentopenshift/enhancements

enhancements/update/manifest-install-levels: Propose a new enhancement

+---+title: manifest-install-levels+authors:+  - "@wking"+reviewers:+  - "@LalatenduMohanty"+approvers:+  - TBD+creation-date: 2020-09-12+last-updated: 2020-09-12+status: implementable+---++# Manifest Install Levels++## Release Signoff Checklist++- [x] Enhancement is `implementable`+- [x] Design details are appropriately documented from clear requirements+- [x] Test plan is defined+- [x] Graduation criteria for dev preview, tech preview, GA+- [ ] User-facing documentation is created in [openshift-docs](https://github.com/openshift/openshift-docs/)++## Summary++[Several operators][extra-manifests] pass manifests [like these][extra-manifests-example] to the cluster-version operator which are nice-to-have, [create-only][] defaults, but not critical to bootstrap completion.+That's fine, and makes life in the post-install clusters easier.+However, when a user feeds the installer their own alternative content, [cluster-bootstrap][] will [race][] the cluster-version operator to push the content into the cluster.+Sometimes cluster-bootstrap wins, and the user-provided manifest ends up in-cluster.+But sometimes the cluster-version operator wins, and the user-provided content is forgotten when the create-only default ends up in-cluster.+This enhancement adds a new enhancement to delay non-critical manifests to later in the installation, ensuring the cluster-version operator will always lose the race and cluster-bootstrap will push the user-provided manifest into the cluster.++## Motivation++### Goals++* Respect manifests that users feed the installer, even if they target a resource that is also backed by a cluster-version operator, create-only manifest.

I've added a terminology section to the enhancement, and I've also added an explicit guard to restrict this new annotation to manifests which already set the create-only manifest.

$ oc adm release extract --to manifests quay.io/openshift-release-dev/ocp-release:4.6.0-fc.7-x86_64
Extracted release payload from digest sha256:99fdfbd1af951f1b5cf2e2aa1a79b6c2ec565950817ffcaa8bac54014d1852dc created at 2020-09-18T09:31:51Z
$ grep -r /create-only manifests
manifests/0000_05_config-operator_02_proxy.cr.yaml:    release.openshift.io/create-only: "true"
manifests/0000_05_config-operator_02_build.cr.yaml:    release.openshift.io/create-only: "true"
manifests/0000_05_config-operator_02_ingress.cr.yaml:    release.openshift.io/create-only: "true"
manifests/0000_50_console-operator_04-rbac-rolebinding-cluster.yaml:    "release.openshift.io/create-only": 'true'
...

I am not clear on how breaking down customer-provided manifests, or any manifests applied via cluster-bootstrap, into subcategories is useful. This enhancement proposal is about keeping optional release image manifests out of the cluster until cluster-bootstrap is done. It is not about changing anything about how cluster-bootstrap operates.

I am not clear on the importance of distinguishing based on support for "customer overrides" (do you mean "customer configuration"?). I would expect release image manifests which support customer configuration to set the create-only annotation, otherwise the CVO would just stomp the customer's changes.

It's possible that there are create-only release image manifests which are still not customer-managed on day 2. For example, perhaps the CVO creates the manifest, but then a second-level operator picks up management of that manifest later? I am not aware of any such manifests. But if there's something similar that drives a need to distinguish some subset of create-only release image manifests where the new release.openshift.io/install-level should not apply, can we give a specific example from 4.6.0-fc.7?

wking

comment created time in 21 hours

PullRequestReviewEvent

Pull request review commentopenshift/enhancements

enhancements/update/manifest-install-levels: Propose a new enhancement

+---+title: manifest-install-levels+authors:+  - "@wking"+reviewers:+  - "@LalatenduMohanty"

since this is affecting control plane, I would expect this to be reviewed by...

The more, the merrier. But I'm not clear on the control-plane connection. Can you clarify?

wking

comment created time in 21 hours

PullRequestReviewEvent
PullRequestReviewEvent

Pull request review commentopenshift/enhancements

enhancements/update/manifest-install-levels: Propose a new enhancement

+---+title: manifest-install-levels+authors:+  - "@wking"+reviewers:+  - "@LalatenduMohanty"+approvers:+  - TBD+creation-date: 2020-09-12+last-updated: 2020-09-12+status: implementable+---++# Manifest Install Levels++## Release Signoff Checklist++- [x] Enhancement is `implementable`+- [x] Design details are appropriately documented from clear requirements+- [x] Test plan is defined+- [x] Graduation criteria for dev preview, tech preview, GA+- [ ] User-facing documentation is created in [openshift-docs](https://github.com/openshift/openshift-docs/)++## Summary++[Several operators][extra-manifests] pass manifests [like these][extra-manifests-example] to the cluster-version operator which are nice-to-have, [create-only][] defaults, but not critical to bootstrap completion.+That's fine, and makes life in the post-install clusters easier.+However, when a user feeds the installer their own alternative content, [cluster-bootstrap][] will [race][] the cluster-version operator to push the content into the cluster.+Sometimes cluster-bootstrap wins, and the user-provided manifest ends up in-cluster.+But sometimes the cluster-version operator wins, and the user-provided content is forgotten when the create-only default ends up in-cluster.+This enhancement adds a new enhancement to delay non-critical manifests to later in the installation, ensuring the cluster-version operator will always lose the race and cluster-bootstrap will push the user-provided manifest into the cluster.++## Motivation++### Goals++* Respect manifests that users feed the installer, even if they target a resource that is also backed by a cluster-version operator, create-only manifest.++### Non-Goals++* Completely ordering installation.+    The cluster-version operator used to order manifests during installation, but we [dropped that][parallel-install] for quicker installs.+    This enhancement should restore enough install-time ordering to avoid the races, but not so much that it significantly delays installs.++## Proposal++This enhancemnt defines a new manifest annotation, `release.openshift.io/install-level`, which allows manifests to be ordered by run-level during installation.

AFAIK there is already an existing concept for run-level which allows KAS and other core operators to bypass the SCC (that require OAS to be running).

Not in the cluster-version operator. Maybe this is something internal to cluster-bootstrap?

wking

comment created time in 21 hours

push eventwking/oc

W. Trevor King

commit sha 170242ca4b0688d312ffadbe29da6a2d559cfff7

pkg/cli/admin/upgrade/channel: Add 'oc adm upgrade channel ...' A new subcommand for conveniently managing channels. Example workflow, starting in a channel: $ oc adm upgrade Cluster version is 4.6.0-fc.3 Upstream: https://api.openshift.com/api/upgrades_info/v1/graph Channel: candidate-4.6 (choices: candidate-4.6) Updates: VERSION IMAGE 4.6.0-fc.4 quay.io/openshift-release-dev/ocp-release@sha256:960ec73733150827076cbb5fa2c1f5aaa9a94bfbce1b4897e46432a56ac976c1 4.6.0-fc.5 quay.io/openshift-release-dev/ocp-release@sha256:5883d0db15939484bd477147e6949c53fbc6f551ec20a0f1106b8a3acfb86ef8 Trying to change to the same channel is a no-op: $ oc adm upgrade channel candidate-4.6 info: Cluster is already in candidate-4.6 Trying to change to an unrecognized channel gets a warning: $ oc adm upgrade channel does-not-exist error: the requested channel "does-not-exist" is not one of the available channels (candidate-4.6), you must pass --allow-explicit-channel to continue $ oc adm upgrade channel --allow-explicit-channel does-not-exist warning: The requested channel "does-not-exist" is not one of the available channels (candidate-4.6). You have used --allow-explicit-channel to proceed anyway. $ oc adm upgrade Cluster version is 4.6.0-fc.3 Channel: does-not-exist warning: Cannot display available updates: Reason: VersionNotFound Message: Unable to retrieve available updates: currently reconciling cluster version 4.6.0-fc.3 not found in the "does-not-exist" channel When we have no known channels, changing requires no override: $ oc adm upgrade channel does-not-exist-either warning: No channels known to be compatible with the current version "4.6.0-fc.3"; unable to validate "does-not-exist-either". $ oc adm upgrade channel candidate-4.6 warning: No channels known to be compatible with the current version "4.6.0-fc.3"; unable to validate "candidate-4.6". Clearing a known channel needs an explicit override: $ oc adm upgrade channel error: the requested channel "" is not one of the available channels (candidate-4.6), you must pass --allow-explicit-channel to continue $ oc adm upgrade channel --allow-explicit-channel warning: Clearing channel "candidate-4.6"; cluster will no longer request available update recommendations. $ oc adm upgrade Cluster version is 4.6.0-fc.3 warning: Cannot display available updates: Reason: NoChannel Message: The update channel has not been configured. Trying to re-clear the channel is a no-op: $ oc adm upgrade channel info: Cluster channel is already clear And you can set any channel from a cleared channel without an override, because this is another case where we have no idea what the valid choices are: $ oc adm upgrade channel candidate-4.6 warning: No channels known to be compatible with the current version "4.6.0-fc.3"; unable to validate "candidate-4.6". $ oc adm upgrade Cluster version is 4.6.0-fc.3 Upstream: https://api.openshift.com/api/upgrades_info/v1/graph Channel: candidate-4.6 (choices: candidate-4.6) Updates: VERSION IMAGE 4.6.0-fc.4 quay.io/openshift-release-dev/ocp-release@sha256:960ec73733150827076cbb5fa2c1f5aaa9a94bfbce1b4897e46432a56ac976c1 4.6.0-fc.5 quay.io/openshift-release-dev/ocp-release@sha256:5883d0db15939484bd477147e6949c53fbc6f551ec20a0f1106b8a3acfb86ef8 Clearing from an unknown channel does not require an override either: $ oc adm upgrade channel --allow-explicit-channel does-not-exist warning: The requested channel "does-not-exist" is not one of the available channels (candidate-4.6). You have used --allow-explicit-channel to proceed anyway. $ oc adm upgrade channel warning: Clearing channel "does-not-exist"; cluster will no longer request available update recommendations. Completions updated with: $ hack/update-generated-completions.sh

view details

push time in 3 days

Pull request review commentopenshift/openshift-docs

BZ#1828814 Update production env support to include fast channels

 $ oc get clusterversion -o json|jq ".items[0].spec" +

Yup, releases and recommended updates in fast-* and stable-* channels are all explicitly supported, as described in the channel docs in this repo.

codyhoag

comment created time in 3 days

PullRequestReviewEvent

pull request commentopenshift/openshift-docs

modules/understanding-upgrade-channels: Recommend clearing channel

I've floated openshift/oc#576 for convenient, patch-free channel management.

wking

comment created time in 4 days

PR opened openshift/oc

pkg/cli/admin/upgrade/channel: Add 'oc adm upgrade channel ...'

A new subcommand for conveniently managing channels. Example workflow, starting in a channel:

$ oc adm upgrade
Cluster version is 4.6.0-fc.3

Upstream: https://api.openshift.com/api/upgrades_info/v1/graph
Channel: candidate-4.6 (choices: candidate-4.6)
Updates:

VERSION    IMAGE
4.6.0-fc.4 quay.io/openshift-release-dev/ocp-release@sha256:960ec73733150827076cbb5fa2c1f5aaa9a94bfbce1b4897e46432a56ac976c1
4.6.0-fc.5 quay.io/openshift-release-dev/ocp-release@sha256:5883d0db15939484bd477147e6949c53fbc6f551ec20a0f1106b8a3acfb86ef8

Trying to change to the same channel is a no-op:

$ oc adm upgrade channel candidate-4.6
info: Cluster is already in candidate-4.6

Trying to change to an unrecognized channel gets a warning:

$ oc adm upgrade channel does-not-exist
error: the requested channel "does-not-exist" is not one of the available channels (candidate-4.6), you must pass --allow-explicit-channel to continue
$ oc adm upgrade channel --allow-explicit-channel does-not-exist
warning: The requested channel "does-not-exist" is not one of the available channels (candidate-4.6).  You have used --allow-explicit-channel to proceed anyway.
$ oc adm upgrade
Cluster version is 4.6.0-fc.3

Channel: does-not-exist
warning: Cannot display available updates:
  Reason: VersionNotFound
  Message: Unable to retrieve available updates: currently reconciling cluster version 4.6.0-fc.3 not found in the "does-not-exist" channel

When we have no known channels, changing requires no override:

$ oc adm upgrade channel does-not-exist-either
warning: No channels known to be compatible with the current version "4.6.0-fc.3"; unable to validate "does-not-exist-either".
$ oc adm upgrade channel candidate-4.6
warning: No channels known to be compatible with the current version "4.6.0-fc.3"; unable to validate "candidate-4.6".

Clearing a known channel needs an explicit override:

$ oc adm upgrade channel
error: the requested channel "" is not one of the available channels (candidate-4.6), you must pass --allow-explicit-channel to continue
$ oc adm upgrade channel --allow-explicit-channel
warning: Clearing channel "candidate-4.6"; cluster will no longer request available update recommendations.
$ oc adm upgrade
Cluster version is 4.6.0-fc.3

warning: Cannot display available updates:
  Reason: NoChannel
  Message: The update channel has not been configured.

Trying to re-clear the channel is a no-op:

$ oc adm upgrade channel
info: Cluster channel is already clear

And you can set any channel from a cleared channel without an override, because this is another case where we have no idea what the valid choices are:

$ oc adm upgrade channel candidate-4.6
warning: No channels known to be compatible with the current version "4.6.0-fc.3"; unable to validate "candidate-4.6".
$ oc adm upgrade
Cluster version is 4.6.0-fc.3

Upstream: https://api.openshift.com/api/upgrades_info/v1/graph
Channel: candidate-4.6 (choices: candidate-4.6)
Updates:

VERSION    IMAGE
4.6.0-fc.4 quay.io/openshift-release-dev/ocp-release@sha256:960ec73733150827076cbb5fa2c1f5aaa9a94bfbce1b4897e46432a56ac976c1
4.6.0-fc.5 quay.io/openshift-release-dev/ocp-release@sha256:5883d0db15939484bd477147e6949c53fbc6f551ec20a0f1106b8a3acfb86ef8

Clearing from an unknown channel does not require an override either:

$ oc adm upgrade channel --allow-explicit-channel does-not-exist
warning: The requested channel "does-not-exist" is not one of the available channels (candidate-4.6).  You have used --allow-explicit-channel to proceed anyway.
$ oc adm upgrade channel
warning: Clearing channel "does-not-exist"; cluster will no longer request available update recommendations.
+145 -0

0 comment

2 changed files

pr created time in 4 days

create barnchwking/oc

branch : channel-chooser

created branch time in 4 days

delete branch wking/cluster-version-operator

delete branch : log-explainer

delete time in 4 days

push eventwking/cluster-version-operator

W. Trevor King

commit sha ddad11e86998eb5ca06fc1764783b99673bd499e

waterfall: Add script to plot CVO waterfalls

view details

W. Trevor King

commit sha a4acb2a32826b6a191500ccac1de175a41a59c29

waterfall/cvo-waterfall: Log incomplete manifests For example: $ curl -s https://gist.githubusercontent.com/steveeJ/eeea8316f676727008274eeaf22b7487/raw/ecce637665f59a5d7b967b59e713867389c29c10/gistfile1.txt | cvo-waterfall.py >/tmp/cvo.svg WARNING:root:not finished: clusteroperator authentication

view details

W. Trevor King

commit sha ac83d98dd2c3c355e64fdee0dce3f84dd9c80e70

waterfall/cvo-waterfall: Display uncompleted manifests too In red, so you can distinguish them from completed manifests.

view details

W. Trevor King

commit sha 3e116c65c7c48e295a60fc474c81ce8ed29afd4d

waterfall/cvo-waterfall: Add a title and 5-minute bars

view details

W. Trevor King

commit sha 087b506a2e58316a90d1a4d4eb9dd231bf9aa9c7

waterfalls/cvo-waterfall: Soften the log regexp's ^ anchor Must-gather includes a timestamp prefix: $ curl -s --compressed https://storage.googleapis.com/origin-ci-test/pr-logs/pull/openshift_cluster-version-operator/178/pull-ci-openshift-cluster-version-operator-master-e2e-aws-upgrade/134/artifacts/e2e-aws-upgrade/must-gather/namespaces/openshift-cluster-version/pods/cluster-version-operator-7cb5846b6b-ljbd5/cluster-version-operator/cluster-version-operator/logs/current.log | grep 'Running sync' | tail -n1 2019-06-13T15:50:32.447432391Z I0613 15:50:32.447421 1 sync_worker.go:574] Running sync for clusteroperator "kube-apiserver" (48 of 381)

view details

W. Trevor King

commit sha 8410557dc27709370c993d282b6c3fa28930a1ee

hack/log-explainer: Expand SVG waterfall to explain more of the log Making it easier to dig into the individual sync cycles, find goroutines that have hung, etc.

view details

W. Trevor King

commit sha 632e763fa209c3e1bd07098c2e33599f8c663e3e

pkg/payload/task_graph: Avoid deadlocking on cancel with workCh queue Before 55ef3d3027 (pkg/payload/task_graph: Handle node pushing and result collection without a goroutine, 2019-10-21, #264), RunGraph had a separate goroutine that managed the work queue, with results fed into errCh to be collected by the main RunGraph goroutine. It didn't matter if that work queue goroutine hung; as long as all the worker goroutines exited, RunGraph would collect their errors from errCh and return. In 55ef3d3027, I removed the queue goroutine and moved queue management into the main RunGraph goroutine. With that change, we became exposed to the following race: 1. Main goroutine pushes work into workCh. 2. Context canceled. 3. Workers exit via the "Canceled worker..." case, so they don't pick the work out of workCh. 4. Main goroutine deadlocks because there is work in flight, but nothing in resultCh, and no longer any workers to feed resultCh. In logs, this looks like "sync-worker goroutine has gone to sleep, and is no longer synchronizing manifests" [1]. There are two mitigating factors: a. 'select' docs [2]: If one or more of the communications can proceed, a single one that can proceed is chosen via a uniform pseudo-random selection. So the races step 3 will happen in about half of the cases where the context has been canceled. In the other half of cases, the worker will randomly decide to pick up the queued work, notice it's been canceled while processing that work, and return a "context canceled" result. b. We scale workCh by the number of workers. So the deadlock risk requires enough parallel work to fill the queue faster than workers are draining it and enough bad luck in the worker's select that the canceled workers don't drain the queue on their own. E.g. with our eight ClusterOperator-precreate workers, we'd have an 0.5^8 ~= 0.4% chance of not draining a single in-queue node post-cancel, and a 1 - 0.5^8 ~= 99.6% chance of not draining eight in-queue nodes post-cancel. With this commit, we drain results when they are available, but we also respect the context to allow the resultCh read to be canceled. When we have been canceled with work in flight, we also attempt a non-blocking read from workCh to drain out anything there that has not yet been picked up by a worker. Because 'done' will not be set true, we'll call getNextNode again and come in with a fresh pass through the for loop. ctx.Err() will no longer be nil, but if the workCh drain worked, we may now have inflight == 0, and we'll end up in the case that sets 'done' true, and break out of the for loop on that round. The unit test sets up two parallel nodes: a and b. We configure one worker, which picks up node a. Node b doesn't block on node a, so it gets pushed into workCh while the worker grinds through node a. On its second task in node a, the worker cancels the run. Because the sleeps do not have select-ctx.Done guards, the worker finishes off that second task, notices the cancel as they enter their third task, and exits with the "context canceled" error. This leaves node b stuck in workCh, and we need the fix from this commit to avoid deadlocking on that in-flight node. [1]: https://bugzilla.redhat.com/show_bug.cgi?id=1873900 [2]: https://golang.org/ref/spec#Select_statements

view details

W. Trevor King

commit sha 1d9a31926dba4ed0e35c9891591003a9308fb9e0

Dockerfile.rhel: Bump to Go 1.15 ART is waffling around the 1.14 <-> 1.15 transition [1], but we don't need to wait. Let the future start today! [1]: https://github.com/openshift/ocp-build-data/pull/665

view details

OpenShift Merge Robot

commit sha 02467f9ad01a97870b3765621a5b239508b8d2f8

Merge pull request #457 from wking/go-1.15 Bug 1878163: Dockerfile.rhel: Bump to Go 1.15

view details

OpenShift Merge Robot

commit sha 4cd5b399d8f59ff8ed5dbee119af53248da2f1cb

Merge pull request #455 from wking/context-canceled-locked-task-graph Bug 1873900: pkg/payload/task_graph: Avoid deadlocking on cancel with workCh queue

view details

OpenShift Merge Robot

commit sha 22696615350d73b0c848ef32ae255ba57db3649d

Merge pull request #452 from wking/log-explainer Bug 1880285: hack/log-explainer: Render CVO logs for easier analysis

view details

push time in 4 days

PR opened openshift/cluster-version-operator

pkg/cvo/upgradeable: Fix "Upgradebale" -> "Upgradeable"

Fixing a typo from 04528144fe (#243).

Generated with:

$ sed -i 's/Upgradebale/Upgradeable/g' $(git grep -l Upgradebale)

No rush; we can land this after 4.6 forks off and master reopens.

+3 -3

0 comment

1 changed file

pr created time in 4 days

pull request commentopenshift/openshift-docs

modules/understanding-upgrade-channels: Recommend clearing channel

Is this blocked on docs with an oc patch ... command to clear the channel? I'll try to get an oc wrapper around channel maintenance in place in the meantime...

wking

comment created time in 4 days

Pull request review commentopenshift/cincinnati-graph-data

Add rust-based CI tool

+use cincinnati::plugins::internal::openshift_secondary_metadata_parser::plugin;+use regex::Regex;+use semver::Version;+use serde::de::DeserializeOwned;+use serde_yaml;+use std::collections::HashSet;+use std::path::PathBuf;+use tokio;++use anyhow::Context;+use anyhow::Result as Fallible;++pub async fn run() -> Fallible<HashSet<Version>> {+  let data_dir = PathBuf::from("..");+  let extension_re = Regex::new("ya+ml")?;+  // Collect a list of mentioned versions+  let mut found_versions: HashSet<Version> = HashSet::new();++  println!("Verifying blocked edge files are valid");+  let blocked_edge_path = data_dir.join(plugin::BLOCKED_EDGES_DIR).canonicalize()?;+  let blocked_edge_vec =+    walk_files::<plugin::graph_data_model::BlockedEdge>(&blocked_edge_path, &extension_re).await?;

Somewhat in this space, and feel free to punt as out of scope for a v1 PR, but it would be nice to be able to feed this tool graph-builder and policy-engine plugin configs, which we could copy/edit/paste from our production update service deployment, and have it ingest the local graph-data checkout, wash through the rest of the plugin stack, and spit out assorted policy-engine output graphs for all architectures and channels. Or diff vs. the current production graph. Or something...

vrutkovs

comment created time in 4 days

PullRequestReviewEvent
PullRequestReviewEvent

Pull request review commentopenshift/cincinnati-graph-data

Add rust-based CI tool

+use cincinnati::plugins::internal::openshift_secondary_metadata_parser::plugin;+use regex::Regex;+use semver::Version;+use serde::de::DeserializeOwned;+use serde_yaml;+use std::collections::HashSet;+use std::path::PathBuf;+use tokio;++use anyhow::Context;+use anyhow::Result as Fallible;++pub async fn run() -> Fallible<HashSet<Version>> {+  let data_dir = PathBuf::from("..");+  let extension_re = Regex::new("ya+ml")?;+  // Collect a list of mentioned versions+  let mut found_versions: HashSet<Version> = HashSet::new();++  println!("Verifying blocked edge files are valid");+  let blocked_edge_path = data_dir.join(plugin::BLOCKED_EDGES_DIR).canonicalize()?;+  let blocked_edge_vec =+    walk_files::<plugin::graph_data_model::BlockedEdge>(&blocked_edge_path, &extension_re).await?;

Do we need local walkers for validation? I was expecting us to be able to wash through the github-secondary-metadata-scrape plugin.

vrutkovs

comment created time in 4 days

Pull request review commentopenshift/cincinnati-graph-data

Add rust-based CI tool

+[package]+name = "cincinnati-graph-data"+version = "0.1.0"+authors = ["Vadim Rutkovsky <vrutkovs@redhat.com>"]+edition = "2018"

Is this some Cargo magic, or should it be 2020?

vrutkovs

comment created time in 4 days

PullRequestReviewEvent

Pull request review commentopenshift/enhancements

Cluster profile: add details about the default profile

 has been specified. Manifests may support inclusion in multiple profiles by including as many of these annotations as needed. +Manifests with no explicit inclusion annotations implicitly belong to the `default` profile. +If no profile is configured, the cluster-version operator should use the `default` profile.

@deads2k : #482 reminded me that this might still be in flight, and seems like rhbz#1871890 still links a bunch of open PRs, mostly in the kube-API sphere of influence. And seems like maybe there aren't even PRs up yet for many components outside the kube-API sphere of influence. Any preliminary thoughts on whether this is something that can be accomplished with a single driving dev, vs. something where you need to call for teams to self-serve in aos-devel@, and then run around and poke folks who don't act on that call?

guillaumerose

comment created time in 4 days

PullRequestReviewEvent
PullRequestReviewEvent

delete branch wking/openshift-release

delete branch : platform-agnostic-pivot-tooling

delete time in 4 days

push eventwking/openshift-release

flacatus

commit sha f72ef1cb74f28f9b0826aa81a681e5d608c2f8d1

Add credentials to Eclipse Che performance tests Signed-off-by: flacatus <flacatus@redhat.com>

view details

flacatus

commit sha a62c664f6f85561b29ac114baa08bbf6f6878df8

Add cluster Signed-off-by: flacatus <flacatus@redhat.com>

view details

flacatus

commit sha bd144e392e89672680a3169aef4d6d4c8110f864

Fixes Signed-off-by: flacatus <flacatus@redhat.com>

view details

flacatus

commit sha 610382082cc983783fcc356b420e8f2848f205d5

Add ipi-aws workflow Signed-off-by: flacatus <flacatus@redhat.com>

view details

flacatus

commit sha 00b76f4dd01758f5ef773b26f002215af3484c68

Fix cluster_profile Signed-off-by: flacatus <flacatus@redhat.com>

view details

flacatus

commit sha 93f9763c740d487e7a2a4c7d8afa7111e32c36b5

Fixes Signed-off-by: flacatus <flacatus@redhat.com>

view details

flacatus

commit sha 7ad8023b11a23098203a779c57bd74b23399c0ce

Fixes Signed-off-by: flacatus <flacatus@redhat.com>

view details

flacatus

commit sha f88ea5ce2cee33601ba92b7ebe1fd67275b5c93b

Another try Signed-off-by: flacatus <flacatus@redhat.com>

view details

flacatus

commit sha 6888a23a06f5086a786e7c8c74df0e4819437956

Another try Signed-off-by: flacatus <flacatus@redhat.com>

view details

flacatus

commit sha 5eda0f1699f1f4c2c9f8df3b973b9863456f912c

Anothe try Signed-off-by: flacatus <flacatus@redhat.com>

view details

Yang Le

commit sha db67e7e9c0e513c5fa58d895f867889b4a2b2387

Enable e2e test of registration-operator

view details

Vadim Rutkovsky

commit sha 9be53e4cf0a2a9d40506e639b552e62062624f1d

Loki-step: install Loki via manifest sideloading This would ensure that Loki is being installed early in the install process

view details

Vadim Rutkovsky

commit sha bab9bc7e8fb87ecf1f47ce4f5f9692ea96639229

ipi-install-loki-commands: convert JSON manifests to YAML, create separate files JSON or multidocument manifests are not being applied by bootstrap CVO correctly

view details

Vadim Rutkovsky

commit sha 6280a45202d3bde19ebf0c2002df7075309a91a4

ipi-install-loki: set a single version for all components

view details

Mike Fedosin

commit sha 929869a518724dda675122605bfb4e45b2aa2b55

Add OpenStack Cinder CSI driver image

view details

Russell Teague

commit sha 5185f5021df1b7a1dc39efb927257d6e23d57e52

jobs/openshift/release: Update workers-rhel7 periodics

view details

Kiran Thyagaraja

commit sha a4b57b304fc35951a199578de4b745623b5c2b96

Add CAPBM tests for code-generation and manifest creation

view details

Lisa Seelye

commit sha 43564ebd3dfb2bbc19065c6172d178c36ba3a0a2

Retire openshift/managed-prometheus-exporter-machine-api Mentioned in https://github.com/openshift/managed-prometheus-exporter-machine-api/issues/6, this repository is no longer in use and as a pre-requisite for DPP-5823, this will remove openshift/managed-prometheus-exporter-machine-api from prow. Signed-off-by: Lisa Seelye <lisa@users.noreply.github.com>

view details

Colin Walters

commit sha 8b0dbe8f57e1bdeb426cda8ec571033895a6a39b

steps/proxy: Port to Fedora CoreOS We're currently using RHCOS as a way to run a container image in a single disposable VM. Let's use FCOS because it's more oriented towards this use case and also gets us out of needing to deal with Ignition version dependencies - we can just unconditionally use spec 3 (which RHCOS also uses in 4.6). Switch instance type to `m5.xlarge` to match the current OpenShift standard on general principle; there's no obvious reason we'd need "storage optimized".

view details

baude

commit sha 2a446e206db94d5d55cc165f577785065f4afec8

remove automatic reviewer assignments from container/ the containers/ team would like to remove the automatic assignment of reviewers. Signed-off-by: baude <bbaude@redhat.com>

view details

push time in 4 days

Pull request review commentopenshift/machine-config-operator

Bug 1877984: Convert networkType from status to lowercase

 func createDiscoveredControllerConfigSpec(infra *configv1.Infrastructure, networ 	} 	if network.Status.NetworkType == "" { 		// At install time, when CNO has not started, status is unset, use the value in spec.-		ccSpec.NetworkType = network.Spec.NetworkType+		ccSpec.NetworkType = strings.ToLower(network.Spec.NetworkType)

Downcasing outside of the network operator is easy. Canonical casing outside of the network operator would be more elegant from a user perspective, but requires teaching the canonicalizing repo about the canonical values. Are there more than these?

rcarrillocruz

comment created time in 4 days

PullRequestReviewEvent

Pull request review commentopenshift/openshift-docs

BZ#1828814 Update production env support to include fast channels

 $ oc get clusterversion -o json|jq ".items[0].spec" + [IMPORTANT] ====-For production clusters, you must subscribe to a `stable-*` channel.+For production clusters, you must subscribe to a `stable-\*` or `fast-*` channel. ====  . View the available updates and note the version number of the update that

And a few lines down, there are some old 4.1 references that we might want to freshen up.

codyhoag

comment created time in 4 days

PullRequestReviewEvent

Pull request review commentopenshift/openshift-docs

BZ#1828814 Update production env support to include fast channels

 $ oc get clusterversion -o json|jq ".items[0].spec" +

Also a few lines up, confirm that your channel is set tostable-4.5:, which probably also needs to be softened.

And do we want to overhaul this to use {product-version}?

codyhoag

comment created time in 4 days

PullRequestReviewEvent
PullRequestReviewEvent

delete branch wking/cluster-version-operator

delete branch : context-canceled-locked-task-graph

delete time in 4 days

Pull request review commentopenshift/hive

remove clusterVersionStatus field from clusterdeployment

 func machineNamePrefix(cd *hivev1.ClusterDeployment, poolName string) (string, e 	// GCP clusters running an OpenShift version earlier than 4.4.8 require leases for machine pool names because the 	// pool name is limited to a single character. 	if p := cd.Spec.Platform; p.GCP != nil {-		if cd.Status.ClusterVersionStatus != nil && semver.Compare("v4.4.8", "v"+cd.Status.ClusterVersionStatus.Desired.Version) > 0 {+		version, versionPresent := cd.Labels[constants.VersionMajorMinorPatchLabel]

So what's this version for? If it's display-only, the desired target is already reasonable.

staebler

comment created time in 4 days

PullRequestReviewEvent

pull request commentopenshift/release

ci-operator/config/openshift: Pivot from GCP to AWS

Rebased onto master, now that #11974 has landed the tooling parts. This is less urgent now that openshift/installer#4193 and openshift/installer#4195 have landed, and openshift/installer#4197 is in flight, to fix the leak in those releases. Also, not much spare capacity in AWS, with Boskos pegged for most of the day.

wking

comment created time in 5 days

push eventwking/openshift-release

Hongkai Liu

commit sha c49e53df99cb4bf7df550ca2c0402386033677f0

Migrate more presubmits to build01 (2/5)

view details

Hongkai Liu

commit sha 67fabc6880a17971c4e74f1d934c90dc59893334

Apply CRC with pidsLimit=8096 to build01/2 and vsphere

view details

openshift-bot

commit sha 489c03cc3b7d1c8edd666327959228c81d907383

Registry-replacer autocommit

view details

OpenShift Merge Robot

commit sha cee76c6739f9f70be27ee002c8029e3afa86db19

Merge pull request #11954 from openshift-bot/registry-replacer Registry-Replacer autoupdate

view details

Justin Pierce

commit sha 86e7bf800ee332b013f6ba4fc4ab87469f27d0cf

Moving the responsibility for ART equivalent image transforms to ART After the migration to ubi8, emerging requirements from upstream development teams demanded a significant increase in the complexity of providing ART equivalent images to upstream CI. It requires dynamically generated buildconfigs based on the current state of ocp-build-data. Creating these buildconfigs directly from ART tooling allows for a straightforward approach to accomplish this.

view details

Justin Pierce

commit sha 562d528d603d7ac3e6aee1697c36957f5c65a498

Merge pull request #11955 from jupierce/move_input_images_to_doozer Moving the responsibility for ART equivalent image transforms to ART

view details

Hongkai Liu

commit sha c33e24890d01812543cdaf8f10eb2e9f2f501099

Migrate more presubmits to build02 (1/5)

view details

openshift-bot

commit sha 6f4c46165713edd46a1af7f43bc12317bcbe1248

Registry-replacer autocommit

view details

OpenShift Merge Robot

commit sha 0df7eb8e946dad18ee1af9ad7bf533dd1a340c5f

Merge pull request #11957 from openshift-bot/registry-replacer Registry-Replacer autoupdate

view details

OpenShift Merge Robot

commit sha 48682c04a4783319e1a56b1a8f70444948a378c5

Merge pull request #11956 from hongkailiu/mig_n_build02 Migrate more presubmits to build02 (1/5)

view details

openshift-bot

commit sha e4f8ea2f963212646af915aba751a6920e28dfe5

Registry-replacer autocommit

view details

OpenShift Merge Robot

commit sha 129f0e66ebee61d5e524c774a7385cf3004b00c1

Merge pull request #11961 from openshift-bot/registry-replacer Registry-Replacer autoupdate

view details

openshift-bot

commit sha 70abbd4efc997543405fc151615f2508e94d1717

Registry-replacer autocommit

view details

OpenShift Merge Robot

commit sha 2a094dcb3bca6f2c75f5af76deafb1a282002316

Merge pull request #11962 from openshift-bot/registry-replacer Registry-Replacer autoupdate

view details

OpenShift Merge Robot

commit sha 0193eb852a7ced2f15a1fa3b087827ff5b9f20f3

Merge pull request #11948 from hongkailiu/mig_n Migrate more presubmits to build01 (2/5)

view details

Lukas Berk

commit sha 8e54f4d72e161ef407a48534afb814b471788337

Remove github source from knative eventing contrib nightly ci

view details

openshift-bot

commit sha 6b47584fd00356c122acc6c29fe1f23b6e4bc40d

Registry-replacer autocommit

view details

openshift-bot

commit sha a8423182d53ee2acebf8ef481e34162573c2f2f8

config-brancher --config-dir ./ci-operator/config --current-release 4.6 --future-release 4.7 --confirm

view details

openshift-bot

commit sha 8e6b7aedfe0eba9351b8cf5be322ef31089cb01a

ci-operator-config-mirror --config-path ./ci-operator/config --to-org openshift-priv --only-org openshift --whitelist-file ./core-services/openshift-priv/_whitelist.yaml

view details

openshift-bot

commit sha 12ce00c4b2e0c8bdcd4765e41099344e901b1c6c

ci-operator-prowgen --from-dir ./ci-operator/config --to-dir ./ci-operator/jobs

view details

push time in 5 days

PR opened openshift/release

ci-operator/platform-balance/step-jobs-by-platform: Add tooling to pivot jobs

Make it easy to shift as many platform-agnostic jobs from one platform to an alternative as possible. For example, if GCP throughput is broken because we are over quota and leaking faster than we can clean up we might want to push all platform-agnostic GCP jobs to AWS.

This is the tooling from #11964 without the actual pivot, so it's under version control and available to folks for future incidents even if we decide not to use it for this incident.

+68 -11

0 comment

1 changed file

pr created time in 5 days

create barnchwking/openshift-release

branch : platform-agnostic-pivot-tooling

created branch time in 5 days

Pull request review commentopenshift/oc

Bug 1878925: pkg/cli/admin/upgrade: Teach --to about history lookup

 func (o *Options) Run() error { 				fmt.Fprintf(o.Out, "info: Cluster is already at version %s\n", o.To) 				return nil 			}-			for _, available := range cv.Status.AvailableUpdates {-				if available.Version == o.To {-					update = &configv1.Update{-						Version: available.Version,-						Image:   available.Image,-					}-					break-				}-			}+			update = findUpdateFromConfigVersion(cv, o.To) 			if update == nil { 				if len(cv.Status.AvailableUpdates) == 0 { 					if c := findCondition(cv.Status.Conditions, configv1.RetrievedUpdates); c != nil && c.Status == configv1.ConditionFalse { 						return fmt.Errorf("Can't look up image for version %s. %v", o.To, c.Message) 					} 					return fmt.Errorf("No available updates, specify --to-image or wait for new updates to be available") 				}-				return fmt.Errorf("The update %s is not one of the available updates: %s", o.To, strings.Join(versionStrings(cv.Status.AvailableUpdates), ", "))+				return fmt.Errorf("The update %s is neither one of the available updates (%s) nor a history entry", o.To, strings.Join(versionStrings(cv.Status.AvailableUpdates), ", "))

@wking nor a history entry , will the customer understand what this means?

Are you asking for a rephrase that says "you can draw from either set" without using "neither ... nor ..."? Or something that mentions the fact this history entries may not be recommended updates? Or...?

wking

comment created time in 5 days

PullRequestReviewEvent

delete branch wking/cluster-version-operator

delete branch : go-1.15

delete time in 5 days

Pull request review commentopenshift/hive

remove clusterVersionStatus field from clusterdeployment

 func machineNamePrefix(cd *hivev1.ClusterDeployment, poolName string) (string, e 	// GCP clusters running an OpenShift version earlier than 4.4.8 require leases for machine pool names because the 	// pool name is limited to a single character. 	if p := cd.Spec.Platform; p.GCP != nil {-		if cd.Status.ClusterVersionStatus != nil && semver.Compare("v4.4.8", "v"+cd.Status.ClusterVersionStatus.Desired.Version) > 0 {+		version, versionPresent := cd.Labels[constants.VersionMajorMinorPatchLabel]

For the future work, machine-config is one of the last operators to update. We don't bump bootimages in-cluster today, but if you were looking up a bootimage to use for a new MachineSet, it's probably safer to look through status.history for the most recent entry which was actually completed. Or crib from the control-plane Machines. Or something...

staebler

comment created time in 5 days

PullRequestReviewEvent

pull request commentopenshift/release

ci-operator/config/openshift: Pivot from GCP to AWS

Dropped some debugging print calls and ran make jobs with cbb1dc7c5a -> 5004645ea3.

wking

comment created time in 5 days

push eventwking/openshift-release

W. Trevor King

commit sha 4c38d4cca988127d53361b1e8c854ad55d17e505

ci-operator/platform-balance/step-jobs-by-platform: Add tooling to pivot jobs Make it easy to shift as many platform-agnostic jobs from one platform to an alternative as possible. For example, if GCP throughput is broken because we are over quota and leaking faster than we can clean up, we might want to push all platform-agnostic GCP jobs to AWS.

view details

W. Trevor King

commit sha 5004645ea34ac0e8abffe3076a9f80dbbdfa0029

ci-operator/config/openshift: Pivot from GCP to AWS GCP is currently experiencing Generated with: $ sed -i 's/#pivot_platform/pivot_platform/' ci-operator/platform-balance/step-jobs-by-platform.py $ ci-operator/platform-balance/step-jobs-by-platform.py $ git add -p $ make jobs and lots of tedious work to select the workflow/cluster_profile changes without pulling in random changes because my local PyYAML has different formatting opinions than whatever wrote the rest of these files.

view details

push time in 5 days

PR opened openshift/release

ci-operator/config/openshift: Pivot from GCP to AWS

GCP is currently experiencing

Generated with:

$ sed -i 's/#pivot_platform/pivot_platform/' ci-operator/platform-balance/step-jobs-by-platform.py
$ ci-operator/platform-balance/step-jobs-by-platform.py
$ git add -p

and lots of tedious work to select the workflow / cluster_profile changes without pulling in random changes because my local PyYAML has different formatting opinions than whatever wrote the rest of these files.

Also has a few leading commits to adjust some unrelated script issues turned up by this new feature.

+205 -139

0 comment

56 changed files

pr created time in 5 days

create barnchwking/openshift-release

branch : platform-agnostic-to-aws

created branch time in 5 days

create barnchwking/cluster-version-operator

branch : upgradebale-typo

created branch time in 5 days

pull request commentopenshift/cloud-credential-operator

Bug 1879628: Upgradeable false if upcoming secrets are not provisioned.

Does upgradeable=false trigger an install failure? That shouldn't be the case...

Agreed. If folks see this happening, grab a must-gather and installer logs and pass them along to us.

dgoodwin

comment created time in 5 days

delete branch wking/openshift-release

delete branch : openshift-e2e-gcp-crc

delete time in 5 days

delete branch wking/openshift-release

delete branch : discuss-gcp-secret-rotation

delete time in 5 days

delete branch wking/bugzilla-operator

delete branch : serious-keywords-reporting-mfojtik

delete time in 5 days

Pull request review commentopenshift/release

Bug 1875773: ci-operator/step-registry/ipi/conf/aws/blackholenetwork/blackhole_vpc_yaml: Add EC2 endpoint

 Resources:     Properties:       SubnetId: !Ref PrivateSubnet3       RouteTableId: !Ref PrivateRouteTable3+  EC2Endpoint:

#11750 fixed the proxy instance setup (SSH access is still broken, but is not critical). I've pushed a639ef9b97 -> ae4da132f3 with a stab at your recommended changes, and have updated the live, long-running blackhole VPC stacks to match. But I wouldn't be surprised if I've still got the new security group wrong. I guess we'll see how rehearsals are working out...

wking

comment created time in 5 days

PullRequestReviewEvent

push eventwking/openshift-release

flacatus

commit sha f72ef1cb74f28f9b0826aa81a681e5d608c2f8d1

Add credentials to Eclipse Che performance tests Signed-off-by: flacatus <flacatus@redhat.com>

view details

flacatus

commit sha a62c664f6f85561b29ac114baa08bbf6f6878df8

Add cluster Signed-off-by: flacatus <flacatus@redhat.com>

view details

flacatus

commit sha bd144e392e89672680a3169aef4d6d4c8110f864

Fixes Signed-off-by: flacatus <flacatus@redhat.com>

view details

flacatus

commit sha 610382082cc983783fcc356b420e8f2848f205d5

Add ipi-aws workflow Signed-off-by: flacatus <flacatus@redhat.com>

view details

flacatus

commit sha 00b76f4dd01758f5ef773b26f002215af3484c68

Fix cluster_profile Signed-off-by: flacatus <flacatus@redhat.com>

view details

flacatus

commit sha 93f9763c740d487e7a2a4c7d8afa7111e32c36b5

Fixes Signed-off-by: flacatus <flacatus@redhat.com>

view details

flacatus

commit sha 7ad8023b11a23098203a779c57bd74b23399c0ce

Fixes Signed-off-by: flacatus <flacatus@redhat.com>

view details

flacatus

commit sha f88ea5ce2cee33601ba92b7ebe1fd67275b5c93b

Another try Signed-off-by: flacatus <flacatus@redhat.com>

view details

flacatus

commit sha 6888a23a06f5086a786e7c8c74df0e4819437956

Another try Signed-off-by: flacatus <flacatus@redhat.com>

view details

flacatus

commit sha 5eda0f1699f1f4c2c9f8df3b973b9863456f912c

Anothe try Signed-off-by: flacatus <flacatus@redhat.com>

view details

Alvaro Aleman

commit sha b3c35ad16cd64dcb1db704c2e928450baa5a6ab1

Fixing openshift-origin-aggregated-logging-master.yaml to use dockerfile from ocp-build-data

view details

Justin Pierce

commit sha 83f99612872a023972e796cbe56a1043a4539ff7

Migrating to RHEL8 images

view details

Vadim Rutkovsky

commit sha 44e5705ba1931950fbd2dd12760b521a6f737c80

Cincinnati: avoid using artifact_dir This seems to confuse ci-operator and makes it add duplicate artifacts mounts

view details

Ben Parees

commit sha 4c41fde5d921435de3c15652e6c1f4a5596692bf

Bug 1873293: switch 4.6 promotion back to using azure https://bugzilla.redhat.com/show_bug.cgi?id=1873293

view details

Doug Hellmann

commit sha ea98c8571796edad50e2edab80aac046f9355d0c

cluster-api-provider-baremetal: add unit test job We were running an end-to-end test job, but not the unit tests. Signed-off-by: Doug Hellmann <dhellmann@redhat.com>

view details

Vadim Rutkovsky

commit sha f30d1eac7db912cc4d62cca35233f3a91c8a76ab

cluster-authentication: use upgrade workflow with Loki This would port upgrade tests to multistep and ensure container logs are persisted with Loki

view details

Vadim Rutkovsky

commit sha 69f40af50a2081295398978c1b6f08e8679b4d19

OKD machine-os-promotion: rework composed image Extract binaries from necessary RPMs (kubelet and crio) so that bootstrap node could start. Compose os-extensions with necessary packages for workers/masters. These would install additional packages via machineconfig instruction. This change ensures all post-install actions - like creating users etc. - are running via `rpm-ostree install` on first boot.

view details

Carlos Eduardo Arango Gutierrez

commit sha 127aeaba2010facc53a1a5ea46766bfbecc1ea7d

Add extra documentation for CONTAINER_ENGINE env var Add extra documentation for developers on how to set a different CONTAINER_ENGINE when calling make targets. Signed-off-by: Carlos Eduardo Arango Gutierrez <carangog@redhat.com>

view details

Vadim Rutkovsky

commit sha 9c9b06db3ab69d75cf303e0c8f28829f0aaa88aa

oauth-server: use multistep job for e2e-upgrade Run GCP upgrade using multistep and persist container logs using Loki

view details

Stefan Junker

commit sha cc10834ee1d729f5143aacbeac99b705b7349b12

cincinnati: run rustfmt tests for Rust 1.46 only Intermittent stable versions of rustfmt, 1.44 and 1.45, are reformatting generated code unexpectedly. Version 1.46 returns to the previous behavior so it seems reasonable to ignore the intermittent formatting.

view details

push time in 5 days

push eventwking/cluster-version-operator

W. Trevor King

commit sha 1d9a31926dba4ed0e35c9891591003a9308fb9e0

Dockerfile.rhel: Bump to Go 1.15 ART is waffling around the 1.14 <-> 1.15 transition [1], but we don't need to wait. Let the future start today! [1]: https://github.com/openshift/ocp-build-data/pull/665

view details

push time in 5 days

PR opened openshift/cluster-version-operator

Bug 1878163: Dockerfile.rhel: Bump to Go 1.15

ART is waffling around the 1.14 <-> 1.15 transition, but we don't need to wait. Let the future start today!

Replaces #456, but without the commit message to mach typo or the bogus link to an ocp-build-data branch where the CVO is currently reverted back to Go 1.14.

+1 -1

0 comment

1 changed file

pr created time in 5 days

create barnchwking/cluster-version-operator

branch : go-1.15

created branch time in 5 days

pull request commentopenshift/ci-tools

cmd/ocp-build-data-enforcer: Fix "to mach" -> "to match" typo

Yeah, I'll let other teams merge or close as they see fit. Using the mass commenter would allow DPTP to mass-close if you want, but I'm not in a hurry here.

wking

comment created time in 5 days

pull request commentopenshift/ci-tools

cmd/ocp-build-data-enforcer: Fix "to mach" -> "to match" typo

I went ahead and closed openshift/cluster-version-operator#456, so we're ready for the replacement ;).

wking

comment created time in 5 days

pull request commentopenshift/cluster-version-operator

Bug 1878163: Updating Dockerfile.rhel baseimages to mach ocp-build-data config

I've revived openshift/ci-tools#1157 to fix "mach" -> "match" in the PR and commit message.

openshift-bot

comment created time in 5 days

push eventwking/ci-tools

Bruno Barcarol Guimarães

commit sha b67a96823229af77093970a2ce8ebe6eff951ad7

Add a test image with boskos

view details

Bruno Barcarol Guimarães

commit sha 892d1ec4e451581d90c929f8076967e60cb79c60

ci-operator: wait 15m for pending pods We've frequently seen image pulls take 10m (!), e.g.: https://prow.ci.openshift.org/view/gs/origin-ci-test/pr-logs/pull/openshift_cluster-kube-controller-manager-operator/449/pull-ci-openshift-cluster-kube-controller-manager-operator-master-e2e-aws-operator/1306162734701744128 ``` 2020-09-16T09:29:58Z pulling image "docker-registry.default.svc:5000/ci-op-dftcl0f8/stable@sha256:e4f935c038e45a58ca4d2e46233930dc1f0354027249690297d766f24a86f7a2" 2020-09-16T09:39:13Z Successfully pulled image "docker-registry.default.svc:5000/ci-op-dftcl0f8/stable@sha256:e4f935c038e45a58ca4d2e46233930dc1f0354027249690297d766f24a86f7a2" ```

view details

geobk

commit sha 5dac11db91dfb4822c3bda99fac5d9a74f69097f

AutotTestgridGen: Add testgrid-config binary to Dockerfile

view details

OpenShift Merge Robot

commit sha e65647d53ecbe1e4286aeb62b2a4413d12cbfe22

Merge pull request #1229 from bbguimaraes/e2e-bin Add a test image with boskos

view details

OpenShift Merge Robot

commit sha 013f43649c054be7f0bb09b6da25e137fbd4588c

Merge pull request #1231 from bbguimaraes/pending_15 ci-operator: wait 15m for pending pods

view details

Steve Kuznetsov

commit sha 50007863d9147f3bba7660c488fcf7bc725a1a11

Merge pull request #1228 from GeoBK/chore/autotestgridgen/add-image AutotTestgridGen: Add testgrid-config binary to Dockerfile

view details

W. Trevor King

commit sha 947413dd5d8f2465da19b699ebe0cda84d0fde62

cmd/ocp-build-data-enforcer: Fix "to mach" -> "to match" typo Typo from fe83470357 (OCP build data enforcer: Add pr creation capabilities, 2020-08-12, #1106).

view details

push time in 5 days

pull request commentopenshift/ci-tools

b/cmd/ocp-build-data-enforcer: Fix "to mach" -> "to match" typo

Re-opened and rebased onto master, because #1156 ended up getting closed.

wking

comment created time in 5 days

push eventwking/ci-tools

Bruno Barcarol Guimarães

commit sha 93552365b9dcfc6d3835f33acf5f64c727a3f167

webreg: document configuration sharding

view details

Alvaro Aleman

commit sha d03d7c49345b7f1a9b9bde546297e8d9b574773d

Secret syncer: Keep existing fields not in source secret

view details

geobk

commit sha b74eff4ed5742f1552ba2290a852459ba8279156

Fail after uploading valid secrets

view details

geobk

commit sha 92e9e8e6c4f7325f1e87dcf6bc9ed870a8ed7303

Update test-infra to 43b6c04

view details

Alvaro Aleman

commit sha 8a25d076ce4303fa859e2b24ba039d704ec4e24a

Remove stale todo from ocp-build-data enforcer readme

view details

Alvaro Aleman

commit sha 00fb1919a023b04765471936da376f09f4f318c0

Secret bootstrapper: Respect existing secrets

view details

OpenShift Merge Robot

commit sha 7d36c2f8dbe5ebf46e5c86b9de62a3cfdee15f8b

Merge pull request #1150 from GeoBK/fix/bootstrapper/dont-fail-if-bw-item-missing Secret Bootstrap: Finish processing valid secrets before failing when encountering errors

view details

OpenShift Merge Robot

commit sha e20bce6d369d717aee9fe12aca788c6e83dd1769

Merge pull request #1143 from bbguimaraes/doc_shard webreg: document configuration sharding

view details

OpenShift Merge Robot

commit sha 7ed16a43c0af719cfb9daebfeda8b0ff915579d1

Merge pull request #1148 from alvaroaleman/respect-existing Secret syncer: Keep existing fields not in source secret

view details

Steve Kuznetsov

commit sha 19d97f85ce09b73b0c234e17e3af206109eae5b5

ci-operator: allow overriding dependency definitions This allows e.g. the normal install step to define the install target to be release:latest and the upgrade workflow to override it to be release:initial. Signed-off-by: Steve Kuznetsov <skuznets@redhat.com>

view details

Steve Kuznetsov

commit sha 5fbf7d87a8dbd74f4a823a2b7eee0161b9b57703

ci-operator: create Pods with resolved refs The built-in ImageStreamTag resolution process in OpenShift is arcane and very difficult to understand in conjuction with all of the different levels of image imports and distribution we have. We always want to pull from the local registry when we're creating a Pod with an ImageStreamTag so we can just do the digest resolution ourselves and use the resolved pull specification in the Pod's configuration. If that ends up being pull-through with the central registry as the source, that's fine, but opaque to the end-user ci-operator process. Signed-off-by: Steve Kuznetsov <skuznets@redhat.com>

view details

OpenShift Merge Robot

commit sha ae31d1102348b34a2cc218c6df66420927bb3b42

Merge pull request #1153 from GeoBK/chore/update-test-infra-135e7b Update test-infra to 43b6c04

view details

Petr Muller

commit sha 8eb87eb7cc83422e38380f19182ae470899d2ab9

ocp-build-data-enforcer: cleaup input paths The code in gatherAllOCPImageConfigs relies on the directory path not having a trailing slash. `Clean` cleans this up.

view details

OpenShift Merge Robot

commit sha db1f710c9c01abbf65464eb66de9c01c2e5e6988

Merge pull request #1158 from stevekuznetsov/skuznets/image-digest-for-pods ci-operator: create Pods with resolved refs

view details

Petr Muller

commit sha 6a079950e0a0e6a483a323ed51a67d5e43a4f1aa

pj-rehearse: fix misleading copy-pasted error message

view details

Alvaro Aleman

commit sha 62e3e4b4abeb79aa0da64b6f619d89fb4e91f4e1

CI operator: Tolerate IsAlreadyExists when creating PDBs

view details

Petr Muller

commit sha 3cc650ac4040553a76328ab40d8143c2374d424f

Revert "ci-operator: create Pods with resolved refs"

view details

Steve Kuznetsov

commit sha e39d1acd5cfdaf0ed09925d44289bdfbdfe43e6d

Merge pull request #1163 from petr-muller/revert-1158-skuznets/image-digest-for-pods Revert "ci-operator: create Pods with resolved refs"

view details

Steve Kuznetsov

commit sha 9cf43c62650556e535eab6590f332b6d89bf26e0

ci-operator: create imagestreams with local reference policy We want users of these ImageStreams to refer to their content with the local registry's domain, not wherever their content might be sourced during pull-through. Signed-off-by: Steve Kuznetsov <skuznets@redhat.com>

view details

OpenShift Merge Robot

commit sha 2150504ece69112b5cc860916d1db9dd7ad48c4a

Merge pull request #1160 from petr-muller/clean-path-in-ocp-build ocp-build-data-enforcer: cleaup input path

view details

push time in 5 days

pull request commentopenshift/ci-tools

b/cmd/ocp-build-data-enforcer: Fix "to mach" -> "to match" typo

/reopen

wking

comment created time in 5 days

create barnchwking/ci-tools

branch : to-mach-typo

created branch time in 5 days

pull request commentopenshift/release

core-services/ci-secret-generator: Document GCP service account rotation

All green :)

wking

comment created time in 6 days

pull request commentopenshift/release

ci-operator/step-registry/openshift/e2e/gcp/crc: Include "gcp" in workflow name

Oops, b60c2eb35e -> e2e4000309 removes the old *.metadata.json created in #11821.

wking

comment created time in 6 days

push eventwking/openshift-release

W. Trevor King

commit sha e2e4000309fcf6c03573e75a05978fb8573f4e23

ci-operator/step-registry/openshift/e2e/gcp/crc: Include "gcp" in workflow name It's a GCP workflow, and we might grow additional workflows for other platforms in the future. Including the platform in the workflow name makes space for that, and avoids: $ hack/step-jobs-by-platform.py unable to determine platform-agnostic workflows for: openshift-e2e-crc gcp (pull-ci-openshift-installer-master-e2e-crc, pull-ci-openshift-installer-release-4.6-e2e-crc, pull-ci-openshift-installer-release-4.7-e2e-crc) ... Leaving ci-operator/config/openshift-priv alone, because those will be auto-updated at some point after the ci-operator/config/openshift changes land, as seen in [1]. *.metadata.json updated via: $ make update [1]: https://github.com/openshift/release/pull/10382

view details

push time in 6 days

pull request commentopenshift/release

ci-operator/step-registry/openshift/e2e/gcp/crc: Include "gcp" in workflow name

Rebased around #11821 with 92ac81133a -> b60c2eb35e.

wking

comment created time in 6 days

push eventwking/openshift-release

Alvaro Aleman

commit sha 6b0f789313239e739de08b34bc2efd367b7b013c

Fixing kube-reporting-metering-operator-master.yaml to use dockerfile from ocp-build-data

view details

Alvaro Aleman

commit sha f8ab12d9783066bc3e4de49bc6ab672cc00b577c

Fixing openshift-jenkins-master.yaml to use dockerfile from ocp-build-data

view details

Justin Pierce

commit sha 821bc3464081a81cba3d663c5c0a57da86fec1ff

Trying a new release image to avoid buildah issue https://bugzilla.redhat.com/show_bug.cgi?id=1868388

view details

Justin Pierce

commit sha 72131379d3107641bcbc4b9601c9d4d529c9c689

Add replacement strings for ART Dockerfile

view details

flacatus

commit sha f72ef1cb74f28f9b0826aa81a681e5d608c2f8d1

Add credentials to Eclipse Che performance tests Signed-off-by: flacatus <flacatus@redhat.com>

view details

flacatus

commit sha a62c664f6f85561b29ac114baa08bbf6f6878df8

Add cluster Signed-off-by: flacatus <flacatus@redhat.com>

view details

flacatus

commit sha bd144e392e89672680a3169aef4d6d4c8110f864

Fixes Signed-off-by: flacatus <flacatus@redhat.com>

view details

flacatus

commit sha 610382082cc983783fcc356b420e8f2848f205d5

Add ipi-aws workflow Signed-off-by: flacatus <flacatus@redhat.com>

view details

flacatus

commit sha 00b76f4dd01758f5ef773b26f002215af3484c68

Fix cluster_profile Signed-off-by: flacatus <flacatus@redhat.com>

view details

flacatus

commit sha 93f9763c740d487e7a2a4c7d8afa7111e32c36b5

Fixes Signed-off-by: flacatus <flacatus@redhat.com>

view details

flacatus

commit sha 7ad8023b11a23098203a779c57bd74b23399c0ce

Fixes Signed-off-by: flacatus <flacatus@redhat.com>

view details

flacatus

commit sha f88ea5ce2cee33601ba92b7ebe1fd67275b5c93b

Another try Signed-off-by: flacatus <flacatus@redhat.com>

view details

flacatus

commit sha 6888a23a06f5086a786e7c8c74df0e4819437956

Another try Signed-off-by: flacatus <flacatus@redhat.com>

view details

Justin Pierce

commit sha c4d29bdb2fbffb8e245146e6d79e145375c87593

Remove source_path and add context_dir

view details

flacatus

commit sha 5eda0f1699f1f4c2c9f8df3b973b9863456f912c

Anothe try Signed-off-by: flacatus <flacatus@redhat.com>

view details

Alvaro Aleman

commit sha b3c35ad16cd64dcb1db704c2e928450baa5a6ab1

Fixing openshift-origin-aggregated-logging-master.yaml to use dockerfile from ocp-build-data

view details

Justin Pierce

commit sha 83f99612872a023972e796cbe56a1043a4539ff7

Migrating to RHEL8 images

view details

Maru Newby

commit sha c1133af91279ddca9083df58fc6b80c70affaf53

Add optional on-demand e2e-aws job to origin master

view details

Vadim Rutkovsky

commit sha 44e5705ba1931950fbd2dd12760b521a6f737c80

Cincinnati: avoid using artifact_dir This seems to confuse ci-operator and makes it add duplicate artifacts mounts

view details

Jacob Tanenbaum

commit sha 7418e62dc80a548fb1fb9e3cf32813cae1a269a9

make the windows testing blocking for ovn and cno testing the windows testing is stable and should be made blocking for openshift/ovn-kubernetes and openshift/cluster-network-operator when using ovn-kubernetes network plugin

view details

push time in 6 days

delete branch wking/cincinnati-operator

delete branch : operator-sdk-install

delete time in 6 days

Pull request review commentopenshift/cluster-storage-operator

Bug 1879365: Move CSO namespace to lower runlevel

+# Create the namespace at runlevel 49, so cluster-csi-snapshot-controller-operator (running at level 50) can use it. apiVersion: v1 kind: Namespace metadata:
jsafrane

comment created time in 6 days

PullRequestReviewEvent

push eventwking/openshift-release

W. Trevor King

commit sha de19d426517e930b8e65fc4ce1299ca949353954

core-services/ci-secret-generator: Document GCP service account rotation If you happen to blow away the GCP service account used for provisioning CI jobs (hypothetically ;), you can recover following [1]. If you don't have gcloud installed locally, you can click "Activate Cloud Shell" in [2]. Once you have gcloud in the openshift-gce-devel-ci project, run: gcloud iam service-accounts create do-not-delete-ci-provisioner --description='Credentials for creating GCP clusters in CI jobs.' --display-name='CI provisioner' for ROLE in admin compute.instanceAdmin compute.networkAdmin compute.securityAdmin compute.viewer iam.serviceAccountUser storage.admin do gcloud projects add-iam-policy-binding openshift-gce-devel-ci --member=serviceAccount:do-not-delete-ci-provisioner@openshift-gce-devel-ci.iam.gserviceaccount.com --role="roles/${ROLE}" done to create a new service account with the recommended roles. Then the 'keys create' I'm adding here will create a new key and push it into BitWarden. You may also wish to revoke previous keys with: $ gcloud iam service-accounts keys list --iam-account do-not-delete-ci-provisioner@openshift-gce-devel-ci.iam.gserviceaccount.com KEY_ID CREATED_AT EXPIRES_AT ...redacted... 2020-09-16T20:23:20Z 9999-12-31T23:59:59Z ...more entries... $ gcloud iam service-accounts keys delete $KEY_ID_TO_DELETE --iam-account do-not-delete-ci-provisioner@openshift-gce-devel-ci.iam.gserviceaccount.com [1]: https://docs.openshift.com/container-platform/4.5/installing/installing_gcp/installing-gcp-account.html#installation-gcp-permissions_installing-gcp-account [2]: https://console.cloud.google.com/iam-admin/iam?organizationId=54643501348&project=openshift-gce-devel-ci

view details

push time in 6 days

PR opened openshift/release

core-services/ci-secret-generator: Document GCP service account rotation

If you happen to blow away the GCP service account used for provisioning CI jobs (hypothetically ;), you can recover following these docs. If you don't have gcloud installed locally, you can click Activate Cloud Shell here. Once you have gcloud in the openshift-gce-devel-ci project, run:

$ gcloud iam service-accounts create do-not-delete-ci-provisioner --description='Credentials for creating GCP clusters in CI jobs.' --display-name='CI provisioner'
$ for ROLE in admin compute.instanceAdmin compute.networkAdmin compute.securityAdmin compute.viewer iam.serviceAccountUser storage.admin
> do
>   gcloud projects add-iam-policy-binding openshift-gce-devel-ci --member=serviceAccount:do-not-delete-ci-provisioner@openshift-gce-devel-ci.iam.gserviceaccount.com --role="roles/${ROLE}"
> done

to create a new service account with the recommended roles. Then the keys create I'm adding here will create a new key and push it into BitWarden.

You may also wish to revoke previous keys with:

$ gcloud iam service-accounts keys list --iam-account do-not-delete-ci-provisioner@openshift-gce-devel-ci.iam.gserviceaccount.com
KEY_ID         CREATED_AT            EXPIRES_AT
...redacted... 2020-09-16T20:23:20Z  9999-12-31T23:59:59Z
...more entries...
$ gcloud iam service-accounts keys delete $KEY_ID_TO_DELETE --iam-account do-not-delete-ci-provisioner@openshift-gce-devel-ci.iam.gserviceaccount.com
+5 -1

0 comment

1 changed file

pr created time in 6 days

create barnchwking/openshift-release

branch : discuss-gcp-secret-rotation

created branch time in 6 days

pull request commentopenshift/cluster-version-operator

Bug 1873900: pkg/payload/task_graph: Avoid deadlocking on cancel with workCh queue

I've pushed 622e04fe74 -> 632e763fa2 with some discussion of workCh capacity and select randomness to show that, even with the old code, there were some cases where in-flight work would still not have deadlocked. That reasoning is why the new unit test will sometimes fail with two cancels instead of deadlocking before the code-fix from this PR.

wking

comment created time in 6 days

push eventwking/cluster-version-operator

W. Trevor King

commit sha 632e763fa209c3e1bd07098c2e33599f8c663e3e

pkg/payload/task_graph: Avoid deadlocking on cancel with workCh queue Before 55ef3d3027 (pkg/payload/task_graph: Handle node pushing and result collection without a goroutine, 2019-10-21, #264), RunGraph had a separate goroutine that managed the work queue, with results fed into errCh to be collected by the main RunGraph goroutine. It didn't matter if that work queue goroutine hung; as long as all the worker goroutines exited, RunGraph would collect their errors from errCh and return. In 55ef3d3027, I removed the queue goroutine and moved queue management into the main RunGraph goroutine. With that change, we became exposed to the following race: 1. Main goroutine pushes work into workCh. 2. Context canceled. 3. Workers exit via the "Canceled worker..." case, so they don't pick the work out of workCh. 4. Main goroutine deadlocks because there is work in flight, but nothing in resultCh, and no longer any workers to feed resultCh. In logs, this looks like "sync-worker goroutine has gone to sleep, and is no longer synchronizing manifests" [1]. There are two mitigating factors: a. 'select' docs [2]: If one or more of the communications can proceed, a single one that can proceed is chosen via a uniform pseudo-random selection. So the races step 3 will happen in about half of the cases where the context has been canceled. In the other half of cases, the worker will randomly decide to pick up the queued work, notice it's been canceled while processing that work, and return a "context canceled" result. b. We scale workCh by the number of workers. So the deadlock risk requires enough parallel work to fill the queue faster than workers are draining it and enough bad luck in the worker's select that the canceled workers don't drain the queue on their own. E.g. with our eight ClusterOperator-precreate workers, we'd have an 0.5^8 ~= 0.4% chance of not draining a single in-queue node post-cancel, and a 1 - 0.5^8 ~= 99.6% chance of not draining eight in-queue nodes post-cancel. With this commit, we drain results when they are available, but we also respect the context to allow the resultCh read to be canceled. When we have been canceled with work in flight, we also attempt a non-blocking read from workCh to drain out anything there that has not yet been picked up by a worker. Because 'done' will not be set true, we'll call getNextNode again and come in with a fresh pass through the for loop. ctx.Err() will no longer be nil, but if the workCh drain worked, we may now have inflight == 0, and we'll end up in the case that sets 'done' true, and break out of the for loop on that round. The unit test sets up two parallel nodes: a and b. We configure one worker, which picks up node a. Node b doesn't block on node a, so it gets pushed into workCh while the worker grinds through node a. On its second task in node a, the worker cancels the run. Because the sleeps do not have select-ctx.Done guards, the worker finishes off that second task, notices the cancel as they enter their third task, and exits with the "context canceled" error. This leaves node b stuck in workCh, and we need the fix from this commit to avoid deadlocking on that in-flight node. [1]: https://bugzilla.redhat.com/show_bug.cgi?id=1873900 [2]: https://golang.org/ref/spec#Select_statements

view details

push time in 6 days

more