profile
viewpoint
Joe Stringer joestringer Isovalent San Francisco Bay Area

cilium/ebpf 668

eBPF Library for Go

joestringer/jsonflowagent 4

JSONFlowAgent is a Net-SNMP subagent which retrieves OpenFlow switch statistics from the JSONStats NOX component and pushes them to an AgentX-based SNMP master agent.

cilium/mtu-update 2

Update the MTU inside running containers managed by Cilium

joestringer/arch-ppa 1

Create and maintain personal Arch linux package repositories

joestringer/ct_perf 1

Connection tracking benchmark scripts.

cilium/iproute2 0

Cilium development for BPF loader

joestringer/bcc 0

BCC - Tools for BPF-based Linux IO analysis, networking, monitoring, and more

joestringer/bpftrace 0

High-level tracing language for Linux eBPF

pull request commentcilium/cilium

redirectpolicy: Check lrp type before restoring serice

:facepalm: Github really doesn't like multiple people making single-label changes at once, haha. The commit which this fixes was in 1.9.0-rc2 so we can mark it as release-note/bug. Agree it should be backported to 1.9. Also release-blocker from the issue that this is fixing.

aditighag

comment created time in 2 days

push eventcilium/cilium

Maciej Kwiek

commit sha 2643a7727a761a7c9ed9340942b588e5449601f9

docs: Various LRP gsg fixups Signed-off-by: Maciej Kwiek <maciej@isovalent.com>

view details

Aditi Ghag

commit sha e93da19801dd5a2e9e012cc6c24f1f0ad78b0197

doc: Wrap command output under shell-session highlighting Signed-off-by: Aditi Ghag <aditi@cilium.io>

view details

Aditi Ghag

commit sha 17c4f80ea32495ecd62c4f1a39455c45b2210f6c

doc: Correct the instruction to pass whitelist argument The whitelist argument needs to be added only to the kiam-agent daemonset. Signed-off-by: Aditi Ghag <aditi@cilium.io>

view details

push time in 2 days

delete branch cilium/cilium

delete branch : pr/lrp-docs-followup

delete time in 2 days

PR merged cilium/cilium

Reviewers
docs: Various LRP gsg fixups area/documentation needs-backport/1.9 release-note/misc
+25 -27

0 comment

1 changed file

nebril

pr closed time in 2 days

PullRequestReviewEvent

Pull request review commentcilium/cilium

docs: Various LRP gsg fixups

 security credentials for pods.        $ helm repo add uswitch https://uswitch.github.io/kiam-helm-charts/charts/       $ helm repo update-      $ helm template kiam uswitch/kiam > kiam.yaml -  - If you see an error like "request blocked by whitelist-route-regexp" while-    running requests to the metadata server, then you may need to whitelist the-    metadata requests by passing the below argument to the ``kiam-agent Deamonset``.+  - You may see an error like "request blocked by whitelist-route-regexp" while+    running requests to the metadata server once the kiam-agent pods are deployed.+    In such cases, you may need to whitelist the metadata requests.      .. code-block:: bash -        $ sed -i '/args:/a \ \ \ \ \ \ \ \ \ \ \ \ - --whitelist-route-regexp=meta-data' kiam.yaml+        $ helm template --set agent.extraArgs.whitelist-route-regexp=meta-data kiam uswitch/kiam > kiam.yaml

Reading the overall list of items here, we have

  • Deploy
    • Instructions add the repository, but don't actually deploy
  • You may see an error....
    • Template instructions
  • ...
  • Apply the configuration
    • Finally, the step where we deploy.

Which seems a bit odd in terms of ordering if you just read through?

Also, now that we don't need to run the extra sed commands, we could just do helm install directly with this option?

I'm wondering whether it's even worth mentioning things like the "request blocked while whitelist-out-regexp" bullet point above or the "iptables" bullet point below if we just tell people to install with the following command:

   helm install --set agent.host.iptables=false --set agent.extraArgs.whitelist-route-regexp=meta-data kiam uswitch/kiam

We can always add items afterwards to explain why each setting is configured like this, that's fine but I feel like we should provide the exact set of options that we expect users to deploy with to have the best chance of success. Then if a user deviates from these instructions, they can consult the documentation and see exactly which options we recommended.

nebril

comment created time in 2 days

PullRequestReviewEvent
PullRequestReviewEvent

Pull request review commentcilium/cilium

docs: Various LRP gsg fixups

 security credentials for pods.   .. code-block:: bash        $ kubectl exec app-pod -- curl -s -w "\n" -X GET http://169.254.169.254/latest/meta-data/+      ami-id

If we'll show the output (which I think is good), I think we should also fix the syntax highlighting for this block: .. code-block:: shell-session

nebril

comment created time in 2 days

PullRequestReviewEvent
PullRequestReviewEvent

push eventcilium/cilium

Chris Tarazi

commit sha 320b83f19a41f387cc3bd2567bad0f415181d055

eventqueue: Add missing godoc Signed-off-by: Chris Tarazi <chris@isovalent.com>

view details

Chris Tarazi

commit sha 253368a681e8c39d6c31c26fdd5419efdcf33b8e

eventqueue: Sort functions by importance This commit contains no functional changes. It just reorders some of the functions by importance in this file so that it's easier to parse. This should hopefully reduce the number of times needed to scroll up and down. Signed-off-by: Chris Tarazi <chris@isovalent.com>

view details

Chris Tarazi

commit sha 1c0f00df47cb0a3180e4f8cb547843e7838ff918

eventqueue: Forcefully drain to prevent deadlock This commit ensures that the EventQueue is fully drained, even when its not running its loop. When endpoints are being restored, their EventQueue is initialized, but non-running state (processing events). It is the job of the endpoint manager to kick off the event loop by calling Expose() on the endpoint. This commit fixes the following commits which causes Cilium to be stuck waiting for the EventQueue to drain (WaitToBeDrained()): 290d9e942 ("daemon: Init endpoint queue during validation") 79bf42515 ("endpoint: Add function to initialize event queue") Cilium becoming stuck is described in the following flow: - Endpoints began restoration - Endpoint's EventQueue initialized (but never run) - Endpoint's metadata data resolver controller kicked off - Visilbity and bandwidth policy events enqueued - Endpoint fails restoration due to some issue (e.g. interface not found, etc) - Endpoint queued for deletion because it failed restoration - As part of endpoint deletion, the EventQueue is stopped and drained - Cilium deadlocks trying to drain, but the EventQueue run loop was never run, which would pop events off the `events` channel, and close the `eventsClosed` channel This commit fixes this deadlock by forcefully running the event loop to drain the queue. After the `events` channel is closed (from Stop()), the loop will terminate and the `eventsClosed` channel will close, thereby unblocking WaitToBeDrained(). Stacktrace from `gops`: ``` goroutine 632 [chan receive, 1 minutes]: github.com/cilium/cilium/pkg/eventqueue.(*EventQueue).WaitToBeDrained(0xc00013c960) /go/src/github.com/cilium/cilium/pkg/eventqueue/eventqueue.go:322 +0x1ad github.com/cilium/cilium/pkg/endpoint.(*Endpoint).Delete(0xc000ad6900, 0x27faee0, 0xc00062ac40, 0x27fba60, 0xc00099a120, 0x2877280, 0xc0005d2340, 0x430101, 0x0, 0x0, ...) /go/src/github.com/cilium/cilium/pkg/endpoint/endpoint.go:2194 +0x91 github.com/cilium/cilium/daemon/cmd.(*Daemon).deleteEndpointQuiet(...) /go/src/github.com/cilium/cilium/daemon/cmd/endpoint.go:674 github.com/cilium/cilium/daemon/cmd.(*Daemon).regenerateRestoredEndpoints.func2(0xc00062ac40, 0xc000b63214, 0xc000ad6900) /go/src/github.com/cilium/cilium/daemon/cmd/state.go:302 +0x7c created by github.com/cilium/cilium/daemon/cmd.(*Daemon).regenerateRestoredEndpoints /go/src/github.com/cilium/cilium/daemon/cmd/state.go:296 +0x8a0 ``` Signed-off-by: Chris Tarazi <chris@isovalent.com>

view details

push time in 2 days

PR merged cilium/cilium

Fix deadlock on eventqueue when it's being drained when endpoints are being restored area/misc kind/bug kind/regression needs-backport/1.8 needs-backport/1.9 priority/release-blocker ready-to-merge release-note/misc

See commit msgs.

Backporters, please ensure that https://github.com/cilium/cilium/pull/13608 is backported first, as this PR depends on it.

Fixes: https://github.com/cilium/cilium/pull/13608

This does not require a release-note as this was a regression introduced by https://github.com/cilium/cilium/pull/13608 which has not made it into any official release.

+93 -63

2 comments

2 changed files

christarazi

pr closed time in 2 days

pull request commentcilium/cilium

helm: Add check for prometheus service monitoring CRDs

@aanm what's the consequences if the CRD is missing? Cilium presumably runs fine and does its core functions (ie, handle networking & policy), I assume that monitoring just doesn't work. If someone upgrades, the CRD is already in place so I guess existing users are fine? Then it's only new users which need to take additonal action, so we can document how to do it right for new installs.

Then beyond that I'd guess there's the corner cases if someone gets into this state and doesn't know how to debug it, how do we make it obvious to users that the CRD is missing and they need to take action? This PR presents one approach - force it during install - but there could be other approaches, like just documenting it as a troubleshooting step, or trying to perform runtime detection of the missing CRD when Cilium is configured with monitoring enabled, then logging a warning in that case.

sayboras

comment created time in 2 days

push eventcilium/cilium

arthurchiao

commit sha e13de04aa20df50134f9e0f5f6a6413f13c3932c

metrics: fix negative identity count [ upstream commit 9673c485a72ec93c10e2db1f4fdc8feab45d3d98 ] Identity allocation uses cache and refcnt mechanisms, if the identity info is already in remote kvstore and localkeys store, it will just increase the refcnt, then notify the caller that this identity is reused. The caller will then not bump up the identity counter. However, there is a corner case that not get handled: refcnt from 0 to 1, which will result to negative identity count in the metrics output. This patch fixes the problem by returning another flag to indicate whether the identity is first-time referenced (refcnt from 0 to 1) or not. The caller then uses this information to determine whether or not to increase the counter. Signed-off-by: arthurchiao <arthurchiao@hotmail.com> Signed-off-by: Sebastian Wicki <sebastian@isovalent.com>

view details

Jianlin Lv

commit sha 4819438cedb36ba55dcd3a2e4fd6527c9d3ca355

lbmap: Fixed lb cmd byte order issue [ upstream commit c5b709b22cfc0c127c7bc92d7ea2eacdc6b59179 ] The port/RevNat info is stored in the bpf map in network byte order; When displaying the given lbmap content, the port needs to be converted to host byte order. Add ToHost() function for lbmap ToHost converts fields to host byte order. Signed-off-by: Jianlin Lv <Jianlin.Lv@arm.com> Signed-off-by: Sebastian Wicki <sebastian@isovalent.com>

view details

Jianlin Lv

commit sha e75201516104b7d6ce7dceb8d307d75d6d8ea9ab

lbmap: Optimize lbmap byte order related code [ upstream commit 682f6826bce735056a5e0d285ec0cdbb1e6cd9c8 ] Gets rid of ToNetwork() in DumpParser(). Signed-off-by: Jianlin Lv <Jianlin.Lv@arm.com> Signed-off-by: Sebastian Wicki <sebastian@isovalent.com>

view details

André Martins

commit sha 792414ce5e80aa0db787474437b88fd06c96994a

pkg/labelsfilter: add more unit test and rewrite docs [ upstream commit 5733f6af771c127ecc2777af4da3d2bcbb870154 ] This change is only adding more unit tests to better understand the behavior of the labelsfilter as well as improving the documentation for the expectation of filtering labels. Signed-off-by: André Martins <andre@cilium.io> Signed-off-by: Sebastian Wicki <sebastian@isovalent.com>

view details

push time in 2 days

delete branch cilium/cilium

delete branch : pr/v1.7-backport-2020-10-23

delete time in 2 days

PR merged cilium/cilium

v1.7 backports 2020-10-23 backport/1.7 kind/backports ready-to-merge
  • #12313 -- metrics: fix negative identity count (@ArthurChiao)
  • #13244 -- lbmap: Correct issue that port info display error (@Jianlin-lv)
  • #13696 -- pkg/labelsfilter: add more unit test and rewrite docs (@aanm)

Once this PR is merged, you can update the PR labels via:

$ for pr in 12313 13244 13696; do contrib/backporting/set-labels.py $pr done 1.7; done

Each commit had some conflicts. They all seemed rather simple, but it would be good if I get an additional set of eyes on this:

commit 9673c485a72ec93c10e2db1f4fdc8feab45d3d98

Rather trivial merge conflict where v1.7 used errors.Wrapf instead of fmt.Errorf.

commit c5b709b22cfc0c127c7bc92d7ea2eacdc6b59179

Simple multiple merge conflicts in the String() functions, as v1.7 seemed to use fmt.Sprintf instead of net.JoinHostPort.

I skipped the the changes to the pkg/maps/lbmap/affinity.go file, which does not seem to exist in the v1.7 branch.

commit 682f6826bce735056a5e0d285ec0cdbb1e6cd9c8

Simple merge conflict in String() due to loadbalancer.ScopeInternal check which is not present in v1.7

Skipped the changes to functions related affinity map dumping in pkg/maps/lbmap/lbmap.go. They do not seem present in v1.7.

Skipped changes to pkg/maps/lbmap/source_range.go and pkg/maps/lbmap/affinity.go, neither of which ar present in v1.7.

commit 5733f6af771c127ecc2777af4da3d2bcbb870154

Simple removal of the labels. references in pkg/labels/filter_test.go due to this file being part of the labels package instead the labelsfilters package in v1.7.

+259 -145

3 comments

14 changed files

gandro

pr closed time in 2 days

PullRequestReviewEvent

pull request commentcilium/cilium

nodeinit: Update image tag

Might be interesting to think about whether we can collapse the four places in the tree that we're tracking this version to a single source of truth, but LGTM.

errordeveloper

comment created time in 2 days

push eventcilium/cilium

Ilya Dmitrichenko

commit sha 7948535b9c3b9029b2a32f7191f2dfe835ffee5d

nodeinit: Update image tag This tag has no functional changes over previous one, however it includes a fix to the manifest format issue that was seen due to a bug in BuildKit (see #13429). Signed-off-by: Ilya Dmitrichenko <errordeveloper@gmail.com>

view details

push time in 2 days

delete branch cilium/cilium

delete branch : pr/errordeveloper/update-startup-script-image-tag-v1.8

delete time in 2 days

PR merged cilium/cilium

nodeinit: Update image tag backport/1.8 kind/backports

In effect this is a backport of https://github.com/cilium/cilium/pull/13726, but the change is quite different so not a cherry-pick as such.

+1 -1

0 comment

1 changed file

errordeveloper

pr closed time in 2 days

PullRequestReviewEvent

push eventcilium/cilium

Ilya Dmitrichenko

commit sha 6b54fde9d90ab232fcd4abebc3cdf45d491678f5

nodeinit: Update image tag This tag has no functional changes over previous one, however it includes a fix to the manifest format issue that was seen due to a bug in BuildKit (see #13429). Signed-off-by: Ilya Dmitrichenko <errordeveloper@gmail.com>

view details

push time in 2 days

delete branch cilium/cilium

delete branch : pr/errordeveloper/update-startup-script-image-tag

delete time in 2 days

PR merged cilium/cilium

Reviewers
nodeinit: Update image tag needs-backport/1.9 release-note/misc

This tag has no functional changes over previous one, however it includes a fix to the manifest format issue that was seen due to a bug in BuildKit (see #13429).

+4 -4

2 comments

4 changed files

errordeveloper

pr closed time in 2 days

PullRequestReviewEvent

Pull request review commentcilium/cilium

monitor: Display human-readable identities

 import ( 	"strings" 	"time" +	"github.com/cilium/cilium/pkg/identity"

Using https://github.com/KyleBanks/depth :

$ depth ./pkg/identity
github.com/cilium/cilium/pkg/identity
  ├ errors
  ├ fmt
  ├ net
  ├ strconv
  ├ sync
  ├ github.com/cilium/cilium/pkg/k8s/apis/cilium.io
  ├ github.com/cilium/cilium/pkg/labels
    ├ bytes
    ├ crypto/sha512
    ├ encoding/json
    ├ fmt
    ├ net
    ├ sort
    ├ strings
    ├ github.com/cilium/cilium/pkg/logging/logfields
      └ fmt
    └ github.com/cilium/cilium/vendor/github.com/sirupsen/logrus
      ├ bufio
      ├ bytes
      ├ context
      ├ encoding/json
      ├ fmt
      ├ io
      ├ log
      ├ os
      ├ reflect
      ├ runtime
      ├ sort
      ├ strconv
      ├ strings
      ├ sync
      ├ sync/atomic
      ├ time
      ├ unicode/utf8
      └ github.com/cilium/cilium/vendor/golang.org/x/sys/unix
        ├ bytes
        ├ encoding/binary
        ├ math/bits
        ├ runtime
        ├ sort
        ├ strings
        ├ sync
        ├ syscall
        ├ time
        ├ unsafe
        └ github.com/cilium/cilium/vendor/golang.org/x/sys/internal/unsafeheader
          └ unsafe
  └ github.com/cilium/cilium/pkg/lock
    ├ context
    ├ sync
    ├ sync/atomic
    ├ github.com/cilium/cilium/vendor/github.com/sasha-s/go-deadlock
      ├ bufio
      ├ bytes
      ├ fmt
      ├ io
      ├ io/ioutil
      ├ os
      ├ os/user
      ├ path/filepath
      ├ runtime
      ├ strings
      ├ sync
      ├ time
      └ github.com/cilium/cilium/vendor/github.com/petermattis/goid
        ├ bytes
        ├ runtime
        └ strconv
    └ github.com/cilium/cilium/vendor/golang.org/x/sync/semaphore
      ├ container/list
      ├ context
      └ sync
38 dependencies (28 internal, 10 external, 0 testing).

Seems like a lot but I'm guessing the callers probably already pull in many of these?

pchaigno

comment created time in 2 days

Pull request review commentcilium/cilium

monitor: Display human-readable identities

 var ( 		labels.IDNameInit:       ReservedIdentityInit, 		labels.IDNameRemoteNode: ReservedIdentityRemoteNode, 	}-	reservedIdentityNames = map[NumericIdentity]string{+	ReservedIdentityNames = map[NumericIdentity]string{+		IdentityUnknown:            "unknown",

At a glance it seems like it should be OK?

I'm a little bit wary that this moves us more in the direction of defining what an "unknown" identity is, when it really represents a lack of information rather than something specific. If we had perfect information in all monitor output messages it wouldn't exist. But I don't think this change will particularly influence this.

:+1:

pchaigno

comment created time in 2 days

Pull request review commentcilium/cilium

monitor: Display human-readable identities

 func (m PolicyMatchType) String() string { 	} 	return "unknown" }++// Entity prints the ID in a human readable string+func Entity(id uint32) string {

I think Entity as a naming here crosses two separate concepts and I don't think it's a good idea to mix them up. The input is a numeric Identity, which represents exactly one possible security context of the peer in the connection. An Entity on the other hand is a policy-level selector which covers multiple individual Identities (for example, cluster entity covers various different identities that are all conceptually part of the 'cluster').

I wonder whether we could just convert the uint32 from monitor side into a native security identity type instead and get the string conversion from there? I'm looking at this which seems to be equivalent to this function:

https://github.com/cilium/cilium/blob/91748c7e19b4afd3a7d30087b7ba5f69da25452f/pkg/identity/numericidentity.go#L364-L370

pchaigno

comment created time in 2 days

Pull request review commentcilium/cilium

monitor: Display human-readable identities

 type DropNotify struct {  // DumpInfo prints a summary of the drop messages. func (n *DropNotify) DumpInfo(data []byte) {-	fmt.Printf("xx drop (%s) flow %#x to endpoint %d, identity %d->%d: %s\n",-		api.DropReason(n.SubType), n.Hash, n.DstID, n.SrcLabel, n.DstLabel,+	fmt.Printf("xx drop (%s) flow %#x to endpoint %d, identity %s->%s: %s\n",+		api.DropReason(n.SubType), n.Hash, n.DstID, api.Entity(n.SrcLabel), api.Entity(n.DstLabel),

I wonder if it would still be useful to also have something like a -n, --numeric option to still support printing the numeric identities in case we find it easier to read the numbers?

pchaigno

comment created time in 2 days

PullRequestReviewEvent
PullRequestReviewEvent

pull request commentcilium/cilium

v1.8 backports 2020-10-22

Same test failed: https://jenkins.cilium.io/job/Cilium-PR-K8s-1.17-kernel-4.19/19/testReport/junit/Suite-k8s-1/17/K8sServicesTest_Checks_service_across_nodes_Tests_NodePort_BPF_Tests_with_direct_routing_Tests_NodePort_with_externalTrafficPolicy_Local/

If it's failed twice in a row, it seems less likely to be a random flake so we should do further investigation of the failure.

I'm not sure how it compares to the other issues in K8sServicesTest reported previously: https://github.com/cilium/cilium/issues?q=is%3Aissue+is%3Aopen+%22exitcode%3A+42%22 but it seems like those were more like occasional flakes rather than a reliable failure.

pchaigno

comment created time in 2 days

Pull request review commentcilium/cilium

docs: Add Azure troubleshooting tips

 Limitations  * All VMs and VM scale sets used in a cluster must belong to the same resource   group.++.. _azure_troubleshooting:++Troubleshooting+===============+* If ``kubectl exec`` to a pod fails to connect, restarting the ``tunnelfront`` pod may help.+* Pods may fail to gain a ``.spec.hostNetwork`` status even if restarted and managed by Cilium.+* If some connectivity tests fail to reach the ready state you may need to restart the unmanaged pods again.+* Some connectivity tests may fail. This is being tracked in `Cilium GitHub issue #12113+  <https://github.com/cilium/cilium/issues/12113>`_.+* ``hubble observe`` may report one or more nodes being unavailable and ``hubble-ui`` may fail to connect to the backends.

Do we need to file an issue for this one too?

jrajahalme

comment created time in 2 days

PullRequestReviewEvent
PullRequestReviewEvent

push eventcilium/cilium

Martynas Pumputis

commit sha 91748c7e19b4afd3a7d30087b7ba5f69da25452f

docs: Do not over promise in BPF-masq docs We haven't decoupled both yet. Signed-off-by: Martynas Pumputis <m@lambda.lt>

view details

push time in 2 days

PR merged cilium/cilium

docs: Do not over promise in BPF-masq docs needs-backport/1.8 needs-backport/1.9 pending-review release-note/misc

We haven't decoupled both yet.

+1 -1

0 comment

1 changed file

brb

pr closed time in 2 days

delete branch cilium/cilium

delete branch : pr/brb/docs-bpf-masq-cleanup

delete time in 2 days

PullRequestReviewEvent

push eventcilium/cilium

Martynas Pumputis

commit sha 61100c50b8fece5cac963c67f71c259ca1b05052

docs: Add a note about systemd 245 rp_filter issue Signed-off-by: Martynas Pumputis <m@lambda.lt>

view details

push time in 2 days

delete branch cilium/cilium

delete branch : pr/brb/doc-systemd-245

delete time in 2 days

PR merged cilium/cilium

docs: Add a note about systemd 245 rp_filter issue area/documentation needs-backport/1.7 needs-backport/1.8 needs-backport/1.9 pending-review priority/release-blocker release-note/misc

This doc change removes release-blocker note from https://github.com/cilium/cilium/issues/10645.

+10 -0

2 comments

1 changed file

brb

pr closed time in 2 days

PullRequestReviewEvent

delete branch joestringer/cilium

delete branch : submit/bpff-docs

delete time in 2 days

pull request commentcilium/cilium

helm: Add check for prometheus service monitoring CRDs

@aanm and what's the behaviour from Cilium side right now if the monitor CRD is not available but Cilium is configured with monitoring?

sayboras

comment created time in 2 days

pull request commentcilium/cilium

helm: Add check for prometheus service monitoring CRDs

One point on this would be, how do we expect the overall flow for install to work?

Option A:

  • Install monitoring
  • Install Cilium

Option B:

  • Install Cilium without monitoring
  • Install monitoring
  • Modify Cilium to enable monitoring

And how do we tell users to set up Cilium with monitoring in the documentation?

In some ways it seems nice to figure out during install whether the prerequisites are satisfied and let the user know since it makes the requirements more visible at install time. Sometimes though these kinds of checks can cause more trouble down the road because it forces a certain ordering of operations.

Is it possible for us to provide a runtime warning or something instead to signal when the user has installed Cilium but hasn't set up the monitoring CRDs? Then ideally whenever the user installs the CRD, Cilium would automatically just start working with it (but I could understand if a first cut also requires restarting Cilium).

sayboras

comment created time in 2 days

issue commentcilium/cilium

daemon: Do not warn if kernel config is not available

Another option we have is also to perform a bpf() syscall and fail out specifically if that fails. That would provide more reliable signal that CONFIG_BPF is not enabled.

brb

comment created time in 2 days

Pull request review commentcilium/cilium

docs: Add a note about systemd 245 rp_filter issue

 RancherOS_                 >= 1.5.5           Linux distribution that works well, please let us know by opening a           GitHub issue or by creating a pull request that updates this guide. +.. note:: Systemd 245 and above (``systemctl --version``) overrides ``rp_filter`` setting+          of Cilium network interfaces. This introduces connectivity issues (see+          `GH-10645 <https://github.com/cilium/cilium/issues/10645>`_ for details). To+          avoid that, the following can be run:

nit, better to avoid passive tense. Rather than "the following can be run", "run the following".

          `GH-10645 <https://github.com/cilium/cilium/issues/10645>`_ for details). To
          avoid that, configure rp_filter in systemd using the following commands:
brb

comment created time in 2 days

PullRequestReviewEvent
PullRequestReviewEvent

push eventjoestringer/cilium

Joe Stringer

commit sha d12aacd070211054dd5251d9da3321c9400317e0

helm: Fix KeepDeprecatedProbes for 1.8 upgrade Commit c8360a6b12ba ("test/k8s: keep configmap across upgrade test") introduced the option KeepDeprecatedProbes which allowed smooth upgrade from 1.7 or earlier to 1.8 or later by generating cilium-agent DaemonSet health probes in the old style if the user specified the keepDeprecatedProbes option during helm install. However, it didn't take into account users upgrading from fresh 1.8.x installs to 1.9.x or later, where deprecated probes should never be used. Teach this setting about "upgradeCompatibility" so that when the user either directly specifies the option, or if they provide the appropriate upgrade compatibility version, the deprecated probes will only be used if the initial install was with Cilium 1.7 or earlier. Signed-off-by: Joe Stringer <joe@cilium.io>

view details

Joe Stringer

commit sha ba7c8224ec108087fc6fa0da4effd3eab4aacf8f

docs: Improve upgradeCompatibility option docs The upgradeCompatibility option was not specified in the default instructions, but there was a more specific option that provided only one piece of upgrade compatibility. Convert the main instructions to use the common variable for providing smooth upgrade which allows control over multiple options. While we're at it, fix the indentation for the values file instructions. Signed-off-by: Joe Stringer <joe@cilium.io>

view details

Joe Stringer

commit sha 7d356ff2cf40528e7be54ce9f627071a33277442

helm: Disable bpf tproxy by default Signed-off-by: Joe Stringer <joe@cilium.io>

view details

Joe Stringer

commit sha 9de5fd6359212c69a55978a4a75c10291b1c22d4

helm: Fix L7 proxy option The way this was written, you could never disable the l7 proxy because the helm chart ignored the setting if it was false. Signed-off-by: Joe Stringer <joe@cilium.io>

view details

Joe Stringer

commit sha ac29aa76ce35bd3d82bf457b707f6769250da797

helm: Remove default options to support upgrade These settings cannot be specified by default in the values.yaml file, because otherwise it would override the upgradeCompatibility value for the flag and hence break compatibility during upgrade from v1.7 (no such option) -> v1.8 (should be disabled during upgrade) -> v1.9. Signed-off-by: Joe Stringer <joe@cilium.io>

view details

André Martins

commit sha 26878ecfa3762dcf494831f4a6af55aff7efea5f

install/kubernetes: add disableEnvoyVersionCheck option [Forward-cherry-pick] With this option users will be able to deploy Cilium without envoy support which is helpful for arm64 clusters. Related: #13650 Signed-off-by: André Martins <andre@cilium.io> Signed-off-by: Joe Stringer <joe@cilium.io>

view details

Aditi Ghag

commit sha 4d07a345860ece4ad8b8ae3867cbc2b26e1a3aa8

examples/doc: Add example LRP yamls and update guide for real-world use cases Update the getting started guide with the kiam use case. Signed-off-by: Aditi Ghag <aditi@cilium.io>

view details

Aditi Ghag

commit sha 92f0a9bcb9c1165b0767f60138b1799e2fdc5ab5

Adds documentation about running node-local-dns with LRP. Signed-off-by: Weilong Cui <cuiwl@google.com> Signed-off-by: Aditi Ghag <aditi@cilium.io>

view details

Joe Stringer

commit sha 2f43aaa96a71663752b35cb33481658b5ba2a55d

docs: Move bpffs requirement to main requirements This requirement is not specific to kubernetes, and the kubernetes guide already points towards the main system requirements document. Move the BPFFS requirement there. Signed-off-by: Joe Stringer <joe@cilium.io>

view details

Joe Stringer

commit sha 28f50fb0af7222395a67ac1593ad35a2c89a13fc

docs: Clarify consequences of unmounted BPFFS Signed-off-by: Joe Stringer <joe@cilium.io>

view details

push time in 3 days

push eventcilium/cilium

Aditi Ghag

commit sha 4d07a345860ece4ad8b8ae3867cbc2b26e1a3aa8

examples/doc: Add example LRP yamls and update guide for real-world use cases Update the getting started guide with the kiam use case. Signed-off-by: Aditi Ghag <aditi@cilium.io>

view details

Aditi Ghag

commit sha 92f0a9bcb9c1165b0767f60138b1799e2fdc5ab5

Adds documentation about running node-local-dns with LRP. Signed-off-by: Weilong Cui <cuiwl@google.com> Signed-off-by: Aditi Ghag <aditi@cilium.io>

view details

push time in 3 days

PR merged cilium/cilium

Reviewers
examples/doc Add example LRP yaml files for real-world use cases needs-backport/1.9 priority/release-blocker release-note/misc

Question: I need to add some notes about the yaml files (e.g., the port names need to match with the ones defined in Daemonset, etc). What's the best place to add such notes? One of the options is the local-redirect policy getting started guide [1].

Also, there are some changes required for the upstream yaml files (nodelocaldns and kiam daemonsets). Should we just mention the required changes in the gsg?

Fixes:#11646 Fixes:#13040

[1] https://github.com/cilium/cilium/blob/master/Documentation/gettingstarted/local-redirect-policy.rst

+396 -5

4 comments

5 changed files

aditighag

pr closed time in 3 days

Pull request review commentcilium/cilium

examples/doc Add example LRP yaml files for real-world use cases

+apiVersion: v1+kind: ServiceAccount+metadata:+  name: node-local-dns+  namespace: kube-system+  labels:+    kubernetes.io/cluster-service: "true"+    addonmanager.kubernetes.io/mode: Reconcile+---+apiVersion: v1+kind: Service+metadata:+  name: kube-dns-upstream+  namespace: kube-system+  labels:+    k8s-app: kube-dns+    kubernetes.io/cluster-service: "true"+    addonmanager.kubernetes.io/mode: Reconcile+    kubernetes.io/name: "KubeDNSUpstream"+spec:+  ports:+  - name: dns+    port: 53+    protocol: UDP+    targetPort: 53+  - name: dns-tcp+    port: 53+    protocol: TCP+    targetPort: 53+  selector:+    k8s-app: kube-dns+---+apiVersion: v1+kind: ConfigMap+metadata:+  name: node-local-dns+  namespace: kube-system+  labels:+    addonmanager.kubernetes.io/mode: Reconcile+data:+  Corefile: |+    cluster.local:53 {+        errors+        cache {+                success 9984 30+                denial 9984 5+        }+        reload+        loop+        bind 0.0.0.0+        forward . __PILLAR__CLUSTER__DNS__ {+                force_tcp+        }+        prometheus :9253+        health+        }+    in-addr.arpa:53 {+        errors+        cache 30+        reload+        loop+        bind 0.0.0.0+        forward . __PILLAR__CLUSTER__DNS__ {+                force_tcp+        }+        prometheus :9253+        }+    ip6.arpa:53 {+        errors+        cache 30+        reload+        loop+        bind 0.0.0.0+        forward . __PILLAR__CLUSTER__DNS__ {+                force_tcp+        }+        prometheus :9253+        }+    .:53 {+        errors+        cache 30+        reload+        loop+        bind 0.0.0.0+        forward . __PILLAR__UPSTREAM__SERVERS__

Looks like not from some brief googling as these will be substituted by the node cache. Anyhow, from the above it sounds like this was already validated so should be good to go.

aditighag

comment created time in 3 days

PullRequestReviewEvent

Pull request review commentcilium/cilium

examples/doc Add example LRP yaml files for real-world use cases

+apiVersion: v1+kind: ServiceAccount+metadata:+  name: node-local-dns+  namespace: kube-system+  labels:+    kubernetes.io/cluster-service: "true"+    addonmanager.kubernetes.io/mode: Reconcile+---+apiVersion: v1+kind: Service+metadata:+  name: kube-dns-upstream+  namespace: kube-system+  labels:+    k8s-app: kube-dns+    kubernetes.io/cluster-service: "true"+    addonmanager.kubernetes.io/mode: Reconcile+    kubernetes.io/name: "KubeDNSUpstream"+spec:+  ports:+  - name: dns+    port: 53+    protocol: UDP+    targetPort: 53+  - name: dns-tcp+    port: 53+    protocol: TCP+    targetPort: 53+  selector:+    k8s-app: kube-dns+---+apiVersion: v1+kind: ConfigMap+metadata:+  name: node-local-dns+  namespace: kube-system+  labels:+    addonmanager.kubernetes.io/mode: Reconcile+data:+  Corefile: |+    cluster.local:53 {+        errors+        cache {+                success 9984 30+                denial 9984 5+        }+        reload+        loop+        bind 0.0.0.0+        forward . __PILLAR__CLUSTER__DNS__ {+                force_tcp+        }+        prometheus :9253+        health+        }+    in-addr.arpa:53 {+        errors+        cache 30+        reload+        loop+        bind 0.0.0.0+        forward . __PILLAR__CLUSTER__DNS__ {+                force_tcp+        }+        prometheus :9253+        }+    ip6.arpa:53 {+        errors+        cache 30+        reload+        loop+        bind 0.0.0.0+        forward . __PILLAR__CLUSTER__DNS__ {+                force_tcp+        }+        prometheus :9253+        }+    .:53 {+        errors+        cache 30+        reload+        loop+        bind 0.0.0.0+        forward . __PILLAR__UPSTREAM__SERVERS__

Do these other PILLAR variables also need substitution or are they fine like this?

aditighag

comment created time in 3 days

PullRequestReviewEvent

pull request commentcilium/cilium

contrib: Add script to bump stable docker image tags

@aanm oh, I guess there was a disconnect here. I marked this as draft under the assumption it's not necessary at this point. But it also has priority/release-blocker. Maybe I'll revert from draft again and we can evaluate whether to accept or just close it out.

joestringer

comment created time in 3 days

Pull request review commentcilium/cilium

Update wording for BPFFS requirement and move to main system requirements page

 Port Range / Protocol    Description 9876/tcp                 cilium-agent health status API ======================== ========================================== +.. _admin_mount_bpffs:++Mounted eBPF filesystem+=======================++.. Note::++        Some distributions mount the bpf filesystem automatically. Check if the+        bpf filesystem is mounted by running the command.++        .. code-block:: shell-session++                  mount | grep /sys/fs/bpf+                  # if present should output, e.g. "none on /sys/fs/bpf type bpf"...++This step is **required for production** environments but optional for testing+and development. It allows the ``cilium-agent`` to pin eBPF resources to a+persistent filesystem and make them persistent across restarts of the agent.+If the eBPF filesystem is not mounted in the host filesystem, Cilium will+automatically mount the filesystem but it will be unmounted and re-mounted when+the Cilium pod is restarted. This in turn will cause eBPF resources to be+removed which impacts cross-node forwarding until the eBPF filesystem is+remounted and Cilium restarts. Mounting the eBPF filesystem in the host mount

It's more the Cilium restarts bit at the end that should fix it. Maybe it should just be:

This in turn will cause eBPF resources to be removed which impacts cross-node forwarding until Cilium restarts.

joestringer

comment created time in 3 days

PullRequestReviewEvent

push eventcilium/cilium

Paul Chaignon

commit sha 984f7bf1829c326ef431d843f1b996dff80aef69

test: Display BPF map content on fail [ upstream commit ec2c18a074de2446186854188c4853c5c5664ffc ] Signed-off-by: Paul Chaignon <paul@cilium.io> Signed-off-by: Nate Sweet <nathanjsweet@pm.me>

view details

push time in 3 days

delete branch cilium/cilium

delete branch : pr/v1.7-backport-2020-10-20

delete time in 3 days

PR merged cilium/cilium

v1.7 backport 2020-10-20 backport/1.7 kind/backports

v1.7 backports 2020-10-20

  • #13295 -- test: Debug RuntimeConntrackInVethModeTest flake (@pchaigno)

Once this PR is merged, you can update the PR labels via:

$ for pr in 13295 ; do contrib/backporting/set-labels.py $pr done 1.7; done
+1 -0

7 comments

1 changed file

nathanjsweet

pr closed time in 3 days

push eventcilium/cilium

Tobias Klauser

commit sha 62cb734049c7138b8bdff9ac583b4b39623df5f3

contrib: match commit subject exactly when searching for upstream commit [ upstream commit 6557f7557ca7ef65f8097743d4fdb935967d686e ] In generate_commit_list_for_pr, the commit subject is used to determine the upstream commit ID from $REMOTE/master. However, if in the meantime another commit with e.g. a Fixes tag that mentions this commit subject, it appears first and leads to the original commit not being found. This can be demonstrated using #13383: ``` * PR: 13383 -- daemon: Enable configuration of iptables --random-fully (@kh34) -- https://github.com/cilium/cilium/pull/13383 Merge with 2 commit(s) merged at: Wed, 14 Oct 2020 11:41:51 +0200! Branch: master (!) refs/pull/13383/head ---------- ------------------- v (start) | Warning: No commit correlation found! via dbac86cffc6d57e8c093d2821e0d794f4c13d284 ("daemon: Enable configuration of iptables --random-fully") | 350f0b36fd9b4cf23ebc11f4365c5c89591d0ff4 via 22d4554e963e2d8029ff95087ac03e55e90a7377 ("test: Test iptables masquerading with --random-fully") v (end) $ # this is the git log command (with the subject added) from $ # contrib/backporting/check-stable that should extract a single $ # upstream commit $ git log -F --since="1year" --pretty="%H %s" --no-merges --grep "daemon: Enable configuration of iptables --random-fully" origin/master 078ec543d36a8f5d6caed5c4649c74c72090ae20 install/kubernetes: consistent case spelling of iptables related values 4e39def13bca568a21087238877fbc60f8751567 daemon: Enable configuration of iptables --random-fully $ git show 078ec543d36a8f5d6caed5c4649c74c72090ae20 commit 078ec543d36a8f5d6caed5c4649c74c72090ae20 Author: Tobias Klauser <tklauser@distanz.ch> Date: Wed Oct 14 11:58:29 2020 +0200 install/kubernetes: consistent case spelling of iptables related values Make the case spelling of the newly introduced "ipTablesRandomFully" value consistent with other iptables option values which use the "iptables" spelling. Fixes: 4e39def13bca ("daemon: Enable configuration of iptables --random-fully") Signed-off-by: Tobias Klauser <tklauser@distanz.ch> ``` Note the `Fixes: ...` line in commit 078ec543d36a8f5d6caed5c4649c74c72090ae20 above. Fix this behavior by grepping for the subject line from start of line: ``` $ git log -F --since="1year" --pretty="%H %s" --no-merges --extended-regexp --grep "^daemon: Enable configuration of iptables --random-fully" origin/master 4e39def13bca568a21087238877fbc60f8751567 daemon: Enable configuration of iptables --random-fully * PR: 13383 -- daemon: Enable configuration of iptables --random-fully (@kh34) -- https://github.com/cilium/cilium/pull/13383 Merge with 2 commit(s) merged at: Wed, 14 Oct 2020 11:41:51 +0200! Branch: master (!) refs/pull/13383/head ---------- ------------------- v (start) | 4e39def13bca568a21087238877fbc60f8751567 via dbac86cffc6d57e8c093d2821e0d794f4c13d284 ("daemon: Enable configuration of iptables --random-fully") | 350f0b36fd9b4cf23ebc11f4365c5c89591d0ff4 via 22d4554e963e2d8029ff95087ac03e55e90a7377 ("test: Test iptables masquerading with --random-fully") v (end) ``` Reported-by: Robin Hahling <robin.hahling@gw-computing.net> Signed-off-by: Tobias Klauser <tklauser@distanz.ch> Signed-off-by: Nate Sweet <nathanjsweet@pm.me>

view details

Maciej Kwiek

commit sha 4bdf13b7f672e68820a2e1d370083f8b5c875570

helm: disable priorityClass for gke [ upstream commit 7c265a660a3b42e7bf67a90a08dc95210cf910e0 ] Since `helm template` doesn't allow to set `.Capabilities.KubeVersion`, we need additional option to force helm into not setting priorityClass for pods running in namespace other than `kube-system`, which is currently the case on GKE. Signed-off-by: Maciej Kwiek <maciej@isovalent.com> Signed-off-by: Nate Sweet <nathanjsweet@pm.me>

view details

Maciej Kwiek

commit sha 27f857a880808e63452d9cac5ae3cd2b900dd560

ci: run baseline perf tests in gh action [ upstream commit e1dcb61401e47670a4517895984c6c11c8eda7cb ] Signed-off-by: Maciej Kwiek <maciej@isovalent.com> Signed-off-by: Nate Sweet <nathanjsweet@pm.me>

view details

Ilya Dmitrichenko

commit sha f180eabfb878330c411752e83f8f729d2cdbea12

images: Fix handing of dev suffix when tag is used (cilium/image-tools#76) [ upstream commit cc976735855d25d184ecb9b98049674b03ca45c0 ] Signed-off-by: Ilya Dmitrichenko <errordeveloper@gmail.com> Signed-off-by: Nate Sweet <nathanjsweet@pm.me>

view details

Tobias Klauser

commit sha 8555a99c5fc968ddaef400641c4d82d105f42821

k8s/watchers: fix data race in (*K8sWatcher).addK8sServiceV1 [ upstream commit 122af2c9e6f9a56f2ea50dafc8e938129e8f7549 ] Access to k.podStore must be protected by k.podStoreMU. Closes #13603 Fixes: e7bb8a7eadb5 ("k8s/cilium Event handlers and processing logic for LRPs") Signed-off-by: Tobias Klauser <tklauser@distanz.ch> Signed-off-by: Nate Sweet <nathanjsweet@pm.me>

view details

Tobias Klauser

commit sha 3f3e0e8caa1239da11b7cb78b084ad6f48bbc0ab

redirectpolicy: use StoreGetter to get pod store [ upstream commit 7fae9855fdae085dce6621faf19522b313be7de2 ] Dont' pass k.podStore for every k8s service add. Instead, use a StoreGetter to get the pod store. Suggested-by: André Martins <andre@cilium.io> Signed-off-by: Tobias Klauser <tklauser@distanz.ch> Signed-off-by: Nate Sweet <nathanjsweet@pm.me>

view details

hui.kong

commit sha 6525380a4157f9e79fdec8229392856498a70166

add error log When ipam allocate nodecidr failure [ upstream commit ef6ecbdb6183e5bd3ecd37590d0ddc46c79cfc13 ] Signed-off-by: hui.kong <konghui@live.cn> Signed-off-by: Nate Sweet <nathanjsweet@pm.me>

view details

Paul Chaignon

commit sha 251c08a29632de080f2ded8d40229e738345dd28

test: Display BPF map content on fail [ upstream commit ec2c18a074de2446186854188c4853c5c5664ffc ] Signed-off-by: Paul Chaignon <paul@cilium.io> Signed-off-by: Nate Sweet <nathanjsweet@pm.me>

view details

Chris Tarazi

commit sha 80530cc84b464191d42b3e4dd485bd8f4038df74

eni: Fix data race during multiple updates at once [ upstream commit e2d8e224247e40e9ee0a60c7c922639ed597301b ] This data race was found by modifying the (*ENISuite).TestNodeManagerDefaultAllocation test to inject spurious update calls to the IPAM NodeManager for the CiliumNode that's already being updated. Fixes: ``` WARNING: DATA RACE Read at 0x00c0005a09e8 by goroutine 36: github.com/cilium/cilium/pkg/aws/eni.(*Node).CreateInterface() /home/chris/code/cilium/cilium/pkg/aws/eni/node.go:358 +0x4aa github.com/cilium/cilium/pkg/ipam.(*Node).createInterface() /home/chris/code/cilium/cilium/pkg/ipam/node.go:441 +0x2a1 github.com/cilium/cilium/pkg/ipam.(*Node).maintainIPPool() /home/chris/code/cilium/cilium/pkg/ipam/node.go:634 +0x868 github.com/cilium/cilium/pkg/ipam.(*Node).MaintainIPPool() /home/chris/code/cilium/cilium/pkg/ipam/node.go:678 +0xd0 github.com/cilium/cilium/pkg/ipam.(*NodeManager).Update.func2() /home/chris/code/cilium/cilium/pkg/ipam/node_manager.go:279 +0xab github.com/cilium/cilium/pkg/trigger.(*Trigger).waiter() /home/chris/code/cilium/cilium/pkg/trigger/trigger.go:206 +0x4db Previous write at 0x00c0005a09e8 by goroutine 40: github.com/cilium/cilium/pkg/aws/eni.(*Node).UpdatedNode() /home/chris/code/cilium/cilium/pkg/aws/eni/node.go:66 +0x85 github.com/cilium/cilium/pkg/ipam.(*Node).UpdatedResource() /home/chris/code/cilium/cilium/pkg/ipam/node.go:325 +0x81 github.com/cilium/cilium/pkg/ipam.(*NodeManager).Update.func1() /home/chris/code/cilium/cilium/pkg/ipam/node_manager.go:263 +0x8e github.com/cilium/cilium/pkg/ipam.(*NodeManager).Update() /home/chris/code/cilium/cilium/pkg/ipam/node_manager.go:321 +0x1e6 github.com/cilium/cilium/pkg/aws/eni.(*ENISuite).TestNodeManagerDefaultAllocation.func2() /home/chris/code/cilium/cilium/pkg/aws/eni/node_manager_test.go:276 +0x6f Goroutine 36 (running) created at: github.com/cilium/cilium/pkg/trigger.NewTrigger() /home/chris/code/cilium/cilium/pkg/trigger/trigger.go:129 +0x24e github.com/cilium/cilium/pkg/ipam.(*NodeManager).Update() /home/chris/code/cilium/cilium/pkg/ipam/node_manager.go:274 +0x659 github.com/cilium/cilium/pkg/aws/eni.(*ENISuite).TestNodeManagerDefaultAllocation() /home/chris/code/cilium/cilium/pkg/aws/eni/node_manager_test.go:262 +0xc15 runtime.call32() /usr/lib/go/src/runtime/asm_amd64.s:540 +0x3d reflect.Value.Call() /usr/lib/go/src/reflect/value.go:336 +0xd8 gopkg.in/check%2ev1.(*suiteRunner).forkTest.func1() /home/chris/code/cilium/cilium/vendor/gopkg.in/check.v1/check.go:781 +0xabb gopkg.in/check%2ev1.(*suiteRunner).forkCall.func1() /home/chris/code/cilium/cilium/vendor/gopkg.in/check.v1/check.go:675 +0xe1 Goroutine 40 (running) created at: github.com/cilium/cilium/pkg/aws/eni.(*ENISuite).TestNodeManagerDefaultAllocation() /home/chris/code/cilium/cilium/pkg/aws/eni/node_manager_test.go:274 +0x10b2 runtime.call32() /usr/lib/go/src/runtime/asm_amd64.s:540 +0x3d reflect.Value.Call() /usr/lib/go/src/reflect/value.go:336 +0xd8 gopkg.in/check%2ev1.(*suiteRunner).forkTest.func1() /home/chris/code/cilium/cilium/vendor/gopkg.in/check.v1/check.go:781 +0xabb gopkg.in/check%2ev1.(*suiteRunner).forkCall.func1() /home/chris/code/cilium/cilium/vendor/gopkg.in/check.v1/check.go:675 +0xe1 ``` Signed-off-by: Chris Tarazi <chris@isovalent.com> Signed-off-by: Nate Sweet <nathanjsweet@pm.me>

view details

Chris Tarazi

commit sha e3b83ab53f220c8d95875941333d0eb400ca2b9a

eni: Signal to avoid locking IPAM node in ENI [ upstream commit 6a5ecf0f992bcb2a31e909d7f29d19148f07f0e5 ] This field does not need to be protected by the mutex because it is already protected by its own mutex. This commit moves the declaration of the mutex field to be below the ipam.Node field to indicate that it's not protected by the mutex, and therefore does not make any functional changes. Signed-off-by: Chris Tarazi <chris@isovalent.com> Signed-off-by: Nate Sweet <nathanjsweet@pm.me>

view details

Martynas Pumputis

commit sha 216c2be7d7c2ed52c6f4448310616fff0bf904e6

docs: Add note about src/dst IP check on cloud for DSR [ upstream commit ad6cffe1efc69551fdada27e184394855b38867e ] Signed-off-by: Martynas Pumputis <m@lambda.lt> Signed-off-by: Nate Sweet <nathanjsweet@pm.me>

view details

Martynas Pumputis

commit sha df09f0f9a80d7bcf1e73f934f1dd8320006711a6

docs: Mention about no ClusterIP access from outside cluster [ upstream commit 6e85cb49682e9a1eedbebb0079ebc35be263b9c2 ] The k8s Service "spec" disallows accessing ClusterIP services from outside a cluster. Signed-off-by: Martynas Pumputis <m@lambda.lt> Signed-off-by: Nate Sweet <nathanjsweet@pm.me>

view details

Daniel Borkmann

commit sha dab3559f5418aad251f3105f10ddc95dc722f03c

bpf: lift v4-in-v6 limitation on lrp services [ upstream commit f1c3c71f0003a62c6db65342fa87b68c9ceb8e9b ] v4-in-v6 for LRP services was originally not implemented due to throwing an obscure verifier error with no particular log message involved: [...] level=warning msg=“Prog section ‘connect6’ rejected: Unknown error 524 (524)!” subsys=datapath-loader level=warning msg=” - Type: 18" subsys=datapath-loader level=warning msg=” - Attach Type: 11" subsys=datapath-loader level=warning msg=” - Instructions: 1105 (0 over limit)” subsys=datapath-loader level=warning msg=” - License: GPL” subsys=datapath-loader level=warning subsys=datapath-loader level=warning msg=“Verifier analysis:” subsys=datapath-loader level=warning subsys=datapath-loader level=warning msg=“processed 2500 insns (limit 1000000) max_states_per_insn 3 total_states 169 peak_states 169 mark_read 26" subsys=datapath-loader level=warning subsys=datapath-loader [...] This may indeed be an odd combination of either LLVM code generation or verifier quirk. It is fixed by changing the return 0 to -ENXIO (or some other error code). Changing to an error code is the correct approach either way as otherwise we indicate to sock6_xlate_v4_in_v6() that a front-to-backend mapping actually took place which clearly did not though. I've opened #13637 to investigate verifier/kernel side for later. Also, simplify the ifdef wrapping a bit and let the compiler optimize away the conditional when sock4_skip_xlate_if_same_netns() statically returns false. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Nate Sweet <nathanjsweet@pm.me>

view details

Daniel Borkmann

commit sha 582956601f34762ce9a70881c0265d1da8c24607

test: add v4-in-v6 test cases on lrp services [ upstream commit cc63b9874b7aae1fc1f149c94b8f74232c86c5d9 ] Extend the LRP curl tests to also include v4-in-v6 URLs. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Nate Sweet <nathanjsweet@pm.me>

view details

Andor Nemeth

commit sha 9ab95441045ceda1de81db6e602a028a20ee6d2b

Expose operator azure-user-assigned-identity-id flag to its chart [ upstream commit ad8f0510175f48179934299a88db3ec542b70809 ] Signed-off-by: Andor Nemeth <ombre9@gmail.com> Signed-off-by: Nate Sweet <nathanjsweet@pm.me>

view details

Tom Payne

commit sha 21774b2687ee0a68c98c342c1f00de4bed7849e1

build: Factor out test-docs target [ upstream commit 3c15ad1f11b97a40bd94d252bbef819b7c14e2a2 ] run-server starts a Docker container, which is not needed if you only want to verify the structure of the documentation. Signed-off-by: Tom Payne <tom@isovalent.com> Signed-off-by: Nate Sweet <nathanjsweet@pm.me>

view details

Tom Payne

commit sha abd89f197fd03989b7b7ef6156c8ec00c78e0596

docs/gettingstarted: Update AKS guide [ upstream commit 992c321178f1330e3dd1f987cea43a130e3e3417 ] Refs: #13627 Signed-off-by: Tom Payne <tom@isovalent.com> Signed-off-by: Nate Sweet <nathanjsweet@pm.me>

view details

Daniel Borkmann

commit sha cd02aa3c0ef631c00cc15020b0a26beff1d9a396

bpf: update/sync helper list [ upstream commit 318abd223b1769274f0e7c253e0bc4940309b467 ] Pull in latest BPF uapi helpers from Linus' tree. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Nate Sweet <nathanjsweet@pm.me>

view details

Daniel Borkmann

commit sha b5f036de450ff143abf02e9c340205d4455bfeb9

bpf: remove warning on host legacy routing when reenabling [ upstream commit 11c03b90180bc5c9dc07a32a4580de65888acc6c ] Given it's an optimization, remove the warning to avoid users hitting this every time. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Nate Sweet <nathanjsweet@pm.me>

view details

Daniel Borkmann

commit sha 343563a4347298aa27ff7609731c5bbdde4ce60d

bpf: rename ENABLE_REDIRECT_NEIGH into _FAST [ upstream commit f75bc53aa0c10dfa2954b497f7395cdc3f70fc13 ] Fwiw, it's slightly more appropriate given it's not just about the availability of that helper but about being able to have fast path through host ns. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Nate Sweet <nathanjsweet@pm.me>

view details

push time in 3 days

delete branch cilium/cilium

delete branch : pr/v1.9-backport-2020-10-21

delete time in 3 days

PR merged cilium/cilium

v1.9 backports 2020-10-21 backport/1.9 kind/backports ready-to-merge

v1.9 backports 2020-10-21

  • [x] #13630 -- contrib: match commit subject exactly when searching for upstream commit (@tklauser)
  • [ ] #13376 -- ci: run baseline perf tests nightly (@nebril)
  • [ ] #13639 -- images: Fix handing of dev suffix when tag is used (cilium/image-tools#76) (@errordeveloper)
  • [x] #13604 -- k8s/watchers: fix data race in (*K8sWatcher).addK8sServiceV1 (@tklauser)
  • [ ] #13299 -- Add log when allocate nodecidr failure (@konghui)
  • [x] #13295 -- test: Debug RuntimeConntrackInVethModeTest flake (@pchaigno)
  • [x] #13612 -- eni: Fix data race during multiple updates at once (@christarazi)
  • [ ] #13640 -- docs: Document some caveats of kube-proxy replacement (@brb)
  • [ ] #13638 -- bpf: fix up lrp for v4-in-v6 sockets (@borkmann)
  • [ ] #13424 -- Expose operator azure-user-assigned-identity-id flag to its chart (@ombre9)
  • [ ] #13632 -- docs/gettingstarted: Update AKS instructions (@twpayne)
  • [ ] #13646 -- bpf: redirect fixes and follow-ups (@borkmann)
  • [x] #13643 -- docker: update Hubble CLI to v0.7.0 (@Rolinh)
  • [x] #13642 -- checkpatch: switch to an external container image (@qmonnet)
  • [ ] #13620 -- vendor: pin yaml.v2 to v2.2.8 (@twpayne)
  • [ ] #13550 -- operator: Fix CEP owner type (@jrajahalme)
  • [x] #13651 -- docs: fix minor issue in cilium support with external etcd gsg (@fristonio)
  • [x] #13644 -- Fixes for troubleshooting guide re. Hubble/Hubble Relay (@tklauser)
  • [ ] #13645 -- docs: GKE - fix some indentation, specify bash code segments (@ti-mo)
  • [x] #13665 -- docs: NodePort XDP on GCP is not supported (@gandro)
  • [x] #13654 -- k8s: update k8s libraries to 1.19.3 (@aanm)
  • [x] #13661 -- docs: Fix broken formating and link (@pchaigno)
  • [x] #13607 -- helm: Remove hardcoded port check for hubble, etc (@nathanjsweet)
  • [x] #13677 -- helm: keep encryption interface value undefined (@kkourt)
  • [x] #13684 -- install/kubernetes: remove nodePort.device from values (@tklauser)

Once this PR is merged, you can update the PR labels via:

$ for pr in 13630 13376 13639 13604 13299 13295 13612 13640 13638 13424 13632 13646 13643 13642 13620 13550 13651 13644 13645 13665 13654 13661 13607 13677 13684; do contrib/backporting/set-labels.py $pr done 1.9; done
+6349 -10789

5 comments

96 changed files

nathanjsweet

pr closed time in 3 days

PullRequestReviewEvent

push eventcilium/cilium

Joe Stringer

commit sha d12aacd070211054dd5251d9da3321c9400317e0

helm: Fix KeepDeprecatedProbes for 1.8 upgrade Commit c8360a6b12ba ("test/k8s: keep configmap across upgrade test") introduced the option KeepDeprecatedProbes which allowed smooth upgrade from 1.7 or earlier to 1.8 or later by generating cilium-agent DaemonSet health probes in the old style if the user specified the keepDeprecatedProbes option during helm install. However, it didn't take into account users upgrading from fresh 1.8.x installs to 1.9.x or later, where deprecated probes should never be used. Teach this setting about "upgradeCompatibility" so that when the user either directly specifies the option, or if they provide the appropriate upgrade compatibility version, the deprecated probes will only be used if the initial install was with Cilium 1.7 or earlier. Signed-off-by: Joe Stringer <joe@cilium.io>

view details

Joe Stringer

commit sha ba7c8224ec108087fc6fa0da4effd3eab4aacf8f

docs: Improve upgradeCompatibility option docs The upgradeCompatibility option was not specified in the default instructions, but there was a more specific option that provided only one piece of upgrade compatibility. Convert the main instructions to use the common variable for providing smooth upgrade which allows control over multiple options. While we're at it, fix the indentation for the values file instructions. Signed-off-by: Joe Stringer <joe@cilium.io>

view details

Joe Stringer

commit sha 7d356ff2cf40528e7be54ce9f627071a33277442

helm: Disable bpf tproxy by default Signed-off-by: Joe Stringer <joe@cilium.io>

view details

Joe Stringer

commit sha 9de5fd6359212c69a55978a4a75c10291b1c22d4

helm: Fix L7 proxy option The way this was written, you could never disable the l7 proxy because the helm chart ignored the setting if it was false. Signed-off-by: Joe Stringer <joe@cilium.io>

view details

Joe Stringer

commit sha ac29aa76ce35bd3d82bf457b707f6769250da797

helm: Remove default options to support upgrade These settings cannot be specified by default in the values.yaml file, because otherwise it would override the upgradeCompatibility value for the flag and hence break compatibility during upgrade from v1.7 (no such option) -> v1.8 (should be disabled during upgrade) -> v1.9. Signed-off-by: Joe Stringer <joe@cilium.io>

view details

André Martins

commit sha 26878ecfa3762dcf494831f4a6af55aff7efea5f

install/kubernetes: add disableEnvoyVersionCheck option [Forward-cherry-pick] With this option users will be able to deploy Cilium without envoy support which is helpful for arm64 clusters. Related: #13650 Signed-off-by: André Martins <andre@cilium.io> Signed-off-by: Joe Stringer <joe@cilium.io>

view details

push time in 3 days

delete branch joestringer/cilium

delete branch : submit/helm-changes-2020-10-21

delete time in 3 days

PR merged cilium/cilium

Fix Helm upgrade compatibility needs-backport/1.9 priority/release-blocker release-note/misc

Review commit-by-commit.

Here's the diff between a v1.7 configmap generated from the tip of v1.7 against the new v1.9 configmap generated with --set upgradeCompatibility=1.7 using this PR:

$ diff -u cilium-1.7-cm.yaml cilium-1.9-with-17-compat-cm.yaml
--- cilium-1.7-cm.yaml  2020-10-21 22:12:29.611909244 -0700
+++ cilium-1.9-with-17-compat-cm.yaml   2020-10-21 22:34:07.738427408 -0700
@@ -1,4 +1,4 @@
-# Source: cilium/charts/config/templates/configmap.yaml
+# Source: cilium/templates/cilium-configmap.yaml
 apiVersion: v1
 kind: ConfigMap
 metadata:
@@ -17,6 +17,7 @@
   #   the kvstore by commenting out the identity-allocation-mode below, or
   #   setting it to "kvstore".
   identity-allocation-mode: crd
+  cilium-endpoint-gc-interval: "5m0s"

   # If you want to run cilium in debug mode change this value to true
   debug: "false"
@@ -28,7 +29,9 @@
   # Enable IPv6 addressing. If enabled, all endpoints are allocated an IPv6
   # address.
   enable-ipv6: "false"
-
+  # Users who wish to specify their own custom CNI configuration file must set
+  # custom-cni-conf to "true", otherwise Cilium may overwrite the configuration.
+  custom-cni-conf: "false"
   # If you want cilium monitor to aggregate tracing for packets, set this level
   # to "low", "medium", or "maximum". The higher the level, the less packets
   # that will be seen in monitor output.
@@ -45,8 +48,7 @@
   #
   # Only effective when monitor aggregation is set to "medium" or higher.
   monitor-aggregation-flags: all
-
-  # ct-global-max-entries-* specifies the maximum number of connections
+  # bpf-ct-global-*-max specifies the maximum number of connections
   # supported across all endpoints, split by protocol: tcp or other. One pair
   # of maps uses these values for IPv4 connections, and another pair of maps
   # use these values for IPv6 connections.
@@ -56,14 +58,15 @@
   # policy drops or a change in loadbalancing decisions for a connection.
   #
   # For users upgrading from Cilium 1.2 or earlier, to minimize disruption
-  # during the upgrade process, comment out these options.
+  # during the upgrade process, set bpf-ct-global-tcp-max to 1000000.
   bpf-ct-global-tcp-max: "524288"
   bpf-ct-global-any-max: "262144"
-
-  # bpf-policy-map-max specified the maximum number of entries in endpoint
+  # bpf-policy-map-max specifies the maximum number of entries in endpoint
   # policy map (per endpoint)
   bpf-policy-map-max: "16384"
-
+  # bpf-lb-map-max specifies the maximum number of entries in bpf lb service,
+  # backend and affinity maps.
+  bpf-lb-map-max: "65536"
   # Pre-allocation of map entries allows per-packet latency to be reduced, at
   # the expense of up-front memory allocation for the entries in the maps. The
   # default value below will minimize memory usage in the default installation;
@@ -95,42 +98,33 @@

   # Name of the cluster. Only relevant when building a mesh of clusters.
   cluster-name: default
-
-  # DNS Polling periodically issues a DNS lookup for each `matchName` from
-  # cilium-agent. The result is used to regenerate endpoint policy.
-  # DNS lookups are repeated with an interval of 5 seconds, and are made for
-  # A(IPv4) and AAAA(IPv6) addresses. Should a lookup fail, the most recent IP
-  # data is used instead. An IP change will trigger a regeneration of the Cilium
-  # policy for each endpoint and increment the per cilium-agent policy
-  # repository revision.
-  #
-  # This option is disabled by default starting from version 1.4.x in favor
-  # of a more powerful DNS proxy-based implementation, see [0] for details.
-  # Enable this option if you want to use FQDN policies but do not want to use
-  # the DNS proxy.
-  #
-  # To ease upgrade, users may opt to set this option to "true".
-  # Otherwise please refer to the Upgrade Guide [1] which explains how to
-  # prepare policy rules for upgrade.
-  #
-  # [0] http://docs.cilium.io/en/stable/policy/language/#dns-based
-  # [1] http://docs.cilium.io/en/stable/install/upgrade/#changes-that-may-require-action
-  tofqdns-enable-poller: "false"
+  # Enables L7 proxy for L7 policy enforcement and visibility
+  enable-l7-proxy: "true"

   # wait-bpf-mount makes init container wait until bpf filesystem is mounted
   wait-bpf-mount: "false"

   masquerade: "true"
+
   enable-xt-socket-fallback: "true"
   install-iptables-rules: "true"
+
   auto-direct-node-routes: "false"
+  enable-bandwidth-manager: "false"
   kube-proxy-replacement:  "probe"
-  enable-host-reachable-services: "false"
-  enable-external-ips: "false"
-  enable-node-port: "false"
+  kube-proxy-replacement-healthz-bind-address: ""
   enable-health-check-nodeport: "true"
   node-port-bind-protection: "true"
   enable-auto-protect-node-port-range: "true"
   enable-endpoint-health-checking: "true"
+  enable-health-checking: "true"
   enable-well-known-identities: "false"
   enable-remote-node-identity: "true"
+  # Enable Hubble gRPC service.
+  enable-hubble: "true"
+  # UNIX domain socket for Hubble server to listen to.
+  hubble-socket-path:  "/var/run/cilium/hubble.sock"
+  ipam: "cluster-pool"
+  cluster-pool-ipv4-cidr: "10.0.0.0/8"
+  cluster-pool-ipv4-mask-size: "24"
+

Basically this represents the eventual set of options that someone would end up with if they upgrade first from 1.7 to 1.8 (with --set config.upgradeCompatibility=1.7) to 1.9 (with --set upgradeCompatibility=1.7).

Related: #12288 Related: #13650

+50 -33

7 comments

7 changed files

joestringer

pr closed time in 3 days

pull request commentcilium/cilium

Fix Helm upgrade compatibility

Only checks are the false negative coveralls and known flake. Merging.

joestringer

comment created time in 3 days

pull request commentcilium/cilium

v1.8 backports 2020-10-22

There was an out-of-band thread discussing whether the presence of #13383 means that we should also wait for https://github.com/cilium/cilium/pull/13694 before merging this set of backports.

pchaigno

comment created time in 3 days

push eventcilium/cilium

Jarno Rajahalme

commit sha a2c9ee29ab4fa5b845f5d0a30084540dec89f539

endpoint: Avoid unnecessary warning logs [ upstream commit df6ede7fa5555984d5b60d961cf3c90a965a6cdb ] Do not log a warning when can't release the ID of a disconnected endpoint. These changes remove warning logs like: msg="Unable to restore endpoint, ignoring" endpointID=1925 error="interface lxc18d62e89ea16 could not be found" k8sPodName=default/spaceship-d5d56b59-6c582 subsys=daemon msg="Unable to release endpoint ID" error="Unable to release endpoint ID 1925" state=disconnected subsys=endpoint Signed-off-by: Jarno Rajahalme <jarno@covalent.io> Signed-off-by: Paul Chaignon <paul@cilium.io>

view details

Paul Chaignon

commit sha bb49619febba44d56ca090792ccf13c5aece2219

endpoint: Avoid benign error messages on restoration [ upstream commit 228a485f2a5441f506ba9c0f357321c060f93590 ] During the endpoint restoration process, when we parse the endpoints, we assign them a reserved init identity if they don't already have an identity [0]. If we later remove the endpoint (because the corresponding K8s pod or interface are missing), we attempt to remove the identity from the identity manager. That last operation results in the following error message because the init identity was never added to the manager. level=error msg="removing identity not added to the identity manager!" identity=5 subsys=identitymanager This commit fixes it by skipping the removal attempt from the manager in the case of identity init. 0 - https://github.com/cilium/cilium/blob/80a71791320df34df5b6252b9680553e38d88d20/pkg/endpoint/endpoint.go#L819 Signed-off-by: Paul Chaignon <paul@cilium.io>

view details

Paul Chaignon

commit sha 9494169cd0b1e56a06e2d2483922c29584760c4b

backporting: Update labels by default when submitting backport [ upstream commit a8e67f1139aa9f384d2c0365aecfacd3532bda10 ] When submitting the backport PR using submit-backport, the script proposes to update the labels (i.e., remove needs-backport/X and add backport-pending/X): Sending pull request... Everything up-to-date https://github.com/cilium/cilium/pull/13700 Updating labels for PRs 13383 13608 12975 Set labels for all PRs above? [y/N] y Setting labels for PR 13383... ✓ Setting labels for PR 13608... ✓ Setting labels for PR 12975... ✓ The choice defaults to not updating the labels. That may give the wrong impression that it is an optional step---and if you're like me, when you're unsure what an optional step does, you skip it. We should default to setting the labels because we later rely on the labels being set (e.g., when we update them after the PR is merged). This commit changes it to the following: Sending pull request... Everything up-to-date https://github.com/cilium/cilium/pull/13700 Updating labels for PRs 13383 13608 12975 Set labels for all PRs above? [Y/n] Setting labels for PR 13383... ✓ Setting labels for PR 13608... ✓ Setting labels for PR 12975... ✓ Signed-off-by: Paul Chaignon <paul@cilium.io>

view details

André Martins

commit sha fda842233aa17c29ea46808a40cbf2d20cebb24a

pkg/endpoint: reduce cardinality of prometheus labels [ upstream commit ec16cab361309155d012ce12b93750fc5b876c9d ] If the controller that is used for label resolution fails, the prometheus metrics will increase its cardinality since the uniquely controller name was being used as a prometheus label. To avoid this, we will reference these warnings with a common subsystem name, 'resolve-labels'. Fixes: a31ab29f57b2 ("endpoint: Run labels controller under ep manager") Signed-off-by: André Martins <andre@cilium.io> Signed-off-by: Paul Chaignon <paul@cilium.io>

view details

push time in 3 days

delete branch cilium/cilium

delete branch : pr/v1.7-backport-2020-10-22

delete time in 3 days

PR merged cilium/cilium

Reviewers
v1.7 backports 2020-10-22 backport/1.7 kind/backports
  • #10974 -- endpoint: Avoid logging about disconnected EPs during restore (@jrajahalme)
  • #13667 -- endpoint: Avoid benign error messages on restoration (@pchaigno)
  • #13703 -- backporting: Update labels by default when submitting backport (@pchaigno)
  • #13699 -- pkg/endpoint: reduce cardinality of prometheus labels (@aanm)

Skipped:

  • #12313 -- metrics: fix negative identity count (@ArthurChiao)
  • #13244 -- lbmap: Correct issue that port info display error (@Jianlin-lv)

Once this PR is merged, you can update the PR labels via:

$ for pr in 10974 13667 13703 13699; do contrib/backporting/set-labels.py $pr done 1.7; done
+18 -8

1 comment

5 changed files

pchaigno

pr closed time in 3 days

pull request commentcilium/cilium

v1.7 backport 2020-10-20

test-upstream-k8s

nathanjsweet

comment created time in 3 days

pull request commentcilium/cilium

v1.7 backport 2020-10-20

Cluster setup for upstream tests seems to have failed:

https://jenkins.cilium.io/job/Cilium-PR-Kubernetes-Upstream/2612/execution/node/46/log/

will re-kick.

nathanjsweet

comment created time in 3 days

issue commentcilium/cilium

CI: K8sChaosTest Connectivity demo application Endpoint can still connect while Cilium is not running

Hit during K8sBandwidthTest Checks Bandwidth Rate-Limiting test: in #13691:

https://jenkins.cilium.io/job/Cilium-PR-Ginkgo-Tests-Kernel/3558/testReport/Suite-k8s-1/18/K8sBandwidthTest_Checks_Bandwidth_Rate_Limiting/

pchaigno

comment created time in 3 days

pull request commentcilium/cilium

Fix Helm upgrade compatibility

4.19 test hit what looks like a variation on known flake #13552.

joestringer

comment created time in 3 days

PullRequestReviewEvent

push eventjoestringer/cilium

Tam Mach

commit sha 64e681d9878dd183374baef7f27507895d2f3d7b

vagrant: Add host name argument in vagrant port command This commit is to add required argument for `vagrant port` command to avoid below error ```shell script $ vagrant port --guest 6443 The provider reported there are no forwarded ports for this virtual machine. This can be caused if there are no ports specified in the Vagrantfile or if the virtual machine is not currently running. Please check that the virtual machine is running and try again. ``` Relates to b68af484d Signed-off-by: Tam Mach <sayboras@yahoo.com>

view details

Gilberto Bertin

commit sha 4931f6b81fe52fdd23f3ae2c3361b8fce7757950

docs: docker: update output of `cilium status` command For #13627 Signed-off-by: Gilberto Bertin <gilberto@isovalent.com>

view details

Gilberto Bertin

commit sha 2c2b5debc82784d26df1af844aedcd29b818d369

docs: docker: update output of `cilium policy` commands For #13627 Signed-off-by: Gilberto Bertin <gilberto@isovalent.com>

view details

Robin Hahling

commit sha bfc0078511ab45074edae662b6702ab489b581c2

docker: update Hubble CLI to v0.7.1 This release fixes a bug which prevents certain environment variables from being used for configuration. See the release notes[0] for details. [0]: https://github.com/cilium/hubble/releases/tag/v0.7.1 Signed-off-by: Robin Hahling <robin.hahling@gw-computing.net>

view details

Gilberto Bertin

commit sha 1672c81ef5f86f4de97a7ca5ef05bbabd4750ba8

examples: getting-started: bump Cilium docker image to 1.9 For #13627 Signed-off-by: Gilberto Bertin <gilberto@isovalent.com>

view details

Gilberto Bertin

commit sha f648be63df92ad22e24f5ff5c137ce6dbf812047

examples: getting-started: get rid of cilium_tag in Vagrantfile This commit unifies the `cilium_version` and `cilium_tag` variables in the Vagrantfile for the getting started tutorial into a single variable, `cilium_version`. Signed-off-by: Gilberto Bertin <gilberto@isovalent.com>

view details

Paul Chaignon

commit sha a8e67f1139aa9f384d2c0365aecfacd3532bda10

backporting: Update labels by default when submitting backport When submitting the backport PR using submit-backport, the script proposes to update the labels (i.e., remove needs-backport/X and add backport-pending/X): Sending pull request... Everything up-to-date https://github.com/cilium/cilium/pull/13700 Updating labels for PRs 13383 13608 12975 Set labels for all PRs above? [y/N] y Setting labels for PR 13383... ✓ Setting labels for PR 13608... ✓ Setting labels for PR 12975... ✓ The choice defaults to not updating the labels. That may give the wrong impression that it is an optional step---and if you're like me, when you're unsure what an optional step does, you skip it. We should default to setting the labels because we later rely on the labels being set (e.g., when we update them after the PR is merged). This commit changes it to the following: Sending pull request... Everything up-to-date https://github.com/cilium/cilium/pull/13700 Updating labels for PRs 13383 13608 12975 Set labels for all PRs above? [Y/n] Setting labels for PR 13383... ✓ Setting labels for PR 13608... ✓ Setting labels for PR 12975... ✓ Signed-off-by: Paul Chaignon <paul@cilium.io>

view details

André Martins

commit sha ec16cab361309155d012ce12b93750fc5b876c9d

pkg/endpoint: reduce cardinality of prometheus labels If the controller that is used for label resolution fails, the prometheus metrics will increase its cardinality since the uniquely controller name was being used as a prometheus label. To avoid this, we will reference these warnings with a common subsystem name, 'resolve-labels'. Fixes: a31ab29f57b2 ("endpoint: Run labels controller under ep manager") Signed-off-by: André Martins <andre@cilium.io>

view details

Joe Stringer

commit sha 770bad585fc64117f8ba7038081ad71057487c17

docs: Fix shell session highlighting ".. code:: shell-session" does not work to provide syntax highlighting for shell sessions, this needs to be ".. code-block:: shell-session". Signed-off-by: Joe Stringer <joe@cilium.io>

view details

Joe Stringer

commit sha e168e09fc2a5c81739637189dc034164c62459f5

docs: Update references to Helm 2 Helm 2 support was deprecated as of commit 31f130612ea9 ("feat(helm): Move requirements.yaml to Chart.yaml"), and will reach end of support from upstream on November 13, 2020 [0]. Remove references to Helm 2. [0] https://helm.sh/blog/helm-v2-deprecation-timeline/ Signed-off-by: Joe Stringer <joe@cilium.io>

view details

Joe Stringer

commit sha 532bbd500fefe5a08e4f733361d378eba741f09e

docs: Improve upgrade guide grammar, flow While proofreading the upgrade guide, I found some phrasings awkward and some duplicate text. Improve it by fixing the grammar / tense and rewording sentences that flow poorly. Signed-off-by: Joe Stringer <joe@cilium.io>

view details

Joe Stringer

commit sha 76505f708a237a727ce019fb5f8ec701fc6b63c6

docs: Improve upgrade rollback section This paragraph seemed to be vaguely referencing some section that might exist in the future at some point, which doesn't make any sense. We can only refer users to the documentation that exists below. Do so. Signed-off-by: Joe Stringer <joe@cilium.io>

view details

Joe Stringer

commit sha 6b4e3c05010538b6dda4edb357d6eacc6fa3ce67

docs: Bump upgrade notes for 1.9 upgrade, 1.6 EOL Signed-off-by: Joe Stringer <joe@cilium.io>

view details

Joe Stringer

commit sha e62fd90c6636880a7b5e27516489e65ce9408081

docs: Remove YAML upgrade column for upgrades Since Cilium 1.6, we have provided Helm charts and have instructed users to upgrade with full YAML updates (either directly through helm or by templating first). There's no point in having a column for handling partial upgrades any more, remove it. Signed-off-by: Joe Stringer <joe@cilium.io>

view details

Joe Stringer

commit sha 7058593bb1f0ccdba2a21c936cc2df84b39866a1

docs: Improve 1.9 helm options change upgrade notes Try to make it more clear to users how they should specify the options in the primary command for upgrading the deployment, to prevent them from using 1.8.x helm options which are no longer recognized by Cilium 1.9 Helm charts. Signed-off-by: Joe Stringer <joe@cilium.io>

view details

Joe Stringer

commit sha 88cf4df8768fa7774121241dcb7fc84283d2c96f

docs: Fix links in upgrade guide The Helm chart one isn't interpreted by sphinx so goes to a dead link, the api ratelimiting one was using double graves rather than single which meant it would highlight the text rather than linking it, and the preflight section vaguely referred to earlier in the document even though it's super easy to just directly link it. Signed-off-by: Joe Stringer <joe@cilium.io>

view details

Joe Stringer

commit sha 48fc7dd9bf1e7e939a95a13d0167bbb944d36fda

docs: Clarify consequences of unmounted BPFFS Signed-off-by: Joe Stringer <joe@cilium.io>

view details

Joe Stringer

commit sha 76e14b11ad2827884f3dd5530fb73d57bbf3e751

docs: Move bpffs requirement to main requirements This requirement is not specific to kubernetes, and the kubernetes guide already points towards the main system requirements document. Move the BPFFS requirement there. Signed-off-by: Joe Stringer <joe@cilium.io>

view details

push time in 3 days

Pull request review commentcilium/cilium

examples/doc Add example LRP yaml files for real-world use cases

 configuring the CiliumLocalRedirectPolicy. Local Redirect Policy updates are currently not supported. If there are any changes to be made, delete the existing policy, and re-create a new one. +Use Cases+=========+Local Redirect Policy allows Cilium to support the following use cases:++Node-local DNS cache+--------------------+`DNS node-cache <https://github.com/kubernetes/dns>`_ listens on a static IP to intercept+traffic from application pods to the cluster's DNS service VIP by default, which will be+bypassed when Cilium is handling service resolution at or before the veth interface of the+application pod. To enable the DNS node-cache in a Cilium cluster, the following example+steers traffic to a local DNS node-cache which runs as a normal pod.++* Deploy DNS node-cache in pod namespace.++  .. tabs::++    .. group-tab:: Quick Deployment++        Deploy DNS node-cache.++        .. note::++           The quick deployment example yaml deploys node-local-dns in ``kube-system`` namespace,+           this does not work for GKE clusters.

any pod should be able to talk to kube-dns assuming there isn't a network policy to prevent it. Is there something special or different about node-local DNS here?

aditighag

comment created time in 3 days

PullRequestReviewEvent

Pull request review commentcilium/cilium

examples/doc Add example LRP yaml files for real-world use cases

 configuring the CiliumLocalRedirectPolicy. Local Redirect Policy updates are currently not supported. If there are any changes to be made, delete the existing policy, and re-create a new one. +Use Cases+=========+Local Redirect Policy allows Cilium to support the following use cases:++Node-local DNS cache+--------------------+`DNS node-cache <https://github.com/kubernetes/dns>`_ listens on a static IP to intercept+traffic from application pods to the cluster's DNS service VIP by default, which will be+bypassed when Cilium is handling service resolution at or before the veth interface of the+application pod. To enable the DNS node-cache in a Cilium cluster, the following example+steers traffic to a local DNS node-cache which runs as a normal pod.++* Deploy DNS node-cache in pod namespace.++  .. tabs::++    .. group-tab:: Quick Deployment++        Deploy DNS node-cache.++        .. note::++           The quick deployment example yaml deploys node-local-dns in ``kube-system`` namespace,+           this does not work for GKE clusters.

I wonder, does this work if you deploy in a different namespace? Just wondering whether we're trying to tell people there's no way to use this on GKE or whether they just need to do things a little differently.

aditighag

comment created time in 3 days

PullRequestReviewEvent

Pull request review commentcilium/cilium

examples/doc Add example LRP yaml files for real-world use cases

 configuring the CiliumLocalRedirectPolicy. Local Redirect Policy updates are currently not supported. If there are any changes to be made, delete the existing policy, and re-create a new one. +Use Cases+=========+Local Redirect Policy allows Cilium to support the following use cases:++Node-local DNS cache+--------------------+`DNS node-cache <https://github.com/kubernetes/dns>`_ listens on a static IP to intercept+traffic from application pods to the cluster's DNS service VIP by default, which will be+bypassed when Cilium is handling service resolution at or before the veth interface of the+application pod. To enable the DNS node-cache in a Cilium cluster, the following example+steers traffic to a local DNS node-cache which runs as a normal pod.++* Deploy DNS node-cache in pod namespace.++  .. tabs::++    .. group-tab:: Quick Deployment++        Deploy DNS node-cache.++        .. note::++           The quick deployment example yaml deploys node-local-dns in ``kube-system`` namespace,+           this does not work for GKE clusters.++        .. parsed-literal::++            $ kubectl apply -f \ |SCM_WEB|\/examples/kubernetes-local-redirect/node-local-dns.yaml++    .. group-tab:: Manual Configuration++         * Make sure to use a Node-local DNS image with a release version >= 1.15.16.+           This is to ensure that we have a knob to disable dummy network interface creation/deletion in+           Node-local DNS when we deploy it in non-host namespace.++         * Modify Node-local DNS cache's deployment yaml to pass these additional arguments to node-cache:+           ``-skipteardown=true``, ``-setupinterface=false``, and ``-setupiptables=false``.++         * Modify Node-local DNS cache's deployment yaml to put it in non-host namespace by setting+           ``hostNetwork: false`` for the daemonset.++         * In the Corefile, bind to ``0.0.0.0`` instead of the static IP.++         * In the Corefile, let CoreDNS serve health-check on its own IP instead of the static IP by+           removing the host IP string after health plugin.++         * Modify Node-local DNS cache's deployment yaml to point readiness probe to its own IP by+           removing the ``host`` field under ``readinessProbe``.++* Deploy local redirect policy (LRP) to steer DNS traffic to the node local dns cache.++  .. parsed-literal::++      $ kubectl apply -f \ |SCM_WEB|\/examples/kubernetes-local-redirect/node-local-dns-lrp.yaml++  .. note::++      * The LRP above uses ``kube-dns`` for the cluster DNS service, however if your cluster DNS service is different,+        you will need to modify this example LRP to specify it.+      * The namespace specified in the LRP above is set to the same namespace+        as the node-local DNS cache daemonset.+      * The LRP above uses the same port names ``dns`` and ``dns-tcp`` as the example quick deployment yaml, you will+        need to modify those to match your deployment if they are different.++After all ``node-local-dns`` pods are in ready status, DNS traffic will now go to the local node-cache first.+You can verify by checking the DNS cache's metrics such as ``coredns_cache_hits_total`` and+``coredns_cache_misses_total`` via curling ``<node-local-dns-pod>:9253/metrics``. A new DNS request either+hits or misses in the cache, therefore incrementing the corresponding metric.++kiam redirect on EKS+--------------------+`kiam <https://github.com/uswitch/kiam>`_ agent runs on each node in an EKS+cluster, and intercepts requests going to the AWS metadata server to fetch+security credentials for pods.++- In order to only redirect traffic from pods to the kiam agent, and pass+  traffic from the kiam agent to the AWS metadata server without any redirection,+  we need the socket lookup functionality in the datapath. This functionality+  requires v5.1.16, v5.2.0 or more recent Linux kernel. Make sure the kernel+  version installed on EKS cluster nodes satisfies these requirements.++- Deploy `kiam <https://github.com/uswitch/kiam>`_ using helm charts.++  .. code-block:: bash++      $ helm repo add uswitch https://uswitch.github.io/kiam-helm-charts/charts/+      $ helm repo update+      $ helm template kiam uswitch/kiam > kiam.yaml++  - If you see an error like "request blocked by whitelist-route-regexp" while+    running requests to the metadata server, then you may need to whitelist the+    metadata requests by passing the below argument to the ``kiam-agent Deamonset``.++    .. code-block:: bash++        $ sed -i '/args:/a \ \ \ \ \ \ \ \ \ \ \ \ - --whitelist-route-regexp=meta-data' kiam.yaml+++  - Make sure the "--iptables" argument is removed from the arguments passed+    to the ``kiam-agent Daemonset``.

Were you planning to update the helm template command above to disable this via helm to avoid the manual instruction here?

aditighag

comment created time in 3 days

Pull request review commentcilium/cilium

examples/doc Add example LRP yaml files for real-world use cases

 configuring the CiliumLocalRedirectPolicy. Local Redirect Policy updates are currently not supported. If there are any changes to be made, delete the existing policy, and re-create a new one. +Use Cases+=========+Local Redirect Policy allows Cilium to support the following use cases:++Node-local DNS cache+--------------------+`DNS node-cache <https://github.com/kubernetes/dns>`_ listens on a static IP to intercept+traffic from application pods to the cluster's DNS service VIP by default, which will be+bypassed when Cilium is handling service resolution at or before the veth interface of the+application pod. To enable the DNS node-cache in a Cilium cluster, the following example+steers traffic to a local DNS node-cache which runs as a normal pod.++* Deploy DNS node-cache in pod namespace.++  .. tabs::++    .. group-tab:: Quick Deployment++        Deploy DNS node-cache.++        .. note::++           The quick deployment example yaml deploys node-local-dns in ``kube-system`` namespace,+           this does not work for GKE clusters.++        .. parsed-literal::++            $ kubectl apply -f \ |SCM_WEB|\/examples/kubernetes-local-redirect/node-local-dns.yaml++    .. group-tab:: Manual Configuration++         * Make sure to use a Node-local DNS image with a release version >= 1.15.16.+           This is to ensure that we have a knob to disable dummy network interface creation/deletion in+           Node-local DNS when we deploy it in non-host namespace.++         * Modify Node-local DNS cache's deployment yaml to pass these additional arguments to node-cache:+           ``-skipteardown=true``, ``-setupinterface=false``, and ``-setupiptables=false``.++         * Modify Node-local DNS cache's deployment yaml to put it in non-host namespace by setting+           ``hostNetwork: false`` for the daemonset.++         * In the Corefile, bind to ``0.0.0.0`` instead of the static IP.++         * In the Corefile, let CoreDNS serve health-check on its own IP instead of the static IP by+           removing the host IP string after health plugin.++         * Modify Node-local DNS cache's deployment yaml to point readiness probe to its own IP by+           removing the ``host`` field under ``readinessProbe``.++* Deploy local redirect policy (LRP) to steer DNS traffic to the node local dns cache.++  .. parsed-literal::++      $ kubectl apply -f \ |SCM_WEB|\/examples/kubernetes-local-redirect/node-local-dns-lrp.yaml++  .. note::++      * The LRP above uses ``kube-dns`` for the cluster DNS service, however if your cluster DNS service is different,+        you will need to modify this example LRP to specify it.+      * The namespace specified in the LRP above is set to the same namespace+        as the node-local DNS cache daemonset.+      * The LRP above uses the same port names ``dns`` and ``dns-tcp`` as the example quick deployment yaml, you will+        need to modify those to match your deployment if they are different.++After all ``node-local-dns`` pods are in ready status, DNS traffic will now go to the local node-cache first.+You can verify by checking the DNS cache's metrics such as ``coredns_cache_hits_total`` and+``coredns_cache_misses_total`` via curling ``<node-local-dns-pod>:9253/metrics``. A new DNS request either+hits or misses in the cache, therefore incrementing the corresponding metric.++kiam redirect on EKS+--------------------+`kiam <https://github.com/uswitch/kiam>`_ agent runs on each node in an EKS+cluster, and intercepts requests going to the AWS metadata server to fetch+security credentials for pods.++- In order to only redirect traffic from pods to the kiam agent, and pass+  traffic from the kiam agent to the AWS metadata server without any redirection,+  we need the socket lookup functionality in the datapath. This functionality+  requires v5.1.16, v5.2.0 or more recent Linux kernel. Make sure the kernel+  version installed on EKS cluster nodes satisfies these requirements.++- Deploy `kiam <https://github.com/uswitch/kiam>`_ using helm charts.++  .. code-block:: bash++      $ helm repo add uswitch https://uswitch.github.io/kiam-helm-charts/charts/+      $ helm repo update+      $ helm template kiam uswitch/kiam > kiam.yaml++  - If you see an error like "request blocked by whitelist-route-regexp" while+    running requests to the metadata server, then you may need to whitelist the+    metadata requests by passing the below argument to the ``kiam-agent Deamonset``.++    .. code-block:: bash++        $ sed -i '/args:/a \ \ \ \ \ \ \ \ \ \ \ \ - --whitelist-route-regexp=meta-data' kiam.yaml+++  - Make sure the "--iptables" argument is removed from the arguments passed+    to the ``kiam-agent Daemonset``.++  - Make sure the ``kiam-agent Daemonset`` is run in the ``hostNetwork`` mode.++  - Apply the kiam configuration.++    .. code-block:: bash++        $ kubectl apply -f kiam.yaml+++- Deploy the Local Redirect Policy to redirect pod traffic to the deployed kiam agent.++  .. parsed-literal::++      $ kubectl apply -f \ |SCM_WEB|\/examples/kubernetes-local-redirect/kiam-lrp.yaml++.. note::++    - The ``addressMatcher`` ip address in the Local Redirect Policy is set to+      the ip address of the AWS metadata server and the ``toPorts`` port+      to the default HTTP server port. The ``toPorts`` field under+      ``redirectBackend`` configuration in the policy is set to the port that+      the kiam agent listens on. The port is passed as "--port" argument in+      the ``kiam-agent Daemonset``.+    - The Local Redirect Policy namespace is set to the namespace+      in which kiam-agent Daemonset is deployed.++- Once all the kiam agent pods are in ``Running`` state, the metadata requests+  from application pods will get redirected to the node-local kiam agent pods.+  You can verify this by running a curl command to the AWS metadata server from+  one of the application pods, and tcpdump command on the same EKS cluster node as the+  pod. Following is an example output, where ``192.169.98.118`` is the ip+  address of an application pod, and ``192.168.33.99`` is the ip address of the+  kiam agent running on the same node as the application pod.++  .. code-block:: bash++      $ kubectl exec app-pod -- curl -s -w "\n" -X GET http://169.254.169.254/latest/meta-data/++  .. code-block:: bash++      $ sudo tcpdump -i any -enn "(port 8181) and (host 192.168.33.99 and 192.168.98.118)"+      tcpdump: verbose output suppressed, use -v or -vv for full protocol decode+      listening on any, link-type LINUX_SLL (Linux cooked), capture size 262144 bytes+      05:16:05.229597  In de:e4:e9:94:b5:9f ethertype IPv4 (0x0800), length 76: 192.168.98.118.47934 > 192.168.33.99.8181: Flags [S], seq 669026791, win 62727, options [mss 8961,sackOK,TS val 2539579886 ecr 0,nop,wscale 7], length 0+      05:16:05.229657 Out 56:8f:62:18:6f:85 ethertype IPv4 (0x0800), length 76: 192.168.33.99.8181 > 192.168.98.118.47934: Flags [S.], seq 2355192249, ack 669026792, win 62643, options [mss 8961,sackOK,TS val 4263010641 ecr 2539579886,nop,wscale 7], length 0 +Miscellaneous

:thinking: can we combine miscellaneous + limitations?

aditighag

comment created time in 3 days

PullRequestReviewEvent
PullRequestReviewEvent

pull request commentcilium/cilium

redirectpolicy: Move the feature behind feature flag

Saw this under the release-blocker issue https://github.com/cilium/cilium/issues/13294 as a release-blocker so just inheriting the label to the PR for visibility.

aditighag

comment created time in 3 days

push eventcilium/cilium

Joe Stringer

commit sha e168e09fc2a5c81739637189dc034164c62459f5

docs: Update references to Helm 2 Helm 2 support was deprecated as of commit 31f130612ea9 ("feat(helm): Move requirements.yaml to Chart.yaml"), and will reach end of support from upstream on November 13, 2020 [0]. Remove references to Helm 2. [0] https://helm.sh/blog/helm-v2-deprecation-timeline/ Signed-off-by: Joe Stringer <joe@cilium.io>

view details

Joe Stringer

commit sha 532bbd500fefe5a08e4f733361d378eba741f09e

docs: Improve upgrade guide grammar, flow While proofreading the upgrade guide, I found some phrasings awkward and some duplicate text. Improve it by fixing the grammar / tense and rewording sentences that flow poorly. Signed-off-by: Joe Stringer <joe@cilium.io>

view details

Joe Stringer

commit sha 76505f708a237a727ce019fb5f8ec701fc6b63c6

docs: Improve upgrade rollback section This paragraph seemed to be vaguely referencing some section that might exist in the future at some point, which doesn't make any sense. We can only refer users to the documentation that exists below. Do so. Signed-off-by: Joe Stringer <joe@cilium.io>

view details

Joe Stringer

commit sha 6b4e3c05010538b6dda4edb357d6eacc6fa3ce67

docs: Bump upgrade notes for 1.9 upgrade, 1.6 EOL Signed-off-by: Joe Stringer <joe@cilium.io>

view details

Joe Stringer

commit sha e62fd90c6636880a7b5e27516489e65ce9408081

docs: Remove YAML upgrade column for upgrades Since Cilium 1.6, we have provided Helm charts and have instructed users to upgrade with full YAML updates (either directly through helm or by templating first). There's no point in having a column for handling partial upgrades any more, remove it. Signed-off-by: Joe Stringer <joe@cilium.io>

view details

Joe Stringer

commit sha 7058593bb1f0ccdba2a21c936cc2df84b39866a1

docs: Improve 1.9 helm options change upgrade notes Try to make it more clear to users how they should specify the options in the primary command for upgrading the deployment, to prevent them from using 1.8.x helm options which are no longer recognized by Cilium 1.9 Helm charts. Signed-off-by: Joe Stringer <joe@cilium.io>

view details

Joe Stringer

commit sha 88cf4df8768fa7774121241dcb7fc84283d2c96f

docs: Fix links in upgrade guide The Helm chart one isn't interpreted by sphinx so goes to a dead link, the api ratelimiting one was using double graves rather than single which meant it would highlight the text rather than linking it, and the preflight section vaguely referred to earlier in the document even though it's super easy to just directly link it. Signed-off-by: Joe Stringer <joe@cilium.io>

view details

push time in 3 days

delete branch joestringer/cilium

delete branch : submit/upgrade-docs-2020-10-21

delete time in 3 days

PR merged cilium/cilium

Refresh upgrade guide for v1.9 needs-backport/1.9 priority/release-blocker release-note/misc

This PR improves the upgrade guide by overhauling most of it for how the v1.9 upgrade procedure is expected to occur. This includes general proofreading, link fixing, improved references to what the user must pay attention to during 1.9 upgrade.

I kept helm chart changes out of this PR so we can focus just on general docs improvements here. I will submit a separate PR for changes that include helm modifications.

Review commit-by-commit.

Related: #12288

+75 -189

0 comment

2 changed files

joestringer

pr closed time in 3 days

pull request commentcilium/cilium

Fix Helm upgrade compatibility

test-gke

joestringer

comment created time in 3 days

pull request commentcilium/cilium

Fix Helm upgrade compatibility

Scaling gke cluster failed:

https://jenkins.cilium.io/job/Cilium-PR-K8s-GKE/2916/

11:28:22  ERROR: (gcloud.container.clusters.resize) FAILED_PRECONDITION: Operation operation-1603391301625-8eb1c018 is currently operating on cluster cilium-ci-18. Please wait and try again once it is done.

Will retry.

joestringer

comment created time in 3 days

issue openedcilium/cilium

v1.9.0 release tracker

  • [x] Feature freeze
  • [x] Merge feature freeze exceptions
  • [x] Evaluate which release blockers we intend to address in release timeframe
  • [x] Branch for release
  • [x] Tag a release and provide docs for GSG testing (v1.9.0-rc2)
  • [ ] Validate the getting started guides (#13627)
  • [ ] Complete all release blockers
  • [ ] Ensure that v1.9 CI jobs are stable
  • [ ] Prepare release blog

created time in 3 days

pull request commentcilium/cilium

docs: remove redundant "--set ipam.node=kubernetes" from GKE GSG

@aanm I don't follow the logic there. In chaining mode, the chaining plugin is responsible for IPAM. But if you configure Cilium's ipam option to eni or azure, Cilium is responsible for IPAM, it just performs it via the cloud API.

qmonnet

comment created time in 3 days

pull request commentcilium/cilium

docs: remove redundant "--set ipam.node=kubernetes" from GKE GSG

@aanm First part I agree. Chaining mode though is a separate issue entirely. The IPAM modes for Azure / ENI are specifically configured when Cilium is not in chaining mode, just like the standard GKE instructions.

qmonnet

comment created time in 3 days

more