profile
viewpoint
Andreas Neumann ANeumann82 Hamburg, Germany Software Developer

mesosphere/kudo-cassandra-operator 3

KUDO Cassandra Operator

ANeumann82/ag-grid 0

Advanced Data Grid / Data Table supporting Javascript / React / AngularJS / Web Components

ANeumann82/bamboo 0

HAProxy auto configuration and auto service discovery for Mesos Marathon

ANeumann82/cassandra-medusa 0

Apache Cassandra Backup and Restore Tool

ANeumann82/core 0

Core functionality for PrimeFaces Extensions

ANeumann82/dcos-core-cli 0

Core plugin for the DC/OS CLI

ANeumann82/gs-collections 0

A supplement or replacement for the Java Collections Framework.

ANeumann82/i18n-polyfill 0

A speculative polyfill to support i18n code translations in Angular

ANeumann82/kubernetes 0

Production-Grade Container Scheduling and Management

ANeumann82/pootle 0

Docker Pootle translation server

push eventkudobuilder/kudo

Ken Sipe

commit sha cdbbade020b441e3f1a76a680a0e35c13a43e240

Namespace Package Verify (#1536) Co-authored-by: Andreas Neumann <aneumann@mesosphere.com> Signed-off-by: Ken Sipe <kensipe@gmail.com>

view details

Ken Sipe

commit sha f3e6f9d60b8f064d767cdd8f56a323c5f14bfeae

KEP31: Template Support for Namespace Manifest (#1535) Co-authored-by: Andreas Neumann <aneumann@mesosphere.com> Signed-off-by: Ken Sipe <kensipe@gmail.com>

view details

Ken Sipe

commit sha 73cd8c24b3381596438ef8b5e69e106f699bf5cc

Plan Update and Trigger with Wait (#1470) Signed-off-by: Ken Sipe <kensipe@gmail.com>

view details

Jan Schlicht

commit sha 5e6ccf2f754de83efa5762f4794ff7883f96f08d

Separate E2E and operator tests (#1540) This runs operator tests as a separate test in parallel with the other tests. It makes operator tests independent from E2E test results and their failures distinguishable. Signed-off-by: Jan Schlicht <jan@d2iq.com> Signed-off-by: Ken Sipe <kensipe@gmail.com> Co-authored-by: Ken Sipe <kensipe@gmail.com>

view details

Jan Schlicht

commit sha 03edc91be048d6321ffefbb6e8d3135350686d04

Fix namespace create without manifest (#1543) If a namespace was created without a manifest, KUDO would fail with a segfault, because it tries to add an annotation to a nil map. This has been fixed and tests have been added for namespace creation. Signed-off-by: Jan Schlicht <jan@d2iq.com>

view details

Ken Sipe

commit sha daa7ac8a20b6cb3e9410c1457c5384fe5c11a325

kuttl v0.4.0 bump (#1545) Signed-off-by: Ken Sipe <kensipe@gmail.com>

view details

Aleksey Dukhovniy

commit sha f71e81c51024ff0b19710dddc3ebad4bc5b818ff

KEP-29: Add `KudoOperatorTask` implementation (#1541) Summary: implemented `KudoTaskOperator` which, given a `KudoOperator` task in the operator will create the `Instance` object and wait for it to become healthy. Additionally added `paramsFile` to the `KudoOperatorTaskSpec`. Fixes: #1509 Signed-off-by: Aleksey Dukhovniy <alex.dukhovniy@googlemail.com>

view details

Jan Schlicht

commit sha 2464231002af838499e7b96586c419cdddb089da

Refactor operator package installation (#1542) The old 'InstallPackage' function has been extracted into a separate package. Its functionality has been split up into multiple functions handling different installation resources. Signed-off-by: Jan Schlicht <jan@d2iq.com>

view details

Andreas Neumann

commit sha f8d96bd16a67f0964500de3c0c83489d52853518

Detect different versions of cert-manager (#1546) Co-authored-by: Ken Sipe <kensipe@gmail.com> Signed-off-by: Andreas Neumann <aneumann@mesosphere.com>

view details

Andreas Neumann

commit sha 3cd8c2549509cee193a521b2af43ed7803ffe715

Merge branch 'master' into an/upgrading Signed-off-by: Andreas Neumann <aneumann@mesosphere.com> # Conflicts: # hack/run-e2e-tests.sh # pkg/kudoctl/cmd/init.go # pkg/kudoctl/cmd/init_test.go # pkg/kudoctl/kudoinit/prereq/webhook.go # pkg/kudoctl/kudoinit/setup/setup.go

view details

push time in 12 hours

push eventkudobuilder/kudo

Andreas Neumann

commit sha 45d5aa402602025f2d62a86c403bfdc37cf29be4

Fixed version for old certmanager group Signed-off-by: Andreas Neumann <aneumann@mesosphere.com>

view details

push time in 14 hours

push eventkudobuilder/kudo

Andreas Neumann

commit sha d4d2005bc66e2ab607a4ed2ccf7e424fc5da6d81

Removed test that is not used anymore Signed-off-by: Andreas Neumann <aneumann@mesosphere.com>

view details

push time in 14 hours

push eventkudobuilder/kudo

Andreas Neumann

commit sha a766e8a394f934e7c206c051c93f6179f836a2ee

Add simple cluster version check in PreInstallVerify to provide better error message when no cluster is reachable Signed-off-by: Andreas Neumann <aneumann@mesosphere.com>

view details

push time in 15 hours

push eventkudobuilder/kudo

Aleksey Dukhovniy

commit sha f71e81c51024ff0b19710dddc3ebad4bc5b818ff

KEP-29: Add `KudoOperatorTask` implementation (#1541) Summary: implemented `KudoTaskOperator` which, given a `KudoOperator` task in the operator will create the `Instance` object and wait for it to become healthy. Additionally added `paramsFile` to the `KudoOperatorTaskSpec`. Fixes: #1509 Signed-off-by: Aleksey Dukhovniy <alex.dukhovniy@googlemail.com>

view details

Jan Schlicht

commit sha 2464231002af838499e7b96586c419cdddb089da

Refactor operator package installation (#1542) The old 'InstallPackage' function has been extracted into a separate package. Its functionality has been split up into multiple functions handling different installation resources. Signed-off-by: Jan Schlicht <jan@d2iq.com>

view details

Andreas Neumann

commit sha 77840f5465f45ad57a86412f8ad05b392a1a8b08

Merge branch 'master' into an/allow-older-cert-manager # Conflicts: # pkg/kudoctl/cmd/testdata/deploy-kudo-webhook.yaml.golden

view details

push time in 15 hours

push eventkudobuilder/kudo

Andreas Neumann

commit sha 473fe25951ea2cbf5c1953f2c991bc3f8fe7691b

Dry-Run does PreInstallVerify now to generate correct yaml Refactoring of webhook cert-manager detection Signed-off-by: Andreas Neumann <aneumann@mesosphere.com>

view details

push time in 15 hours

pull request commentkudobuilder/kudo

Detect different versions of cert-manager

I pushed a version of cert-man detection that works for konvoy and should work for any distro that provides the standard label of app=cert-manager for the certman deployment.

Yeah, I like that. And I agree that we probably can get rid of the additional checks for cert-manager health, etc.

The remaining blocking issue is around the kinds Certification and Issuer. I guess I don't understand when and why they are used. I spent significant time around this today... it doesn't look like they are used if init on a cluster with a certman... nor are they used on a cluster with an --unsafe-self-signed-webhook-ca. It looks like they are only provided for dry-run with yaml output... why? It seems like we should remove them... I wanted to discuss first.

The cert is used here: https://github.com/kudobuilder/kudo/blob/2464231002af838499e7b96586c419cdddb089da/pkg/kudoctl/kudoinit/prereq/webhook.go#L248

It's either this or the self-signed-webhook-ca...

ANeumann82

comment created time in 16 hours

Pull request review commentkudobuilder/kudo

Detect different versions of cert-manager

 func (k KudoWebHook) Resources() []runtime.Object { 	return k.resourcesWithCertManager() } -func (k KudoWebHook) resourcesWithCertManager() []runtime.Object {+func (k *KudoWebHook) resourcesWithCertManager() []runtime.Object {+	// We have to fall back to a default here as for a dry-run we can't detect the actual version of a cluster+	k.issuer = issuer(k.opts.Namespace, certManagerNewGroup, certManagerAPIVersions[0])+	k.certificate = certificate(k.opts.Namespace, certManagerNewGroup, certManagerAPIVersions[0])

Well, they are not used when --unsafe-self-signed-webhook-ca is used. Issuer and Certificate are created when we don't use a self-signed-ca.

With this refactoring they are detected in the preInstallVerify step and the correct version of Issuer and Certificate for the installed cert-manager are created.

They are then applied in the Install step.

The problem with resources is that this step currently does not use the preInstallVerify step, and it usually doesn't even have a client initialized because it doesn't need an existing K8s cluster at all. Now that we need to detect the correct cert-manager version, we may have to do that. Maybe that's not a bad thing...

ANeumann82

comment created time in 17 hours

issue commentkudobuilder/kudo

Toggle Task can not be used for uninstalled custom resources

Well, if the parameter for the Toggle Task is "false", then the CRD is not required to install the operator, correct? And in this case the toggle tasks should simply do nothing and not fail the execution because it doesn't know about the CR that it wants to delete.

ANeumann82

comment created time in a day

issue openedkudobuilder/kudo

Toggle Task can not be used for uninstalled custom resources

What happened: Tried to change a resource deployment to use a Toggle Task. The resource is a custom resource for a CRD that may not exist on the cluster. KUDO failed with:

06
          +        - message: 'A transient error when executing task deploy.nodes.pre-node.monitor-deploy.
16:17:06
          +            Will retry. failed to determine if object &{map[apiVersion:monitoring.coreos.com/v1
16:17:06
          +            kind:ServiceMonitor metadata:map[annotations:map[kudo.dev/last-plan-execution-uid:6521e1bf-e0c4-45ab-a295-356198b3557b
16:17:06
          +            kudo.dev/phase:nodes kudo.dev/plan:deploy kudo.dev/step:pre-node] labels:map[app:prometheus-operator
16:17:06
          +            heritage:kudo kudo.dev/instance:cassandra kudo.dev/operator:cassandra
16:17:06
          +            release:prometheus-kubeaddons] name:cassandra-monitor namespace:cassandra-install-test]
16:17:06
          +            spec:map[endpoints:[map[interval:30s port:prometheus-exporter-port]] namespaceSelector:map[matchNames:[cassandra-install-test]]
16:17:06
          +            selector:map[matchLabels:map[kudo.dev/instance:cassandra kudo.dev/servicemonitor:true]]]]}
16:17:06
          +            is namespaced: a resource with GVK monitoring.coreos.com/v1, Kind=ServiceMonitor
16:17:06
          +            seems to be missing in API resource list'

What you expected to happen: The toggle task should be able to "delete" or not deploy a custom resource for which the CRD is not known to the cluster

How to reproduce it (as minimally and precisely as possible): Use a toggle task with a custom resource which is not known to the cluster.

Anything else we need to know?: The task_delete.go uses the enhancer which tries to determine if the resource to deploy is namespaced or not - which fails for an unknown custom resource.

created time in a day

Pull request review commentkudobuilder/kudo

Detect different versions of cert-manager

 import ( var _ kudoinit.Step = &KudoWebHook{}  type KudoWebHook struct {-	opts        kudoinit.Options-	issuer      unstructured.Unstructured-	certificate unstructured.Unstructured+	opts kudoinit.Options++	certManagerGroup      string+	certManagerAPIVersion string++	issuer      *unstructured.Unstructured+	certificate *unstructured.Unstructured }  const (-	certManagerAPIVersion        = "v1alpha2"-	certManagerControllerVersion = "v0.12.0"+	certManagerOldGroup = "certmanager.k8s.io"

It's not necessarily a map... It might be a map of lists:

certManagerAPIs := map[string][]string{
  "certmanager.k8s.io": []string{ "v1alpha1"},
  "cert-manager.io": []string{ "v1alpha1", "v1alpha2", "v1alpha3" },
}

(at least I think "cert-manager.io" has "v1alpha1" as well in some version, it certainly has v1alpha2 and v1alpha3)

ANeumann82

comment created time in 2 days

Pull request review commentmesosphere/kudo-cassandra-operator

Bump operator version and add missing changelog entries.

 This is achieved by creating and merging a PR _against the stable branch_ where: 1.  (as needed) the various `*_VERSION` variables are set as necessary for base     tech and the operator version, according to the versioning scheme shown     above,-1.  necessary files are updated by running `./tools/compile_template.sh`+1.  necessary files are updated by running `./tools/compile_template.sh` and+    `./tools/generate_parameters_markdown.sh`
1.  necessary files are updated by running 
    - `./tools/compile_template.sh` and
    - `./tools/generate_parameters_markdown.sh`
    - `./tools/format_files.sh`
porridge

comment created time in 2 days

Pull request review commentmesosphere/kudo-cassandra-operator

Switching to kuttl for MWT

+apiVersion: kuttl.dev/v1beta1+kind: TestStep+commands:+  - command: kubectl kudo install cassandra --skip-instance --namespace cassandra  

I would generally agree, although with a project that is heavily in development this might not always be possible. For the last MWT I used a dev-version from master.

I do hope that we have reached a state where this won't be necessary anymore though.

kensipe

comment created time in 2 days

Pull request review commentmesosphere/kudo-cassandra-operator

Update documentation

 in your cluster as you use in the NODE_TOPOLOGY definition.  ### Full list of required parameters -```-    ENDPOINT_SNITCH=GossipingPropertyFileSnitch-    NODE_ANTI_AFFINITY=true-    NODE_TOPOLOGY=<the cluster topology>+```yaml+ENDPOINT_SNITCH=GossipingPropertyFileSnitch NODE_ANTI_AFFINITY=true+NODE_TOPOLOGY=<the cluster topology>
```yaml
ENDPOINT_SNITCH=GossipingPropertyFileSnitch 
NODE_ANTI_AFFINITY=true
NODE_TOPOLOGY=<the cluster topology>
nfnt

comment created time in 2 days

Pull request review commentmesosphere/kudo-cassandra-operator

Update documentation

 rack awareness.  ## Kubernetes cluster prerequisites -At this time, KUDO Cassandra needs a single Kubernetes cluster spanning all the-datacenters. A Cassandra cluster running on two or more Kubernetes clusters is-not supported at the moment+### Naming++Cassandra datacenters can either run in a single Kubernetes cluster that is
Cassandra clusters can either run in a single Kubernetes cluster that is

?

nfnt

comment created time in 2 days

Pull request review commentmesosphere/kudo-cassandra-operator

Update documentation

 configurable settings.  See the [document on upgrading](upgrading.md). +## Failure handling++When using local storage, a Cassandra pod is using a local persistent volume+that is only available when the pod is scheduled in a specific node. Any+rescheduling will land the pod to the very same node due to the volume node+affinity.++This is an issue in case of a total Kubernetes node loss: the pods running on an+unreachable Node enter the states Terminating or Unknown. Kubernetes doesn’t+allow the deletion of those pods to avoid any brain-split.++KUDO Cassandra provides a way to automatically handle these failure modes and+move a Cassandra node that is located on a failed Kubernetes node to a different+node in the cluster.++### Recovery controller++To enable this feature, use the following parameter:++```bash+RECOVERY_CONTROLLER=true+```++When this parameter is set, KUDO Cassandra will deploy an additional controller+that monitors the deployed Cassandra pods. If any pod reaches an unschedulable+state and detects that the kubernetes node is gone, it will remove the local+volume of that pod and allow Kubernetes to schedule the pod to a different node.+Additionally, the rescheduling can be triggered by an eviction label.++The recovery controller relies on the Kubernetes state of a node, not the actual+running processes. This means that the failure of the hardware on which a+Cassandra node runs does not trigger the recovery. The only way an automatic+recovery is triggered is when the Kubernetes node is removed from the cluster by+kubectl delete node <failed-node-name>. This allows a Kubernetes node to be shut+down for a maintenance period without KUDO Cassandra triggering a recovery.++:warning: This feature will remove persistent volume claims in the Kubernetes+cluster. This may lead to data loss. Additionally, you must not use any+keyspaces with a replication factor of ONE, or the data of the failed Cassandra+node will be lost.++#### Node eviction++Evicting a Cassandra node is similar to Failure recovery described above. The+recovery controller will automate certain steps. The main difference is that+during node eviction the Kubernetes node should stay available, i.e. other pods+on that node shouldn’t get evicted. To evict a Cassandra node, first cordon or+taint the Kubernetes node the Cassandra node is running on. This ensures that

We can now also add a section to describe that KUDO Cassandra supports the cordon for Cassandra only.

If you want to cordon only Cassandra from a specific node, add a label kudo-cassandra/cordon=true to that node. KUDO Cassandra pods will not be scheduled to Kubernetes nodes with this label present.

nfnt

comment created time in 2 days

create barnchmesosphere/kudo-cassandra-operator

branch : an/use-toggle-for-monitor

created branch time in 2 days

Pull request review commentkudobuilder/kudo

simplified diagnostics

+package diagnostics

Generally: Yes :) But at the moment, the whole code is in a single package, this would require a full restructuring because of circular dependencies...

vemelin-epm

comment created time in 2 days

push eventkudobuilder/kudo

Andreas Neumann

commit sha 4c2489422f74da0ae8f5ae4ba746b20a6b781bfb

Small cleanup and added comment Signed-off-by: Andreas Neumann <aneumann@mesosphere.com>

view details

push time in 2 days

PR opened kudobuilder/kudo

Reviewers
Detect different versions of cert-manager

Detect different versions of cert-manager and install the correct CRDs

  • Supports cert-manager 0.10.1 and 0.11.0+
  • kudo init --dry-run --output yaml uses a default with the newest cert-manager API
  • Removed the warning and check for a cert-manager deployment, as we not really depend on the exact version
  • Had to make the whole setup/installation stateful so we can hold the required cert-manager versions, but this may come in handy later if we expand the installation

Signed-off-by: Andreas Neumann aneumann@mesosphere.com

+166 -71

0 comment

4 changed files

pr created time in 2 days

create barnchkudobuilder/kudo

branch : an/allow-older-cert-manager

created branch time in 2 days

push eventkudobuilder/kudo

Andreas Neumann

commit sha 0be876c07b8277518606aad44268eac0d46d746b

Fixed missing return Signed-off-by: Andreas Neumann <aneumann@mesosphere.com>

view details

push time in 2 days

push eventmesosphere/kudo-cassandra-operator

Andreas Neumann

commit sha 26724f0addec27dd1302ca7a194b21cb7990431c

Update Medusa to 0.6.0 (#124) * Update Medusa to 0.6.0 * Include TLS backup/restore test Signed-off-by: Andreas Neumann <aneumann@mesosphere.com>

view details

push time in 2 days

delete branch mesosphere/kudo-cassandra-operator

delete branch : an/medusa-0.6.0

delete time in 2 days

PR merged mesosphere/kudo-cassandra-operator

Reviewers
Update Medusa to 0.6.0

This allows us to enable backup/restore for encrypted clusters

Signed-off-by: Andreas Neumann aneumann@mesosphere.com

+123 -62

1 comment

10 changed files

ANeumann82

pr closed time in 2 days

push eventmesosphere/kudo-cassandra-operator

Andreas Neumann

commit sha 6345c7da1e47fa1adf386c62c9bf2e8565a1b7bd

adefrasdfg Signed-off-by: Andreas Neumann <aneumann@mesosphere.com>

view details

push time in 2 days

Pull request review commentkudobuilder/kudo

KEP-29: Add `KudoOperatorTask` implementation

 type KudoOperatorTask struct { 	InstanceName    string 	AppVersion      string 	OperatorVersion string+	ParameterFile   string }  // Run method for the KudoOperatorTask. Not yet implemented func (dt KudoOperatorTask) Run(ctx Context) (bool, error) {-	return false, errors.New("kudo-operator task is not yet implemented. Stay tuned though ;)")++	// 0. - A few prerequisites -+	// Note: ctx.Meta has Meta.OperatorName and Meta.OperatorVersion fields but these are of the **parent instance**+	// However, since we don't support multiple namespaces yet, we can use the Meta.InstanceNamespace for the namespace+	namespace := ctx.Meta.InstanceNamespace+	operatorName := dt.Package+	operatorVersion := dt.OperatorVersion+	operatorVersionName := v1beta1.OperatorVersionName(operatorName, operatorVersion)+	instanceName := dependencyInstanceName(ctx.Meta.InstanceName, dt.InstanceName, operatorName)++	// 1. - Expand parameter file if exists -+	params, err := instanceParameters(dt.ParameterFile, ctx.Templates, ctx.Meta, ctx.Parameters)+	if err != nil {+		return false, fatalExecutionError(err, taskRenderingError, ctx.Meta)+	}++	// 2. - Build the instance object -+	instance, err := instanceResource(instanceName, operatorName, operatorVersionName, namespace, params, ctx.Meta.ResourcesOwner, ctx.Scheme)+	if err != nil {+		return false, fatalExecutionError(err, taskRenderingError, ctx.Meta)+	}++	// 3. - Apply the Instance object -+	err = applyInstance(instance, namespace, ctx.Client)+	if err != nil {+		return false, err+	}++	// 4. - Check the Instance health -+	if err := health.IsHealthy(instance); err != nil {+		return false, nil+	}++	return true, nil+}++// dependencyInstanceName returns a name for the child instance in an operator with dependencies looking like+// <parent-instance.<child-instance> if a child instance name is provided e.g. `kafka-instance.custom-name` or+// <parent-instance.<child-operator> if not e.g. `kafka-instance.zookeeper`. This way we always have a valid child+// instance name and user can install the same operator multiple times in the same namespace, because the instance+// names will be unique thanks to the top-level instance name prefix.+func dependencyInstanceName(parentInstanceName, instanceName, operatorName string) string {+	if instanceName != "" {+		return fmt.Sprintf("%s.%s", parentInstanceName, instanceName)+	}+	return fmt.Sprintf("%s.%s", parentInstanceName, operatorName)+}++// render method takes templated parameter file and a map of parameters and then renders passed template using kudo engine.+func instanceParameters(pf string, templates map[string]string, meta renderer.Metadata, parameters map[string]interface{}) (map[string]string, error) {+	if len(pf) != 0 {+		pft, ok := templates[pf]+		if !ok {+			return nil, fmt.Errorf("error finding parameter file %s", pf)+		}++		rendered, err := renderParametersFile(pf, pft, meta, parameters)+		if err != nil {+			return nil, fmt.Errorf("error expanding parameter file %s: %w", pf, err)+		}++		parameters := map[string]string{}+		errs := []string{}+		parser.GetParametersFromFile(pf, []byte(rendered), errs, parameters)+		if len(errs) > 0 {+			return nil, fmt.Errorf("failed to unmarshal parameter file %s: %s", pf, strings.Join(errs, ", "))+		}++		return parameters, nil+	}++	return nil, nil+}++func renderParametersFile(pf string, pft string, meta renderer.Metadata, parameters map[string]interface{}) (string, error) {+	vals := renderer.+		NewVariableMap().+		WithInstance(meta.OperatorName, meta.InstanceName, meta.InstanceNamespace, meta.AppVersion, meta.OperatorVersion).+		WithParameters(parameters)++	engine := renderer.New()++	return engine.Render(pf, pft, vals)+}++func instanceResource(instanceName, operatorName, operatorVersionName, namespace string, parameters map[string]string, owner metav1.Object, scheme *runtime.Scheme) (*v1beta1.Instance, error) {+	instance := &v1beta1.Instance{+		TypeMeta: metav1.TypeMeta{+			Kind:       "Instance",+			APIVersion: packages.APIVersion,+		},+		ObjectMeta: metav1.ObjectMeta{+			Name:      instanceName,+			Namespace: namespace,+			Labels:    map[string]string{kudo.OperatorLabel: operatorName},+		},+		Spec: v1beta1.InstanceSpec{+			OperatorVersion: corev1.ObjectReference{+				Name: operatorVersionName,+			},+			Parameters: parameters,+		},+		Status: v1beta1.InstanceStatus{},+	}+	if err := controllerutil.SetControllerReference(owner, instance, scheme); err != nil {+		return nil, fmt.Errorf("failed to set resource ownership for the new instance: %v", err)+	}++	return instance, nil+}++// applyInstance creates the passed instance if it doesn't exist or patches the existing one. Patch will override+// current spec.parameters and Spec.operatorVersion the same way, kudoctl does it. If the was no error, then the passed+// instance object is updated with the content returned by the server+func applyInstance(new *v1beta1.Instance, ns string, c client.Client) error {+	old := &v1beta1.Instance{}+	err := c.Get(context.TODO(), types.NamespacedName{Name: new.Name, Namespace: ns}, old)++	switch {+	// 1. if instance doesn't exist, create it+	case apierrors.IsNotFound(err):+		log.Printf("Instance %s/%s doesn't exist. Creating it", new.Namespace, new.Name)+		return createInstance(new, c)+	// 2. if the instance exists (there was no error), try to patch it+	case err == nil:+		log.Printf("Instance %s/%s already exist. Patching it", new.Namespace, new.Name)+		return patchInstance(new, c)+	// 3. any other error is treated as transient+	default:+		return fmt.Errorf("failed to check if instance %s/%s already exists: %v", new.Namespace, new.Name, err)+	}+}++func createInstance(i *v1beta1.Instance, c client.Client) error {+	gvk := i.GroupVersionKind()+	err := c.Create(context.TODO(), i)++	// reset the GVK since it is removed by the c.Create call+	// https://github.com/kubernetes/kubernetes/issues/80609+	i.SetGroupVersionKind(gvk)++	return err+}++func patchInstance(i *v1beta1.Instance, c client.Client) error {+	patch, err := json.Marshal(struct {+		Spec *v1beta1.InstanceSpec `json:"spec"`+	}{+		Spec: &i.Spec,+	})++	if err != nil {+		return fmt.Errorf("failed to serialize instance %s/%s patch: %v", i.Namespace, i.Name, err)+	}++	return c.Patch(context.TODO(), i, client.RawPatch(types.MergePatchType, patch))

Can we utilize the applyResources from the task_apply.go?

zen-dog

comment created time in 3 days

Pull request review commentmesosphere/kudo-cassandra-operator

Update Medusa to 0.6.0

 fi  docker build "${REPO_ROOT}/tools" -t kudo-cassandra-tools -docker run --rm -u "$(id -u):$(id -g)" -v "${REPO_ROOT}:${REPO_ROOT}" -w "$(pwd)" kudo-cassandra-tools "$@"++# Run docker and copy all ENV vars except any that contain PATH+docker run --rm -u "$(id -u):$(id -g)" --env-file <(env | grep -v "PATH") -v "${REPO_ROOT}:${REPO_ROOT}" -w "$(pwd)" kudo-cassandra-tools "$@"

I've removed this after merging master

ANeumann82

comment created time in 3 days

push eventmesosphere/kudo-cassandra-operator

Andreas Neumann

commit sha da9b3ad73b1eb6de6dc646382b31ef0a2fd01821

Add nodeAffinity label to cordon only cassandra from a k8s node, kudo v0.13.0 (#123) * Add nodeAffinity label to cordon only cassandra from a k8s node * Update KUDO dependency to 0.13.0 Signed-off-by: Andreas Neumann <aneumann@mesosphere.com>

view details

Zain Malik

commit sha 48bbbf7cca29df4877706460e80b1ca064ffe6ee

bump cassandra to 3.11.6 (#116) * bump cassandra to 3.11.6 * update CHANGELOG.md

view details

Marcin Owsiany

commit sha aeca8b61dd2ec01b6136dcd848429c9b38eeb611

Create a CODEOWNERS file (#128) Save us from having to select reviewers by hand.

view details

Marcin Owsiany

commit sha b977d9ec8495dcbec13f2b8dddd6f1632b1b3270

Bump the shared submodule to current master. (#129) This includes the following commits: Submodule shared f5116d5..361ff73: > Introduce a way of disabling $IMAGE_DISAMBIGUATION_SUFFIX. (#64) > Fix retry counter. (#62) > Bump to a recent konvoy version with better diagnostics. (#63)

view details

Marcin Owsiany

commit sha f5e0e5c1a90ce80e69955ed65f0e691b0b64160a

Rewrite the release instructions to use a simpler workflow. (#125) This changes the release workflow to use another approach, where we simply create the branch and tag via GitHub UI rather the (IMHO) overly complicated Python script that strangely uses Golang conventions for error handling. * Tolerate unset $IMAGE_DISAMBIGUATION_SUFFIX. * Use the disable-disambiguation-suffix branch of shared submodule. * Remove unused file. * Rewrite the release instructions.

view details

Marcin Owsiany

commit sha 013daf5ff86f44319f3bbbf02f7e0baa6cd3fc7c

Make sure that suffix is passed when necessary. (#134) Pass the variable explicitly. Wrapping with docker.sh was added for convenience of dispatch in #109 but effectively made this line a no-op.

view details

Marcin Owsiany

commit sha 7f000369d3e718e57660567ae78f53ed2970b93b

Remove the VERBOSE variable in compile_templates.sh (#135) We generally want information about what's going on. Let's reduce the number of knobs for simplicity.

view details

Andreas Neumann

commit sha 5ec01ce4e54fda72835a1630344dd09774e8f8b0

Merge branch 'master' into an/medusa-0.6.0

view details

Andreas Neumann

commit sha 6d1663ea6623c3e41c08ea17ef7f3130b418a944

Adjustments from master merge Signed-off-by: Andreas Neumann <aneumann@mesosphere.com>

view details

push time in 3 days

push eventkudobuilder/kudo

Andreas Neumann

commit sha fdc6dd2f0f4359c4aa11c7e991921b2959c6cba8

Fixed test Signed-off-by: Andreas Neumann <aneumann@mesosphere.com>

view details

push time in 3 days

push eventkudobuilder/kudo

Aleksey Dukhovniy

commit sha a7b98bf97c0a3ca9a4729b27d412b90c8a90a1ef

Removed `--webhook` option (#1497) Summary: Now that KUDO moves towards a better support of multiple plans (e.g. `kudoctl plan trigger` command), the existing instance admission webhook becomes necessary to guarantee the plan execution consistency. More on KUDO [admission controller](https://kudo.dev/docs/developing-operators/plans.html#admission-controllers) in the documentation. This PR removes the `--webhook` option and thus makes the instance admission webhook required. This is a breaking change since the users will have to either have [cert-manager](https://cert-manager.io/) installed or use the `--unsafe-self-signed-webhook-ca` option when initializing KUDO. For existing installations, one would need to run [kudo init](https://kudo.dev/docs/cli.html#examples) to create missing secret/webhook configuration. Signed-off-by: Aleksey Dukhovniy <alex.dukhovniy@googlemail.com>

view details

Ken Sipe

commit sha 609b74442b3a0cd78d3a0335f6a6dec9cd1cc4e2

KUTTL 0.2.2 Bump (#1532) Signed-off-by: Ken Sipe <kensipe@gmail.com>

view details

Andreas Neumann

commit sha 2e1b38c457934b3f4104523f1d30c59729d7de6c

KEP-30: Immutable parameters (#1485) * Added KEP for immutable parameters Signed-off-by: Andreas Neumann <aneumann@mesosphere.com>

view details

Ken Sipe

commit sha 66cada219ba062b79bd154a823e5d26e3c2880bf

removing 32-bit darwin from release (#1534) Signed-off-by: Ken Sipe <kensipe@gmail.com>

view details

Ken Sipe

commit sha cdbbade020b441e3f1a76a680a0e35c13a43e240

Namespace Package Verify (#1536) Co-authored-by: Andreas Neumann <aneumann@mesosphere.com> Signed-off-by: Ken Sipe <kensipe@gmail.com>

view details

Ken Sipe

commit sha f3e6f9d60b8f064d767cdd8f56a323c5f14bfeae

KEP31: Template Support for Namespace Manifest (#1535) Co-authored-by: Andreas Neumann <aneumann@mesosphere.com> Signed-off-by: Ken Sipe <kensipe@gmail.com>

view details

Ken Sipe

commit sha 73cd8c24b3381596438ef8b5e69e106f699bf5cc

Plan Update and Trigger with Wait (#1470) Signed-off-by: Ken Sipe <kensipe@gmail.com>

view details

Jan Schlicht

commit sha 5e6ccf2f754de83efa5762f4794ff7883f96f08d

Separate E2E and operator tests (#1540) This runs operator tests as a separate test in parallel with the other tests. It makes operator tests independent from E2E test results and their failures distinguishable. Signed-off-by: Jan Schlicht <jan@d2iq.com> Signed-off-by: Ken Sipe <kensipe@gmail.com> Co-authored-by: Ken Sipe <kensipe@gmail.com>

view details

Jan Schlicht

commit sha 03edc91be048d6321ffefbb6e8d3135350686d04

Fix namespace create without manifest (#1543) If a namespace was created without a manifest, KUDO would fail with a segfault, because it tries to add an annotation to a nil map. This has been fixed and tests have been added for namespace creation. Signed-off-by: Jan Schlicht <jan@d2iq.com>

view details

Andreas Neumann

commit sha a2535356d559fed5aa0eafc14dee06c448f096d0

Merge branch 'master' into ve/simplified-diagnostics

view details

push time in 3 days

push eventkudobuilder/kudo

Andreas Neumann

commit sha fd9864a661bdad7bf9b231107d5107c76ee1e456

Small cleanup from code review Signed-off-by: Andreas Neumann <aneumann@mesosphere.com>

view details

Andreas Neumann

commit sha 2a439b9667d3d9ba522aaf727981cc05313e3543

Reworked collectors and runner Signed-off-by: Andreas Neumann <aneumann@mesosphere.com>

view details

Andreas Neumann

commit sha 71a20f12c0b70e475d945a3d974aeed1dd0b61c4

Another small cleanup Signed-off-by: Andreas Neumann <aneumann@mesosphere.com>

view details

push time in 3 days

Pull request review commentmesosphere/kudo-cassandra-operator

Update Medusa to 0.6.0

 fi  docker build "${REPO_ROOT}/tools" -t kudo-cassandra-tools -docker run --rm -u "$(id -u):$(id -g)" -v "${REPO_ROOT}:${REPO_ROOT}" -w "$(pwd)" kudo-cassandra-tools "$@"++# Run docker and copy all ENV vars except any that contain PATH+docker run --rm -u "$(id -u):$(id -g)" --env-file <(env | grep -v "PATH") -v "${REPO_ROOT}:${REPO_ROOT}" -w "$(pwd)" kudo-cassandra-tools "$@"

Also: In this case the docker.sh is called from the image-build-and-push.sh which is located in the shared directory...

I have adjusted the pattern to only look for PATH in the var name

ANeumann82

comment created time in 3 days

push eventmesosphere/kudo-cassandra-operator

Andreas Neumann

commit sha 29c1ae6746e441219e6e59ff5b3592eb46028ccd

Fixed typo, grep only for PATH in var name Signed-off-by: Andreas Neumann <aneumann@mesosphere.com>

view details

push time in 3 days

Pull request review commentmesosphere/kudo-cassandra-operator

Update Medusa to 0.6.0

 fi  docker build "${REPO_ROOT}/tools" -t kudo-cassandra-tools -docker run --rm -u "$(id -u):$(id -g)" -v "${REPO_ROOT}:${REPO_ROOT}" -w "$(pwd)" kudo-cassandra-tools "$@"++# Run docker and copy all ENV vars except any that contain PATH+docker run --rm -u "$(id -u):$(id -g)" --env-file <(env | grep -v "PATH") -v "${REPO_ROOT}:${REPO_ROOT}" -w "$(pwd)" kudo-cassandra-tools "$@"

Well... It's a problem with docker-ception and running every nook and cranny inside docker. The problem is that we need a lot of stuff from metadata.sh, which in turn needs a lot of other variables. I agree that this is not optimal, but tbh. i would prefer to not have the docker.sh at all in the cassandra project. I think it should rather be in the shared tools, pass all required vars from TC to that docker invocation and let the project do it's stuff inside there.

ANeumann82

comment created time in 3 days

push eventmesosphere/kudo-cassandra-operator

Andreas Neumann

commit sha 109736262fac53bf3e26ab33979bf46da6c59218

Wait for all nodes to be UN state, UJ is not yet ok Signed-off-by: Andreas Neumann <aneumann@mesosphere.com>

view details

push time in 7 days

push eventmesosphere/kudo-cassandra-operator

Andreas Neumann

commit sha 957657ffdbea8db64649a81a23b446f7f19fd15a

Don't use *PATH* env vars in docker environment Signed-off-by: Andreas Neumann <aneumann@mesosphere.com>

view details

push time in 7 days

push eventmesosphere/kudo-cassandra-operator

Andreas Neumann

commit sha 6e3ac9a9444e1a7b158d7acbbde3ee6d1fafe184

Copy env into docker.sh Signed-off-by: Andreas Neumann <aneumann@mesosphere.com>

view details

push time in 7 days

Pull request review commentkudobuilder/kudo

KEP-29: Add `KudoOperatorTask` implementation

 type KudoOperatorTask struct { 	InstanceName    string 	AppVersion      string 	OperatorVersion string+	ParameterFile   string }  // Run method for the KudoOperatorTask. Not yet implemented func (dt KudoOperatorTask) Run(ctx Context) (bool, error) {-	return false, errors.New("kudo-operator task is not yet implemented. Stay tuned though ;)")++	// 0. - A few prerequisites -+	// Note: ctx.Meta has Meta.OperatorName and Meta.OperatorVersion fields but these are of the **parent instance**+	// However, since we don't support multiple namespaces yet, we can use the Meta.InstanceNamespace for the namespace+	namespace := ctx.Meta.InstanceNamespace+	operatorName := dt.Package+	operatorVersion := dt.OperatorVersion+	operatorVersionName := v1beta1.OperatorVersionName(operatorName, operatorVersion)+	instanceName := dependencyInstanceName(ctx.Meta.InstanceName, dt.InstanceName, operatorName)++	// 1. - Expand parameter file if exists -+	params, err := instanceParameters(dt.ParameterFile, ctx.Templates, ctx.Meta, ctx.Parameters)+	if err != nil {+		return false, fatalExecutionError(err, taskRenderingError, ctx.Meta)+	}++	// 2. - Build the instance object -+	instance, err := instanceResource(instanceName, operatorName, operatorVersionName, namespace, params, ctx.Meta.ResourcesOwner, ctx.Scheme)+	if err != nil {+		return false, fatalExecutionError(err, taskRenderingError, ctx.Meta)+	}++	// 3. - Apply the Instance object -+	err = applyInstance(instance, namespace, ctx.Client)+	if err != nil {+		return false, err+	}++	// 4. - Check the Instance health -+	if err := health.IsHealthy(instance); err != nil {+		return false, nil+	}++	return true, nil+}++// dependencyInstanceName returns a name for the child instance in an operator with dependencies looking like+// <parent-instance.<child-instance> if a child instance name is provided e.g. `kafka-instance.custom-name` or+// <parent-instance.<child-operator> if not e.g. `kafka-instance.zookeeper`. This way we always have a valid child+// instance name and user can install the same operator multiple times in the same namespace, because the instance+// names will be unique thanks to the top-level instance name prefix.+func dependencyInstanceName(parentInstanceName, instanceName, operatorName string) string {+	if instanceName != "" {+		return fmt.Sprintf("%s.%s", parentInstanceName, instanceName)+	}+	return fmt.Sprintf("%s.%s", parentInstanceName, operatorName)+}++// render method takes templated parameter file and a map of parameters and then renders passed template using kudo engine.+func instanceParameters(pf string, templates map[string]string, meta renderer.Metadata, parameters map[string]interface{}) (map[string]string, error) {+	if len(pf) != 0 {+		pft, ok := templates[pf]+		if !ok {+			return nil, fmt.Errorf("error finding parameter file %s", pf)+		}++		rendered, err := renderParametersFile(pf, pft, meta, parameters)+		if err != nil {+			return nil, fmt.Errorf("error expanding parameter file %s: %w", pf, err)+		}++		parameters := map[string]string{}+		errs := []string{}+		parser.GetParametersFromFile(pf, []byte(rendered), errs, parameters)+		if len(errs) > 0 {+			return nil, fmt.Errorf("failed to unmarshal parameter file %s: %s", pf, strings.Join(errs, ", "))+		}++		return parameters, nil+	}++	return nil, nil+}++func renderParametersFile(pf string, pft string, meta renderer.Metadata, parameters map[string]interface{}) (string, error) {+	vals := renderer.+		NewVariableMap().+		WithInstance(meta.OperatorName, meta.InstanceName, meta.InstanceNamespace, meta.AppVersion, meta.OperatorVersion).+		WithParameters(parameters)++	engine := renderer.New()++	return engine.Render(pf, pft, vals)+}++func instanceResource(instanceName, operatorName, operatorVersionName, namespace string, parameters map[string]string, owner metav1.Object, scheme *runtime.Scheme) (*v1beta1.Instance, error) {+	instance := &v1beta1.Instance{+		TypeMeta: metav1.TypeMeta{+			Kind:       "Instance",+			APIVersion: packages.APIVersion,+		},+		ObjectMeta: metav1.ObjectMeta{+			Name:      instanceName,+			Namespace: namespace,+			Labels:    map[string]string{kudo.OperatorLabel: operatorName},+		},+		Spec: v1beta1.InstanceSpec{+			OperatorVersion: corev1.ObjectReference{+				Name: operatorVersionName,+			},+			Parameters: parameters,+		},+		Status: v1beta1.InstanceStatus{},+	}+	if err := controllerutil.SetControllerReference(owner, instance, scheme); err != nil {+		return nil, fmt.Errorf("failed to set resource ownership for the new instance: %v", err)+	}++	return instance, nil+}++// applyInstance creates the passed instance if it doesn't exist or patches the existing one. Patch will override+// current spec.parameters and Spec.operatorVersion the same way, kudoctl does it. If the was no error, then the passed+// instance object is updated with the content returned by the server+func applyInstance(new *v1beta1.Instance, ns string, c client.Client) error {+	old := &v1beta1.Instance{}+	err := c.Get(context.TODO(), types.NamespacedName{Name: new.Name, Namespace: ns}, old)++	switch {+	// 1. if instance doesn't exist, create it+	case apierrors.IsNotFound(err):+		log.Printf("Instance %s/%s doesn't exist. Creating it", new.Namespace, new.Name)+		return createInstance(new, c)+	// 2. if the instance exists (there was no error), try to patch it+	case err == nil:+		log.Printf("Instance %s/%s already exist. Patching it", new.Namespace, new.Name)+		return patchInstance(new, c)+	// 3. any other error is treated as transient+	default:+		return fmt.Errorf("failed to check if instance %s/%s already exists: %v", new.Namespace, new.Name, err)+	}+}++func createInstance(i *v1beta1.Instance, c client.Client) error {+	gvk := i.GroupVersionKind()+	err := c.Create(context.TODO(), i)++	// reset the GVK since it is removed by the c.Create call+	// https://github.com/kubernetes/kubernetes/issues/80609+	i.SetGroupVersionKind(gvk)++	return err+}++func patchInstance(i *v1beta1.Instance, c client.Client) error {+	patch, err := json.Marshal(struct {+		Spec *v1beta1.InstanceSpec `json:"spec"`+	}{+		Spec: &i.Spec,+	})++	if err != nil {+		return fmt.Errorf("failed to serialize instance %s/%s patch: %v", i.Namespace, i.Name, err)+	}++	return c.Patch(context.TODO(), i, client.RawPatch(types.MergePatchType, patch))

I'm pretty sure this is not enough to correctly patch the instance, I want to see at least a test that removes a parameter from the updated instance. This could for example happen if the paramsFile for the dependent operator has something like:

{{ if eq ".Params.INCLUDE_STUFF }}
FANCY_PARAM_THAT_CAN_BE_THERE_OR NOT: "cool value"
{{ end }}
zen-dog

comment created time in 7 days

Pull request review commentkudobuilder/kudo

KEP-29: Add `KudoOperatorTask` implementation

 type KudoOperatorTaskSpec struct { 	// a specific operator version in the official repo, defaults to the most recent one 	// +optional 	OperatorVersion string `json:"operatorVersion,omitempty"`+	// a parameter file name that will be used to populate Instance.Spec.Parameters

Well, it is as proposed in the KEP - and it is different from the CLIs kudo install -P although it servers a similar purpose. I agree that it is a bit misleading, I had to re-read the KEP as well to fully understand it

zen-dog

comment created time in 7 days

Pull request review commentkudobuilder/kudo

simplified diagnostics

 import ( // Client is a KUDO Client providing access to a kudo clientset and kubernetes clientsets type Client struct { 	kudoClientset versioned.Interface-	kubeClientset kubernetes.Interface+	kubernetes.Interface

I'm not too fixed on this as well, I just think its clearer when we have the client.kubeClientset.*, client.kudoClientset.*,client.discovery.,client.dynamic.,client.extensions.*` etc...

vemelin-epm

comment created time in 8 days

push eventmesosphere/kudo-cassandra-operator

Andreas Neumann

commit sha 56ab122b275b923b01fba36e89b270c1659bf548

Uninstall operator correctly between tests Signed-off-by: Andreas Neumann <aneumann@mesosphere.com>

view details

push time in 8 days

Pull request review commentkudobuilder/kudo

simplified diagnostics

 import ( // Client is a KUDO Client providing access to a kudo clientset and kubernetes clientsets type Client struct { 	kudoClientset versioned.Interface-	kubeClientset kubernetes.Interface+	kubernetes.Interface

I'm not sure about this change. At the moment it's rather easy to see that we have two distinct clients here, if we apply this change the kudoClientset will mostly be hidden by funcs from the kubernetes.Interface. And if we add more clients at some point it gets even worse.

vemelin-epm

comment created time in 8 days

Pull request review commentkudobuilder/kudo

simplified diagnostics

+package cmd++import (+	"fmt"+	"time"++	"github.com/spf13/afero"+	"github.com/spf13/cobra"++	"github.com/kudobuilder/kudo/pkg/kudoctl/cmd/diagnostics"+	"github.com/kudobuilder/kudo/pkg/kudoctl/util/kudo"+)++const (+	diagCollectExample = `  # collect diagnostics example+  kubectl kudo diagnostics collect --instance flink+`+)++func newDiagnosticsCmd(fs afero.Fs) *cobra.Command {+	cmd := &cobra.Command{+		Use:   "diagnostics",+		Short: "collect diagnostics",+		Long:  "diagnostics command has sub-commands to collect and analyze diagnostics data",+	}+	cmd.AddCommand(newDiagnosticsCollectCmd(fs))+	return cmd+}++func newDiagnosticsCollectCmd(fs afero.Fs) *cobra.Command {+	var logSince time.Duration+	var instance string+	cmd := &cobra.Command{+		Use:     "collect",+		Short:   "collect diagnostics",+		Long:    "collect data relevant for diagnostics of the provided instance's state",+		Example: diagCollectExample,+		RunE: func(cmd *cobra.Command, args []string) error {+			c, err := kudo.NewClient(Settings.KubeConfig, Settings.RequestTimeout, Settings.Validate)+			if err != nil {+				return fmt.Errorf("failed to create kudo client: %v", err)+			}+			return diagnostics.Collect(fs, instance, diagnostics.NewOptions(logSince), c, &Settings)+		},+	}+	cmd.Flags().StringVar(&instance, "instance", "", "The instance name.")+	cmd.Flags().DurationVar(&logSince, "log-since", 0, "Only return logs newer than a relative duration like 5s, 2m, or 3h. Defaults to all logs.")

I just noticed: We should probably have an (optional) parameter for the target output directory, right? Can be added in a later PR though..

vemelin-epm

comment created time in 8 days

Pull request review commentkudobuilder/kudo

simplified diagnostics

+package diagnostics++import (+	"fmt"++	"github.com/kudobuilder/kudo/pkg/apis/kudo/v1beta1"++	v1 "k8s.io/api/core/v1"+	"k8s.io/apimachinery/pkg/api/meta"+	"k8s.io/apimachinery/pkg/runtime"+)++// processingContext - shared data for the resource collectors+// provides property accessors allowing to define a collector before the data it needs is available+// provides update callback functions. callbacks panic if called on a wrong type of runtime.object+type processingContext struct {+	podNames      []string+	root          string+	opName        string+	opVersionName string+	instanceName  string+}++func (ctx *processingContext) attachToRoot() string {+	return ctx.root+}++func (ctx *processingContext) attachToOperator() string {+	return fmt.Sprintf("%s/operator_%s", ctx.root, ctx.opName)+}++func (ctx *processingContext) attachToInstance() string {+	return fmt.Sprintf("%s/instance_%s", ctx.attachToOperator(), ctx.instanceName)+}++func (ctx *processingContext) mustSetOperatorNameFromOperatorVersion(o runtime.Object) {+	ctx.opName = o.(*v1beta1.OperatorVersion).Spec.Operator.Name+}++func (ctx *processingContext) mustSetOperatorVersionNameFromInstance(o runtime.Object) {+	ctx.opVersionName = o.(*v1beta1.Instance).Spec.OperatorVersion.Name+}++func (ctx *processingContext) mustAddPodNames(o runtime.Object) {

Hmmm. It's a weird convention that doesn't really add much to my understanding of the code, but ok. I would still prefer not having that prefix, but it's not a blocker.

I agree with the unchecked cast btw, that's just how it is in Go-World...

vemelin-epm

comment created time in 8 days

Pull request review commentkudobuilder/kudo

simplified diagnostics

+package diagnostics++import (+	"fmt"+	"io"+	"path/filepath"+	"reflect"++	"k8s.io/apimachinery/pkg/api/meta"+	"k8s.io/apimachinery/pkg/runtime"+)++// resourceCollector - collector interface implementation for Kubernetes resources (runtime objects)+type resourceCollector struct {+	loadResourceFn func() (runtime.Object, error)+	errKind        string                 // object kind used to describe the error+	parentDir      func() string          // parent dir to attach the printer's output+	failOnError    bool                   // define whether the collector should return the error+	callback       func(o runtime.Object) // should be used to update some shared context+	printer        *nonFailingPrinter+	printMode      printMode+}++// collect - load a resource and send either the resource or collection error to printer+// return error if failOnError field is set to true+// if failOnError is true, finding no object(s) is treated as an error+func (c *resourceCollector) collect() error {+	obj, err := c.loadResourceFn()+	switch {+	case err != nil:+		if c.failOnError {+			return fmt.Errorf("failed to retrieve object(s) of kind %s: %v", c.errKind, err)+		}+		c.printer.printError(err, c.parentDir(), c.errKind)+	case obj == nil || reflect.ValueOf(obj).IsNil() || meta.IsListType(obj) && meta.LenList(obj) == 0:

I like expressiveness. I usually don't use many short circuit operators, and having the extra brackets there helps readability, at least for me.

I don't need to have the brackets around the == part though, my code would probably look like this:

case obj == nil || reflect.ValueOf(obj).IsNil() || (meta.IsListType(obj) && meta.LenList(obj) == 0):
vemelin-epm

comment created time in 8 days

Pull request review commentkudobuilder/kudo

simplified diagnostics

+package diagnostics++// collector - generic interface for diagnostic data collection+// implementors are expected to return only fatal errors and handle non-fatal ones themselves+type collector interface {+	collect() error+}

Hmmm, good question. My main issue with it is that I can't quickly see which interface is implemented by the resourceCollector, logCollector etc.

It would help a lot if we could use this pattern at the implementing side:

// Ensure collector is implemented
var _ collector = &resourceCollector{}

type resourceCollector struct {
...

Then there's direct link to the interface and it can stay here

vemelin-epm

comment created time in 8 days

PR closed mesosphere/kudo-cassandra-operator

kudo 0.12 bump

and running gen scripts Signed-off-by: Ken Sipe kensipe@gmail.com

+14 -2

1 comment

5 changed files

kensipe

pr closed time in 8 days

pull request commentmesosphere/kudo-cassandra-operator

kudo 0.12 bump

Not required anymore, master is now at 0.13.0

kensipe

comment created time in 8 days

push eventmesosphere/kudo-cassandra-operator

Andreas Neumann

commit sha d2be57851739e1d32c006defff211935dbf51dbf

Include TLS backup/restore test Don't include own node in seed list Adjust deployment of resources so that the dependency hash change is not triggered Signed-off-by: Andreas Neumann <aneumann@mesosphere.com>

view details

push time in 8 days

Pull request review commentmesosphere/kudo-cassandra-operator

add resources and production docs

+# Resources++## Computable Resources++KUDO Cassandra by default requests 1.5 cpu and 4.5Gi of memory for each+Cassandra node. The limits by default are 2 cpu and 4.5Gi of memory.++Those requests and limits can be tuned using the parameters:++- NODE_CPU_MC+- NODE_CPU_LIMIT_MC+- NODE_MEM_MIB+- NODE_MEM_LIMIT_MIB+- PROMETHEUS_EXPORTER_CPU_MC+- PROMETHEUS_EXPORTER_CPU_LIMIT_MC+- PROMETHEUS_EXPORTER_MEM_MIB+- PROMETHEUS_EXPORTER_MEM_LIMIT_MIB+- RECOVERY_CONTROLLER_CPU_MC+- RECOVERY_CONTROLLER_CPU_LIMIT_MC+- RECOVERY_CONTROLLER_MEM_MIB+- RECOVERY_CONTROLLER_MEM_LIMIT_MIB++## Storage resources++By default KUDO Cassandra uses 20GiB PV. This isn't recommended for production+use. Please refer to [production](./production.md) docs to see the storage and+compute resources recommendations++## Resources per container++#### Cassandra container++```+resources:+  limits:+    cpu: 1+    memory: 4Gi+  requests:+    cpu: 1+    memory: 4Gi+```++#### Bootstrap init container++```+resources:+  limits:+    cpu: 200m+    memory: 256Mi+  requests:+    cpu: 100m+    memory: 128Mi+```++#### prometheus exporter sidecar++```+resources:+  limits:+    cpu: 1+    memory: 512Mi+  requests:+    cpu: 500m+    memory: 512Mi+```++#### cassandra-recovery controller pod++```+resources:+  limits:+    cpu: 200m+    memory: 256Mi+  requests:+    cpu: 50m+    memory: 50Mi+```++## Kubernetes Objects++KUDO Cassandra is delivered with a different set of features, enabling or+disabling those features creates more or fewer objects in Kubernetes.++Let’s take a look at the resources created by default when doing a simple++```+$ kubectl kudo install cassandra+operator.kudo.dev/v1beta1/cassandra created+operatorversion.kudo.dev/v1beta1/cassandra-<version> created+instance.kudo.dev/v1beta1/cassandra-instance created+instance.kudo.dev/v1beta1/cassandra-instance ready++$ kubectl tree instance cassandra-instance+NAMESPACE  NAME                                                             READY  REASON  AGE+default    Instance/cassandra-instance                                      -              6m11s+default    ├─ConfigMap/cassandra-instance-cassandra-env-sh                  -              6m9s+default    ├─ConfigMap/cassandra-instance-cassandra-exporter-config-yml     -              6m9s+default    ├─ConfigMap/cassandra-instance-generate-cassandra-yaml           -              6m9s+default    ├─ConfigMap/cassandra-instance-generate-cqlshrc-sh               -              6m9s+default    ├─ConfigMap/cassandra-instance-generate-nodetool-ssl-properties  -              6m9s+default    ├─ConfigMap/cassandra-instance-generate-tls-artifacts-sh         -              6m9s+default    ├─ConfigMap/cassandra-instance-jvm-options                       -              6m9s+default    ├─ConfigMap/cassandra-instance-node-scripts                      -              6m9s+default    ├─ConfigMap/cassandra-instance-topology-lock                     -              6m9s+default    ├─PodDisruptionBudget/cassandra-instance-pdb                     -              6m9s+default    ├─Role/cassandra-instance-node-role                              -              6m10s+default    ├─Role/cassandra-instance-role                                   -              6m9s+default    ├─RoleBinding/cassandra-instance-binding                         -              6m9s+default    ├─RoleBinding/cassandra-instance-node-default-binding            -              6m10s+default    ├─Secret/cassandra-instance-tls-store-credentials                -              6m9s+default    ├─Service/cassandra-instance-svc                                 -              6m9s+default    ├─ServiceAccount/cassandra-instance-sa                           -              6m9s+default    ├─ServiceMonitor/cassandra-instance-monitor                      -              6m9s+default    └─StatefulSet/cassandra-instance-node                            -              6m9s+default      ├─ControllerRevision/cassandra-instance-node-659c89769d        -              2m24s+default      ├─Pod/cassandra-instance-node-0                                True           2m14s+default      ├─Pod/cassandra-instance-node-1                                True           94s+default      └─Pod/cassandra-instance-node-2                                True           53s+```++### Statefulset++Statefulsets are designed to manage stateful workload in Kubernetes. KUDO+Cassandra uses statefulsets. The operator by default uses `OrderedReady`pod+management policy. Which guarantees that pods are created sequentially. This+makes sure that when the Cassandra cluster is coming up, only one node starts at+the same time. Pod names are <instance-name>-node-<ordinal-index> starting from+ordinal-index 0. For example a 3 node cluster created using KUDO Cassandra+instance name cass-prod will have these pods:++```+$ kubectl get pods+NAME               READY   STATUS    RESTARTS   AGE+cass-prod-node-0   1/1     Running   0          101s+cass-prod-node-1   1/1     Running   0          49s+cass-prod-node-2   1/1     Running   0          8s+```++### Configmaps++KUDO Cassandra generates the configurable scripts and properties used in KUDO+Cassandra operator as configmaps objects PodDisruptionBudget KUDO Cassandra
Cassandra operator as configmap objects. 

### PodDisruptionBudget 
KUDO Cassandra
zmalik

comment created time in 8 days

Pull request review commentmesosphere/kudo-cassandra-operator

add resources and production docs

+# Resources++## Computable Resources++KUDO Cassandra by default requests 1.5 cpu and 4.5Gi of memory for each+Cassandra node. The limits by default are 2 cpu and 4.5Gi of memory.++Those requests and limits can be tuned using the parameters:++- NODE_CPU_MC+- NODE_CPU_LIMIT_MC+- NODE_MEM_MIB+- NODE_MEM_LIMIT_MIB+- PROMETHEUS_EXPORTER_CPU_MC+- PROMETHEUS_EXPORTER_CPU_LIMIT_MC+- PROMETHEUS_EXPORTER_MEM_MIB+- PROMETHEUS_EXPORTER_MEM_LIMIT_MIB+- RECOVERY_CONTROLLER_CPU_MC+- RECOVERY_CONTROLLER_CPU_LIMIT_MC+- RECOVERY_CONTROLLER_MEM_MIB+- RECOVERY_CONTROLLER_MEM_LIMIT_MIB++## Storage resources++By default KUDO Cassandra uses 20GiB PV. This isn't recommended for production+use. Please refer to [production](./production.md) docs to see the storage and+compute resources recommendations++## Resources per container++#### Cassandra container++```+resources:+  limits:+    cpu: 1+    memory: 4Gi+  requests:+    cpu: 1+    memory: 4Gi+```++#### Bootstrap init container++```+resources:+  limits:+    cpu: 200m+    memory: 256Mi+  requests:+    cpu: 100m+    memory: 128Mi+```++#### prometheus exporter sidecar++```+resources:+  limits:+    cpu: 1+    memory: 512Mi+  requests:+    cpu: 500m+    memory: 512Mi+```++#### cassandra-recovery controller pod++```+resources:+  limits:+    cpu: 200m+    memory: 256Mi+  requests:+    cpu: 50m+    memory: 50Mi+```++## Kubernetes Objects++KUDO Cassandra is delivered with a different set of features, enabling or+disabling those features creates more or fewer objects in Kubernetes.++Let’s take a look at the resources created by default when doing a simple++```+$ kubectl kudo install cassandra+operator.kudo.dev/v1beta1/cassandra created+operatorversion.kudo.dev/v1beta1/cassandra-<version> created+instance.kudo.dev/v1beta1/cassandra-instance created+instance.kudo.dev/v1beta1/cassandra-instance ready++$ kubectl tree instance cassandra-instance+NAMESPACE  NAME                                                             READY  REASON  AGE+default    Instance/cassandra-instance                                      -              6m11s+default    ├─ConfigMap/cassandra-instance-cassandra-env-sh                  -              6m9s+default    ├─ConfigMap/cassandra-instance-cassandra-exporter-config-yml     -              6m9s+default    ├─ConfigMap/cassandra-instance-generate-cassandra-yaml           -              6m9s+default    ├─ConfigMap/cassandra-instance-generate-cqlshrc-sh               -              6m9s+default    ├─ConfigMap/cassandra-instance-generate-nodetool-ssl-properties  -              6m9s+default    ├─ConfigMap/cassandra-instance-generate-tls-artifacts-sh         -              6m9s+default    ├─ConfigMap/cassandra-instance-jvm-options                       -              6m9s+default    ├─ConfigMap/cassandra-instance-node-scripts                      -              6m9s+default    ├─ConfigMap/cassandra-instance-topology-lock                     -              6m9s+default    ├─PodDisruptionBudget/cassandra-instance-pdb                     -              6m9s+default    ├─Role/cassandra-instance-node-role                              -              6m10s+default    ├─Role/cassandra-instance-role                                   -              6m9s+default    ├─RoleBinding/cassandra-instance-binding                         -              6m9s+default    ├─RoleBinding/cassandra-instance-node-default-binding            -              6m10s+default    ├─Secret/cassandra-instance-tls-store-credentials                -              6m9s+default    ├─Service/cassandra-instance-svc                                 -              6m9s+default    ├─ServiceAccount/cassandra-instance-sa                           -              6m9s+default    ├─ServiceMonitor/cassandra-instance-monitor                      -              6m9s+default    └─StatefulSet/cassandra-instance-node                            -              6m9s+default      ├─ControllerRevision/cassandra-instance-node-659c89769d        -              2m24s+default      ├─Pod/cassandra-instance-node-0                                True           2m14s+default      ├─Pod/cassandra-instance-node-1                                True           94s+default      └─Pod/cassandra-instance-node-2                                True           53s+```++### Statefulset++Statefulsets are designed to manage stateful workload in Kubernetes. KUDO+Cassandra uses statefulsets. The operator by default uses `OrderedReady`pod+management policy. Which guarantees that pods are created sequentially. This+makes sure that when the Cassandra cluster is coming up, only one node starts at+the same time. Pod names are <instance-name>-node-<ordinal-index> starting from+ordinal-index 0. For example a 3 node cluster created using KUDO Cassandra+instance name cass-prod will have these pods:++```+$ kubectl get pods+NAME               READY   STATUS    RESTARTS   AGE+cass-prod-node-0   1/1     Running   0          101s+cass-prod-node-1   1/1     Running   0          49s+cass-prod-node-2   1/1     Running   0          8s+```+

Note: When a multi-datacenter configuration with NODE_TOPOLOGY is used, the pod names include the datacenter name as well. See (link to Multi-DC-Setup documentation).

zmalik

comment created time in 8 days

push eventmesosphere/kudo-cassandra-operator

Andreas Neumann

commit sha da9b3ad73b1eb6de6dc646382b31ef0a2fd01821

Add nodeAffinity label to cordon only cassandra from a k8s node, kudo v0.13.0 (#123) * Add nodeAffinity label to cordon only cassandra from a k8s node * Update KUDO dependency to 0.13.0 Signed-off-by: Andreas Neumann <aneumann@mesosphere.com>

view details

push time in 9 days

delete branch mesosphere/kudo-cassandra-operator

delete branch : an/add-cordon-label

delete time in 9 days

PR merged mesosphere/kudo-cassandra-operator

Add nodeAffinity label to cordon only cassandra from a k8s node, kudo v0.13.0

Added an nodeAffinity label for nodes not to contain a specific label. This makes it a lot easier to evict nodes from a k8s node without draining or cordon the whole node.

Updated KUDO dependency to 0.13.0

Signed-off-by: Andreas Neumann aneumann@mesosphere.com

+99 -23

0 comment

8 changed files

ANeumann82

pr closed time in 9 days

push eventmesosphere/kudo-cassandra-operator

Andreas Neumann

commit sha f8cd907030eb7050e44d373ab033b232fd1a2a8e

Updated parameters Signed-off-by: Andreas Neumann <aneumann@mesosphere.com>

view details

push time in 9 days

PR closed mesosphere/kudo-cassandra-operator

Revert "Started new branch for recovery controller"

Reverts mesosphere/kudo-cassandra-operator#96 in an attempt to de-flake the tests.

The kuttl tests failed on every commit since it was merged:

The change itself: https://github.com/mesosphere/kudo-cassandra-operator/commit/9265e3022f3aed7e8fee64905ce51a87d8977ea1

  • https://teamcity.mesosphere.io/buildConfiguration/Frameworks_DataServices_Kudo_Cassandra_Pr_CassandraPrKonvoyKudo_2/2739677?showLog=2739677_2338_1247.2338
  • https://teamcity.mesosphere.io/buildConfiguration/Frameworks_DataServices_Kudo_Cassandra_Nightly_CassandraNightlyKonvoyKudo/2739991?showLog=2739991_2110_961.2110 both failed with: failed to get file: command "docker exec --privileged kind-control-plane cat /kind/version" failed with error: exit status 1

My two PRs:

  • https://github.com/mesosphere/kudo-cassandra-operator/pull/109 https://teamcity.mesosphere.io/buildConfiguration/Frameworks_DataServices_Kudo_Cassandra_Pr_CassandraPrKonvoyKudo_2/2739678?showLog=2739678_2611_1247.2611
      kuttl/harness/node-failure: logger.go:41: 20:07:14 | node-failure | 2020-05-14 20:07:07 +0000 UTC  Warning  Unhealthy  Liveness probe failed: File replace.ip found with IP:10.244.2.7
22:07:14
          nodetool: Failed to connect to '127.0.0.1:7199' - ConnectException: 'Connection refused (Connection refused)'.
  • https://github.com/mesosphere/kudo-cassandra-operator/pull/108 https://teamcity.mesosphere.io/buildConfiguration/Frameworks_DataServices_Kudo_Cassandra_Pr_CassandraPrKonvoyKudo_2/2739676?showLog=2739676_2423_1247.2422.2423 again failed with failed to get file: command "docker exec --privileged kind-control-plane cat /kind/version" failed with error: exit status 1
+104 -1224

2 comments

51 changed files

porridge

pr closed time in 9 days

PR opened mesosphere/kudo-cassandra-operator

Update Medusa to 0.6.0

This allows us to enable backup/restore for encrypted clusters

Signed-off-by: Andreas Neumann aneumann@mesosphere.com

+61 -34

0 comment

4 changed files

pr created time in 9 days

push eventmesosphere/kudo-cassandra-operator

Andreas Neumann

commit sha bcd67fb232ffe66d8306ef75e217e8b016837a0f

Update KUDO CLI as well Signed-off-by: Andreas Neumann <aneumann@mesosphere.com>

view details

push time in 9 days

create barnchmesosphere/kudo-cassandra-operator

branch : an/medusa-0.6.0

created branch time in 9 days

pull request commentkudobuilder/kuttl

Update kind dependency to 0.8.1

So, as a quick update: All other flake causes are now fixed in the cassandra operator. There are still some flakes which come from this issue, but they are like 1 in 10 runs, so it's not too urgent to get this updated.

Still - at some point I think we need to push to the new version

ANeumann82

comment created time in 9 days

push eventmesosphere/kudo-cassandra-operator

Andreas Neumann

commit sha 7b1da3f536db089b4183331cb392de0098061ce3

Update test tools dependency to 0.5.0 Signed-off-by: Andreas Neumann <aneumann@mesosphere.com>

view details

push time in 9 days

push eventmesosphere/kudo-cassandra-operator

Andreas Neumann

commit sha e4af7633733aaa44d5c9ff08b2ac19d393798778

Update test tools dependency to 0.5.0 Signed-off-by: Andreas Neumann <aneumann@mesosphere.com>

view details

push time in 9 days

created tagkudobuilder/test-tools

tagv0.5.0

Helper functions for writing tests for Kubernetes clusters in Go

created time in 9 days

release kudobuilder/test-tools

v0.5.0

released time in 9 days

push eventkudobuilder/test-tools

Andreas Neumann

commit sha e70acfcd353f178b4a907c344002d7f068c7ccdb

Generate save func (#29) Signed-off-by: Andreas Neumann <aneumann@mesosphere.com>

view details

push time in 9 days

delete branch kudobuilder/test-tools

delete branch : an/generate-save-func

delete time in 9 days

PR merged kudobuilder/test-tools

Generate save function

Signed-off-by: Andreas Neumann aneumann@mesosphere.com

+207 -37

0 comment

12 changed files

ANeumann82

pr closed time in 9 days

push eventkudobuilder/test-tools

Jan Schlicht

commit sha 0ca90a056bde57cf5d8e79adc9fa98b53462f5f3

Update dependencies (#28) Signed-off-by: Jan Schlicht <jan@d2iq.com>

view details

Andreas Neumann

commit sha 800efb43ac1c93b872d0a54fa0df46a63fb17589

Merge branch 'master' into an/generate-save-func

view details

push time in 9 days

PR opened kudobuilder/test-tools

Generate save function

Signed-off-by: Andreas Neumann aneumann@mesosphere.com

+207 -37

0 comment

12 changed files

pr created time in 10 days

create barnchkudobuilder/test-tools

branch : an/generate-save-func

created branch time in 10 days

push eventkudobuilder/kudo

Andreas Neumann

commit sha 9a9d8d9eed8bb946f7080be68299a4dbdcd3a9c2

Use --unsafe-self-signed-webhook-ca in upgrade test Signed-off-by: Andreas Neumann <aneumann@mesosphere.com>

view details

push time in 10 days

push eventkudobuilder/kudo

Andreas Neumann

commit sha 7e6f3c797729bcf16053f013fbfeebaf86344252

Removed old code, fixed webhook deletion Signed-off-by: Andreas Neumann <aneumann@mesosphere.com>

view details

push time in 10 days

push eventmesosphere/kudo-cassandra-operator

Andreas Neumann

commit sha 5784c9d2f882ab06f9db325f2379d1e3c23d2fcc

Fixed affinity assertation Signed-off-by: Andreas Neumann <aneumann@mesosphere.com>

view details

push time in 10 days

push eventkudobuilder/kudo

Andreas Neumann

commit sha 85dd61d03b25f023d850b9e6221c14ebacf6c2dc

Fixed bug when uninstalling webhook Signed-off-by: Andreas Neumann <aneumann@mesosphere.com>

view details

push time in 10 days

push eventkudobuilder/kudo

Andreas Neumann

commit sha de66b1924471147373602b3d20b3e6a75a46a406

Fixed run-e2e-tests script Signed-off-by: Andreas Neumann <aneumann@mesosphere.com>

view details

push time in 10 days

push eventkudobuilder/kudo

Andreas Neumann

commit sha fdfe3566b5a7bf7755dd2c9d86e2ad8cde6b9bf1

Skip resources that are NotFound for dependency calculation (#1519) * Skip resources that are NotFound for dependency calculation * Don't fail when LastAppliedConfigAnnotation is not set, but use original object Signed-off-by: Andreas Neumann <aneumann@mesosphere.com>

view details

Ken Sipe

commit sha 900ca042b758004d88cc1f31c5fdebf20b8847e9

bump to kuttl 0.2.1 release and adding darwin back to release for local testing (#1524) Signed-off-by: Ken Sipe <kensipe@gmail.com>

view details

Aleksey Dukhovniy

commit sha 5a4cdc5d0f316d874fd3a92d9e66eb96b01422f1

Enable `skipDelete` for e2e tests (#1502) Summary: `skipDelete` option, if set, will not delete the resources after running the tests. This improves debuggability of the e2e tests as we will have all the container logs including KUDO manager. Signed-off-by: Aleksey Dukhovniy <alex.dukhovniy@googlemail.com>

view details

Aleksey Dukhovniy

commit sha f3183aef680b093a97b53ea91f8b13e9db18adbc

Add `KudoOperator` task types (#1515) to help parallelize upcoming work on KEP-29. Signed-off-by: Aleksey Dukhovniy alex.dukhovniy@googlemail.com Fixes: #1509

view details

Aleksey Dukhovniy

commit sha b10d86957142f7045e33cf14f4b720aec59a1b88

Rename `UnknownTaskKind` to `TaskBuildError` (#1520) Summary: previously, building a task would only fail for `UnknownTaskKind` reason. However, we're past that and have more error reasons e.g. validation. For now, the reason is kept generic, but we might revisit this should the need arise. Signed-off-by: Aleksey Dukhovniy <alex.dukhovniy@googlemail.com>

view details

Aleksey Dukhovniy

commit sha fc5900701441950dd50613ccbfe318270e026429

Better encapsulation for e2e test manifests (#1522) Summary: Now that we've introduced `--kudo-image-pull-policy` init option, we can avoid the workaround of generating e2e KUDO manifests and make it part of the `kudo-e2e-test.yaml` Signed-off-by: Aleksey Dukhovniy <alex.dukhovniy@googlemail.com>

view details

Ken Sipe

commit sha c22e53a1b1654234d2597a1ec69919d32800e463

CLI, YAML and Go CMP Dependency Bumps (#1525) Signed-off-by: Ken Sipe <kensipe@gmail.com>

view details

Ken Sipe

commit sha 87b7966b8a8628e8e34d3c0337ed148095fb6d5a

Adding Kubernetes Clientset to kudo.Client (#1528) Co-authored-by: Aleksey Dukhovniy <adukhovniy@mesosphere.io> Signed-off-by: Ken Sipe <kensipe@gmail.com>

view details

Jan Schlicht

commit sha 98a90e6d091bbd24e03fe4d4bb1e45aeb057cfdf

Lint files with 'integration' build flag (#1508) 'golangci-lint' has to be instructed to not skip code that has build tags. By adding 'integration' to the 'build-tags' property, we ensure that integration tests are linted. Signed-off-by: Jan Schlicht <jan@d2iq.com>

view details

Ken Sipe

commit sha 474c4e370f12731a5f57a6e068d06969e32f3a05

Display Gen Tool Versions in CI Output (#1530) Signed-off-by: Ken Sipe <kensipe@gmail.com>

view details

Ken Sipe

commit sha ec540a1ce33f0f1b0ecf24521d7f03afe21cd9d1

Create Namespace with `--create-namespace` Flag (#1531) Signed-off-by: Ken Sipe <kensipe@gmail.com>

view details

Aleksey Dukhovniy

commit sha a7b98bf97c0a3ca9a4729b27d412b90c8a90a1ef

Removed `--webhook` option (#1497) Summary: Now that KUDO moves towards a better support of multiple plans (e.g. `kudoctl plan trigger` command), the existing instance admission webhook becomes necessary to guarantee the plan execution consistency. More on KUDO [admission controller](https://kudo.dev/docs/developing-operators/plans.html#admission-controllers) in the documentation. This PR removes the `--webhook` option and thus makes the instance admission webhook required. This is a breaking change since the users will have to either have [cert-manager](https://cert-manager.io/) installed or use the `--unsafe-self-signed-webhook-ca` option when initializing KUDO. For existing installations, one would need to run [kudo init](https://kudo.dev/docs/cli.html#examples) to create missing secret/webhook configuration. Signed-off-by: Aleksey Dukhovniy <alex.dukhovniy@googlemail.com>

view details

Ken Sipe

commit sha 609b74442b3a0cd78d3a0335f6a6dec9cd1cc4e2

KUTTL 0.2.2 Bump (#1532) Signed-off-by: Ken Sipe <kensipe@gmail.com>

view details

Andreas Neumann

commit sha 2e1b38c457934b3f4104523f1d30c59729d7de6c

KEP-30: Immutable parameters (#1485) * Added KEP for immutable parameters Signed-off-by: Andreas Neumann <aneumann@mesosphere.com>

view details

Ken Sipe

commit sha 66cada219ba062b79bd154a823e5d26e3c2880bf

removing 32-bit darwin from release (#1534) Signed-off-by: Ken Sipe <kensipe@gmail.com>

view details

Andreas Neumann

commit sha 058406609f70e72b10094ce2e4df1f94bd6738ba

Merge branch 'master' into an/upgrading Signed-off-by: Andreas Neumann <aneumann@mesosphere.com> # Conflicts: # hack/run-e2e-tests.sh # pkg/kudoctl/cmd/init.go # pkg/kudoctl/cmd/init_integration_test.go # pkg/kudoctl/kudoinit/manager/manager.go # pkg/kudoctl/kudoinit/options.go # pkg/kudoctl/kudoinit/prereq/namespace_test.go # pkg/kudoctl/kudoinit/prereq/serviceaccount_test.go # pkg/kudoctl/kudoinit/prereq/webhook_test.go

view details

push time in 10 days

PR opened mesosphere/kudo-cassandra-operator

Add nodeAffinity label to cordon only cassandra from a k8s node, kudo v0.13.0

Added an nodeAffinity label for nodes not to contain a specific label. This makes it a lot easier to evict nodes from a k8s node without draining or cordon the whole node.

Updated KUDO dependency to 0.13.0

Signed-off-by: Andreas Neumann aneumann@mesosphere.com

+26 -14

0 comment

5 changed files

pr created time in 10 days

push eventmesosphere/kudo-cassandra-operator

Andreas Neumann

commit sha 9e4dad662f12391ffeb98c83a0c3ff49eb36ef41

Enable shutdown of old reachable node Signed-off-by: Andreas Neumann <aneumann@mesosphere.com>

view details

push time in 10 days

create barnchmesosphere/kudo-cassandra-operator

branch : an/add-cordon-label

created branch time in 10 days

push eventmesosphere/kudo-cassandra-operator

Andreas Neumann

commit sha 2da1b6578847a62c125785bc72716d43f7b71699

More fixes for node replace (#114) - Do not delete PVC if the Phase is still pending - Support for JMX/SSL in bootstrap - use kubectl node drain before deleting the kubernetes node Signed-off-by: Andreas Neumann <aneumann@mesosphere.com> Co-authored-by: Zain Malik <zmalikshxil@gmail.com>

view details

Andreas Neumann

commit sha 122da20cf100c1f77f12bfd5f7e8f06e2017557a

Merge branch 'master' into an/kuttl-with-0.8.1-kind # Conflicts: # docs/parameters.md # images/bootstrap/pkg/service/cassandra.go # kuttl-tests/.gitignore # kuttl-tests/Makefile # kuttl-tests/suites/failure-recovery/node-failure/delete-node-for-pod.sh # operator/params.yaml # operator/templates/stateful-set.yaml # templates/operator/params.yaml.template # templates/operator/templates/stateful-set.yaml.template

view details

push time in 10 days

Pull request review commentkudobuilder/kudo

simplified diagnostics

+package diagnostics++import (+	"fmt"+	"io"+	"path/filepath"+	"strings"++	"github.com/kudobuilder/kudo/pkg/kudoctl/util/kudo"++	"github.com/spf13/afero"+	"gopkg.in/yaml.v2"++	"k8s.io/apimachinery/pkg/api/meta"+	"k8s.io/apimachinery/pkg/runtime"+	"k8s.io/cli-runtime/pkg/printers"+	"k8s.io/client-go/kubernetes/scheme"+)++const (+	DiagDir = "diag"+	KudoDir = "diag/kudo"+)++type printMode int++const (+	ObjectWithDir      printMode = iota // print object into its own nested directory based on its name and kind+	ObjectListWithDirs                  // print each object into its own nested directory based on its name and kind+	RuntimeObject                       // print as a file based on its kind only+)++// nonFailingPrinter - print provided data into provided directory and accumulate errors instead of returning them.+// Creates a nested directory if an object type requires so.+type nonFailingPrinter struct {+	fs     afero.Fs+	errors []string+}++func (p *nonFailingPrinter) printObject(o runtime.Object, parentDir string, mode printMode) {+	switch mode {+	case ObjectWithDir:+		if err := printSingleObject(p.fs, o, parentDir); err != nil {+			p.errors = append(p.errors, err.Error())+		}+	case ObjectListWithDirs:+		err := meta.EachListItem(o, func(ro runtime.Object) error {+			if err := printSingleObject(p.fs, ro, parentDir); err != nil {+				p.errors = append(p.errors, err.Error())+			}+			return nil+		})+		if err != nil {+			p.errors = append(p.errors, err.Error())+		}+	case RuntimeObject:+		fallthrough+	default:+		if err := printSingleRuntimeObject(p.fs, o, parentDir); err != nil {+			p.errors = append(p.errors, err.Error())+		}+	}+}++func (p *nonFailingPrinter) printError(err error, parentDir, name string) {+	b := []byte(err.Error())+	if err := printBytes(p.fs, b, parentDir, fmt.Sprintf("%s.err", name)); err != nil {+		p.errors = append(p.errors, err.Error())+	}+}++func (p *nonFailingPrinter) printLog(log io.ReadCloser, parentDir, name string) {+	if err := printLog(p.fs, log, parentDir, name); err != nil {+		p.errors = append(p.errors, err.Error())+	}+}++func (p *nonFailingPrinter) printYaml(v interface{}, parentDir, name string) {+	if err := printYaml(p.fs, v, parentDir, name); err != nil {+		p.errors = append(p.errors, err.Error())+	}+}++// printSingleObject - print a runtime.object assuming it exposes metadata by implementing metav1.object+// or panic otherwise. object is put into a nested directory.+func printSingleObject(fs afero.Fs, obj runtime.Object, parentDir string) error {+	if !isKudoCR(obj) {+		err := kudo.SetGVKFromScheme(obj, scheme.Scheme)+		if err != nil {+			return err+		}+	}++	o, _ := obj.(object)+	relToParentDir := fmt.Sprintf("%s_%s", strings.ToLower(o.GetObjectKind().GroupVersionKind().Kind), o.GetName())
	o, _ := obj.(metav1.Object)
	relToParentDir := fmt.Sprintf("%s_%s", strings.ToLower(obj.GetObjectKind().GroupVersionKind().Kind), o.GetName())

Let's get rid of the object interface that merges metav1.Object and runtime.Object. It's really confusing, and it's only used to call o.GetName().

vemelin-epm

comment created time in 11 days

Pull request review commentkudobuilder/kudo

simplified diagnostics

+package diagnostics++import (+	"github.com/kudobuilder/kudo/pkg/kudoctl/env"+	"github.com/kudobuilder/kudo/pkg/kudoctl/util/kudo"+	"github.com/kudobuilder/kudo/pkg/version"+)++type runnerHelper struct {+	p *nonFailingPrinter+}++func (rh *runnerHelper) runForInstance(instance string, options *Options, c *kudo.Client, info version.Info, s *env.Settings) error {+	ir, err := newInstanceResources(instance, options, c, s)+	if err != nil {+		return err+	}++	ctx := &processingContext{root: DiagDir, instanceName: instance}+	instanceDiagRunner := &runner{}+	instanceDiagRunner.+		run(resourceCollectorGroup{+			{+				loadResourceFn: ir.instance,+				errKind:        "instance",+				parentDir:      ctx.attachToOperator,+				failOnError:    true,+				callback:       ctx.mustSetOperatorVersionNameFromInstance,+				printer:        rh.p,+				printMode:      ObjectWithDir},+			{+				loadResourceFn: ir.operatorVersion(ctx.operatorVersionName),+				errKind:        "operatorversion",+				parentDir:      ctx.attachToOperator,+				failOnError:    true,+				callback:       ctx.mustSetOperatorNameFromOperatorVersion,+				printer:        rh.p,+				printMode:      ObjectWithDir},+			{+				loadResourceFn: ir.operator(ctx.operatorName),+				errKind:        "operator",+				parentDir:      ctx.attachToRoot,+				failOnError:    true,+				printer:        rh.p,+				printMode:      ObjectWithDir}}).+		run(&resourceCollector{+			loadResourceFn: ir.pods,+			errKind:        "pod",+			parentDir:      ctx.attachToInstance,+			callback:       ctx.mustAddPodNames,+			printer:        rh.p,+			printMode:      ObjectListWithDirs}).+		run(&resourceCollector{+			loadResourceFn: ir.services,+			errKind:        "service",+			parentDir:      ctx.attachToInstance,+			printer:        rh.p,+			printMode:      RuntimeObject}).

Not a big fan of the builder pattern here. What's the upside of

r := &runner{}
r.run(...).run(...).run(...)

vs

r := &runner{}
r.run(...)
r.run(...)
r.run(...)

?

I can parse the latter one a lot easier, but that might be personal preference, so not a blocker here.

vemelin-epm

comment created time in 11 days

Pull request review commentkudobuilder/kudo

simplified diagnostics

+package diagnostics++import (+	"fmt"+	"io"+	"path/filepath"+	"strings"++	"github.com/kudobuilder/kudo/pkg/kudoctl/util/kudo"++	"github.com/spf13/afero"+	"gopkg.in/yaml.v2"++	"k8s.io/apimachinery/pkg/api/meta"+	"k8s.io/apimachinery/pkg/runtime"+	"k8s.io/cli-runtime/pkg/printers"+	"k8s.io/client-go/kubernetes/scheme"+)++const (+	DiagDir = "diag"+	KudoDir = "diag/kudo"+)++type printMode int++const (+	ObjectWithDir      printMode = iota // print object into its own nested directory based on its name and kind+	ObjectListWithDirs                  // print each object into its own nested directory based on its name and kind+	RuntimeObject                       // print as a file based on its kind only+)++// nonFailingPrinter - print provided data into provided directory and accumulate errors instead of returning them.+// Creates a nested directory if an object type requires so.+type nonFailingPrinter struct {+	fs     afero.Fs+	errors []string+}++func (p *nonFailingPrinter) printObject(o runtime.Object, parentDir string, mode printMode) {+	switch mode {+	case ObjectWithDir:+		if err := printSingleObject(p.fs, o, parentDir); err != nil {+			p.errors = append(p.errors, err.Error())+		}+	case ObjectListWithDirs:+		err := meta.EachListItem(o, func(ro runtime.Object) error {+			if err := printSingleObject(p.fs, ro, parentDir); err != nil {+				p.errors = append(p.errors, err.Error())+			}+			return nil+		})+		if err != nil {+			p.errors = append(p.errors, err.Error())+		}+	case RuntimeObject:+		fallthrough+	default:+		if err := printSingleRuntimeObject(p.fs, o, parentDir); err != nil {+			p.errors = append(p.errors, err.Error())+		}+	}+}++func (p *nonFailingPrinter) printError(err error, parentDir, name string) {+	b := []byte(err.Error())+	if err := printBytes(p.fs, b, parentDir, fmt.Sprintf("%s.err", name)); err != nil {+		p.errors = append(p.errors, err.Error())+	}+}++func (p *nonFailingPrinter) printLog(log io.ReadCloser, parentDir, name string) {+	if err := printLog(p.fs, log, parentDir, name); err != nil {+		p.errors = append(p.errors, err.Error())+	}+}++func (p *nonFailingPrinter) printYaml(v interface{}, parentDir, name string) {+	if err := printYaml(p.fs, v, parentDir, name); err != nil {+		p.errors = append(p.errors, err.Error())+	}+}++// printSingleObject - print a runtime.object assuming it exposes metadata by implementing metav1.object+// or panic otherwise. object is put into a nested directory.+func printSingleObject(fs afero.Fs, obj runtime.Object, parentDir string) error {+	if !isKudoCR(obj) {+		err := kudo.SetGVKFromScheme(obj, scheme.Scheme)+		if err != nil {+			return err+		}+	}++	o, _ := obj.(object)+	relToParentDir := fmt.Sprintf("%s_%s", strings.ToLower(o.GetObjectKind().GroupVersionKind().Kind), o.GetName())+	dir := filepath.Join(parentDir, relToParentDir)+	err := fs.MkdirAll(dir, 0700)+	if err != nil {+		return fmt.Errorf("failed to create directory %s: %v", dir, err)+	}++	fileWithPath := filepath.Join(dir, fmt.Sprintf("%s.yaml", o.GetName()))+	file, err := fs.Create(fileWithPath)+	if err != nil {+		return fmt.Errorf("failed to create %s: %v", fileWithPath, err)+	}+	defer file.Close()++	printer := printers.YAMLPrinter{}+	return printer.PrintObj(o, file)
	return printer.PrintObj(obj, file)
vemelin-epm

comment created time in 11 days

Pull request review commentkudobuilder/kudo

simplified diagnostics

+package diagnostics++import (+	"bytes"+	"fmt"+	"io"+	"io/ioutil"+	"reflect"++	"github.com/kudobuilder/kudo/pkg/apis/kudo/v1beta1"+	"github.com/kudobuilder/kudo/pkg/kudoctl/env"+	"github.com/kudobuilder/kudo/pkg/kudoctl/kudoinit"+	"github.com/kudobuilder/kudo/pkg/kudoctl/util/kudo"+	kudoutil "github.com/kudobuilder/kudo/pkg/util/kudo"++	corev1 "k8s.io/api/core/v1"+	metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"+	"k8s.io/apimachinery/pkg/runtime"+	"k8s.io/client-go/rest"+)++// resourceFuncsConfig - a wrapper for Kube and Kudo clients and common invocation parameters+// for loading Kube and Kudo resources+type resourceFuncsConfig struct {+	c           *kudo.Client+	ns          string+	instanceObj *v1beta1.Instance+	opts        metav1.ListOptions+	logOpts     corev1.PodLogOptions+}++// newInstanceResources is a configuration for instance-related resources+func newInstanceResources(instanceName string, options *Options, c *kudo.Client, s *env.Settings) (*resourceFuncsConfig, error) {+	instance, err := c.GetInstance(instanceName, s.Namespace)+	if err != nil {+		return nil, fmt.Errorf("failed to get instance %s/%s: %v", s.Namespace, instanceName, err)+	}+	if instance == nil {+		return nil, fmt.Errorf("instance %s/%s not found", s.Namespace, instanceName)+	}+	return &resourceFuncsConfig{+		c:           c,+		ns:          s.Namespace,+		instanceObj: instance,+		opts:        metav1.ListOptions{LabelSelector: fmt.Sprintf("%s=%s", kudoutil.OperatorLabel, instance.Labels[kudoutil.OperatorLabel])},+		logOpts:     corev1.PodLogOptions{SinceSeconds: options.LogSince},+	}, nil+}++// newKudoResources is a configuration for Kudo controller related resources+// panics if used to load Kudo CRDs (e.g. instance etc.)+func newKudoResources(options *Options, c *kudo.Client) (*resourceFuncsConfig, error) {+	opts := metav1.ListOptions{LabelSelector: fmt.Sprintf("app=%s", kudoinit.DefaultKudoLabel)}+	ns, err := c.CoreV1().Namespaces().List(opts)+	if err != nil {+		return nil, fmt.Errorf("failed to get kudo system namespace: %v", err)+	}+	if ns == nil || len(ns.Items) == 0 {+		return nil, fmt.Errorf("kudo system namespace not found")+	}+	return &resourceFuncsConfig{+		c:       c,+		ns:      ns.Items[0].Name,+		opts:    opts,+		logOpts: corev1.PodLogOptions{SinceSeconds: options.LogSince},+	}, nil+}++// object implements runtime.object and+// metav1.object interfaces.+// copied from K8 internal type metaRuntimeInterface+type object interface {+	runtime.Object+	metav1.Object+}

I just looked it up in the code, we can delete it. It's only used in the printSingleObject and it's used to call getName on the metav1.Object. I'll comment up there

vemelin-epm

comment created time in 11 days

Pull request review commentkudobuilder/kudo

simplified diagnostics

+package diagnostics++import (+	"bytes"+	"fmt"+	"io"+	"io/ioutil"+	"reflect"++	"github.com/kudobuilder/kudo/pkg/apis/kudo/v1beta1"+	"github.com/kudobuilder/kudo/pkg/kudoctl/env"+	"github.com/kudobuilder/kudo/pkg/kudoctl/kudoinit"+	"github.com/kudobuilder/kudo/pkg/kudoctl/util/kudo"+	kudoutil "github.com/kudobuilder/kudo/pkg/util/kudo"++	corev1 "k8s.io/api/core/v1"+	metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"+	"k8s.io/apimachinery/pkg/runtime"+	"k8s.io/client-go/rest"+)++// resourceFuncsConfig - a wrapper for Kube and Kudo clients and common invocation parameters+// for loading Kube and Kudo resources+type resourceFuncsConfig struct {+	c           *kudo.Client+	ns          string+	instanceObj *v1beta1.Instance+	opts        metav1.ListOptions+	logOpts     corev1.PodLogOptions+}++// newInstanceResources is a configuration for instance-related resources+func newInstanceResources(instanceName string, options *Options, c *kudo.Client, s *env.Settings) (*resourceFuncsConfig, error) {+	instance, err := c.GetInstance(instanceName, s.Namespace)+	if err != nil {+		return nil, fmt.Errorf("failed to get instance %s/%s: %v", s.Namespace, instanceName, err)+	}+	if instance == nil {+		return nil, fmt.Errorf("instance %s/%s not found", s.Namespace, instanceName)+	}+	return &resourceFuncsConfig{+		c:           c,+		ns:          s.Namespace,+		instanceObj: instance,+		opts:        metav1.ListOptions{LabelSelector: fmt.Sprintf("%s=%s", kudoutil.OperatorLabel, instance.Labels[kudoutil.OperatorLabel])},+		logOpts:     corev1.PodLogOptions{SinceSeconds: options.LogSince},+	}, nil+}++// newKudoResources is a configuration for Kudo controller related resources+// panics if used to load Kudo CRDs (e.g. instance etc.)+func newKudoResources(options *Options, c *kudo.Client) (*resourceFuncsConfig, error) {+	opts := metav1.ListOptions{LabelSelector: fmt.Sprintf("app=%s", kudoinit.DefaultKudoLabel)}+	ns, err := c.CoreV1().Namespaces().List(opts)+	if err != nil {+		return nil, fmt.Errorf("failed to get kudo system namespace: %v", err)+	}+	if ns == nil || len(ns.Items) == 0 {+		return nil, fmt.Errorf("kudo system namespace not found")+	}+	return &resourceFuncsConfig{+		c:       c,+		ns:      ns.Items[0].Name,+		opts:    opts,+		logOpts: corev1.PodLogOptions{SinceSeconds: options.LogSince},+	}, nil+}++// object implements runtime.object and+// metav1.object interfaces.+// copied from K8 internal type metaRuntimeInterface+type object interface {+	runtime.Object+	metav1.Object+}

Erks. Do we really need this? I was already wondering above how it was possible to cast obj.(object) when the comment talked about metav1.Object.

vemelin-epm

comment created time in 11 days

Pull request review commentkudobuilder/kudo

simplified diagnostics

+package diagnostics++import (+	"fmt"++	"github.com/kudobuilder/kudo/pkg/apis/kudo/v1beta1"++	v1 "k8s.io/api/core/v1"+	"k8s.io/apimachinery/pkg/api/meta"+	"k8s.io/apimachinery/pkg/runtime"+)++// processingContext - shared data for the resource collectors+// provides property accessors allowing to define a collector before the data it needs is available+// provides update callback functions. callbacks panic if called on a wrong type of runtime.object+type processingContext struct {+	podNames      []string+	root          string+	opName        string+	opVersionName string+	instanceName  string+}++func (ctx *processingContext) attachToRoot() string {
func (ctx *processingContext) rootDirectory() string {
vemelin-epm

comment created time in 11 days

Pull request review commentkudobuilder/kudo

simplified diagnostics

+package diagnostics++import (+	"fmt"++	"github.com/kudobuilder/kudo/pkg/apis/kudo/v1beta1"++	v1 "k8s.io/api/core/v1"+	"k8s.io/apimachinery/pkg/api/meta"+	"k8s.io/apimachinery/pkg/runtime"+)++// processingContext - shared data for the resource collectors+// provides property accessors allowing to define a collector before the data it needs is available+// provides update callback functions. callbacks panic if called on a wrong type of runtime.object+type processingContext struct {+	podNames      []string+	root          string+	opName        string+	opVersionName string+	instanceName  string+}++func (ctx *processingContext) attachToRoot() string {+	return ctx.root+}++func (ctx *processingContext) attachToOperator() string {+	return fmt.Sprintf("%s/operator_%s", ctx.root, ctx.opName)+}++func (ctx *processingContext) attachToInstance() string {+	return fmt.Sprintf("%s/instance_%s", ctx.attachToOperator(), ctx.instanceName)+}++func (ctx *processingContext) mustSetOperatorNameFromOperatorVersion(o runtime.Object) {+	ctx.opName = o.(*v1beta1.OperatorVersion).Spec.Operator.Name+}++func (ctx *processingContext) mustSetOperatorVersionNameFromInstance(o runtime.Object) {+	ctx.opVersionName = o.(*v1beta1.Instance).Spec.OperatorVersion.Name+}++func (ctx *processingContext) mustAddPodNames(o runtime.Object) {
func (ctx *processingContext) addPodNames(o runtime.Object) {
vemelin-epm

comment created time in 11 days

Pull request review commentkudobuilder/kudo

simplified diagnostics

+package diagnostics++import (+	"fmt"++	"github.com/kudobuilder/kudo/pkg/apis/kudo/v1beta1"++	v1 "k8s.io/api/core/v1"+	"k8s.io/apimachinery/pkg/api/meta"+	"k8s.io/apimachinery/pkg/runtime"+)++// processingContext - shared data for the resource collectors+// provides property accessors allowing to define a collector before the data it needs is available+// provides update callback functions. callbacks panic if called on a wrong type of runtime.object+type processingContext struct {+	podNames      []string+	root          string+	opName        string+	opVersionName string+	instanceName  string+}++func (ctx *processingContext) attachToRoot() string {+	return ctx.root+}++func (ctx *processingContext) attachToOperator() string {+	return fmt.Sprintf("%s/operator_%s", ctx.root, ctx.opName)+}++func (ctx *processingContext) attachToInstance() string {+	return fmt.Sprintf("%s/instance_%s", ctx.attachToOperator(), ctx.instanceName)+}++func (ctx *processingContext) mustSetOperatorNameFromOperatorVersion(o runtime.Object) {+	ctx.opName = o.(*v1beta1.OperatorVersion).Spec.Operator.Name+}++func (ctx *processingContext) mustSetOperatorVersionNameFromInstance(o runtime.Object) {
func (ctx *processingContext) setOperatorVersionNameFromInstance(o runtime.Object) {
vemelin-epm

comment created time in 11 days

Pull request review commentkudobuilder/kudo

simplified diagnostics

+package diagnostics++import (+	"fmt"++	"github.com/kudobuilder/kudo/pkg/apis/kudo/v1beta1"++	v1 "k8s.io/api/core/v1"+	"k8s.io/apimachinery/pkg/api/meta"+	"k8s.io/apimachinery/pkg/runtime"+)++// processingContext - shared data for the resource collectors+// provides property accessors allowing to define a collector before the data it needs is available+// provides update callback functions. callbacks panic if called on a wrong type of runtime.Object+type processingContext struct {+	podNames      []string+	root          string+	opName        string+	opVersionName string+	instanceName  string+}++func (ctx *processingContext) attachToRoot() string {+	return ctx.root+}++func (ctx *processingContext) attachToOperator() string {+	return fmt.Sprintf("%s/operator_%s", ctx.root, ctx.opName)+}++func (ctx *processingContext) attachToInstance() string {+	return fmt.Sprintf("%s/instance_%s", ctx.attachToOperator(), ctx.instanceName)+}++func (ctx *processingContext) mustSetOperatorNameFromOperatorVersion(o runtime.Object) {
func (ctx *processingContext) setOperatorNameFromOperatorVersion(o runtime.Object) {

Same here: I don't understand the "must" prefix, having just the "set..." part makes it a lot clearer on invocation:

callback:       ctx.setOperatorNameFromOperatorVersion,

This tells me: Ah, the callback will set the operator name from the operator version. Nice. :)

btw, I don't mind long descriptive names

vemelin-epm

comment created time in 11 days

Pull request review commentkudobuilder/kudo

simplified diagnostics

+package diagnostics++import (+	"fmt"+	"io"+	"path/filepath"+	"strings"++	"github.com/kudobuilder/kudo/pkg/kudoctl/util/kudo"++	"github.com/spf13/afero"+	"gopkg.in/yaml.v2"++	"k8s.io/apimachinery/pkg/api/meta"+	"k8s.io/apimachinery/pkg/runtime"+	"k8s.io/cli-runtime/pkg/printers"+	"k8s.io/client-go/kubernetes/scheme"+)++const (+	DiagDir = "diag"+	KudoDir = "diag/kudo"+)++type printMode int++const (+	ObjectWithDir      printMode = iota // print object into its own nested directory based on its name and kind+	ObjectListWithDirs                  // print each object into its own nested directory based on its name and kind+	RuntimeObject                       // print as a file based on its kind only+)++// nonFailingPrinter - print provided data into provided directory and accumulate errors instead of returning them.+// Creates a nested directory if an object type requires so.+type nonFailingPrinter struct {+	fs     afero.Fs+	errors []string+}++func (p *nonFailingPrinter) printObject(o runtime.Object, parentDir string, mode printMode) {+	switch mode {+	case ObjectWithDir:+		if err := printSingleObject(p.fs, o, parentDir); err != nil {+			p.errors = append(p.errors, err.Error())+		}+	case ObjectListWithDirs:+		err := meta.EachListItem(o, func(ro runtime.Object) error {+			if err := printSingleObject(p.fs, ro, parentDir); err != nil {+				p.errors = append(p.errors, err.Error())+			}+			return nil+		})+		if err != nil {+			p.errors = append(p.errors, err.Error())+		}+	case RuntimeObject:+		fallthrough+	default:+		if err := printSingleRuntimeObject(p.fs, o, parentDir); err != nil {+			p.errors = append(p.errors, err.Error())+		}+	}+}++func (p *nonFailingPrinter) printError(err error, parentDir, name string) {+	b := []byte(err.Error())+	if err := printBytes(p.fs, b, parentDir, fmt.Sprintf("%s.err", name)); err != nil {+		p.errors = append(p.errors, err.Error())+	}+}++func (p *nonFailingPrinter) printLog(log io.ReadCloser, parentDir, name string) {+	if err := printLog(p.fs, log, parentDir, name); err != nil {+		p.errors = append(p.errors, err.Error())+	}+}++func (p *nonFailingPrinter) printYaml(v interface{}, parentDir, name string) {+	if err := printYaml(p.fs, v, parentDir, name); err != nil {+		p.errors = append(p.errors, err.Error())+	}+}++// printSingleObject - print a runtime.object assuming it exposes metadata by implementing metav1.object+// or panic otherwise. object is put into a nested directory.+func printSingleObject(fs afero.Fs, obj runtime.Object, parentDir string) error {

There's a lot of overlap between this func and the printSingleRuntimeObject. Maybe we can remove some of the duplication?

vemelin-epm

comment created time in 11 days

Pull request review commentkudobuilder/kudo

simplified diagnostics

+package diagnostics++import (+	"fmt"++	"github.com/kudobuilder/kudo/pkg/apis/kudo/v1beta1"++	v1 "k8s.io/api/core/v1"+	"k8s.io/apimachinery/pkg/api/meta"+	"k8s.io/apimachinery/pkg/runtime"+)++// processingContext - shared data for the resource collectors+// provides property accessors allowing to define a collector before the data it needs is available+// provides update callback functions. callbacks panic if called on a wrong type of runtime.object+type processingContext struct {+	podNames      []string+	root          string+	opName        string+	opVersionName string+	instanceName  string+}++func (ctx *processingContext) attachToRoot() string {+	return ctx.root+}++func (ctx *processingContext) attachToOperator() string {+	return fmt.Sprintf("%s/operator_%s", ctx.root, ctx.opName)+}++func (ctx *processingContext) attachToInstance() string {
func (ctx *processingContext) instanceDirectory() string {

I think these names describe the function a lot better. The old names may be remnants of the old architecture, but with the new structure it makes it a lot clear:

parentDir:      ctx.attachToOperator,

feels weird to me, but

parentDir:      ctx.operatorDirectory,

makes a lot more sense.

Additionally, this func does not attach anything to something, it just returns a directory.

vemelin-epm

comment created time in 11 days

Pull request review commentkudobuilder/kudo

simplified diagnostics

+package diagnostics++import (+	"fmt"++	"github.com/kudobuilder/kudo/pkg/apis/kudo/v1beta1"++	v1 "k8s.io/api/core/v1"+	"k8s.io/apimachinery/pkg/api/meta"+	"k8s.io/apimachinery/pkg/runtime"+)++// processingContext - shared data for the resource collectors+// provides property accessors allowing to define a collector before the data it needs is available+// provides update callback functions. callbacks panic if called on a wrong type of runtime.object+type processingContext struct {+	podNames      []string+	root          string+	opName        string+	opVersionName string+	instanceName  string+}++func (ctx *processingContext) attachToRoot() string {+	return ctx.root+}++func (ctx *processingContext) attachToOperator() string {
func (ctx *processingContext) operatorDirectory() string {
vemelin-epm

comment created time in 11 days

Pull request review commentkudobuilder/kudo

simplified diagnostics

+package diagnostics++import (+	"fmt"+	"io"+	"path/filepath"+	"strings"++	"github.com/kudobuilder/kudo/pkg/kudoctl/util/kudo"++	"github.com/spf13/afero"+	"gopkg.in/yaml.v2"++	"k8s.io/apimachinery/pkg/api/meta"+	"k8s.io/apimachinery/pkg/runtime"+	"k8s.io/cli-runtime/pkg/printers"+	"k8s.io/client-go/kubernetes/scheme"+)++const (+	DiagDir = "diag"+	KudoDir = "diag/kudo"+)++type printMode int++const (+	ObjectWithDir      printMode = iota // print object into its own nested directory based on its name and kind+	ObjectListWithDirs                  // print each object into its own nested directory based on its name and kind+	RuntimeObject                       // print as a file based on its kind only+)++// nonFailingPrinter - print provided data into provided directory and accumulate errors instead of returning them.+// Creates a nested directory if an object type requires so.
// Creates parent directories if they are required and do not exist.

Is my suggestion correct?

vemelin-epm

comment created time in 11 days

Pull request review commentkudobuilder/kudo

simplified diagnostics

+package diagnostics++import (+	"fmt"+	"io"+	"strings"++	"github.com/kudobuilder/kudo/pkg/kudoctl/util/kudo"++	"github.com/spf13/afero"+	"gopkg.in/yaml.v2"++	"k8s.io/apimachinery/pkg/api/meta"+	"k8s.io/apimachinery/pkg/runtime"+	"k8s.io/cli-runtime/pkg/printers"+	"k8s.io/client-go/kubernetes/scheme"+)++const (+	DiagDir = "diag"+	KudoDir = "diag/kudo"+)++type printMode int++const (+	ObjectWithDir      printMode = iota // print object into its own nested directory based on its name and kind

I don't really like the iota feature of go very much, but in this case I think it's ok. The value of the enum is not stored anywhere, it's only used at runtime without any persistence.

I'd still prefer the "use strings for enums", as my brain parses it a lot easier, but it's not a blocker for me.

vemelin-epm

comment created time in 11 days

Pull request review commentkudobuilder/kudo

simplified diagnostics

+package diagnostics++// collector - generic interface for diagnostic data collection+// implementors are expected to return only fatal errors and handle non-fatal ones themselves+type collector interface {+	collect() error+}

I would expect this to be in the collectors.go

vemelin-epm

comment created time in 11 days

Pull request review commentkudobuilder/kudo

simplified diagnostics

+package diagnostics++import (+	"fmt"+	"strings"+	"time"++	"github.com/spf13/afero"++	"github.com/kudobuilder/kudo/pkg/kudoctl/env"+	"github.com/kudobuilder/kudo/pkg/kudoctl/util/kudo"+	"github.com/kudobuilder/kudo/pkg/version"+)++type Options struct {+	LogSince *int64+}++func NewOptions(logSince time.Duration) *Options {+	opts := Options{}+	if logSince > 0 {+		sec := int64(logSince.Round(time.Second).Seconds())+		opts.LogSince = &sec+	}+	return &opts+}++func Collect(fs afero.Fs, instance string, options *Options, c *kudo.Client, s *env.Settings) error {+	p := &nonFailingPrinter{fs: fs}+	rh := runnerHelper{p}++	instanceErr := rh.runForInstance(instance, options, c, version.Get(), s)+	kudoErr := rh.runForKudoManager(options, c)++	errMsgs := p.errors+	if instanceErr != nil {+		errMsgs = append(errMsgs, instanceErr.Error())+	}+	if kudoErr != nil {+		errMsgs = append(errMsgs, kudoErr.Error())+	}

Nit:

        errMsgs := p.errors
        
	if err := rh.runForInstance(instance, options, c, version.Get(), s); err != nil {
		errMsgs = append(errMsgs, err.Error())
	}
	if err := rh.runForKudoManager(options, c); err != nil {
		errMsgs = append(errMsgs, err.Error())
	}
vemelin-epm

comment created time in 11 days

Pull request review commentkudobuilder/kudo

simplified diagnostics

+package diagnostics++import (+	"fmt"+	"io"+	"path/filepath"+	"reflect"++	"k8s.io/apimachinery/pkg/api/meta"+	"k8s.io/apimachinery/pkg/runtime"+)++// resourceCollector - collector interface implementation for Kubernetes resources (runtime objects)+type resourceCollector struct {+	loadResourceFn func() (runtime.Object, error)+	errKind        string                 // object kind used to describe the error+	parentDir      func() string          // parent dir to attach the printer's output+	failOnError    bool                   // define whether the collector should return the error+	callback       func(o runtime.Object) // should be used to update some shared context
	callback       func(o runtime.Object) // will be called with the retrieved resource after cllection to update  shared context
vemelin-epm

comment created time in 11 days

Pull request review commentkudobuilder/kudo

simplified diagnostics

+package diagnostics++import (+	"github.com/kudobuilder/kudo/pkg/kudoctl/env"+	"github.com/kudobuilder/kudo/pkg/kudoctl/util/kudo"+	"github.com/kudobuilder/kudo/pkg/version"+)++type runnerHelper struct {+	p *nonFailingPrinter+}

This shouldn't be an object. Just pass the nonFailingPrinter into runForInstance and runForKudoManager, I don't think having an object here adds anything useful. The name Helper already kind of implies that ;)

vemelin-epm

comment created time in 11 days

more