profile
viewpoint
Jan Schlicht nfnt @mesosphere Hamburg, Germany https://kaput.life Distributed systems expert, building solutions for the cloud with @kubernetes, @kudobuilder, @mesos, and @dcos.

nfnt/resize 2371

Pure golang image resizing

dcos/dcos-e2e 59

Spin up and manage DC/OS clusters in test environments

mesosphere/kudo-cassandra-operator 3

KUDO Cassandra Operator

mesosphere/kudo-kafka-operator 0

KUDO based Kafka Operator

nfnt/mesos-vagrant-env 0

Mesos development environment for Vagrant

push eventmesosphere/kudo-cassandra-operator

Jan Schlicht

commit sha 03dc001bc741e054db3725081b4637f1eebe81ab

Change markdown language of non-YAMl snippets Signed-off-by: Jan Schlicht <jan@d2iq.com>

view details

push time in 16 hours

Pull request review commentmesosphere/kudo-cassandra-operator

Update documentation

 in your cluster as you use in the NODE_TOPOLOGY definition.  ### Full list of required parameters -```-    ENDPOINT_SNITCH=GossipingPropertyFileSnitch-    NODE_ANTI_AFFINITY=true-    NODE_TOPOLOGY=<the cluster topology>+```yaml+ENDPOINT_SNITCH=GossipingPropertyFileSnitch NODE_ANTI_AFFINITY=true+NODE_TOPOLOGY=<the cluster topology>

It's the file formatter for YAML. Just noticed that this isn't actually YAML, changing this to text formatting here and in other places.

nfnt

comment created time in 16 hours

push eventmesosphere/kudo-cassandra-operator

Jan Schlicht

commit sha afad903cfbdc4149d92838113610cc22863cc939

Format files Signed-off-by: Jan Schlicht <jan@d2iq.com>

view details

push time in 16 hours

push eventkudobuilder/kudo

Jan Schlicht

commit sha 2464231002af838499e7b96586c419cdddb089da

Refactor operator package installation (#1542) The old 'InstallPackage' function has been extracted into a separate package. Its functionality has been split up into multiple functions handling different installation resources. Signed-off-by: Jan Schlicht <jan@d2iq.com>

view details

push time in 17 hours

delete branch kudobuilder/kudo

delete branch : nfnt/refactor-package-install

delete time in 17 hours

PR merged kudobuilder/kudo

Reviewers
Refactor operator package installation

<!-- Thanks for sending a pull request! Here are some tips for you:

  1. If this is your first time, please read our contributor guidelines: https://github.com/kudobuilder/kudo/blob/master/CONTRIBUTING.md
  2. Make sure you have added and ran the tests before submitting your PR
  3. If the PR is unfinished, start it as a Draft PR: https://github.blog/2019-02-14-introducing-draft-pull-requests/ -->

What this PR does / why we need it: The old InstallPackage function has been extracted into a separate package. It's functionality has been split up into multiple functions handling different installation resources. The function signature of install.Package introduces variadic option parameters to provide a backward-compatible API.

<!-- *Automatically closes linked issue when PR is merged. Usage: Fixes #<issue number>, or Fixes (paste link of issue). --> In preparation for #1514

+395 -224

0 comment

15 changed files

nfnt

pr closed time in 17 hours

push eventmesosphere/kudo-cassandra-operator

Jan Schlicht

commit sha 4001d5b76d9f52985bf6c816e1ca2ff3a5f1b63d

Update docs following review comments Signed-off-by: Jan Schlicht <jan@d2iq.com>

view details

push time in 17 hours

issue commentkudobuilder/kudo

Highly sub-optimal UX of `make cli-install` at HEAD of `master`

Yes, you're right.

porridge

comment created time in 18 hours

push eventmesosphere/kudo-cassandra-operator

Jan Schlicht

commit sha 6257691f51b1342710e1169beb67ad689e88dbe1

Apply suggestions from code review Signed-off-by: Jan Schlicht <jan@d2iq.com> Co-authored-by: Andreas Neumann <aneumann@mesosphere.com>

view details

push time in 18 hours

Pull request review commentmesosphere/kudo-cassandra-operator

Update documentation

 in your cluster as you use in the NODE_TOPOLOGY definition.  ### Full list of required parameters -```-    ENDPOINT_SNITCH=GossipingPropertyFileSnitch-    NODE_ANTI_AFFINITY=true-    NODE_TOPOLOGY=<the cluster topology>+```yaml+ENDPOINT_SNITCH=GossipingPropertyFileSnitch NODE_ANTI_AFFINITY=true+NODE_TOPOLOGY=<the cluster topology>

Oops, how did that happen?

nfnt

comment created time in 18 hours

Pull request review commentmesosphere/kudo-cassandra-operator

Update documentation

 rack awareness.  ## Kubernetes cluster prerequisites -At this time, KUDO Cassandra needs a single Kubernetes cluster spanning all the-datacenters. A Cassandra cluster running on two or more Kubernetes clusters is-not supported at the moment+### Naming++Cassandra datacenters can either run in a single Kubernetes cluster that is

Mutiple Datacenters form a single Cassandra cluster. At least that's what I'm trying to say here. I'll add another sentence to distinguish between Cassandra cluster and datacenter.

nfnt

comment created time in 18 hours

push eventkudobuilder/kudo

Ken Sipe

commit sha daa7ac8a20b6cb3e9410c1457c5384fe5c11a325

kuttl v0.4.0 bump (#1545) Signed-off-by: Ken Sipe <kensipe@gmail.com>

view details

Aleksey Dukhovniy

commit sha f71e81c51024ff0b19710dddc3ebad4bc5b818ff

KEP-29: Add `KudoOperatorTask` implementation (#1541) Summary: implemented `KudoTaskOperator` which, given a `KudoOperator` task in the operator will create the `Instance` object and wait for it to become healthy. Additionally added `paramsFile` to the `KudoOperatorTaskSpec`. Fixes: #1509 Signed-off-by: Aleksey Dukhovniy <alex.dukhovniy@googlemail.com>

view details

Jan Schlicht

commit sha 8a34e8988dff9bd2a98fd991360309db29dc46d2

Merge branch 'master' into nfnt/refactor-package-install Signed-off-by: Jan Schlicht <jan@d2iq.com>

view details

push time in 19 hours

pull request commentmesosphere/kudo-cassandra-operator

Disable the Prometheus exporter by default

Thanks for catching and correcting that @zmalik!

nfnt

comment created time in 19 hours

issue commentkudobuilder/kudo

Highly sub-optimal UX of `make cli-install` at HEAD of `master`

Isn't the issue here that we're referencing the last released controller even if master is already ahead? Once a version of KUDO is released, we should consider every commit to master afterwards as commits towards the next version of KUDO and this should also be reflected in the controller image kubectl kudo init will use. I.e. it should reference the next, still unreleased image that users would have to build on their own.

porridge

comment created time in 20 hours

Pull request review commentkudobuilder/kudo

KEP-29: Add `KudoOperatorTask` implementation

 func fatalExecutionError(cause error, eventName string, meta renderer.Metadata)  func newKudoOperator(task *v1beta1.Task) (Tasker, error) { 	// validate KudoOperatorTask-	if len(task.Spec.KudoOperatorTaskSpec.Package) == 0 {+	if task.Spec.KudoOperatorTaskSpec.Package == "" { 		return nil, fmt.Errorf("task validation error: kudo operator task '%s' has an empty package name", task.Name) 	} -	if len(task.Spec.KudoOperatorTaskSpec.OperatorVersion) == 0 {+	if task.Spec.KudoOperatorTaskSpec.OperatorVersion == "" { 		return nil, fmt.Errorf("task validation error: kudo operator task '%s' has an empty operatorVersion", task.Name) 	}  	return KudoOperatorTask{ 		Name:            task.Name,-		Package:         task.Spec.KudoOperatorTaskSpec.Package,+		OperatorName:    task.Spec.KudoOperatorTaskSpec.Package,

:+1: That should work!

zen-dog

comment created time in 2 days

Pull request review commentkudobuilder/kudo

Refactor operator package installation

+// Package install provides function to install package resources+// on a Kubernetes cluster.+package install++import (+	"strings"+	"time"++	"github.com/kudobuilder/kudo/pkg/apis/kudo/v1beta1"+	"github.com/kudobuilder/kudo/pkg/kudoctl/clog"+	"github.com/kudobuilder/kudo/pkg/kudoctl/packages"+	"github.com/kudobuilder/kudo/pkg/kudoctl/util/kudo"+)++type Options struct {+	skipInstance    bool+	wait            *time.Duration+	createNamespace bool+}++type Option func(*Options)++// SkipInstance installs only Operator and OperatorVersion+// of an operator package.+func SkipInstance() Option {+	return func(o *Options) {+		o.skipInstance = true+	}+}++// WaitForInstance waits an amount of time for the instance+// to complete installation.+func WaitForInstance(duration time.Duration) Option {+	return func(o *Options) {+		o.wait = &duration+	}+}++// CreateNamespace creates the specified namespace before installation.+// If available, a namespace manifest in the operator package is+// rendered using the installation parameters.+func CreateNamespace() Option {+	return func(o *Options) {+		o.createNamespace = true+	}+}++// Package installs an operator package with parameters into a namespace.+// Instance name, namespace and operator parameters are applied to the+// operator package resources. These rendered resources are then created+// on the Kubernetes cluster.+func Package(+	client *kudo.Client,+	instanceName string,+	namespace string,+	resources packages.Resources,+	parameters map[string]string,+	opts ...Option) error {+	clog.V(3).Printf("operator name: %v", resources.Operator.Name)+	clog.V(3).Printf("operator version: %v", resources.OperatorVersion.Spec.Version)++	options := Options{}+	for _, o := range opts {+		o(&options)+	}++	applyOverrides(&resources, instanceName, namespace, parameters)++	if err := client.ValidateServerForOperator(resources.Operator); err != nil {+		return err+	}++	if options.createNamespace {+		if err := installNamespace(client, resources, parameters); err != nil {+			return err+		}+	}++	if err := installOperatorAndOperatorVersion(client, resources); err != nil {+		return err+	}++	if options.skipInstance {+		return nil+	}++	if err := validateParameters(

The old code skipped this validation if no instance was to be installed. I moved it to run as early as possible now with the reasons you mentioned.

nfnt

comment created time in 2 days

push eventkudobuilder/kudo

Jan Schlicht

commit sha 6d594f2da17d92095351a05e80575c5b59965962

Validate early Signed-off-by: Jan Schlicht <jan@d2iq.com>

view details

push time in 2 days

push eventkudobuilder/kudo

Jan Schlicht

commit sha c6748d250999e8cf0e16c3d6c9951b4b655d00f3

Improved consistency of resource logging Signed-off-by: Jan Schlicht <jan@d2iq.com>

view details

push time in 2 days

Pull request review commentkudobuilder/kudo

Refactor operator package installation

+package install++import (+	"fmt"++	"github.com/thoas/go-funk"++	"github.com/kudobuilder/kudo/pkg/kudoctl/clog"+	"github.com/kudobuilder/kudo/pkg/kudoctl/packages"+	"github.com/kudobuilder/kudo/pkg/kudoctl/util/kudo"+)++func installOperatorAndOperatorVersion(client *kudo.Client, resources packages.Resources) error {+	if !client.OperatorExistsInCluster(resources.Operator.Name, resources.Operator.Namespace) {+		if _, err := client.InstallOperatorObjToCluster(resources.Operator, resources.Operator.Namespace); err != nil {+			return fmt.Errorf(+				"failed to install %s-operator.yaml in namespace %s: %v",+				resources.Operator.Name,+				resources.Operator.Namespace,+				err)+		}+		clog.Printf(+			"operator.%s/%s created in namespace %s",

Well, I counted at least 3 different approaches in KUDO while working on this refactor, $namespace/$name, $resource.$apiversion/$name and $name-$resource.yaml. Not sure if this was intended or just due to a lack of consistency. I'll change all occurences to $namespace/$name.

nfnt

comment created time in 2 days

Pull request review commentkudobuilder/kudo

KEP-29: Add `KudoOperatorTask` implementation

 func fatalExecutionError(cause error, eventName string, meta renderer.Metadata)  func newKudoOperator(task *v1beta1.Task) (Tasker, error) { 	// validate KudoOperatorTask-	if len(task.Spec.KudoOperatorTaskSpec.Package) == 0 {+	if task.Spec.KudoOperatorTaskSpec.Package == "" { 		return nil, fmt.Errorf("task validation error: kudo operator task '%s' has an empty package name", task.Name) 	} -	if len(task.Spec.KudoOperatorTaskSpec.OperatorVersion) == 0 {+	if task.Spec.KudoOperatorTaskSpec.OperatorVersion == "" { 		return nil, fmt.Errorf("task validation error: kudo operator task '%s' has an empty operatorVersion", task.Name) 	}  	return KudoOperatorTask{ 		Name:            task.Name,-		Package:         task.Spec.KudoOperatorTaskSpec.Package,+		OperatorName:    task.Spec.KudoOperatorTaskSpec.Package,

We should rename this in KudoOperatorTaskSpec as well. As that API isn't in use yet it's safe to rename without breaking backwards-compatibility.

zen-dog

comment created time in 2 days

Pull request review commentkudobuilder/kudo

KEP-29: Add `KudoOperatorTask` implementation

+package v1beta1++import (+	"fmt"+)++func OperatorInstanceName(operatorName string) string {+	return fmt.Sprintf("%s-instance", operatorName)+}++func OperatorVersionName(operatorName, operatorVersion string) string {

More nitpicking: Isn't this and OperatorInstanceName better in operator_types_helpers because it extracts information from operator names? I mistook operatorVersion for and OperatorVersion, that's why I suggested that this should be in operatorversion_types_helpers... but it's in fact a version version :). So let's just call this parameter version so that we don't mistake it with an OperatorVersion.

zen-dog

comment created time in 2 days

PR opened mesosphere/kudo-cassandra-operator

Reviewers
Update documentation
  • Updated ToCs
  • Some markdown linting
  • Update some outdated descriptions
  • Add sections for failure handling

<!-- Thanks for sending a pull request! Here are some tips:

  1. Please make yourself familiar with the general development guidelines: https://github.com/mesosphere/kudo-cassandra-operator/blob/master/DEVELOPMENT.md#development

  2. Please make sure that the PR abides to the style guide: https://github.com/mesosphere/kudo-cassandra-operator/blob/master/DEVELOPMENT.md#style-guide

  3. Please make sure that that git status looks clean after running ./tools/compile_templates.sh. If there's a diff you might need to commit further changes. This script is currently linux only. To run on another platform run the docker.sh script; example: ./tools/docker.sh ./tools/compile_templates.sh

  4. Please make sure that that git status looks clean after running ./tools/generate_parameters_markdown.py. If there's a diff you might need to commit further changes. ./tools/docker.sh ./tools/generate_parameters_markdown.py This script is currently linux only. To run on another platform run the docker.sh script; example: ./tools/docker.sh ./tools/generate_parameters_markdown.py

  5. Please make sure that that git status looks clean after running ./tools/format_files.sh. If there's a diff you might need to commit further changes.

  6. If it makes sense, please add an entry to the CHANGELOG: https://github.com/mesosphere/kudo-cassandra-operator/blob/master/CHANGELOG.md#unreleased

  7. If the PR is unfinished, please start it as a Draft PR: https://github.blog/2019-02-14-introducing-draft-pull-requests/ -->

+148 -63

0 comment

8 changed files

pr created time in 2 days

create barnchmesosphere/kudo-cassandra-operator

branch : nfnt/additional-docs

created branch time in 2 days

Pull request review commentkudobuilder/kudo

KEP-29: Add `KudoOperatorTask` implementation

 type KudoOperatorTask struct { 	InstanceName    string 	AppVersion      string 	OperatorVersion string+	ParameterFile   string }  // Run method for the KudoOperatorTask. Not yet implemented func (dt KudoOperatorTask) Run(ctx Context) (bool, error) {-	return false, errors.New("kudo-operator task is not yet implemented. Stay tuned though ;)")++	// 0. - A few prerequisites -+	// Note: ctx.Meta has Meta.OperatorName and Meta.OperatorVersion fields but these are of the **parent instance**+	// However, since we don't support multiple namespaces yet, we can use the Meta.InstanceNamespace for the namespace+	namespace := ctx.Meta.InstanceNamespace+	operatorName := dt.Package+	operatorVersion := dt.OperatorVersion+	operatorVersionName := v1beta1.OperatorVersionName(operatorName, operatorVersion)+	instanceName := dependencyInstanceName(ctx.Meta.InstanceName, dt.InstanceName, operatorName)++	// 1. - Expand parameter file if exists -+	params, err := instanceParameters(dt.ParameterFile, ctx.Templates, ctx.Meta, ctx.Parameters)+	if err != nil {+		return false, fatalExecutionError(err, taskRenderingError, ctx.Meta)+	}++	// 2. - Build the instance object -+	instance, err := instanceResource(instanceName, operatorName, operatorVersionName, namespace, params, ctx.Meta.ResourcesOwner, ctx.Scheme)+	if err != nil {+		return false, fatalExecutionError(err, taskRenderingError, ctx.Meta)+	}++	// 3. - Apply the Instance object -+	err = applyInstance(instance, namespace, ctx.Client)+	if err != nil {+		return false, err+	}++	// 4. - Check the Instance health -+	if err := health.IsHealthy(instance); err != nil {+		return false, nil+	}++	return true, nil+}++// dependencyInstanceName returns a name for the child instance in an operator with dependencies looking like+// <parent-instance.<child-instance> if a child instance name is provided e.g. `kafka-instance.custom-name` or+// <parent-instance.<child-operator> if not e.g. `kafka-instance.zookeeper`. This way we always have a valid child+// instance name and user can install the same operator multiple times in the same namespace, because the instance+// names will be unique thanks to the top-level instance name prefix.+func dependencyInstanceName(parentInstanceName, instanceName, operatorName string) string {+	if instanceName != "" {+		return fmt.Sprintf("%s.%s", parentInstanceName, instanceName)+	}+	return fmt.Sprintf("%s.%s", parentInstanceName, operatorName)+}++// render method takes templated parameter file and a map of parameters and then renders passed template using kudo engine.+func instanceParameters(pf string, templates map[string]string, meta renderer.Metadata, parameters map[string]interface{}) (map[string]string, error) {+	if len(pf) != 0 {+		pft, ok := templates[pf]+		if !ok {+			return nil, fmt.Errorf("error finding parameter file %s", pf)+		}++		rendered, err := renderParametersFile(pf, pft, meta, parameters)+		if err != nil {+			return nil, fmt.Errorf("error expanding parameter file %s: %w", pf, err)+		}++		parameters := map[string]string{}+		errs := []string{}+		parser.GetParametersFromFile(pf, []byte(rendered), errs, parameters)+		if len(errs) > 0 {+			return nil, fmt.Errorf("failed to unmarshal parameter file %s: %s", pf, strings.Join(errs, ", "))+		}++		return parameters, nil+	}++	return nil, nil+}++func renderParametersFile(pf string, pft string, meta renderer.Metadata, parameters map[string]interface{}) (string, error) {+	vals := renderer.+		NewVariableMap().+		WithInstance(meta.OperatorName, meta.InstanceName, meta.InstanceNamespace, meta.AppVersion, meta.OperatorVersion).+		WithParameters(parameters)++	engine := renderer.New()++	return engine.Render(pf, pft, vals)+}++func instanceResource(instanceName, operatorName, operatorVersionName, namespace string, parameters map[string]string, owner metav1.Object, scheme *runtime.Scheme) (*v1beta1.Instance, error) {+	instance := &v1beta1.Instance{+		TypeMeta: metav1.TypeMeta{+			Kind:       "Instance",+			APIVersion: packages.APIVersion,+		},+		ObjectMeta: metav1.ObjectMeta{+			Name:      instanceName,+			Namespace: namespace,+			Labels:    map[string]string{kudo.OperatorLabel: operatorName},+		},+		Spec: v1beta1.InstanceSpec{+			OperatorVersion: corev1.ObjectReference{+				Name: operatorVersionName,+			},+			Parameters: parameters,+		},+		Status: v1beta1.InstanceStatus{},+	}+	if err := controllerutil.SetControllerReference(owner, instance, scheme); err != nil {+		return nil, fmt.Errorf("failed to set resource ownership for the new instance: %v", err)+	}++	return instance, nil+}++// applyInstance creates the passed instance if it doesn't exist or patches the existing one. Patch will override+// current spec.parameters and Spec.operatorVersion the same way, kudoctl does it. If the was no error, then the passed+// instance object is updated with the content returned by the server+func applyInstance(new *v1beta1.Instance, ns string, c client.Client) error {

#1542 won't resolve this, because it's a refactor in a different area. We should keep the API unification in mind though and see if we can tackle this as part of the dependencies work. All the points you mentioned as reasons for adding these functions are more reasons to refactor the existing similar functions to also cover this use-case.

zen-dog

comment created time in 2 days

Pull request review commentkudobuilder/kudo

Refactor operator package installation

+// Package install provides function to install package resources+// on a Kubernetes cluster.+package install++import (+	"strings"+	"time"++	"github.com/kudobuilder/kudo/pkg/apis/kudo/v1beta1"+	"github.com/kudobuilder/kudo/pkg/kudoctl/clog"+	"github.com/kudobuilder/kudo/pkg/kudoctl/packages"+	"github.com/kudobuilder/kudo/pkg/kudoctl/util/kudo"+)++type Options struct {

I have yet to see where the Kubernetes code base disagrees. Their usage of types.go is done to provide consistency when using generators like deepcopy-gen or other kubebuilder tools on structures that are supposed to be used as part of the Kubernetes API. Which isn't the case here.

nfnt

comment created time in 2 days

Pull request review commentkudobuilder/kudo

KEP-29: Add `KudoOperatorTask` implementation

 func getParamsFromFiles(fs afero.Fs, filePaths []string, errs []string) (map[str 			errs = append(errs, fmt.Sprintf("error reading from parameter file %s: %v", filePath, err)) 			continue 		}-		data := make(map[string]interface{})-		err = yaml.Unmarshal(rawData, &data)++		errs = GetParametersFromFile(filePath, rawData, errs, parameters)++	}+	return parameters, errs+}++func GetParametersFromFile(filePath string, bytes []byte, errs []string, parameters map[string]string) []string {

Your example looks great, that's what I had in mind.

zen-dog

comment created time in 2 days

push eventmesosphere/kudo-cassandra-operator

Jan Schlicht

commit sha 66dd713ad22505463b3b92dc1fec87cb229d6a3d

Update more docs Signed-off-by: Jan Schlicht <jan@d2iq.com>

view details

push time in 2 days

Pull request review commentkudobuilder/kudo

KEP-29: Add `KudoOperatorTask` implementation

 func GetPhaseStatus(phaseName string, planStatus *PlanStatus) *PhaseStatus {  	return nil }++func InstanceName(operatorName string) string {

I see, it's supposed to be used in environment that don't have these objects (yet). I'm a bit nitpicky of having a clearer name for that, e.g. InstanceNameFromOperatorName but given that operatorName is already part of the function signature, I'm fine with the current solution as well.

zen-dog

comment created time in 3 days

push eventkudobuilder/kudo

Jan Schlicht

commit sha 5ac19d94a98c8fe4d1368cc413741ba89705a38f

Apply review suggestions * Easier option handling * Improved logging Signed-off-by: Jan Schlicht <jan@d2iq.com>

view details

push time in 3 days

Pull request review commentkudobuilder/kudo

Refactor operator package installation

+// Package install provides function to install package resources+// on a Kubernetes cluster.+package install++import (+	"strings"+	"time"++	"github.com/kudobuilder/kudo/pkg/apis/kudo/v1beta1"+	"github.com/kudobuilder/kudo/pkg/kudoctl/clog"+	"github.com/kudobuilder/kudo/pkg/kudoctl/packages"+	"github.com/kudobuilder/kudo/pkg/kudoctl/util/kudo"+)++type Options struct {

Adding a install.types.go won't change anything regarding cyclic dependencies, because it would still be in the same package. For this it would need to be in types.go of the packages package. I consider this an anti-pattern, types should be declared close to the functions using this type. Cyclic dependencies can be resolved by good package management following SRP.

nfnt

comment created time in 3 days

Pull request review commentkudobuilder/kudo

Refactor operator package installation

+// Package install provides function to install package resources+// on a Kubernetes cluster.+package install++import (+	"strings"+	"time"++	"github.com/kudobuilder/kudo/pkg/apis/kudo/v1beta1"+	"github.com/kudobuilder/kudo/pkg/kudoctl/clog"+	"github.com/kudobuilder/kudo/pkg/kudoctl/packages"+	"github.com/kudobuilder/kudo/pkg/kudoctl/util/kudo"+)++type Options struct {+	skipInstance    bool+	wait            *time.Duration+	createNamespace bool+}++type Option func(*Options)++// SkipInstance installs only Operator and OperatorVersion+// of an operator package.+func SkipInstance() Option {+	return func(o *Options) {+		o.skipInstance = true+	}+}++// WaitForInstance waits an amount of time for the instance+// to complete installation.+func WaitForInstance(duration time.Duration) Option {+	return func(o *Options) {+		o.wait = &duration+	}+}++// CreateNamespace creates the specified namespace before installation.+// If available, a namespace manifest in the operator package is+// rendered using the installation parameters.+func CreateNamespace() Option {+	return func(o *Options) {+		o.createNamespace = true+	}+}++// Package installs an operator package with parameters into a namespace.+// Instance name, namespace and operator parameters are applied to the+// operator package resources. These rendered resources are then created+// on the Kubernetes cluster.+func Package(

I want to avoid that this is install.InstallPackage when called from other packages. OTOH, for the other functions the install prefix makes them more readable because they're private and only called in this package.

nfnt

comment created time in 3 days

Pull request review commentkudobuilder/kudo

Refactor operator package installation

+// Package install provides function to install package resources+// on a Kubernetes cluster.+package install++import (+	"strings"+	"time"++	"github.com/kudobuilder/kudo/pkg/apis/kudo/v1beta1"+	"github.com/kudobuilder/kudo/pkg/kudoctl/clog"+	"github.com/kudobuilder/kudo/pkg/kudoctl/packages"+	"github.com/kudobuilder/kudo/pkg/kudoctl/util/kudo"+)++type Options struct {+	skipInstance    bool+	wait            *time.Duration+	createNamespace bool+}++type Option func(*Options)++// SkipInstance installs only Operator and OperatorVersion+// of an operator package.+func SkipInstance() Option {+	return func(o *Options) {+		o.skipInstance = true+	}+}++// WaitForInstance waits an amount of time for the instance+// to complete installation.+func WaitForInstance(duration time.Duration) Option {+	return func(o *Options) {+		o.wait = &duration+	}+}++// CreateNamespace creates the specified namespace before installation.+// If available, a namespace manifest in the operator package is+// rendered using the installation parameters.+func CreateNamespace() Option {+	return func(o *Options) {+		o.createNamespace = true+	}+}++// Package installs an operator package with parameters into a namespace.+// Instance name, namespace and operator parameters are applied to the+// operator package resources. These rendered resources are then created+// on the Kubernetes cluster.+func Package(+	client *kudo.Client,+	instanceName string,+	namespace string,+	resources packages.Resources,+	parameters map[string]string,+	opts ...Option) error {+	clog.V(3).Printf("operator name: %v", resources.Operator.Name)+	clog.V(3).Printf("operator version: %v", resources.OperatorVersion.Spec.Version)++	options := Options{}+	for _, o := range opts {+		o(&options)+	}++	applyOverrides(&resources, instanceName, namespace, parameters)++	if err := client.ValidateServerForOperator(resources.Operator); err != nil {+		return err+	}++	if options.createNamespace {+		if err := installNamespace(client, resources, parameters); err != nil {+			return err+		}+	}++	if err := installOperatorAndOperatorVersion(client, resources); err != nil {+		return err+	}++	if options.skipInstance {+		return nil+	}++	if err := validateParameters(

Parameter validation is only important for instances, that's why it's only done when actually installing an instance.

nfnt

comment created time in 3 days

Pull request review commentkudobuilder/kudo

Refactor operator package installation

+// Package install provides function to install package resources+// on a Kubernetes cluster.+package install++import (+	"strings"+	"time"++	"github.com/kudobuilder/kudo/pkg/apis/kudo/v1beta1"+	"github.com/kudobuilder/kudo/pkg/kudoctl/clog"+	"github.com/kudobuilder/kudo/pkg/kudoctl/packages"+	"github.com/kudobuilder/kudo/pkg/kudoctl/util/kudo"+)++type Options struct {+	skipInstance    bool+	wait            *time.Duration+	createNamespace bool+}++type Option func(*Options)++// SkipInstance installs only Operator and OperatorVersion+// of an operator package.+func SkipInstance() Option {+	return func(o *Options) {+		o.skipInstance = true+	}+}++// WaitForInstance waits an amount of time for the instance+// to complete installation.+func WaitForInstance(duration time.Duration) Option {+	return func(o *Options) {+		o.wait = &duration+	}+}++// CreateNamespace creates the specified namespace before installation.+// If available, a namespace manifest in the operator package is+// rendered using the installation parameters.+func CreateNamespace() Option {+	return func(o *Options) {+		o.createNamespace = true+	}+}++// Package installs an operator package with parameters into a namespace.+// Instance name, namespace and operator parameters are applied to the+// operator package resources. These rendered resources are then created+// on the Kubernetes cluster.+func Package(+	client *kudo.Client,+	instanceName string,+	namespace string,+	resources packages.Resources,+	parameters map[string]string,+	opts ...Option) error {+	clog.V(3).Printf("operator name: %v", resources.Operator.Name)

Good point, I copied these logs. Happy to improve them here and in the other places. Great suggestions!

nfnt

comment created time in 3 days

Pull request review commentkudobuilder/kudo

Refactor operator package installation

+// Package install provides function to install package resources+// on a Kubernetes cluster.+package install++import (+	"strings"+	"time"++	"github.com/kudobuilder/kudo/pkg/apis/kudo/v1beta1"+	"github.com/kudobuilder/kudo/pkg/kudoctl/clog"+	"github.com/kudobuilder/kudo/pkg/kudoctl/packages"+	"github.com/kudobuilder/kudo/pkg/kudoctl/util/kudo"+)++type Options struct {+	skipInstance    bool+	wait            *time.Duration+	createNamespace bool+}++type Option func(*Options)++// SkipInstance installs only Operator and OperatorVersion+// of an operator package.+func SkipInstance() Option {

To provide a backwards-compatible way of declaring options. E.g. the underlying option types could be changes without voiding these functions -- only changing their implementations. But that's probably a bit too much, as it also complicates how that function is called in installOperator. I'll change this to use an install.Options.

nfnt

comment created time in 3 days

Pull request review commentkudobuilder/kudo

KEP-29: Add `KudoOperatorTask` implementation

 func newKudoOperator(task *v1beta1.Task) (Tasker, error) { 		return nil, fmt.Errorf("task validation error: kudo operator task '%s' has an empty package name", task.Name) 	} +	if len(task.Spec.KudoOperatorTaskSpec.OperatorVersion) == 0 {

Then fix the other occurrences as well please, let's not spread bad patterns; also validPipeFile does the right comparisons. Line 144 and line 192.

zen-dog

comment created time in 3 days

Pull request review commentkudobuilder/kudo

KEP-29: Add `KudoOperatorTask` implementation

 func newKudoOperator(task *v1beta1.Task) (Tasker, error) { 		return nil, fmt.Errorf("task validation error: kudo operator task '%s' has an empty package name", task.Name) 	} +	if len(task.Spec.KudoOperatorTaskSpec.OperatorVersion) == 0 {

That's a weird way to check for an empty string.

	if task.Spec.KudoOperatorTaskSpec.OperatorVersion == "" {
zen-dog

comment created time in 3 days

Pull request review commentkudobuilder/kudo

KEP-29: Add `KudoOperatorTask` implementation

 type KudoOperatorTask struct { 	InstanceName    string 	AppVersion      string 	OperatorVersion string+	ParameterFile   string }  // Run method for the KudoOperatorTask. Not yet implemented func (dt KudoOperatorTask) Run(ctx Context) (bool, error) {-	return false, errors.New("kudo-operator task is not yet implemented. Stay tuned though ;)")++	// 0. - A few prerequisites -+	// Note: ctx.Meta has Meta.OperatorName and Meta.OperatorVersion fields but these are of the **parent instance**+	// However, since we don't support multiple namespaces yet, we can use the Meta.InstanceNamespace for the namespace+	namespace := ctx.Meta.InstanceNamespace+	operatorName := dt.Package+	operatorVersion := dt.OperatorVersion+	operatorVersionName := v1beta1.OperatorVersionName(operatorName, operatorVersion)+	instanceName := dependencyInstanceName(ctx.Meta.InstanceName, dt.InstanceName, operatorName)++	// 1. - Expand parameter file if exists -+	params, err := instanceParameters(dt.ParameterFile, ctx.Templates, ctx.Meta, ctx.Parameters)+	if err != nil {+		return false, fatalExecutionError(err, taskRenderingError, ctx.Meta)+	}++	// 2. - Build the instance object -+	instance, err := instanceResource(instanceName, operatorName, operatorVersionName, namespace, params, ctx.Meta.ResourcesOwner, ctx.Scheme)+	if err != nil {+		return false, fatalExecutionError(err, taskRenderingError, ctx.Meta)+	}++	// 3. - Apply the Instance object -+	err = applyInstance(instance, namespace, ctx.Client)+	if err != nil {+		return false, err+	}++	// 4. - Check the Instance health -+	if err := health.IsHealthy(instance); err != nil {+		return false, nil+	}++	return true, nil+}++// dependencyInstanceName returns a name for the child instance in an operator with dependencies looking like+// <parent-instance.<child-instance> if a child instance name is provided e.g. `kafka-instance.custom-name` or+// <parent-instance.<child-operator> if not e.g. `kafka-instance.zookeeper`. This way we always have a valid child+// instance name and user can install the same operator multiple times in the same namespace, because the instance+// names will be unique thanks to the top-level instance name prefix.+func dependencyInstanceName(parentInstanceName, instanceName, operatorName string) string {+	if instanceName != "" {+		return fmt.Sprintf("%s.%s", parentInstanceName, instanceName)+	}+	return fmt.Sprintf("%s.%s", parentInstanceName, operatorName)+}++// render method takes templated parameter file and a map of parameters and then renders passed template using kudo engine.+func instanceParameters(pf string, templates map[string]string, meta renderer.Metadata, parameters map[string]interface{}) (map[string]string, error) {+	if len(pf) != 0 {
	if pf != "" {
zen-dog

comment created time in 3 days

Pull request review commentkudobuilder/kudo

KEP-29: Add `KudoOperatorTask` implementation

 type KudoOperatorTask struct { 	InstanceName    string 	AppVersion      string 	OperatorVersion string+	ParameterFile   string }  // Run method for the KudoOperatorTask. Not yet implemented func (dt KudoOperatorTask) Run(ctx Context) (bool, error) {-	return false, errors.New("kudo-operator task is not yet implemented. Stay tuned though ;)")++	// 0. - A few prerequisites -+	// Note: ctx.Meta has Meta.OperatorName and Meta.OperatorVersion fields but these are of the **parent instance**+	// However, since we don't support multiple namespaces yet, we can use the Meta.InstanceNamespace for the namespace+	namespace := ctx.Meta.InstanceNamespace+	operatorName := dt.Package+	operatorVersion := dt.OperatorVersion+	operatorVersionName := v1beta1.OperatorVersionName(operatorName, operatorVersion)+	instanceName := dependencyInstanceName(ctx.Meta.InstanceName, dt.InstanceName, operatorName)++	// 1. - Expand parameter file if exists -+	params, err := instanceParameters(dt.ParameterFile, ctx.Templates, ctx.Meta, ctx.Parameters)+	if err != nil {+		return false, fatalExecutionError(err, taskRenderingError, ctx.Meta)+	}++	// 2. - Build the instance object -+	instance, err := instanceResource(instanceName, operatorName, operatorVersionName, namespace, params, ctx.Meta.ResourcesOwner, ctx.Scheme)+	if err != nil {+		return false, fatalExecutionError(err, taskRenderingError, ctx.Meta)+	}++	// 3. - Apply the Instance object -+	err = applyInstance(instance, namespace, ctx.Client)+	if err != nil {+		return false, err+	}++	// 4. - Check the Instance health -+	if err := health.IsHealthy(instance); err != nil {+		return false, nil+	}++	return true, nil+}++// dependencyInstanceName returns a name for the child instance in an operator with dependencies looking like+// <parent-instance.<child-instance> if a child instance name is provided e.g. `kafka-instance.custom-name` or+// <parent-instance.<child-operator> if not e.g. `kafka-instance.zookeeper`. This way we always have a valid child+// instance name and user can install the same operator multiple times in the same namespace, because the instance+// names will be unique thanks to the top-level instance name prefix.+func dependencyInstanceName(parentInstanceName, instanceName, operatorName string) string {+	if instanceName != "" {+		return fmt.Sprintf("%s.%s", parentInstanceName, instanceName)+	}+	return fmt.Sprintf("%s.%s", parentInstanceName, operatorName)+}++// render method takes templated parameter file and a map of parameters and then renders passed template using kudo engine.+func instanceParameters(pf string, templates map[string]string, meta renderer.Metadata, parameters map[string]interface{}) (map[string]string, error) {+	if len(pf) != 0 {+		pft, ok := templates[pf]+		if !ok {+			return nil, fmt.Errorf("error finding parameter file %s", pf)

Let's be a bit more clear here

			return nil, fmt.Errorf("error finding parameter file %s in template parameters", pf)
zen-dog

comment created time in 3 days

Pull request review commentkudobuilder/kudo

KEP-29: Add `KudoOperatorTask` implementation

 type KudoOperatorTask struct { 	InstanceName    string 	AppVersion      string 	OperatorVersion string+	ParameterFile   string }  // Run method for the KudoOperatorTask. Not yet implemented func (dt KudoOperatorTask) Run(ctx Context) (bool, error) {-	return false, errors.New("kudo-operator task is not yet implemented. Stay tuned though ;)")++	// 0. - A few prerequisites -+	// Note: ctx.Meta has Meta.OperatorName and Meta.OperatorVersion fields but these are of the **parent instance**+	// However, since we don't support multiple namespaces yet, we can use the Meta.InstanceNamespace for the namespace+	namespace := ctx.Meta.InstanceNamespace+	operatorName := dt.Package+	operatorVersion := dt.OperatorVersion+	operatorVersionName := v1beta1.OperatorVersionName(operatorName, operatorVersion)+	instanceName := dependencyInstanceName(ctx.Meta.InstanceName, dt.InstanceName, operatorName)++	// 1. - Expand parameter file if exists -+	params, err := instanceParameters(dt.ParameterFile, ctx.Templates, ctx.Meta, ctx.Parameters)+	if err != nil {+		return false, fatalExecutionError(err, taskRenderingError, ctx.Meta)+	}++	// 2. - Build the instance object -+	instance, err := instanceResource(instanceName, operatorName, operatorVersionName, namespace, params, ctx.Meta.ResourcesOwner, ctx.Scheme)+	if err != nil {+		return false, fatalExecutionError(err, taskRenderingError, ctx.Meta)+	}++	// 3. - Apply the Instance object -+	err = applyInstance(instance, namespace, ctx.Client)+	if err != nil {+		return false, err+	}++	// 4. - Check the Instance health -+	if err := health.IsHealthy(instance); err != nil {+		return false, nil+	}++	return true, nil+}++// dependencyInstanceName returns a name for the child instance in an operator with dependencies looking like+// <parent-instance.<child-instance> if a child instance name is provided e.g. `kafka-instance.custom-name` or+// <parent-instance.<child-operator> if not e.g. `kafka-instance.zookeeper`. This way we always have a valid child+// instance name and user can install the same operator multiple times in the same namespace, because the instance+// names will be unique thanks to the top-level instance name prefix.+func dependencyInstanceName(parentInstanceName, instanceName, operatorName string) string {+	if instanceName != "" {+		return fmt.Sprintf("%s.%s", parentInstanceName, instanceName)+	}+	return fmt.Sprintf("%s.%s", parentInstanceName, operatorName)+}++// render method takes templated parameter file and a map of parameters and then renders passed template using kudo engine.
// instanceParameters method takes templated parameter file and a map of parameters and then renders passed template using kudo engine.
zen-dog

comment created time in 3 days

Pull request review commentkudobuilder/kudo

KEP-29: Add `KudoOperatorTask` implementation

 func IsHealthy(obj runtime.Object) error {  		return fmt.Errorf("job %q still running or failed", obj.Name) 	case *kudov1beta1.Instance:-		ps := obj.GetLastExecutedPlanStatus()-		if ps == nil {-			return fmt.Errorf("no plan has been executed for Instance %v", obj.Name)-		}--		if ps.Status.IsFinished() {+		// if there is no scheduled plan, than we're done
		// if there is no scheduled plan, then we're done
zen-dog

comment created time in 3 days

Pull request review commentkudobuilder/kudo

KEP-29: Add `KudoOperatorTask` implementation

 type KudoOperatorTask struct { 	InstanceName    string 	AppVersion      string 	OperatorVersion string+	ParameterFile   string }  // Run method for the KudoOperatorTask. Not yet implemented func (dt KudoOperatorTask) Run(ctx Context) (bool, error) {-	return false, errors.New("kudo-operator task is not yet implemented. Stay tuned though ;)")++	// 0. - A few prerequisites -+	// Note: ctx.Meta has Meta.OperatorName and Meta.OperatorVersion fields but these are of the **parent instance**+	// However, since we don't support multiple namespaces yet, we can use the Meta.InstanceNamespace for the namespace+	namespace := ctx.Meta.InstanceNamespace+	operatorName := dt.Package

"Package" could also mean a reference to a URL or local file? This isn't really clear and would break the OperatorVersionName call below. Please clarify by documenting the fields of KudoOperatorTask. If this is meant to always reference an operator name, let's rename the field to OperatorName.

zen-dog

comment created time in 3 days

Pull request review commentkudobuilder/kudo

KEP-29: Add `KudoOperatorTask` implementation

 status: provisional   * [Non-Goals](#non-goals) * [Proposal](#proposal)   * [Implementation Details](#implementation-details)-    * [Operator Task](#operator-task)+    * [Operator Task](#kudooperator-task)

Please update ToC name as well.

zen-dog

comment created time in 3 days

Pull request review commentkudobuilder/kudo

KEP-29: Add `KudoOperatorTask` implementation

 type KudoOperatorTask struct { 	InstanceName    string 	AppVersion      string 	OperatorVersion string+	ParameterFile   string }  // Run method for the KudoOperatorTask. Not yet implemented func (dt KudoOperatorTask) Run(ctx Context) (bool, error) {-	return false, errors.New("kudo-operator task is not yet implemented. Stay tuned though ;)")++	// 0. - A few prerequisites -+	// Note: ctx.Meta has Meta.OperatorName and Meta.OperatorVersion fields but these are of the **parent instance**+	// However, since we don't support multiple namespaces yet, we can use the Meta.InstanceNamespace for the namespace+	namespace := ctx.Meta.InstanceNamespace+	operatorName := dt.Package+	operatorVersion := dt.OperatorVersion+	operatorVersionName := v1beta1.OperatorVersionName(operatorName, operatorVersion)+	instanceName := dependencyInstanceName(ctx.Meta.InstanceName, dt.InstanceName, operatorName)++	// 1. - Expand parameter file if exists -+	params, err := instanceParameters(dt.ParameterFile, ctx.Templates, ctx.Meta, ctx.Parameters)+	if err != nil {+		return false, fatalExecutionError(err, taskRenderingError, ctx.Meta)+	}++	// 2. - Build the instance object -+	instance, err := instanceResource(instanceName, operatorName, operatorVersionName, namespace, params, ctx.Meta.ResourcesOwner, ctx.Scheme)+	if err != nil {+		return false, fatalExecutionError(err, taskRenderingError, ctx.Meta)+	}++	// 3. - Apply the Instance object -+	err = applyInstance(instance, namespace, ctx.Client)+	if err != nil {+		return false, err+	}++	// 4. - Check the Instance health -+	if err := health.IsHealthy(instance); err != nil {+		return false, nil+	}++	return true, nil+}++// dependencyInstanceName returns a name for the child instance in an operator with dependencies looking like+// <parent-instance.<child-instance> if a child instance name is provided e.g. `kafka-instance.custom-name` or+// <parent-instance.<child-operator> if not e.g. `kafka-instance.zookeeper`. This way we always have a valid child+// instance name and user can install the same operator multiple times in the same namespace, because the instance+// names will be unique thanks to the top-level instance name prefix.+func dependencyInstanceName(parentInstanceName, instanceName, operatorName string) string {+	if instanceName != "" {+		return fmt.Sprintf("%s.%s", parentInstanceName, instanceName)+	}+	return fmt.Sprintf("%s.%s", parentInstanceName, operatorName)+}++// render method takes templated parameter file and a map of parameters and then renders passed template using kudo engine.+func instanceParameters(pf string, templates map[string]string, meta renderer.Metadata, parameters map[string]interface{}) (map[string]string, error) {+	if len(pf) != 0 {+		pft, ok := templates[pf]+		if !ok {+			return nil, fmt.Errorf("error finding parameter file %s", pf)+		}++		rendered, err := renderParametersFile(pf, pft, meta, parameters)+		if err != nil {+			return nil, fmt.Errorf("error expanding parameter file %s: %w", pf, err)+		}++		parameters := map[string]string{}+		errs := []string{}+		parser.GetParametersFromFile(pf, []byte(rendered), errs, parameters)+		if len(errs) > 0 {+			return nil, fmt.Errorf("failed to unmarshal parameter file %s: %s", pf, strings.Join(errs, ", "))+		}++		return parameters, nil+	}++	return nil, nil

Let's return an empty map instead. This will prevent issues with dependent functions not checking for a nil map

	return map[string]string{}, nil

or move the definition of parameters to the beginning of the function and return it here.

zen-dog

comment created time in 3 days

Pull request review commentkudobuilder/kudo

KEP-29: Add `KudoOperatorTask` implementation

 type KudoOperatorTask struct { 	InstanceName    string 	AppVersion      string 	OperatorVersion string+	ParameterFile   string }  // Run method for the KudoOperatorTask. Not yet implemented func (dt KudoOperatorTask) Run(ctx Context) (bool, error) {-	return false, errors.New("kudo-operator task is not yet implemented. Stay tuned though ;)")++	// 0. - A few prerequisites -+	// Note: ctx.Meta has Meta.OperatorName and Meta.OperatorVersion fields but these are of the **parent instance**+	// However, since we don't support multiple namespaces yet, we can use the Meta.InstanceNamespace for the namespace+	namespace := ctx.Meta.InstanceNamespace+	operatorName := dt.Package+	operatorVersion := dt.OperatorVersion+	operatorVersionName := v1beta1.OperatorVersionName(operatorName, operatorVersion)+	instanceName := dependencyInstanceName(ctx.Meta.InstanceName, dt.InstanceName, operatorName)++	// 1. - Expand parameter file if exists -+	params, err := instanceParameters(dt.ParameterFile, ctx.Templates, ctx.Meta, ctx.Parameters)+	if err != nil {+		return false, fatalExecutionError(err, taskRenderingError, ctx.Meta)+	}++	// 2. - Build the instance object -+	instance, err := instanceResource(instanceName, operatorName, operatorVersionName, namespace, params, ctx.Meta.ResourcesOwner, ctx.Scheme)+	if err != nil {+		return false, fatalExecutionError(err, taskRenderingError, ctx.Meta)+	}++	// 3. - Apply the Instance object -+	err = applyInstance(instance, namespace, ctx.Client)+	if err != nil {+		return false, err+	}++	// 4. - Check the Instance health -+	if err := health.IsHealthy(instance); err != nil {+		return false, nil+	}++	return true, nil+}++// dependencyInstanceName returns a name for the child instance in an operator with dependencies looking like+// <parent-instance.<child-instance> if a child instance name is provided e.g. `kafka-instance.custom-name` or+// <parent-instance.<child-operator> if not e.g. `kafka-instance.zookeeper`. This way we always have a valid child+// instance name and user can install the same operator multiple times in the same namespace, because the instance+// names will be unique thanks to the top-level instance name prefix.+func dependencyInstanceName(parentInstanceName, instanceName, operatorName string) string {+	if instanceName != "" {+		return fmt.Sprintf("%s.%s", parentInstanceName, instanceName)+	}+	return fmt.Sprintf("%s.%s", parentInstanceName, operatorName)+}++// render method takes templated parameter file and a map of parameters and then renders passed template using kudo engine.+func instanceParameters(pf string, templates map[string]string, meta renderer.Metadata, parameters map[string]interface{}) (map[string]string, error) {

pf? parameterFile.

zen-dog

comment created time in 3 days

Pull request review commentkudobuilder/kudo

KEP-29: Add `KudoOperatorTask` implementation

 const ( // 2. GetPlanInProgress goes through i.Spec.PlanStatus map and returns the first found plan that is running // // (1) is set directly when the user updates the instance and reset **after** the plan is terminal-// (2) is updated **after** each time the instance controller executes the plan+// (2) is updated **AFTER** the instance controller if done with the reconciliation call

s/if/is/

zen-dog

comment created time in 3 days

Pull request review commentkudobuilder/kudo

KEP-29: Add `KudoOperatorTask` implementation

 type KudoOperatorTask struct { 	InstanceName    string 	AppVersion      string 	OperatorVersion string+	ParameterFile   string }  // Run method for the KudoOperatorTask. Not yet implemented func (dt KudoOperatorTask) Run(ctx Context) (bool, error) {-	return false, errors.New("kudo-operator task is not yet implemented. Stay tuned though ;)")++	// 0. - A few prerequisites -+	// Note: ctx.Meta has Meta.OperatorName and Meta.OperatorVersion fields but these are of the **parent instance**+	// However, since we don't support multiple namespaces yet, we can use the Meta.InstanceNamespace for the namespace+	namespace := ctx.Meta.InstanceNamespace+	operatorName := dt.Package+	operatorVersion := dt.OperatorVersion+	operatorVersionName := v1beta1.OperatorVersionName(operatorName, operatorVersion)+	instanceName := dependencyInstanceName(ctx.Meta.InstanceName, dt.InstanceName, operatorName)++	// 1. - Expand parameter file if exists -+	params, err := instanceParameters(dt.ParameterFile, ctx.Templates, ctx.Meta, ctx.Parameters)+	if err != nil {+		return false, fatalExecutionError(err, taskRenderingError, ctx.Meta)+	}++	// 2. - Build the instance object -+	instance, err := instanceResource(instanceName, operatorName, operatorVersionName, namespace, params, ctx.Meta.ResourcesOwner, ctx.Scheme)+	if err != nil {+		return false, fatalExecutionError(err, taskRenderingError, ctx.Meta)+	}++	// 3. - Apply the Instance object -+	err = applyInstance(instance, namespace, ctx.Client)+	if err != nil {+		return false, err+	}++	// 4. - Check the Instance health -+	if err := health.IsHealthy(instance); err != nil {+		return false, nil+	}++	return true, nil+}++// dependencyInstanceName returns a name for the child instance in an operator with dependencies looking like+// <parent-instance.<child-instance> if a child instance name is provided e.g. `kafka-instance.custom-name` or+// <parent-instance.<child-operator> if not e.g. `kafka-instance.zookeeper`. This way we always have a valid child+// instance name and user can install the same operator multiple times in the same namespace, because the instance+// names will be unique thanks to the top-level instance name prefix.+func dependencyInstanceName(parentInstanceName, instanceName, operatorName string) string {+	if instanceName != "" {+		return fmt.Sprintf("%s.%s", parentInstanceName, instanceName)+	}+	return fmt.Sprintf("%s.%s", parentInstanceName, operatorName)+}++// render method takes templated parameter file and a map of parameters and then renders passed template using kudo engine.+func instanceParameters(pf string, templates map[string]string, meta renderer.Metadata, parameters map[string]interface{}) (map[string]string, error) {+	if len(pf) != 0 {+		pft, ok := templates[pf]+		if !ok {+			return nil, fmt.Errorf("error finding parameter file %s", pf)+		}++		rendered, err := renderParametersFile(pf, pft, meta, parameters)+		if err != nil {+			return nil, fmt.Errorf("error expanding parameter file %s: %w", pf, err)+		}++		parameters := map[string]string{}+		errs := []string{}+		parser.GetParametersFromFile(pf, []byte(rendered), errs, parameters)+		if len(errs) > 0 {+			return nil, fmt.Errorf("failed to unmarshal parameter file %s: %s", pf, strings.Join(errs, ", "))+		}++		return parameters, nil+	}++	return nil, nil+}++func renderParametersFile(pf string, pft string, meta renderer.Metadata, parameters map[string]interface{}) (string, error) {+	vals := renderer.

Nit: vals for values? Let's use vars because we're creating a variable map here.

zen-dog

comment created time in 3 days

Pull request review commentkudobuilder/kudo

KEP-29: Add `KudoOperatorTask` implementation

 type KudoOperatorTask struct { 	InstanceName    string 	AppVersion      string 	OperatorVersion string+	ParameterFile   string }  // Run method for the KudoOperatorTask. Not yet implemented func (dt KudoOperatorTask) Run(ctx Context) (bool, error) {-	return false, errors.New("kudo-operator task is not yet implemented. Stay tuned though ;)")++	// 0. - A few prerequisites -+	// Note: ctx.Meta has Meta.OperatorName and Meta.OperatorVersion fields but these are of the **parent instance**+	// However, since we don't support multiple namespaces yet, we can use the Meta.InstanceNamespace for the namespace+	namespace := ctx.Meta.InstanceNamespace+	operatorName := dt.Package+	operatorVersion := dt.OperatorVersion+	operatorVersionName := v1beta1.OperatorVersionName(operatorName, operatorVersion)+	instanceName := dependencyInstanceName(ctx.Meta.InstanceName, dt.InstanceName, operatorName)++	// 1. - Expand parameter file if exists -+	params, err := instanceParameters(dt.ParameterFile, ctx.Templates, ctx.Meta, ctx.Parameters)+	if err != nil {+		return false, fatalExecutionError(err, taskRenderingError, ctx.Meta)+	}++	// 2. - Build the instance object -+	instance, err := instanceResource(instanceName, operatorName, operatorVersionName, namespace, params, ctx.Meta.ResourcesOwner, ctx.Scheme)+	if err != nil {+		return false, fatalExecutionError(err, taskRenderingError, ctx.Meta)+	}++	// 3. - Apply the Instance object -+	err = applyInstance(instance, namespace, ctx.Client)+	if err != nil {+		return false, err+	}++	// 4. - Check the Instance health -+	if err := health.IsHealthy(instance); err != nil {+		return false, nil+	}++	return true, nil+}++// dependencyInstanceName returns a name for the child instance in an operator with dependencies looking like+// <parent-instance.<child-instance> if a child instance name is provided e.g. `kafka-instance.custom-name` or+// <parent-instance.<child-operator> if not e.g. `kafka-instance.zookeeper`. This way we always have a valid child+// instance name and user can install the same operator multiple times in the same namespace, because the instance+// names will be unique thanks to the top-level instance name prefix.+func dependencyInstanceName(parentInstanceName, instanceName, operatorName string) string {+	if instanceName != "" {+		return fmt.Sprintf("%s.%s", parentInstanceName, instanceName)+	}+	return fmt.Sprintf("%s.%s", parentInstanceName, operatorName)+}++// render method takes templated parameter file and a map of parameters and then renders passed template using kudo engine.+func instanceParameters(pf string, templates map[string]string, meta renderer.Metadata, parameters map[string]interface{}) (map[string]string, error) {+	if len(pf) != 0 {+		pft, ok := templates[pf]+		if !ok {+			return nil, fmt.Errorf("error finding parameter file %s", pf)+		}++		rendered, err := renderParametersFile(pf, pft, meta, parameters)+		if err != nil {+			return nil, fmt.Errorf("error expanding parameter file %s: %w", pf, err)+		}++		parameters := map[string]string{}+		errs := []string{}+		parser.GetParametersFromFile(pf, []byte(rendered), errs, parameters)+		if len(errs) > 0 {+			return nil, fmt.Errorf("failed to unmarshal parameter file %s: %s", pf, strings.Join(errs, ", "))+		}++		return parameters, nil+	}++	return nil, nil+}++func renderParametersFile(pf string, pft string, meta renderer.Metadata, parameters map[string]interface{}) (string, error) {+	vals := renderer.+		NewVariableMap().+		WithInstance(meta.OperatorName, meta.InstanceName, meta.InstanceNamespace, meta.AppVersion, meta.OperatorVersion).+		WithParameters(parameters)++	engine := renderer.New()++	return engine.Render(pf, pft, vals)+}++func instanceResource(instanceName, operatorName, operatorVersionName, namespace string, parameters map[string]string, owner metav1.Object, scheme *runtime.Scheme) (*v1beta1.Instance, error) {+	instance := &v1beta1.Instance{+		TypeMeta: metav1.TypeMeta{+			Kind:       "Instance",+			APIVersion: packages.APIVersion,+		},+		ObjectMeta: metav1.ObjectMeta{+			Name:      instanceName,+			Namespace: namespace,+			Labels:    map[string]string{kudo.OperatorLabel: operatorName},+		},+		Spec: v1beta1.InstanceSpec{+			OperatorVersion: corev1.ObjectReference{+				Name: operatorVersionName,+			},+			Parameters: parameters,+		},+		Status: v1beta1.InstanceStatus{},+	}+	if err := controllerutil.SetControllerReference(owner, instance, scheme); err != nil {+		return nil, fmt.Errorf("failed to set resource ownership for the new instance: %v", err)+	}++	return instance, nil+}++// applyInstance creates the passed instance if it doesn't exist or patches the existing one. Patch will override+// current spec.parameters and Spec.operatorVersion the same way, kudoctl does it. If the was no error, then the passed+// instance object is updated with the content returned by the server+func applyInstance(new *v1beta1.Instance, ns string, c client.Client) error {

Not yours, but it would be great if you could unify instance creation/updates in KUDO's code base. There is already functionality for that in pkg/kudoctl/util/kudo: InstallInstanceObjToCluster and UpdateInstance. It would be great if that could get refactored into an API that's usable for this case as well.

zen-dog

comment created time in 3 days

Pull request review commentkudobuilder/kudo

KEP-29: Add `KudoOperatorTask` implementation

 func getParamsFromFiles(fs afero.Fs, filePaths []string, errs []string) (map[str 			errs = append(errs, fmt.Sprintf("error reading from parameter file %s: %v", filePath, err)) 			continue 		}-		data := make(map[string]interface{})-		err = yaml.Unmarshal(rawData, &data)++		errs = GetParametersFromFile(filePath, rawData, errs, parameters)++	}+	return parameters, errs+}++func GetParametersFromFile(filePath string, bytes []byte, errs []string, parameters map[string]string) []string {

If []string is used to return multiple errors, why not use []error instead? Also, what's the reasoning behind the errs input value? It's only used for append calls and hence unnecessary. If it should be appended to an existing list of errors, it's much clearer if callers run errs = append(errs, GetParametersFromFile()...).

zen-dog

comment created time in 3 days

Pull request review commentkudobuilder/kudo

KEP-29: Add `KudoOperatorTask` implementation

 func GetPhaseStatus(phaseName string, planStatus *PlanStatus) *PhaseStatus {  	return nil }++func InstanceName(operatorName string) string {+	return fmt.Sprintf("%s-instance", operatorName)+}++func OperatorVersionName(operatorName, operatorVersion string) string {

See comment above. Also, this shouldn't be in instance_type_helpers but operatorversion_type_helpers` instead.

zen-dog

comment created time in 3 days

Pull request review commentkudobuilder/kudo

KEP-29: Add `KudoOperatorTask` implementation

 func GetPhaseStatus(phaseName string, planStatus *PlanStatus) *PhaseStatus {  	return nil }++func InstanceName(operatorName string) string {

Better would be to have something like

func (o *Operator) InstanceName() string

because the string parameter makes this basically untyped. Then the calls below would become files.Operator.Name() instead of InstanceName(files.Operator.Name).

zen-dog

comment created time in 3 days

push eventmesosphere/kudo-cassandra-operator

Jan Schlicht

commit sha 621296b9d081f8466ccfa902c6f14f601e040144

Run 'tools/format_files.sh' Signed-off-by: Jan Schlicht <jan@d2iq.com>

view details

push time in 3 days

Pull request review commentmesosphere/kudo-cassandra-operator

Disable the Prometheus exporter by default

 # Parameters--| Name                                                     | Description                                                                                                                                                                                                                                                                                                                               | Default                                                                                                                                                                                           |-| -------------------------------------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |-| **NODE_COUNT**                                           | Number of Cassandra nodes.                                                                                                                                                                                                                                                                                                                | 3                                                                                                                                                                                                 |-| **NODE_CPU_MC**                                          | CPU request (in millicores) for the Cassandra node containers.                                                                                                                                                                                                                                                                            | 1000                                                                                                                                                                                              |-| **NODE_CPU_LIMIT_MC**                                    | CPU limit (in millicores) for the Cassandra node containers.                                                                                                                                                                                                                                                                              | 1000                                                                                                                                                                                              |-| **NODE_MEM_MIB**                                         | Memory request (in MiB) for the Cassandra node containers.                                                                                                                                                                                                                                                                                | 4096                                                                                                                                                                                              |-| **NODE_MEM_LIMIT_MIB**                                   | Memory limit (in MiB) for the Cassandra node containers.                                                                                                                                                                                                                                                                                  | 4096                                                                                                                                                                                              |-| **NODE_DISK_SIZE_GIB**                                   | Disk size (in GiB) for the Cassandra node containers.                                                                                                                                                                                                                                                                                     | 20                                                                                                                                                                                                |-| **NODE_STORAGE_CLASS**                                   | The storage class to be used in volumeClaimTemplates. By default, it is not required and the default storage class is used.                                                                                                                                                                                                               |                                                                                                                                                                                                   |-| **NODE_DOCKER_IMAGE**                                    | Cassandra node Docker image.                                                                                                                                                                                                                                                                                                              | mesosphere/cassandra:3.11.6-0.1.2-SNAPSHOT                                                                                                                                                        |-| **NODE_DOCKER_IMAGE_PULL_POLICY**                        | Cassandra node Docker image pull policy.                                                                                                                                                                                                                                                                                                  | Always                                                                                                                                                                                            |-| **NODE_READINESS_PROBE_INITIAL_DELAY_S**                 | Number of seconds after the container has started before the readiness probe is initiated.                                                                                                                                                                                                                                                | 0                                                                                                                                                                                                 |-| **NODE_READINESS_PROBE_PERIOD_S**                        | How often (in seconds) to perform the readiness probe.                                                                                                                                                                                                                                                                                    | 5                                                                                                                                                                                                 |-| **NODE_READINESS_PROBE_TIMEOUT_S**                       | How long (in seconds) to wait for a readiness probe to succeed.                                                                                                                                                                                                                                                                           | 60                                                                                                                                                                                                |-| **NODE_READINESS_PROBE_SUCCESS_THRESHOLD**               | Minimum consecutive successes for the readiness probe to be considered successful after having failed.                                                                                                                                                                                                                                    | 1                                                                                                                                                                                                 |-| **NODE_READINESS_PROBE_FAILURE_THRESHOLD**               | When a pod starts and the readiness probe fails, `failure_threshold` attempts will be made before marking the pod as 'unready'.                                                                                                                                                                                                           | 3                                                                                                                                                                                                 |-| **NODE_LIVENESS_PROBE_INITIAL_DELAY_S**                  | Number of seconds after the container has started before the liveness probe is initiated.                                                                                                                                                                                                                                                 | 15                                                                                                                                                                                                |-| **NODE_LIVENESS_PROBE_PERIOD_S**                         | How often (in seconds) to perform the liveness probe.                                                                                                                                                                                                                                                                                     | 20                                                                                                                                                                                                |-| **NODE_LIVENESS_PROBE_TIMEOUT_S**                        | How long (in seconds) to wait for a liveness probe to succeed.                                                                                                                                                                                                                                                                            | 60                                                                                                                                                                                                |-| **NODE_LIVENESS_PROBE_SUCCESS_THRESHOLD**                | Minimum consecutive successes for the liveness probe to be considered successful after having failed.                                                                                                                                                                                                                                     | 1                                                                                                                                                                                                 |-| **NODE_LIVENESS_PROBE_FAILURE_THRESHOLD**                | When a pod starts and the liveness probe fails, `failure_threshold` attempts will be made before restarting the pod.                                                                                                                                                                                                                      | 3                                                                                                                                                                                                 |-| **OVERRIDE_CLUSTER_NAME**                                | Override the name of the Cassandra cluster set by the operator. This shouldn't be explicit set, unless you know what you're doing.                                                                                                                                                                                                        |                                                                                                                                                                                                   |-| **EXTERNAL_SERVICE**                                     | Needs to be true for either EXTERNAL_NATIVE_TRANSPORT or EXTERNAL_RPC to work                                                                                                                                                                                                                                                             | False                                                                                                                                                                                             |-| **EXTERNAL_NATIVE_TRANSPORT**                            | This exposes the Cassandra cluster via an external service so it can be accessed from outside the Kubernetes cluster                                                                                                                                                                                                                      | False                                                                                                                                                                                             |-| **EXTERNAL_RPC**                                         | This exposes the Cassandra cluster via an external service so it can be accessed from outside the Kubernetes cluster. Works only if START_RPC is true                                                                                                                                                                                     | False                                                                                                                                                                                             |-| **EXTERNAL_NATIVE_TRANSPORT_PORT**                       | The external port to use for Cassandra native transport protocol.                                                                                                                                                                                                                                                                         | 9042                                                                                                                                                                                              |-| **EXTERNAL_RPC_PORT**                                    | The external port to use for Cassandra rpc protocol.                                                                                                                                                                                                                                                                                      | 9160                                                                                                                                                                                              |-| **RECOVERY_CONTROLLER**                                  | Needs to be true for automatic failure recovery and node eviction                                                                                                                                                                                                                                                                         | False                                                                                                                                                                                             |-| **RECOVERY_CONTROLLER_DOCKER_IMAGE**                     | Recovery controller Docker image.                                                                                                                                                                                                                                                                                                         | mesosphere/kudo-cassandra-recovery:0.0.2-0.1.2-SNAPSHOT                                                                                                                                           |-| **RECOVERY_CONTROLLER_DOCKER_IMAGE_PULL_POLICY**         | Recovery controller Docker image pull policy.                                                                                                                                                                                                                                                                                             | Always                                                                                                                                                                                            |-| **MAX_UNAVAILABLE_NODES**                                | Maximum number of nodes that are allowed to be down, either for restarts or from unscheduled outage. See PodDisruptionBudget                                                                                                                                                                                                              | 1                                                                                                                                                                                                 |-| **BOOTSTRAP_TIMEOUT**                                    | Timeout for the bootstrap binary to join the cluster with the new IP. Valid time units are 'ns', 'us', 'ms', 's', 'm', 'h'.                                                                                                                                                                                                               | 12h30m                                                                                                                                                                                            |-| **SHUTDOWN_OLD_REACHABLE_NODE**                          | When a node replace is done, try to connect to the old node and shut it down before starting up the old node                                                                                                                                                                                                                              | False                                                                                                                                                                                             |-| **BACKUP_RESTORE_ENABLED**                               | Global flag that enables the medusa sidecar for backups                                                                                                                                                                                                                                                                                   | False                                                                                                                                                                                             |-| **BACKUP_TRIGGER**                                       | Trigger parameter to start a backup. Simply needs to be changed from the current value to start a backup                                                                                                                                                                                                                                  | 1                                                                                                                                                                                                 |-| **BACKUP_AWS_CREDENTIALS_SECRET**                        | If set, can be used to provide the access_key, secret_key and security_token with a secret                                                                                                                                                                                                                                                |                                                                                                                                                                                                   |-| **BACKUP_AWS_S3_BUCKET_NAME**                            | The name of the AWS S3 bucket to store the backups                                                                                                                                                                                                                                                                                        |                                                                                                                                                                                                   |-| **BACKUP_AWS_S3_STORAGE_PROVIDER**                       | Should be one of the s3\_\* values from https://github.com/apache/libcloud/blob/trunk/libcloud/storage/types.py                                                                                                                                                                                                                           | s3_us_west_oregon                                                                                                                                                                                 |-| **BACKUP_PREFIX**                                        | A prefix to be used inside the S3 bucket                                                                                                                                                                                                                                                                                                  |                                                                                                                                                                                                   |-| **BACKUP_MEDUSA_CPU_MC**                                 | CPU request (in millicores) for the Medusa backup containers.                                                                                                                                                                                                                                                                             | 100                                                                                                                                                                                               |-| **BACKUP_MEDUSA_CPU_LIMIT_MC**                           | CPU limit (in millicores) for the Medusa backup containers.                                                                                                                                                                                                                                                                               | 500                                                                                                                                                                                               |-| **BACKUP_MEDUSA_MEM_MIB**                                | Memory request (in MiB) for the Medusa backup containers.                                                                                                                                                                                                                                                                                 | 256                                                                                                                                                                                               |-| **BACKUP_MEDUSA_MEM_LIMIT_MIB**                          | Memory limit (in MiB) for the Medusa backup containers.                                                                                                                                                                                                                                                                                   | 512                                                                                                                                                                                               |-| **BACKUP_MEDUSA_DOCKER_IMAGE**                           | Medusa backup Docker image.                                                                                                                                                                                                                                                                                                               | mesosphere/kudo-cassandra-medusa:0.5.1-0.1.2-SNAPSHOT                                                                                                                                             |-| **BACKUP_MEDUSA_DOCKER_IMAGE_PULL_POLICY**               | Medusa backup Docker image pull policy.                                                                                                                                                                                                                                                                                                   | Always                                                                                                                                                                                            |-| **BACKUP_NAME**                                          | The name of the backup to create or restore                                                                                                                                                                                                                                                                                               |                                                                                                                                                                                                   |-| **RESTORE_FLAG**                                         | If true, a restore is done on installation                                                                                                                                                                                                                                                                                                | False                                                                                                                                                                                             |-| **RESTORE_OLD_NAMESPACE**                                | The namespace from the operator that was used to create the backup                                                                                                                                                                                                                                                                        |                                                                                                                                                                                                   |-| **RESTORE_OLD_NAME**                                     | The instance name from the operator that was used to create the backup                                                                                                                                                                                                                                                                    |                                                                                                                                                                                                   |-| **NODE_TOPOLOGY**                                        | This describes a multi-datacenter setup. When set it has precedence over NODE_COUNT. See docs/multidatacenter.md for more details.                                                                                                                                                                                                        |                                                                                                                                                                                                   |-| **NODE_ANTI_AFFINITY**                                   | Ensure that every Cassandra node is deployed on separate hosts                                                                                                                                                                                                                                                                            | False                                                                                                                                                                                             |-| **SERVICE_ACCOUNT_INSTALL**                              | This flag can be set to true to automatic installation of a cluster role, service account and role binding                                                                                                                                                                                                                                | False                                                                                                                                                                                             |-| **EXTERNAL_SEED_NODES**                                  | List of seed nodes external to this instance to add to the cluster. This allows clusters spanning multiple Kubernetes clusters.                                                                                                                                                                                                           |                                                                                                                                                                                                   |-| **PROMETHEUS_EXPORTER_ENABLED**                          |                                                                                                                                                                                                                                                                                                                                           | True                                                                                                                                                                                              |-| **PROMETHEUS_EXPORTER_PORT**                             | Prometheus exporter port.                                                                                                                                                                                                                                                                                                                 | 7200                                                                                                                                                                                              |-| **PROMETHEUS_EXPORTER_CPU_MC**                           | CPU request (in millicores) for the Prometheus exporter containers.                                                                                                                                                                                                                                                                       | 500                                                                                                                                                                                               |-| **PROMETHEUS_EXPORTER_CPU_LIMIT_MC**                     | CPU limit (in millicores) for the Prometheus exporter containers.                                                                                                                                                                                                                                                                         | 1000                                                                                                                                                                                              |-| **PROMETHEUS_EXPORTER_MEM_MIB**                          | Memory request (in MiB) for the Prometheus exporter containers.                                                                                                                                                                                                                                                                           | 512                                                                                                                                                                                               |-| **PROMETHEUS_EXPORTER_MEM_LIMIT_MIB**                    | Memory limit (in MiB) for the Prometheus exporter containers.                                                                                                                                                                                                                                                                             | 512                                                                                                                                                                                               |-| **PROMETHEUS_EXPORTER_DOCKER_IMAGE**                     | Prometheus exporter Docker image.                                                                                                                                                                                                                                                                                                         | mesosphere/cassandra-prometheus-exporter:2.3.4-0.1.2-SNAPSHOT                                                                                                                                     |-| **PROMETHEUS_EXPORTER_DOCKER_IMAGE_PULL_POLICY**         | Prometheus exporter Docker image pull policy.                                                                                                                                                                                                                                                                                             | Always                                                                                                                                                                                            |-| **PROMETHEUS_EXPORTER_CUSTOM_CONFIG_CM_NAME**            | The properties present in this configmap will be appended to the prometheus configuration properties                                                                                                                                                                                                                                      |                                                                                                                                                                                                   |-| **STORAGE_PORT**                                         | The port for inter-node communication.                                                                                                                                                                                                                                                                                                    | 7000                                                                                                                                                                                              |-| **SSL_STORAGE_PORT**                                     | The port for inter-node communication over SSL.                                                                                                                                                                                                                                                                                           | 7001                                                                                                                                                                                              |-| **NATIVE_TRANSPORT_PORT**                                | The port for CQL communication.                                                                                                                                                                                                                                                                                                           | 9042                                                                                                                                                                                              |-| **RPC_PORT**                                             | The port for Thrift RPC communication.                                                                                                                                                                                                                                                                                                    | 9160                                                                                                                                                                                              |-| **JMX_PORT**                                             | The JMX port that will be used to interface with the Cassandra application.                                                                                                                                                                                                                                                               | 7199                                                                                                                                                                                              |-| **RMI_PORT**                                             | The RMI port that will be used to interface with the Cassandra application when TRANSPORT_ENCRYPTION_ENABLED is set.                                                                                                                                                                                                                      | 7299                                                                                                                                                                                              |-| **JMX_LOCAL_ONLY**                                       | If true, the JMX port will only be opened on localhost and not be available to the cluster                                                                                                                                                                                                                                                | True                                                                                                                                                                                              |-| **TRANSPORT_ENCRYPTION_ENABLED**                         | Enable node-to-node encryption.                                                                                                                                                                                                                                                                                                           | False                                                                                                                                                                                             |-| **TRANSPORT_ENCRYPTION_CLIENT_ENABLED**                  | Enable client-to-node encryption.                                                                                                                                                                                                                                                                                                         | False                                                                                                                                                                                             |-| **TRANSPORT_ENCRYPTION_CIPHERS**                         | Comma-separated list of JSSE Cipher Suite Names.                                                                                                                                                                                                                                                                                          | TLS_RSA_WITH_AES_128_CBC_SHA,TLS_RSA_WITH_AES_256_CBC_SHA,TLS_DHE_RSA_WITH_AES_128_CBC_SHA,TLS_DHE_RSA_WITH_AES_256_CBC_SHA,TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA,TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA |-| **TRANSPORT_ENCRYPTION_CLIENT_ALLOW_PLAINTEXT**          | Enable Server-Client plaintext communication alongside encrypted traffic.                                                                                                                                                                                                                                                                 | False                                                                                                                                                                                             |-| **TRANSPORT_ENCRYPTION_REQUIRE_CLIENT_AUTH**             | Enable client certificate authentication on node-to-node transport encryption.                                                                                                                                                                                                                                                            | True                                                                                                                                                                                              |-| **TRANSPORT_ENCRYPTION_CLIENT_REQUIRE_CLIENT_AUTH**      | Enable client certificate authentication on client-to-node transport encryption.                                                                                                                                                                                                                                                          | True                                                                                                                                                                                              |-| **TLS_SECRET_NAME**                                      | The TLS secret that contains the self-signed certificate (cassandra.crt) and the private key (cassandra.key). The secret will be mounted as a volume to make the artifacts available.                                                                                                                                                     | cassandra-tls                                                                                                                                                                                     |-| **NODE_MIN_HEAP_SIZE_MB**                                | The minimum JVM heap size in MB. This has a smart default and doesn't need to be explicitly set.                                                                                                                                                                                                                                          |                                                                                                                                                                                                   |-| **NODE_MAX_HEAP_SIZE_MB**                                | The maximum JVM heap size in MB. This has a smart default and doesn't need to be explicitly set.                                                                                                                                                                                                                                          |                                                                                                                                                                                                   |-| **NODE_NEW_GENERATION_HEAP_SIZE_MB**                     | The JVM new generation heap size in MB.                                                                                                                                                                                                                                                                                                   |                                                                                                                                                                                                   |-| **SEED_PROVIDER_CLASS**                                  | The class within Cassandra that handles the seed logic.                                                                                                                                                                                                                                                                                   | org.apache.cassandra.locator.SimpleSeedProvider                                                                                                                                                   |-| **NUM_TOKENS**                                           | The number of tokens assigned to each node.                                                                                                                                                                                                                                                                                               | 256                                                                                                                                                                                               |-| **HINTED_HANDOFF_ENABLED**                               | If true, hinted handoff is enabled for the cluster.                                                                                                                                                                                                                                                                                       | True                                                                                                                                                                                              |-| **MAX_HINT_WINDOW_IN_MS**                                | The maximum amount of time, in ms, that hints are generated for an unresponsive node.                                                                                                                                                                                                                                                     | 10800000                                                                                                                                                                                          |-| **HINTED_HANDOFF_THROTTLE_IN_KB**                        | The maximum throttle per delivery thread in KBs per second.                                                                                                                                                                                                                                                                               | 1024                                                                                                                                                                                              |-| **MAX_HINTS_DELIVERY_THREADS**                           | The maximum number of delivery threads for hinted handoff.                                                                                                                                                                                                                                                                                | 2                                                                                                                                                                                                 |-| **BATCHLOG_REPLAY_THROTTLE_IN_KB**                       | The total maximum throttle for replaying failed logged batches in KBs per second.                                                                                                                                                                                                                                                         | 1024                                                                                                                                                                                              |-| **AUTHENTICATOR**                                        | Authentication backend, implementing IAuthenticator; used to identify users.                                                                                                                                                                                                                                                              | AllowAllAuthenticator                                                                                                                                                                             |-| **AUTHENTICATION_SECRET_NAME**                           | Name of the secret containing the credentials used by the operator when running 'nodetool' for its functionality. Only relevant if AUTHENTICATOR is set to 'PasswordAuthenticator'. The secret needs to have a 'username' and a 'password' entry.                                                                                         |                                                                                                                                                                                                   |-| **AUTHORIZER**                                           | Authorization backend, implementing IAuthorizer; used to limit access/provide permissions.                                                                                                                                                                                                                                                | AllowAllAuthorizer                                                                                                                                                                                |-| **ROLE_MANAGER**                                         | Part of the Authentication & Authorization backend that implements IRoleManager to maintain grants and memberships between roles, By default, the value set is Apache Cassandra's out of the box Role Manager: CassandraRoleManager                                                                                                       | CassandraRoleManager                                                                                                                                                                              |-| **ROLES_VALIDITY_IN_MS**                                 | Validity period for roles cache; set to 0 to disable                                                                                                                                                                                                                                                                                      | 2000                                                                                                                                                                                              |-| **ROLES_UPDATE_INTERVAL_IN_MS**                          | After this interval, cache entries become eligible for refresh. Upon next access, Cassandra schedules an async reload, and returns the old value until the reload completes. If roles_validity_in_ms is non-zero, then this must be also.                                                                                                 |                                                                                                                                                                                                   |-| **CREDENTIALS_VALIDITY_IN_MS**                           | This cache is tightly coupled to the provided PasswordAuthenticator implementation of IAuthenticator. If another IAuthenticator implementation is configured, Cassandra does not use this cache, and these settings have no effect. Set to 0 to disable.                                                                                  | 2000                                                                                                                                                                                              |-| **CREDENTIALS_UPDATE_INTERVAL_IN_MS**                    | After this interval, cache entries become eligible for refresh. The next time the cache is accessed, the system schedules an asynchronous reload of the cache. Until this cache reload is complete, the cache returns the old values. If credentials_validity_in_ms is nonzero, this property must also be nonzero.                       |                                                                                                                                                                                                   |-| **PERMISSIONS_VALIDITY_IN_MS**                           | How many milliseconds permissions in cache remain valid. Fetching permissions can be resource intensive. To disable the cache, set this to 0.                                                                                                                                                                                             | 2000                                                                                                                                                                                              |-| **PERMISSIONS_UPDATE_INTERVAL_IN_MS**                    | If enabled, sets refresh interval for the permissions cache. After this interval, cache entries become eligible for refresh. On next access, Cassandra schedules an async reload and returns the old value until the reload completes. If permissions_validity_in_ms is nonzero, permissions_update_interval_in_ms must also be non-zero. |                                                                                                                                                                                                   |-| **PARTITIONER**                                          | The partitioner used to distribute rows across the cluster. Murmur3Partitioner is the recommended setting. RandomPartitioner and ByteOrderedPartitioner are supported for legacy applications.                                                                                                                                            | org.apache.cassandra.dht.Murmur3Partitioner                                                                                                                                                       |-| **KEY_CACHE_SAVE_PERIOD**                                | The duration in seconds that keys are saved in cache. Saved caches greatly improve cold-start speeds and has relatively little effect on I/O.                                                                                                                                                                                             | 14400                                                                                                                                                                                             |-| **ROW_CACHE_SIZE_IN_MB**                                 | Maximum size of the row cache in memory. Row cache can save more time than key_cache_size_in_mb, but is space-intensive because it contains the entire row. Use the row cache only for hot rows or static rows. 0 disables the row cache.                                                                                                 | 0                                                                                                                                                                                                 |-| **ROW_CACHE_SAVE_PERIOD**                                | Duration in seconds that rows are saved in cache. 0 disables caching.                                                                                                                                                                                                                                                                     | 0                                                                                                                                                                                                 |-| **COMMITLOG_SYNC_PERIOD_IN_MS**                          | The number of milliseconds between disk fsync calls.                                                                                                                                                                                                                                                                                      | 10000                                                                                                                                                                                             |-| **COMMITLOG_SYNC_BATCH_WINDOW_IN_MS**                    | Time to wait between batch fsyncs, if commitlog_sync is in batch mode then default value should be: 2                                                                                                                                                                                                                                     |                                                                                                                                                                                                   |-| **COMMITLOG_SEGMENT_SIZE_IN_MB**                         | The size of each commit log segment in Mb.                                                                                                                                                                                                                                                                                                | 32                                                                                                                                                                                                |-| **CONCURRENT_READS**                                     | For workloads with more data than can fit in memory, the bottleneck is reads fetching data from disk. Setting to (16 times the number of drives) allows operations to queue low enough in the stack so that the OS and drives can reorder them.                                                                                           | 16                                                                                                                                                                                                |-| **CONCURRENT_WRITES**                                    | Writes in Cassandra are rarely I/O bound, so the ideal number of concurrent writes depends on the number of CPU cores in your system. The recommended value is 8 times the number of cpu cores.                                                                                                                                           | 32                                                                                                                                                                                                |-| **CONCURRENT_COUNTER_WRITES**                            | Counter writes read the current values before incrementing and writing them back. The recommended value is (16 times the number of drives) .                                                                                                                                                                                              | 16                                                                                                                                                                                                |-| **MEMTABLE_ALLOCATION_TYPE**                             | The type of allocations for the Cassandra memtable. heap_buffers keep all data on the JVM heap. offheap_buffers may reduce heap utilization for large string or binary values. offheap_objects may improve heap size for small integers or UUIDs as well. Both off heap options will increase read latency.                               | heap_buffers                                                                                                                                                                                      |-| **INDEX_SUMMARY_RESIZE_INTERVAL_IN_MINUTES**             | How frequently index summaries should be re-sampled in minutes. This is done periodically to redistribute memory from the fixed-size pool to SSTables proportional their recent read rates.                                                                                                                                               | 60                                                                                                                                                                                                |-| **START_NATIVE_TRANSPORT**                               | If true, CQL is enabled.                                                                                                                                                                                                                                                                                                                  | True                                                                                                                                                                                              |-| **START_RPC**                                            | If true, Thrift RPC is enabled. This is deprecated but may be necessary for legacy applications.                                                                                                                                                                                                                                          | False                                                                                                                                                                                             |-| **RPC_KEEPALIVE**                                        | Enables or disables keepalive on client connections (RPC or native).                                                                                                                                                                                                                                                                      | True                                                                                                                                                                                              |-| **THRIFT_FRAMED_TRANSPORT_SIZE_IN_MB**                   | Frame size (maximum field length) for Thrift.                                                                                                                                                                                                                                                                                             | 15                                                                                                                                                                                                |-| **TOMBSTONE_WARN_THRESHOLD**                             | The maximum number of tombstones a query can scan before warning.                                                                                                                                                                                                                                                                         | 1000                                                                                                                                                                                              |-| **TOMBSTONE_FAILURE_THRESHOLD**                          | The maximum number of tombstones a query can scan before aborting.                                                                                                                                                                                                                                                                        | 100000                                                                                                                                                                                            |-| **COLUMN_INDEX_SIZE_IN_KB**                              | The granularity of the index of rows within a partition. For huge rows, decrease this setting to improve seek time. If you use key cache, be careful not to make this setting too large because key cache will be overwhelmed.                                                                                                            | 64                                                                                                                                                                                                |-| **BATCH_SIZE_WARN_THRESHOLD_IN_KB**                      | Warn the operator on a batch size exceeding this value in kilobytes. Caution should be taken on increasing the size of this threshold as it can lead to node instability.                                                                                                                                                                 | 5                                                                                                                                                                                                 |-| **BATCH_SIZE_FAIL_THRESHOLD_IN_KB**                      | Fail batch sizes exceeding this value in kilobytes. Caution should be taken on increasing the size of this threshold as it can lead to node instability.                                                                                                                                                                                  | 50                                                                                                                                                                                                |-| **COMPACTION_THROUGHPUT_MB_PER_SEC**                     | Throttles compaction to the specified total throughput across the node. Compaction frequency varies with direct proportion to write throughput and is necessary to limit the SSTable size. The recommended value is 16 to 32 times the rate of write throughput (in MB/second).                                                           | 16                                                                                                                                                                                                |-| **SSTABLE_PREEMPTIVE_OPEN_INTERVAL_IN_MB**               | When compacting, the replacement opens SSTables before they are completely written and uses in place of the prior SSTables for any range previously written. This setting helps to smoothly transfer reads between the SSTables by reducing page cache churn and keeps hot rows hot.                                                      | 50                                                                                                                                                                                                |-| **READ_REQUEST_TIMEOUT_IN_MS**                           | The time that the coordinator waits for read operations to complete in ms.                                                                                                                                                                                                                                                                | 5000                                                                                                                                                                                              |-| **RANGE_REQUEST_TIMEOUT_IN_MS**                          | The time that the coordinator waits for range scans complete in ms.                                                                                                                                                                                                                                                                       | 10000                                                                                                                                                                                             |-| **WRITE_REQUEST_TIMEOUT_IN_MS**                          | The time that the coordinator waits for write operations to complete in ms.                                                                                                                                                                                                                                                               | 2000                                                                                                                                                                                              |-| **COUNTER_WRITE_REQUEST_TIMEOUT_IN_MS**                  | The time that the coordinator waits for counter write operations to complete in ms.                                                                                                                                                                                                                                                       | 5000                                                                                                                                                                                              |-| **CAS_CONTENTION_TIMEOUT_IN_MS**                         | The time for which the coordinator will retry CAS operations on the same row in ms.                                                                                                                                                                                                                                                       | 1000                                                                                                                                                                                              |-| **TRUNCATE_REQUEST_TIMEOUT_IN_MS**                       | The time that the coordinator waits for truncate operations to complete in ms.                                                                                                                                                                                                                                                            | 60000                                                                                                                                                                                             |-| **REQUEST_TIMEOUT_IN_MS**                                | The default timeout for all other requests in ms.                                                                                                                                                                                                                                                                                         | 10000                                                                                                                                                                                             |-| **DYNAMIC_SNITCH_UPDATE_INTERVAL_IN_MS**                 | The time, in ms, the snitch will wait before updating node scores.                                                                                                                                                                                                                                                                        | 100                                                                                                                                                                                               |-| **DYNAMIC_SNITCH_RESET_INTERVAL_IN_MS**                  | The time, in ms, the snitch will wait before resetting node scores allowing bad nodes to recover.                                                                                                                                                                                                                                         | 600000                                                                                                                                                                                            |-| **DYNAMIC_SNITCH_BADNESS_THRESHOLD**                     | Sets the performance threshold for dynamically routing client requests away from a poorly performing node.                                                                                                                                                                                                                                | 0.1                                                                                                                                                                                               |-| **INTERNODE_COMPRESSION**                                | Controls whether traffic between nodes is compressed. all compresses all traffic. none compresses no traffic. dc compresses between datacenters.                                                                                                                                                                                          | dc                                                                                                                                                                                                |-| **MAX_HINTS_FILE_SIZE_IN_MB**                            | The maximum size of the hints file in Mb.                                                                                                                                                                                                                                                                                                 | 128                                                                                                                                                                                               |-| **HINTS_FLUSH_PERIOD_IN_MS**                             | The time, in ms, for the period in which hints are flushed to disk.                                                                                                                                                                                                                                                                       | 10000                                                                                                                                                                                             |-| **CONCURRENT_MATERIALIZED_VIEW_WRITES**                  | The maximum number of concurrent writes to materialized views.                                                                                                                                                                                                                                                                            | 32                                                                                                                                                                                                |-| **COMMITLOG_TOTAL_SPACE_IN_MB**                          | The total size of the commit log in Mb.                                                                                                                                                                                                                                                                                                   |                                                                                                                                                                                                   |-| **AUTO_SNAPSHOT**                                        | Take a snapshot of the data before truncating a keyspace or dropping a table                                                                                                                                                                                                                                                              | True                                                                                                                                                                                              |-| **KEY_CACHE_KEYS_TO_SAVE**                               | The number of keys from the key cache to save                                                                                                                                                                                                                                                                                             |                                                                                                                                                                                                   |-| **ROW_CACHE_KEYS_TO_SAVE**                               | The number of keys from the row cache to save                                                                                                                                                                                                                                                                                             |                                                                                                                                                                                                   |-| **COUNTER_CACHE_KEYS_TO_SAVE**                           | The number of keys from the counter cache to save                                                                                                                                                                                                                                                                                         |                                                                                                                                                                                                   |-| **FILE_CACHE_SIZE_IN_MB**                                | The total memory to use for SSTable-reading buffers                                                                                                                                                                                                                                                                                       |                                                                                                                                                                                                   |-| **MEMTABLE_HEAP_SPACE_IN_MB**                            | The amount of on-heap memory allocated for memtables                                                                                                                                                                                                                                                                                      |                                                                                                                                                                                                   |-| **MEMTABLE_OFFHEAP_SPACE_IN_MB**                         | The total amount of off-heap memory allocated for memtables                                                                                                                                                                                                                                                                               |                                                                                                                                                                                                   |-| **MEMTABLE_CLEANUP_THRESHOLD**                           | The ratio used for automatic memtable flush                                                                                                                                                                                                                                                                                               |                                                                                                                                                                                                   |-| **MEMTABLE_FLUSH_WRITERS**                               | The number of memtable flush writer threads                                                                                                                                                                                                                                                                                               |                                                                                                                                                                                                   |-| **LISTEN_ON_BROADCAST_ADDRESS**                          | Listen on the address set in broadcast_address property                                                                                                                                                                                                                                                                                   |                                                                                                                                                                                                   |-| **INTERNODE_AUTHENTICATOR**                              | The internode authentication backend                                                                                                                                                                                                                                                                                                      |                                                                                                                                                                                                   |-| **NATIVE_TRANSPORT_MAX_THREADS**                         | The maximum number of thread handling requests                                                                                                                                                                                                                                                                                            |                                                                                                                                                                                                   |-| **NATIVE_TRANSPORT_MAX_FRAME_SIZE_IN_MB**                | The maximum allowed size of a frame                                                                                                                                                                                                                                                                                                       |                                                                                                                                                                                                   |-| **NATIVE_TRANSPORT_MAX_CONCURRENT_CONNECTIONS**          | The maximum number of concurrent client connections                                                                                                                                                                                                                                                                                       |                                                                                                                                                                                                   |-| **NATIVE_TRANSPORT_MAX_CONCURRENT_CONNECTIONS_PER_IP**   | The maximum number of concurrent client connections per source IP address                                                                                                                                                                                                                                                                 |                                                                                                                                                                                                   |-| **RPC_MIN_THREADS**                                      | The minimum thread pool size for remote procedure calls                                                                                                                                                                                                                                                                                   |                                                                                                                                                                                                   |-| **RPC_MAX_THREADS**                                      | The maximum thread pool size for remote procedure calls                                                                                                                                                                                                                                                                                   |                                                                                                                                                                                                   |-| **RPC_SEND_BUFF_SIZE_IN_BYTES**                          | The sending socket buffer size in bytes for remote procedure calls                                                                                                                                                                                                                                                                        |                                                                                                                                                                                                   |-| **RPC_RECV_BUFF_SIZE_IN_BYTES**                          | The receiving socket buffer size for remote procedure calls                                                                                                                                                                                                                                                                               |                                                                                                                                                                                                   |-| **CONCURRENT_COMPACTORS**                                | The number of concurrent compaction processes allowed to run simultaneously on a node                                                                                                                                                                                                                                                     |                                                                                                                                                                                                   |-| **STREAM_THROUGHPUT_OUTBOUND_MEGABITS_PER_SEC**          | The maximum throughput of all outbound streaming file transfers on a node                                                                                                                                                                                                                                                                 |                                                                                                                                                                                                   |-| **INTER_DC_STREAM_THROUGHPUT_OUTBOUND_MEGABITS_PER_SEC** | The maximum throughput of all streaming file transfers between datacenters                                                                                                                                                                                                                                                                |                                                                                                                                                                                                   |-| **STREAMING_KEEP_ALIVE_PERIOD_IN_SECS**                  | Interval to send keep-alive messages. The stream session fails when a keep-alive message is not received for 2 keep-alive cycles.                                                                                                                                                                                                         |                                                                                                                                                                                                   |-| **PHI_CONVICT_THRESHOLD**                                | The sensitivity of the failure detector on an exponential scale                                                                                                                                                                                                                                                                           |                                                                                                                                                                                                   |-| **BUFFER_POOL_USE_HEAP_IF_EXHAUSTED**                    | Allocate on-heap memory when the SSTable buffer pool is exhausted                                                                                                                                                                                                                                                                         |                                                                                                                                                                                                   |-| **DISK_OPTIMIZATION_STRATEGY**                           | The strategy for optimizing disk reads                                                                                                                                                                                                                                                                                                    |                                                                                                                                                                                                   |-| **MAX_VALUE_SIZE_IN_MB**                                 | The maximum size of any value in SSTables                                                                                                                                                                                                                                                                                                 |                                                                                                                                                                                                   |-| **OTC_COALESCING_STRATEGY**                              | The strategy to use for coalescing network messages. Values can be: fixed, movingaverage, timehorizon, disabled (default)                                                                                                                                                                                                                 |                                                                                                                                                                                                   |-| **UNLOGGED_BATCH_ACROSS_PARTITIONS_WARN_THRESHOLD**      | Causes Cassandra to log a WARN message on any batches not of type LOGGED that span across more partitions than this limit.                                                                                                                                                                                                                | 10                                                                                                                                                                                                |-| **COMPACTION_LARGE_PARTITION_WARNING_THRESHOLD_MB**      | Cassandra logs a warning when compacting partitions larger than the set value.                                                                                                                                                                                                                                                            | 100                                                                                                                                                                                               |-| **REQUEST_SCHEDULER**                                    | The scheduler to handle incoming client requests according to a defined policy. This scheduler is useful for throttling client requests in single clusters containing multiple keyspaces.                                                                                                                                                 | org.apache.cassandra.scheduler.NoScheduler                                                                                                                                                        |-| **INTER_DC_TCP_NODELAY**                                 | Enable this property for inter-datacenter communication.                                                                                                                                                                                                                                                                                  | False                                                                                                                                                                                             |-| **TRACETYPE_QUERY_TTL**                                  | TTL for different trace types used during logging of the query process.                                                                                                                                                                                                                                                                   | 86400                                                                                                                                                                                             |-| **TRACETYPE_REPAIR_TTL**                                 | TTL for different trace types used during logging of the repair process.                                                                                                                                                                                                                                                                  | 604800                                                                                                                                                                                            |-| **GC_WARN_THRESHOLD_IN_MS**                              | Any GC pause longer than this interval is logged at the WARN level.                                                                                                                                                                                                                                                                       | 1000                                                                                                                                                                                              |-| **WINDOWS_TIMER_INTERVAL**                               | The default Windows kernel timer and scheduling resolution is 15.6ms for power conservation. Lowering this value on Windows can provide much tighter latency and better throughput, however some virtualized environments may see a negative performance impact from changing this setting below their system default.                    | 1                                                                                                                                                                                                 |-| **COUNTER_CACHE_SAVE_PERIOD**                            | the amount of time after which Cassandra saves the counter cache (keys only).                                                                                                                                                                                                                                                             | 7200                                                                                                                                                                                              |-| **TRICKLE_FSYNC_INTERVAL_IN_KB**                         | The size of the fsync in kilobytes.                                                                                                                                                                                                                                                                                                       | 10240                                                                                                                                                                                             |-| **TRICKLE_FSYNC**                                        | When set to true, causes fsync to force the operating system to flush the dirty buffers at the set interval                                                                                                                                                                                                                               | False                                                                                                                                                                                             |-| **INCREMENTAL_BACKUPS**                                  | Backs up data updated since the last snapshot was taken. When enabled, Cassandra creates a hard link to each SSTable flushed or streamed locally in a backups subdirectory of the keyspace data.                                                                                                                                          | False                                                                                                                                                                                             |-| **SNAPSHOT_BEFORE_COMPACTION**                           | Enables or disables taking a snapshot before each compaction. A snapshot is useful to back up data when there is a data format change.                                                                                                                                                                                                    | False                                                                                                                                                                                             |-| **CROSS_NODE_TIMEOUT**                                   | operation timeout information exchange between nodes (to accurately measure request timeouts).                                                                                                                                                                                                                                            | False                                                                                                                                                                                             |-| **COMMIT_FAILURE_POLICY**                                | Policy for commit disk failures.                                                                                                                                                                                                                                                                                                          | stop                                                                                                                                                                                              |-| **KEY_CACHE_SIZE_IN_MB**                                 | A global cache setting for the maximum size of the key cache in memory (for all tables).                                                                                                                                                                                                                                                  |                                                                                                                                                                                                   |-| **COUNTER_CACHE_SIZE_IN_MB**                             | When no value is set, Cassandra uses the smaller of minimum of 2.5% of Heap or 50MB.                                                                                                                                                                                                                                                      |                                                                                                                                                                                                   |-| **COMMITLOG_SYNC**                                       | The method that Cassandra uses to acknowledge writes in milliseconds                                                                                                                                                                                                                                                                      | periodic                                                                                                                                                                                          |-| **INDEX_SUMMARY_CAPACITY_IN_MB**                         | Fixed memory pool size in MB for SSTable index summaries.                                                                                                                                                                                                                                                                                 |                                                                                                                                                                                                   |-| **RPC_SERVER_TYPE**                                      | Cassandra provides three options for the RPC server. sync and hsha performance is about the same, but hsha uses less memory.                                                                                                                                                                                                              | sync                                                                                                                                                                                              |-| **ENDPOINT_SNITCH**                                      | Set to a class that implements the IEndpointSnitch interface. Cassandra uses the snitch to locate nodes and route requests.                                                                                                                                                                                                               | SimpleSnitch                                                                                                                                                                                      |-| **DISK_FAILURE_POLICY**                                  | The policy for how Cassandra responds to disk failure                                                                                                                                                                                                                                                                                     | stop                                                                                                                                                                                              |-| **ENABLE_USER_DEFINED_FUNCTIONS**                        | User defined functions (UDFs) present a security risk, since they are executed on the server side. UDFs are executed in a sandbox to contain the execution of malicious code.                                                                                                                                                             | False                                                                                                                                                                                             |-| **ENABLE_SCRIPTED_USER_DEFINED_FUNCTIONS**               | Java UDFs are always enabled, if enable_user_defined_functions is true. Enable this option to use UDFs with language javascript or any custom JSR-223 provider. This option has no effect if enable_user_defined_functions is false                                                                                                       | False                                                                                                                                                                                             |-| **ENABLE_MATERIALIZED_VIEWS**                            | Enables materialized view creation on this node. Materialized views are considered experimental and are not recommended for production use.                                                                                                                                                                                               | False                                                                                                                                                                                             |-| **CDC_ENABLED**                                          | Enable / disable CDC functionality on a per-node basis. This modifies the logic used for write path allocation rejection                                                                                                                                                                                                                  | False                                                                                                                                                                                             |-| **CDC_TOTAL_SPACE_IN_MB**                                | Total space to use for change-data-capture (CDC) logs on disk.                                                                                                                                                                                                                                                                            |                                                                                                                                                                                                   |-| **CDC_FREE_SPACE_CHECK_INTERVAL_MS**                     | Interval between checks for new available space for CDC-tracked tables when the cdc_total_space_in_mb threshold is reached and the CDCCompactor is running behind or experiencing back pressure.                                                                                                                                          |                                                                                                                                                                                                   |-| **PREPARED_STATEMENTS_CACHE_SIZE_MB**                    | Maximum size of the native protocol prepared statement cache                                                                                                                                                                                                                                                                              |                                                                                                                                                                                                   |-| **THRIFT_PREPARED_STATEMENTS_CACHE_SIZE_MB**             | Maximum size of the Thrift prepared statement cache. Leave empty if you do not use Thrift.                                                                                                                                                                                                                                                |                                                                                                                                                                                                   |-| **COLUMN_INDEX_CACHE_SIZE_IN_KB**                        | A threshold for the total size of all index entries for a partition that the database stores in the partition key cache.                                                                                                                                                                                                                  | 2                                                                                                                                                                                                 |-| **SLOW_QUERY_LOG_TIMEOUT_IN_MS**                         | How long before a node logs slow queries. Select queries that exceed this value generate an aggregated log message to identify slow queries. To disable, set to 0.                                                                                                                                                                        | 500                                                                                                                                                                                               |-| **BACK_PRESSURE_ENABLED**                                | Enable for the coordinator to apply the specified back pressure strategy to each mutation that is sent to replicas.                                                                                                                                                                                                                       | False                                                                                                                                                                                             |-| **BACK_PRESSURE_STRATEGY_CLASS_NAME**                    | The back-pressure strategy applied. The default implementation, RateBasedBackPressure, takes three arguments: high ratio, factor, and flow type, and uses the ratio between incoming mutation responses and outgoing mutation requests.                                                                                                   | org.apache.cassandra.net.RateBasedBackPressure                                                                                                                                                    |-| **BACK_PRESSURE_STRATEGY_HIGH_RATIO**                    | When outgoing mutations are below this value, they are rate limited according to the incoming rate decreased by the factor. When above this value, the rate limiting is increased by the factor.                                                                                                                                          | 0.9                                                                                                                                                                                               |-| **BACK_PRESSURE_STRATEGY_FACTOR**                        | A number between 1 and 10. Increases or decreases rate limiting.                                                                                                                                                                                                                                                                          | 5                                                                                                                                                                                                 |-| **BACK_PRESSURE_STRATEGY_FLOW**                          | The flow speed to apply rate limiting: FAST - rate limited to the speed of the fastest replica. SLOW - rate limit to the speed of the slowest replica.                                                                                                                                                                                    | FAST                                                                                                                                                                                              |-| **ALLOCATE_TOKENS_FOR_KEYSPACE**                         | Triggers automatic allocation of num_tokens tokens for this node. The allocation algorithm attempts to choose tokens in a way that optimizes replicated load over the nodes in the datacenter for the replication strategy used by the specified keyspace.                                                                                |                                                                                                                                                                                                   |-| **HINTS_DIRECTORY**                                      | Directory where Cassandra should store hints.                                                                                                                                                                                                                                                                                             |                                                                                                                                                                                                   |-| **COMMITLOG_DIRECTORY**                                  | When running on magnetic HDD, this should be a separate spindle than the data directories. If not set, the default directory is \$CASSANDRA_HOME/data/commitlog.                                                                                                                                                                          |                                                                                                                                                                                                   |-| **CDC_RAW_DIRECTORY**                                    | CommitLogSegments are moved to this directory on flush if cdc_enabled: true and the segment contains mutations for a CDC-enabled table                                                                                                                                                                                                    |                                                                                                                                                                                                   |-| **ROW_CACHE_CLASS_NAME**                                 | Row cache implementation class name.                                                                                                                                                                                                                                                                                                      |                                                                                                                                                                                                   |-| **SAVED_CACHES_DIRECTORY**                               | saved caches If not set, the default directory is \$CASSANDRA_HOME/data/saved_caches.                                                                                                                                                                                                                                                     |                                                                                                                                                                                                   |-| **INTERNODE_SEND_BUFF_SIZE_IN_BYTES**                    | Set socket buffer size for internode communication Note that when setting this, the buffer size is limited by net.core.wmem_max and when not setting it it is defined by net.ipv4.tcp_wm                                                                                                                                                  |                                                                                                                                                                                                   |-| **INTERNODE_RECV_BUFF_SIZE_IN_BYTES**                    | Set socket buffer size for internode communication Note that when setting this, the buffer size is limited by net.core.wmem_max and when not setting it it is defined by net.ipv4.tcp_wmem                                                                                                                                                |                                                                                                                                                                                                   |-| **GC_LOG_THRESHOLD_IN_MS**                               | GC Pauses greater than 200 ms will be logged at INFO level This threshold can be adjusted to minimize logging if necessary                                                                                                                                                                                                                |                                                                                                                                                                                                   |-| **OTC_COALESCING_WINDOW_US**                             | How many microseconds to wait for coalescing.                                                                                                                                                                                                                                                                                             |                                                                                                                                                                                                   |-| **OTC_COALESCING_ENOUGH_COALESCED_MESSAGES**             | Do not try to coalesce messages if we already got that many messages. This should be more than 2 and less than 128.                                                                                                                                                                                                                       |                                                                                                                                                                                                   |-| **OTC_BACKLOG_EXPIRATION_INTERVAL_MS**                   | How many milliseconds to wait between two expiration runs on the backlog (queue) of the OutboundTcpConnection.                                                                                                                                                                                                                            |                                                                                                                                                                                                   |-| **REPAIR_SESSION_MAX_TREE_DEPTH**                        | Limits the maximum Merkle tree depth to avoid consuming too much memory during repairs.                                                                                                                                                                                                                                                   |                                                                                                                                                                                                   |-| **ENABLE_SASI_INDEXES**                                  | Enables SASI index creation on this node. SASI indexes are considered experimental and are not recommended for production use.                                                                                                                                                                                                            |                                                                                                                                                                                                   |-| **CUSTOM_CASSANDRA_YAML_BASE64**                         | Base64-encoded Cassandra properties appended to cassandra.yaml.                                                                                                                                                                                                                                                                           |                                                                                                                                                                                                   |-| **KUBECTL_VERSION**                                      | Version of 'bitnami/kubectl' image. This image is used for some functionality of the operator.                                                                                                                                                                                                                                            | 1.18.2                                                                                                                                                                                            |-| **JVM_OPT_AVAILABLE_PROCESSORS**                         | In a multi-instance deployment, multiple Cassandra instances will independently assume that all CPU processors are available to it. This setting allows you to specify a smaller set of processors and perhaps have affinity.                                                                                                             |                                                                                                                                                                                                   |-| **JVM_OPT_JOIN_RING**                                    | Set to false to start Cassandra on a node but not have the node join the cluster.                                                                                                                                                                                                                                                         |                                                                                                                                                                                                   |-| **JVM_OPT_LOAD_RING_STATE**                              | Set to false to clear all gossip state for the node on restart. Use when you have changed node information in cassandra.yaml (such as listen_address).                                                                                                                                                                                    |                                                                                                                                                                                                   |-| **JVM_OPT_REPLAYLIST**                                   | Allow restoring specific tables from an archived commit log.                                                                                                                                                                                                                                                                              |                                                                                                                                                                                                   |-| **JVM_OPT_RING_DELAY_MS**                                | Allows overriding of the default RING_DELAY (30000ms), which is the amount of time a node waits before joining the ring.                                                                                                                                                                                                                  |                                                                                                                                                                                                   |-| **JVM_OPT_TRIGGERS_DIR**                                 | Set the default location for the trigger JARs. (Default: conf/triggers)                                                                                                                                                                                                                                                                   |                                                                                                                                                                                                   |-| **JVM_OPT_WRITE_SURVEY**                                 | For testing new compaction and compression strategies. It allows you to experiment with different strategies and benchmark write performance differences without affecting the production workload.                                                                                                                                       |                                                                                                                                                                                                   |-| **JVM_OPT_DISABLE_AUTH_CACHES_REMOTE_CONFIGURATION**     | To disable configuration via JMX of auth caches (such as those for credentials, permissions and roles). This will mean those config options can only be set (persistently) in cassandra.yaml and will require a restart for new values to take effect.                                                                                    |                                                                                                                                                                                                   |-| **JVM_OPT_FORCE_DEFAULT_INDEXING_PAGE_SIZE**             | To disable dynamic calculation of the page size used when indexing an entire partition (during initial index build/rebuild). If set to true, the page size will be fixed to the default of 10000 rows per page.                                                                                                                           |                                                                                                                                                                                                   |-| **JVM_OPT_PREFER_IPV4_STACK**                            | Prefer binding to IPv4 network intefaces (when net.ipv6.bindv6only=1). See http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6342561 (short version: comment out this entry to enable IPv6 support).                                                                                                                                     | True                                                                                                                                                                                              |-| **JVM_OPT_EXPIRATION_DATE_OVERFLOW_POLICY**              | Defines how to handle INSERT requests with TTL exceeding the maximum supported expiration date.                                                                                                                                                                                                                                           |                                                                                                                                                                                                   |-| **JVM_OPT_THREAD_PRIORITY_POLICY**                       | allows lowering thread priority without being root on linux - probably not necessary on Windows but doesn't harm anything.                                                                                                                                                                                                                | 42                                                                                                                                                                                                |-| **JVM_OPT_THREAD_STACK_SIZE**                            | Per-thread stack size.                                                                                                                                                                                                                                                                                                                    | 256k                                                                                                                                                                                              |-| **JVM_OPT_STRING_TABLE_SIZE**                            | Larger interned string table, for gossip's benefit (CASSANDRA-6410)                                                                                                                                                                                                                                                                       | 1000003                                                                                                                                                                                           |-| **JVM_OPT_SURVIVOR_RATIO**                               | CMS Settings: SurvivorRatio                                                                                                                                                                                                                                                                                                               | 8                                                                                                                                                                                                 |-| **JVM_OPT_MAX_TENURING_THRESHOLD**                       | CMS Settings: MaxTenuringThreshold                                                                                                                                                                                                                                                                                                        | 1                                                                                                                                                                                                 |-| **JVM_OPT_CMS_INITIATING_OCCUPANCY_FRACTION**            | CMS Settings: CMSInitiatingOccupancyFraction                                                                                                                                                                                                                                                                                              | 75                                                                                                                                                                                                |-| **JVM_OPT_CMS_WAIT_DURATION**                            | CMS Settings: CMSWaitDuration                                                                                                                                                                                                                                                                                                             | 10000                                                                                                                                                                                             |-| **JVM_OPT_NUMBER_OF_GC_LOG_FILES**                       | GC logging options: NumberOfGCLogFiles                                                                                                                                                                                                                                                                                                    | 10                                                                                                                                                                                                |-| **JVM_OPT_GC_LOG_FILE_SIZE**                             | GC logging options: GCLOGFILESIZE                                                                                                                                                                                                                                                                                                         | 10M                                                                                                                                                                                               |-| **JVM_OPT_GC_LOG_DIRECTORY**                             | GC logging options: GC_LOG_DIRECTORY                                                                                                                                                                                                                                                                                                      |                                                                                                                                                                                                   |-| **JVM_OPT_PRINT_FLS_STATISTICS**                         | GC logging options: PrintFLSStatistics                                                                                                                                                                                                                                                                                                    |                                                                                                                                                                                                   |-| **JVM_OPT_CONC_GC_THREADS**                              | By default, ConcGCThreads is 1/4 of ParallelGCThreads. Setting both to the same value can reduce STW durations.                                                                                                                                                                                                                           |                                                                                                                                                                                                   |-| **JVM_OPT_INITIATING_HEAP_OCCUPANCY_PERCENT**            | Save CPU time on large (>= 16GB) heaps by delaying region scanning until the heap is 70% full. The default in Hotspot 8u40 is 40%.                                                                                                                                                                                                        |                                                                                                                                                                                                   |-| **JVM_OPT_MAX_GC_PAUSE_MILLIS**                          | Main G1GC tunable: lowering the pause target will lower throughput and vise versa.                                                                                                                                                                                                                                                        |                                                                                                                                                                                                   |-| **JVM_OPT_G1R_SET_UPDATING_PAUSE_TIME_PERCENT**          | Have the JVM do less remembered set work during STW, instead preferring concurrent GC. Reduces p99.9 latency.                                                                                                                                                                                                                             |                                                                                                                                                                                                   |-| **CUSTOM_JVM_OPTIONS_BASE64**                            | Base64-encoded JVM options appended to jvm.options.                                                                                                                                                                                                                                                                                       |                                                                                                                                                                                                   |-| **POD_MANAGEMENT_POLICY**                                | podManagementPolicy of the Cassandra Statefulset                                                                                                                                                                                                                                                                                          | OrderedReady                                                                                                                                                                                      |-| **REPAIR_POD**                                           | Name of the pod on which 'nodetool repair' should be run.                                                                                                                                                                                                                                                                                 |                                                                                                                                                                                                   |+|                           Name                           |                                                                                                                                                                Description                                                                                                                                                                |                                                                                              Default                                                                                              |

The CI uses 0.51.0, I downgraded locally.

nfnt

comment created time in 3 days

push eventmesosphere/kudo-cassandra-operator

Jan Schlicht

commit sha cd53a6a864bac3d04004c77bac7a120977502561

Generate parameters table using 'pytablewriter' 0.51.0 Signed-off-by: Jan Schlicht <jan@d2iq.com>

view details

push time in 3 days

push eventmesosphere/kudo-cassandra-operator

Jan Schlicht

commit sha cc5dffd4cfe5822d6ddce3718711480735e23eea

Update documentation Signed-off-by: Jan Schlicht <jan@d2iq.com>

view details

push time in 3 days

push eventmesosphere/kudo-cassandra-operator

Jan Schlicht

commit sha 0088e78f6ddba5c28842ad1a0d44d03ae2f10ac6

Update documentation Signed-off-by: Jan Schlicht <jan@d2iq.com>

view details

push time in 3 days

Pull request review commentmesosphere/kudo-cassandra-operator

Disable the Prometheus exporter by default

 # Parameters--| Name                                                     | Description                                                                                                                                                                                                                                                                                                                               | Default                                                                                                                                                                                           |-| -------------------------------------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |-| **NODE_COUNT**                                           | Number of Cassandra nodes.                                                                                                                                                                                                                                                                                                                | 3                                                                                                                                                                                                 |-| **NODE_CPU_MC**                                          | CPU request (in millicores) for the Cassandra node containers.                                                                                                                                                                                                                                                                            | 1000                                                                                                                                                                                              |-| **NODE_CPU_LIMIT_MC**                                    | CPU limit (in millicores) for the Cassandra node containers.                                                                                                                                                                                                                                                                              | 1000                                                                                                                                                                                              |-| **NODE_MEM_MIB**                                         | Memory request (in MiB) for the Cassandra node containers.                                                                                                                                                                                                                                                                                | 4096                                                                                                                                                                                              |-| **NODE_MEM_LIMIT_MIB**                                   | Memory limit (in MiB) for the Cassandra node containers.                                                                                                                                                                                                                                                                                  | 4096                                                                                                                                                                                              |-| **NODE_DISK_SIZE_GIB**                                   | Disk size (in GiB) for the Cassandra node containers.                                                                                                                                                                                                                                                                                     | 20                                                                                                                                                                                                |-| **NODE_STORAGE_CLASS**                                   | The storage class to be used in volumeClaimTemplates. By default, it is not required and the default storage class is used.                                                                                                                                                                                                               |                                                                                                                                                                                                   |-| **NODE_DOCKER_IMAGE**                                    | Cassandra node Docker image.                                                                                                                                                                                                                                                                                                              | mesosphere/cassandra:3.11.6-0.1.2-SNAPSHOT                                                                                                                                                        |-| **NODE_DOCKER_IMAGE_PULL_POLICY**                        | Cassandra node Docker image pull policy.                                                                                                                                                                                                                                                                                                  | Always                                                                                                                                                                                            |-| **NODE_READINESS_PROBE_INITIAL_DELAY_S**                 | Number of seconds after the container has started before the readiness probe is initiated.                                                                                                                                                                                                                                                | 0                                                                                                                                                                                                 |-| **NODE_READINESS_PROBE_PERIOD_S**                        | How often (in seconds) to perform the readiness probe.                                                                                                                                                                                                                                                                                    | 5                                                                                                                                                                                                 |-| **NODE_READINESS_PROBE_TIMEOUT_S**                       | How long (in seconds) to wait for a readiness probe to succeed.                                                                                                                                                                                                                                                                           | 60                                                                                                                                                                                                |-| **NODE_READINESS_PROBE_SUCCESS_THRESHOLD**               | Minimum consecutive successes for the readiness probe to be considered successful after having failed.                                                                                                                                                                                                                                    | 1                                                                                                                                                                                                 |-| **NODE_READINESS_PROBE_FAILURE_THRESHOLD**               | When a pod starts and the readiness probe fails, `failure_threshold` attempts will be made before marking the pod as 'unready'.                                                                                                                                                                                                           | 3                                                                                                                                                                                                 |-| **NODE_LIVENESS_PROBE_INITIAL_DELAY_S**                  | Number of seconds after the container has started before the liveness probe is initiated.                                                                                                                                                                                                                                                 | 15                                                                                                                                                                                                |-| **NODE_LIVENESS_PROBE_PERIOD_S**                         | How often (in seconds) to perform the liveness probe.                                                                                                                                                                                                                                                                                     | 20                                                                                                                                                                                                |-| **NODE_LIVENESS_PROBE_TIMEOUT_S**                        | How long (in seconds) to wait for a liveness probe to succeed.                                                                                                                                                                                                                                                                            | 60                                                                                                                                                                                                |-| **NODE_LIVENESS_PROBE_SUCCESS_THRESHOLD**                | Minimum consecutive successes for the liveness probe to be considered successful after having failed.                                                                                                                                                                                                                                     | 1                                                                                                                                                                                                 |-| **NODE_LIVENESS_PROBE_FAILURE_THRESHOLD**                | When a pod starts and the liveness probe fails, `failure_threshold` attempts will be made before restarting the pod.                                                                                                                                                                                                                      | 3                                                                                                                                                                                                 |-| **OVERRIDE_CLUSTER_NAME**                                | Override the name of the Cassandra cluster set by the operator. This shouldn't be explicit set, unless you know what you're doing.                                                                                                                                                                                                        |                                                                                                                                                                                                   |-| **EXTERNAL_SERVICE**                                     | Needs to be true for either EXTERNAL_NATIVE_TRANSPORT or EXTERNAL_RPC to work                                                                                                                                                                                                                                                             | False                                                                                                                                                                                             |-| **EXTERNAL_NATIVE_TRANSPORT**                            | This exposes the Cassandra cluster via an external service so it can be accessed from outside the Kubernetes cluster                                                                                                                                                                                                                      | False                                                                                                                                                                                             |-| **EXTERNAL_RPC**                                         | This exposes the Cassandra cluster via an external service so it can be accessed from outside the Kubernetes cluster. Works only if START_RPC is true                                                                                                                                                                                     | False                                                                                                                                                                                             |-| **EXTERNAL_NATIVE_TRANSPORT_PORT**                       | The external port to use for Cassandra native transport protocol.                                                                                                                                                                                                                                                                         | 9042                                                                                                                                                                                              |-| **EXTERNAL_RPC_PORT**                                    | The external port to use for Cassandra rpc protocol.                                                                                                                                                                                                                                                                                      | 9160                                                                                                                                                                                              |-| **RECOVERY_CONTROLLER**                                  | Needs to be true for automatic failure recovery and node eviction                                                                                                                                                                                                                                                                         | False                                                                                                                                                                                             |-| **RECOVERY_CONTROLLER_DOCKER_IMAGE**                     | Recovery controller Docker image.                                                                                                                                                                                                                                                                                                         | mesosphere/kudo-cassandra-recovery:0.0.2-0.1.2-SNAPSHOT                                                                                                                                           |-| **RECOVERY_CONTROLLER_DOCKER_IMAGE_PULL_POLICY**         | Recovery controller Docker image pull policy.                                                                                                                                                                                                                                                                                             | Always                                                                                                                                                                                            |-| **MAX_UNAVAILABLE_NODES**                                | Maximum number of nodes that are allowed to be down, either for restarts or from unscheduled outage. See PodDisruptionBudget                                                                                                                                                                                                              | 1                                                                                                                                                                                                 |-| **BOOTSTRAP_TIMEOUT**                                    | Timeout for the bootstrap binary to join the cluster with the new IP. Valid time units are 'ns', 'us', 'ms', 's', 'm', 'h'.                                                                                                                                                                                                               | 12h30m                                                                                                                                                                                            |-| **SHUTDOWN_OLD_REACHABLE_NODE**                          | When a node replace is done, try to connect to the old node and shut it down before starting up the old node                                                                                                                                                                                                                              | False                                                                                                                                                                                             |-| **BACKUP_RESTORE_ENABLED**                               | Global flag that enables the medusa sidecar for backups                                                                                                                                                                                                                                                                                   | False                                                                                                                                                                                             |-| **BACKUP_TRIGGER**                                       | Trigger parameter to start a backup. Simply needs to be changed from the current value to start a backup                                                                                                                                                                                                                                  | 1                                                                                                                                                                                                 |-| **BACKUP_AWS_CREDENTIALS_SECRET**                        | If set, can be used to provide the access_key, secret_key and security_token with a secret                                                                                                                                                                                                                                                |                                                                                                                                                                                                   |-| **BACKUP_AWS_S3_BUCKET_NAME**                            | The name of the AWS S3 bucket to store the backups                                                                                                                                                                                                                                                                                        |                                                                                                                                                                                                   |-| **BACKUP_AWS_S3_STORAGE_PROVIDER**                       | Should be one of the s3\_\* values from https://github.com/apache/libcloud/blob/trunk/libcloud/storage/types.py                                                                                                                                                                                                                           | s3_us_west_oregon                                                                                                                                                                                 |-| **BACKUP_PREFIX**                                        | A prefix to be used inside the S3 bucket                                                                                                                                                                                                                                                                                                  |                                                                                                                                                                                                   |-| **BACKUP_MEDUSA_CPU_MC**                                 | CPU request (in millicores) for the Medusa backup containers.                                                                                                                                                                                                                                                                             | 100                                                                                                                                                                                               |-| **BACKUP_MEDUSA_CPU_LIMIT_MC**                           | CPU limit (in millicores) for the Medusa backup containers.                                                                                                                                                                                                                                                                               | 500                                                                                                                                                                                               |-| **BACKUP_MEDUSA_MEM_MIB**                                | Memory request (in MiB) for the Medusa backup containers.                                                                                                                                                                                                                                                                                 | 256                                                                                                                                                                                               |-| **BACKUP_MEDUSA_MEM_LIMIT_MIB**                          | Memory limit (in MiB) for the Medusa backup containers.                                                                                                                                                                                                                                                                                   | 512                                                                                                                                                                                               |-| **BACKUP_MEDUSA_DOCKER_IMAGE**                           | Medusa backup Docker image.                                                                                                                                                                                                                                                                                                               | mesosphere/kudo-cassandra-medusa:0.5.1-0.1.2-SNAPSHOT                                                                                                                                             |-| **BACKUP_MEDUSA_DOCKER_IMAGE_PULL_POLICY**               | Medusa backup Docker image pull policy.                                                                                                                                                                                                                                                                                                   | Always                                                                                                                                                                                            |-| **BACKUP_NAME**                                          | The name of the backup to create or restore                                                                                                                                                                                                                                                                                               |                                                                                                                                                                                                   |-| **RESTORE_FLAG**                                         | If true, a restore is done on installation                                                                                                                                                                                                                                                                                                | False                                                                                                                                                                                             |-| **RESTORE_OLD_NAMESPACE**                                | The namespace from the operator that was used to create the backup                                                                                                                                                                                                                                                                        |                                                                                                                                                                                                   |-| **RESTORE_OLD_NAME**                                     | The instance name from the operator that was used to create the backup                                                                                                                                                                                                                                                                    |                                                                                                                                                                                                   |-| **NODE_TOPOLOGY**                                        | This describes a multi-datacenter setup. When set it has precedence over NODE_COUNT. See docs/multidatacenter.md for more details.                                                                                                                                                                                                        |                                                                                                                                                                                                   |-| **NODE_ANTI_AFFINITY**                                   | Ensure that every Cassandra node is deployed on separate hosts                                                                                                                                                                                                                                                                            | False                                                                                                                                                                                             |-| **SERVICE_ACCOUNT_INSTALL**                              | This flag can be set to true to automatic installation of a cluster role, service account and role binding                                                                                                                                                                                                                                | False                                                                                                                                                                                             |-| **EXTERNAL_SEED_NODES**                                  | List of seed nodes external to this instance to add to the cluster. This allows clusters spanning multiple Kubernetes clusters.                                                                                                                                                                                                           |                                                                                                                                                                                                   |-| **PROMETHEUS_EXPORTER_ENABLED**                          |                                                                                                                                                                                                                                                                                                                                           | True                                                                                                                                                                                              |-| **PROMETHEUS_EXPORTER_PORT**                             | Prometheus exporter port.                                                                                                                                                                                                                                                                                                                 | 7200                                                                                                                                                                                              |-| **PROMETHEUS_EXPORTER_CPU_MC**                           | CPU request (in millicores) for the Prometheus exporter containers.                                                                                                                                                                                                                                                                       | 500                                                                                                                                                                                               |-| **PROMETHEUS_EXPORTER_CPU_LIMIT_MC**                     | CPU limit (in millicores) for the Prometheus exporter containers.                                                                                                                                                                                                                                                                         | 1000                                                                                                                                                                                              |-| **PROMETHEUS_EXPORTER_MEM_MIB**                          | Memory request (in MiB) for the Prometheus exporter containers.                                                                                                                                                                                                                                                                           | 512                                                                                                                                                                                               |-| **PROMETHEUS_EXPORTER_MEM_LIMIT_MIB**                    | Memory limit (in MiB) for the Prometheus exporter containers.                                                                                                                                                                                                                                                                             | 512                                                                                                                                                                                               |-| **PROMETHEUS_EXPORTER_DOCKER_IMAGE**                     | Prometheus exporter Docker image.                                                                                                                                                                                                                                                                                                         | mesosphere/cassandra-prometheus-exporter:2.3.4-0.1.2-SNAPSHOT                                                                                                                                     |-| **PROMETHEUS_EXPORTER_DOCKER_IMAGE_PULL_POLICY**         | Prometheus exporter Docker image pull policy.                                                                                                                                                                                                                                                                                             | Always                                                                                                                                                                                            |-| **PROMETHEUS_EXPORTER_CUSTOM_CONFIG_CM_NAME**            | The properties present in this configmap will be appended to the prometheus configuration properties                                                                                                                                                                                                                                      |                                                                                                                                                                                                   |-| **STORAGE_PORT**                                         | The port for inter-node communication.                                                                                                                                                                                                                                                                                                    | 7000                                                                                                                                                                                              |-| **SSL_STORAGE_PORT**                                     | The port for inter-node communication over SSL.                                                                                                                                                                                                                                                                                           | 7001                                                                                                                                                                                              |-| **NATIVE_TRANSPORT_PORT**                                | The port for CQL communication.                                                                                                                                                                                                                                                                                                           | 9042                                                                                                                                                                                              |-| **RPC_PORT**                                             | The port for Thrift RPC communication.                                                                                                                                                                                                                                                                                                    | 9160                                                                                                                                                                                              |-| **JMX_PORT**                                             | The JMX port that will be used to interface with the Cassandra application.                                                                                                                                                                                                                                                               | 7199                                                                                                                                                                                              |-| **RMI_PORT**                                             | The RMI port that will be used to interface with the Cassandra application when TRANSPORT_ENCRYPTION_ENABLED is set.                                                                                                                                                                                                                      | 7299                                                                                                                                                                                              |-| **JMX_LOCAL_ONLY**                                       | If true, the JMX port will only be opened on localhost and not be available to the cluster                                                                                                                                                                                                                                                | True                                                                                                                                                                                              |-| **TRANSPORT_ENCRYPTION_ENABLED**                         | Enable node-to-node encryption.                                                                                                                                                                                                                                                                                                           | False                                                                                                                                                                                             |-| **TRANSPORT_ENCRYPTION_CLIENT_ENABLED**                  | Enable client-to-node encryption.                                                                                                                                                                                                                                                                                                         | False                                                                                                                                                                                             |-| **TRANSPORT_ENCRYPTION_CIPHERS**                         | Comma-separated list of JSSE Cipher Suite Names.                                                                                                                                                                                                                                                                                          | TLS_RSA_WITH_AES_128_CBC_SHA,TLS_RSA_WITH_AES_256_CBC_SHA,TLS_DHE_RSA_WITH_AES_128_CBC_SHA,TLS_DHE_RSA_WITH_AES_256_CBC_SHA,TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA,TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA |-| **TRANSPORT_ENCRYPTION_CLIENT_ALLOW_PLAINTEXT**          | Enable Server-Client plaintext communication alongside encrypted traffic.                                                                                                                                                                                                                                                                 | False                                                                                                                                                                                             |-| **TRANSPORT_ENCRYPTION_REQUIRE_CLIENT_AUTH**             | Enable client certificate authentication on node-to-node transport encryption.                                                                                                                                                                                                                                                            | True                                                                                                                                                                                              |-| **TRANSPORT_ENCRYPTION_CLIENT_REQUIRE_CLIENT_AUTH**      | Enable client certificate authentication on client-to-node transport encryption.                                                                                                                                                                                                                                                          | True                                                                                                                                                                                              |-| **TLS_SECRET_NAME**                                      | The TLS secret that contains the self-signed certificate (cassandra.crt) and the private key (cassandra.key). The secret will be mounted as a volume to make the artifacts available.                                                                                                                                                     | cassandra-tls                                                                                                                                                                                     |-| **NODE_MIN_HEAP_SIZE_MB**                                | The minimum JVM heap size in MB. This has a smart default and doesn't need to be explicitly set.                                                                                                                                                                                                                                          |                                                                                                                                                                                                   |-| **NODE_MAX_HEAP_SIZE_MB**                                | The maximum JVM heap size in MB. This has a smart default and doesn't need to be explicitly set.                                                                                                                                                                                                                                          |                                                                                                                                                                                                   |-| **NODE_NEW_GENERATION_HEAP_SIZE_MB**                     | The JVM new generation heap size in MB.                                                                                                                                                                                                                                                                                                   |                                                                                                                                                                                                   |-| **SEED_PROVIDER_CLASS**                                  | The class within Cassandra that handles the seed logic.                                                                                                                                                                                                                                                                                   | org.apache.cassandra.locator.SimpleSeedProvider                                                                                                                                                   |-| **NUM_TOKENS**                                           | The number of tokens assigned to each node.                                                                                                                                                                                                                                                                                               | 256                                                                                                                                                                                               |-| **HINTED_HANDOFF_ENABLED**                               | If true, hinted handoff is enabled for the cluster.                                                                                                                                                                                                                                                                                       | True                                                                                                                                                                                              |-| **MAX_HINT_WINDOW_IN_MS**                                | The maximum amount of time, in ms, that hints are generated for an unresponsive node.                                                                                                                                                                                                                                                     | 10800000                                                                                                                                                                                          |-| **HINTED_HANDOFF_THROTTLE_IN_KB**                        | The maximum throttle per delivery thread in KBs per second.                                                                                                                                                                                                                                                                               | 1024                                                                                                                                                                                              |-| **MAX_HINTS_DELIVERY_THREADS**                           | The maximum number of delivery threads for hinted handoff.                                                                                                                                                                                                                                                                                | 2                                                                                                                                                                                                 |-| **BATCHLOG_REPLAY_THROTTLE_IN_KB**                       | The total maximum throttle for replaying failed logged batches in KBs per second.                                                                                                                                                                                                                                                         | 1024                                                                                                                                                                                              |-| **AUTHENTICATOR**                                        | Authentication backend, implementing IAuthenticator; used to identify users.                                                                                                                                                                                                                                                              | AllowAllAuthenticator                                                                                                                                                                             |-| **AUTHENTICATION_SECRET_NAME**                           | Name of the secret containing the credentials used by the operator when running 'nodetool' for its functionality. Only relevant if AUTHENTICATOR is set to 'PasswordAuthenticator'. The secret needs to have a 'username' and a 'password' entry.                                                                                         |                                                                                                                                                                                                   |-| **AUTHORIZER**                                           | Authorization backend, implementing IAuthorizer; used to limit access/provide permissions.                                                                                                                                                                                                                                                | AllowAllAuthorizer                                                                                                                                                                                |-| **ROLE_MANAGER**                                         | Part of the Authentication & Authorization backend that implements IRoleManager to maintain grants and memberships between roles, By default, the value set is Apache Cassandra's out of the box Role Manager: CassandraRoleManager                                                                                                       | CassandraRoleManager                                                                                                                                                                              |-| **ROLES_VALIDITY_IN_MS**                                 | Validity period for roles cache; set to 0 to disable                                                                                                                                                                                                                                                                                      | 2000                                                                                                                                                                                              |-| **ROLES_UPDATE_INTERVAL_IN_MS**                          | After this interval, cache entries become eligible for refresh. Upon next access, Cassandra schedules an async reload, and returns the old value until the reload completes. If roles_validity_in_ms is non-zero, then this must be also.                                                                                                 |                                                                                                                                                                                                   |-| **CREDENTIALS_VALIDITY_IN_MS**                           | This cache is tightly coupled to the provided PasswordAuthenticator implementation of IAuthenticator. If another IAuthenticator implementation is configured, Cassandra does not use this cache, and these settings have no effect. Set to 0 to disable.                                                                                  | 2000                                                                                                                                                                                              |-| **CREDENTIALS_UPDATE_INTERVAL_IN_MS**                    | After this interval, cache entries become eligible for refresh. The next time the cache is accessed, the system schedules an asynchronous reload of the cache. Until this cache reload is complete, the cache returns the old values. If credentials_validity_in_ms is nonzero, this property must also be nonzero.                       |                                                                                                                                                                                                   |-| **PERMISSIONS_VALIDITY_IN_MS**                           | How many milliseconds permissions in cache remain valid. Fetching permissions can be resource intensive. To disable the cache, set this to 0.                                                                                                                                                                                             | 2000                                                                                                                                                                                              |-| **PERMISSIONS_UPDATE_INTERVAL_IN_MS**                    | If enabled, sets refresh interval for the permissions cache. After this interval, cache entries become eligible for refresh. On next access, Cassandra schedules an async reload and returns the old value until the reload completes. If permissions_validity_in_ms is nonzero, permissions_update_interval_in_ms must also be non-zero. |                                                                                                                                                                                                   |-| **PARTITIONER**                                          | The partitioner used to distribute rows across the cluster. Murmur3Partitioner is the recommended setting. RandomPartitioner and ByteOrderedPartitioner are supported for legacy applications.                                                                                                                                            | org.apache.cassandra.dht.Murmur3Partitioner                                                                                                                                                       |-| **KEY_CACHE_SAVE_PERIOD**                                | The duration in seconds that keys are saved in cache. Saved caches greatly improve cold-start speeds and has relatively little effect on I/O.                                                                                                                                                                                             | 14400                                                                                                                                                                                             |-| **ROW_CACHE_SIZE_IN_MB**                                 | Maximum size of the row cache in memory. Row cache can save more time than key_cache_size_in_mb, but is space-intensive because it contains the entire row. Use the row cache only for hot rows or static rows. 0 disables the row cache.                                                                                                 | 0                                                                                                                                                                                                 |-| **ROW_CACHE_SAVE_PERIOD**                                | Duration in seconds that rows are saved in cache. 0 disables caching.                                                                                                                                                                                                                                                                     | 0                                                                                                                                                                                                 |-| **COMMITLOG_SYNC_PERIOD_IN_MS**                          | The number of milliseconds between disk fsync calls.                                                                                                                                                                                                                                                                                      | 10000                                                                                                                                                                                             |-| **COMMITLOG_SYNC_BATCH_WINDOW_IN_MS**                    | Time to wait between batch fsyncs, if commitlog_sync is in batch mode then default value should be: 2                                                                                                                                                                                                                                     |                                                                                                                                                                                                   |-| **COMMITLOG_SEGMENT_SIZE_IN_MB**                         | The size of each commit log segment in Mb.                                                                                                                                                                                                                                                                                                | 32                                                                                                                                                                                                |-| **CONCURRENT_READS**                                     | For workloads with more data than can fit in memory, the bottleneck is reads fetching data from disk. Setting to (16 times the number of drives) allows operations to queue low enough in the stack so that the OS and drives can reorder them.                                                                                           | 16                                                                                                                                                                                                |-| **CONCURRENT_WRITES**                                    | Writes in Cassandra are rarely I/O bound, so the ideal number of concurrent writes depends on the number of CPU cores in your system. The recommended value is 8 times the number of cpu cores.                                                                                                                                           | 32                                                                                                                                                                                                |-| **CONCURRENT_COUNTER_WRITES**                            | Counter writes read the current values before incrementing and writing them back. The recommended value is (16 times the number of drives) .                                                                                                                                                                                              | 16                                                                                                                                                                                                |-| **MEMTABLE_ALLOCATION_TYPE**                             | The type of allocations for the Cassandra memtable. heap_buffers keep all data on the JVM heap. offheap_buffers may reduce heap utilization for large string or binary values. offheap_objects may improve heap size for small integers or UUIDs as well. Both off heap options will increase read latency.                               | heap_buffers                                                                                                                                                                                      |-| **INDEX_SUMMARY_RESIZE_INTERVAL_IN_MINUTES**             | How frequently index summaries should be re-sampled in minutes. This is done periodically to redistribute memory from the fixed-size pool to SSTables proportional their recent read rates.                                                                                                                                               | 60                                                                                                                                                                                                |-| **START_NATIVE_TRANSPORT**                               | If true, CQL is enabled.                                                                                                                                                                                                                                                                                                                  | True                                                                                                                                                                                              |-| **START_RPC**                                            | If true, Thrift RPC is enabled. This is deprecated but may be necessary for legacy applications.                                                                                                                                                                                                                                          | False                                                                                                                                                                                             |-| **RPC_KEEPALIVE**                                        | Enables or disables keepalive on client connections (RPC or native).                                                                                                                                                                                                                                                                      | True                                                                                                                                                                                              |-| **THRIFT_FRAMED_TRANSPORT_SIZE_IN_MB**                   | Frame size (maximum field length) for Thrift.                                                                                                                                                                                                                                                                                             | 15                                                                                                                                                                                                |-| **TOMBSTONE_WARN_THRESHOLD**                             | The maximum number of tombstones a query can scan before warning.                                                                                                                                                                                                                                                                         | 1000                                                                                                                                                                                              |-| **TOMBSTONE_FAILURE_THRESHOLD**                          | The maximum number of tombstones a query can scan before aborting.                                                                                                                                                                                                                                                                        | 100000                                                                                                                                                                                            |-| **COLUMN_INDEX_SIZE_IN_KB**                              | The granularity of the index of rows within a partition. For huge rows, decrease this setting to improve seek time. If you use key cache, be careful not to make this setting too large because key cache will be overwhelmed.                                                                                                            | 64                                                                                                                                                                                                |-| **BATCH_SIZE_WARN_THRESHOLD_IN_KB**                      | Warn the operator on a batch size exceeding this value in kilobytes. Caution should be taken on increasing the size of this threshold as it can lead to node instability.                                                                                                                                                                 | 5                                                                                                                                                                                                 |-| **BATCH_SIZE_FAIL_THRESHOLD_IN_KB**                      | Fail batch sizes exceeding this value in kilobytes. Caution should be taken on increasing the size of this threshold as it can lead to node instability.                                                                                                                                                                                  | 50                                                                                                                                                                                                |-| **COMPACTION_THROUGHPUT_MB_PER_SEC**                     | Throttles compaction to the specified total throughput across the node. Compaction frequency varies with direct proportion to write throughput and is necessary to limit the SSTable size. The recommended value is 16 to 32 times the rate of write throughput (in MB/second).                                                           | 16                                                                                                                                                                                                |-| **SSTABLE_PREEMPTIVE_OPEN_INTERVAL_IN_MB**               | When compacting, the replacement opens SSTables before they are completely written and uses in place of the prior SSTables for any range previously written. This setting helps to smoothly transfer reads between the SSTables by reducing page cache churn and keeps hot rows hot.                                                      | 50                                                                                                                                                                                                |-| **READ_REQUEST_TIMEOUT_IN_MS**                           | The time that the coordinator waits for read operations to complete in ms.                                                                                                                                                                                                                                                                | 5000                                                                                                                                                                                              |-| **RANGE_REQUEST_TIMEOUT_IN_MS**                          | The time that the coordinator waits for range scans complete in ms.                                                                                                                                                                                                                                                                       | 10000                                                                                                                                                                                             |-| **WRITE_REQUEST_TIMEOUT_IN_MS**                          | The time that the coordinator waits for write operations to complete in ms.                                                                                                                                                                                                                                                               | 2000                                                                                                                                                                                              |-| **COUNTER_WRITE_REQUEST_TIMEOUT_IN_MS**                  | The time that the coordinator waits for counter write operations to complete in ms.                                                                                                                                                                                                                                                       | 5000                                                                                                                                                                                              |-| **CAS_CONTENTION_TIMEOUT_IN_MS**                         | The time for which the coordinator will retry CAS operations on the same row in ms.                                                                                                                                                                                                                                                       | 1000                                                                                                                                                                                              |-| **TRUNCATE_REQUEST_TIMEOUT_IN_MS**                       | The time that the coordinator waits for truncate operations to complete in ms.                                                                                                                                                                                                                                                            | 60000                                                                                                                                                                                             |-| **REQUEST_TIMEOUT_IN_MS**                                | The default timeout for all other requests in ms.                                                                                                                                                                                                                                                                                         | 10000                                                                                                                                                                                             |-| **DYNAMIC_SNITCH_UPDATE_INTERVAL_IN_MS**                 | The time, in ms, the snitch will wait before updating node scores.                                                                                                                                                                                                                                                                        | 100                                                                                                                                                                                               |-| **DYNAMIC_SNITCH_RESET_INTERVAL_IN_MS**                  | The time, in ms, the snitch will wait before resetting node scores allowing bad nodes to recover.                                                                                                                                                                                                                                         | 600000                                                                                                                                                                                            |-| **DYNAMIC_SNITCH_BADNESS_THRESHOLD**                     | Sets the performance threshold for dynamically routing client requests away from a poorly performing node.                                                                                                                                                                                                                                | 0.1                                                                                                                                                                                               |-| **INTERNODE_COMPRESSION**                                | Controls whether traffic between nodes is compressed. all compresses all traffic. none compresses no traffic. dc compresses between datacenters.                                                                                                                                                                                          | dc                                                                                                                                                                                                |-| **MAX_HINTS_FILE_SIZE_IN_MB**                            | The maximum size of the hints file in Mb.                                                                                                                                                                                                                                                                                                 | 128                                                                                                                                                                                               |-| **HINTS_FLUSH_PERIOD_IN_MS**                             | The time, in ms, for the period in which hints are flushed to disk.                                                                                                                                                                                                                                                                       | 10000                                                                                                                                                                                             |-| **CONCURRENT_MATERIALIZED_VIEW_WRITES**                  | The maximum number of concurrent writes to materialized views.                                                                                                                                                                                                                                                                            | 32                                                                                                                                                                                                |-| **COMMITLOG_TOTAL_SPACE_IN_MB**                          | The total size of the commit log in Mb.                                                                                                                                                                                                                                                                                                   |                                                                                                                                                                                                   |-| **AUTO_SNAPSHOT**                                        | Take a snapshot of the data before truncating a keyspace or dropping a table                                                                                                                                                                                                                                                              | True                                                                                                                                                                                              |-| **KEY_CACHE_KEYS_TO_SAVE**                               | The number of keys from the key cache to save                                                                                                                                                                                                                                                                                             |                                                                                                                                                                                                   |-| **ROW_CACHE_KEYS_TO_SAVE**                               | The number of keys from the row cache to save                                                                                                                                                                                                                                                                                             |                                                                                                                                                                                                   |-| **COUNTER_CACHE_KEYS_TO_SAVE**                           | The number of keys from the counter cache to save                                                                                                                                                                                                                                                                                         |                                                                                                                                                                                                   |-| **FILE_CACHE_SIZE_IN_MB**                                | The total memory to use for SSTable-reading buffers                                                                                                                                                                                                                                                                                       |                                                                                                                                                                                                   |-| **MEMTABLE_HEAP_SPACE_IN_MB**                            | The amount of on-heap memory allocated for memtables                                                                                                                                                                                                                                                                                      |                                                                                                                                                                                                   |-| **MEMTABLE_OFFHEAP_SPACE_IN_MB**                         | The total amount of off-heap memory allocated for memtables                                                                                                                                                                                                                                                                               |                                                                                                                                                                                                   |-| **MEMTABLE_CLEANUP_THRESHOLD**                           | The ratio used for automatic memtable flush                                                                                                                                                                                                                                                                                               |                                                                                                                                                                                                   |-| **MEMTABLE_FLUSH_WRITERS**                               | The number of memtable flush writer threads                                                                                                                                                                                                                                                                                               |                                                                                                                                                                                                   |-| **LISTEN_ON_BROADCAST_ADDRESS**                          | Listen on the address set in broadcast_address property                                                                                                                                                                                                                                                                                   |                                                                                                                                                                                                   |-| **INTERNODE_AUTHENTICATOR**                              | The internode authentication backend                                                                                                                                                                                                                                                                                                      |                                                                                                                                                                                                   |-| **NATIVE_TRANSPORT_MAX_THREADS**                         | The maximum number of thread handling requests                                                                                                                                                                                                                                                                                            |                                                                                                                                                                                                   |-| **NATIVE_TRANSPORT_MAX_FRAME_SIZE_IN_MB**                | The maximum allowed size of a frame                                                                                                                                                                                                                                                                                                       |                                                                                                                                                                                                   |-| **NATIVE_TRANSPORT_MAX_CONCURRENT_CONNECTIONS**          | The maximum number of concurrent client connections                                                                                                                                                                                                                                                                                       |                                                                                                                                                                                                   |-| **NATIVE_TRANSPORT_MAX_CONCURRENT_CONNECTIONS_PER_IP**   | The maximum number of concurrent client connections per source IP address                                                                                                                                                                                                                                                                 |                                                                                                                                                                                                   |-| **RPC_MIN_THREADS**                                      | The minimum thread pool size for remote procedure calls                                                                                                                                                                                                                                                                                   |                                                                                                                                                                                                   |-| **RPC_MAX_THREADS**                                      | The maximum thread pool size for remote procedure calls                                                                                                                                                                                                                                                                                   |                                                                                                                                                                                                   |-| **RPC_SEND_BUFF_SIZE_IN_BYTES**                          | The sending socket buffer size in bytes for remote procedure calls                                                                                                                                                                                                                                                                        |                                                                                                                                                                                                   |-| **RPC_RECV_BUFF_SIZE_IN_BYTES**                          | The receiving socket buffer size for remote procedure calls                                                                                                                                                                                                                                                                               |                                                                                                                                                                                                   |-| **CONCURRENT_COMPACTORS**                                | The number of concurrent compaction processes allowed to run simultaneously on a node                                                                                                                                                                                                                                                     |                                                                                                                                                                                                   |-| **STREAM_THROUGHPUT_OUTBOUND_MEGABITS_PER_SEC**          | The maximum throughput of all outbound streaming file transfers on a node                                                                                                                                                                                                                                                                 |                                                                                                                                                                                                   |-| **INTER_DC_STREAM_THROUGHPUT_OUTBOUND_MEGABITS_PER_SEC** | The maximum throughput of all streaming file transfers between datacenters                                                                                                                                                                                                                                                                |                                                                                                                                                                                                   |-| **STREAMING_KEEP_ALIVE_PERIOD_IN_SECS**                  | Interval to send keep-alive messages. The stream session fails when a keep-alive message is not received for 2 keep-alive cycles.                                                                                                                                                                                                         |                                                                                                                                                                                                   |-| **PHI_CONVICT_THRESHOLD**                                | The sensitivity of the failure detector on an exponential scale                                                                                                                                                                                                                                                                           |                                                                                                                                                                                                   |-| **BUFFER_POOL_USE_HEAP_IF_EXHAUSTED**                    | Allocate on-heap memory when the SSTable buffer pool is exhausted                                                                                                                                                                                                                                                                         |                                                                                                                                                                                                   |-| **DISK_OPTIMIZATION_STRATEGY**                           | The strategy for optimizing disk reads                                                                                                                                                                                                                                                                                                    |                                                                                                                                                                                                   |-| **MAX_VALUE_SIZE_IN_MB**                                 | The maximum size of any value in SSTables                                                                                                                                                                                                                                                                                                 |                                                                                                                                                                                                   |-| **OTC_COALESCING_STRATEGY**                              | The strategy to use for coalescing network messages. Values can be: fixed, movingaverage, timehorizon, disabled (default)                                                                                                                                                                                                                 |                                                                                                                                                                                                   |-| **UNLOGGED_BATCH_ACROSS_PARTITIONS_WARN_THRESHOLD**      | Causes Cassandra to log a WARN message on any batches not of type LOGGED that span across more partitions than this limit.                                                                                                                                                                                                                | 10                                                                                                                                                                                                |-| **COMPACTION_LARGE_PARTITION_WARNING_THRESHOLD_MB**      | Cassandra logs a warning when compacting partitions larger than the set value.                                                                                                                                                                                                                                                            | 100                                                                                                                                                                                               |-| **REQUEST_SCHEDULER**                                    | The scheduler to handle incoming client requests according to a defined policy. This scheduler is useful for throttling client requests in single clusters containing multiple keyspaces.                                                                                                                                                 | org.apache.cassandra.scheduler.NoScheduler                                                                                                                                                        |-| **INTER_DC_TCP_NODELAY**                                 | Enable this property for inter-datacenter communication.                                                                                                                                                                                                                                                                                  | False                                                                                                                                                                                             |-| **TRACETYPE_QUERY_TTL**                                  | TTL for different trace types used during logging of the query process.                                                                                                                                                                                                                                                                   | 86400                                                                                                                                                                                             |-| **TRACETYPE_REPAIR_TTL**                                 | TTL for different trace types used during logging of the repair process.                                                                                                                                                                                                                                                                  | 604800                                                                                                                                                                                            |-| **GC_WARN_THRESHOLD_IN_MS**                              | Any GC pause longer than this interval is logged at the WARN level.                                                                                                                                                                                                                                                                       | 1000                                                                                                                                                                                              |-| **WINDOWS_TIMER_INTERVAL**                               | The default Windows kernel timer and scheduling resolution is 15.6ms for power conservation. Lowering this value on Windows can provide much tighter latency and better throughput, however some virtualized environments may see a negative performance impact from changing this setting below their system default.                    | 1                                                                                                                                                                                                 |-| **COUNTER_CACHE_SAVE_PERIOD**                            | the amount of time after which Cassandra saves the counter cache (keys only).                                                                                                                                                                                                                                                             | 7200                                                                                                                                                                                              |-| **TRICKLE_FSYNC_INTERVAL_IN_KB**                         | The size of the fsync in kilobytes.                                                                                                                                                                                                                                                                                                       | 10240                                                                                                                                                                                             |-| **TRICKLE_FSYNC**                                        | When set to true, causes fsync to force the operating system to flush the dirty buffers at the set interval                                                                                                                                                                                                                               | False                                                                                                                                                                                             |-| **INCREMENTAL_BACKUPS**                                  | Backs up data updated since the last snapshot was taken. When enabled, Cassandra creates a hard link to each SSTable flushed or streamed locally in a backups subdirectory of the keyspace data.                                                                                                                                          | False                                                                                                                                                                                             |-| **SNAPSHOT_BEFORE_COMPACTION**                           | Enables or disables taking a snapshot before each compaction. A snapshot is useful to back up data when there is a data format change.                                                                                                                                                                                                    | False                                                                                                                                                                                             |-| **CROSS_NODE_TIMEOUT**                                   | operation timeout information exchange between nodes (to accurately measure request timeouts).                                                                                                                                                                                                                                            | False                                                                                                                                                                                             |-| **COMMIT_FAILURE_POLICY**                                | Policy for commit disk failures.                                                                                                                                                                                                                                                                                                          | stop                                                                                                                                                                                              |-| **KEY_CACHE_SIZE_IN_MB**                                 | A global cache setting for the maximum size of the key cache in memory (for all tables).                                                                                                                                                                                                                                                  |                                                                                                                                                                                                   |-| **COUNTER_CACHE_SIZE_IN_MB**                             | When no value is set, Cassandra uses the smaller of minimum of 2.5% of Heap or 50MB.                                                                                                                                                                                                                                                      |                                                                                                                                                                                                   |-| **COMMITLOG_SYNC**                                       | The method that Cassandra uses to acknowledge writes in milliseconds                                                                                                                                                                                                                                                                      | periodic                                                                                                                                                                                          |-| **INDEX_SUMMARY_CAPACITY_IN_MB**                         | Fixed memory pool size in MB for SSTable index summaries.                                                                                                                                                                                                                                                                                 |                                                                                                                                                                                                   |-| **RPC_SERVER_TYPE**                                      | Cassandra provides three options for the RPC server. sync and hsha performance is about the same, but hsha uses less memory.                                                                                                                                                                                                              | sync                                                                                                                                                                                              |-| **ENDPOINT_SNITCH**                                      | Set to a class that implements the IEndpointSnitch interface. Cassandra uses the snitch to locate nodes and route requests.                                                                                                                                                                                                               | SimpleSnitch                                                                                                                                                                                      |-| **DISK_FAILURE_POLICY**                                  | The policy for how Cassandra responds to disk failure                                                                                                                                                                                                                                                                                     | stop                                                                                                                                                                                              |-| **ENABLE_USER_DEFINED_FUNCTIONS**                        | User defined functions (UDFs) present a security risk, since they are executed on the server side. UDFs are executed in a sandbox to contain the execution of malicious code.                                                                                                                                                             | False                                                                                                                                                                                             |-| **ENABLE_SCRIPTED_USER_DEFINED_FUNCTIONS**               | Java UDFs are always enabled, if enable_user_defined_functions is true. Enable this option to use UDFs with language javascript or any custom JSR-223 provider. This option has no effect if enable_user_defined_functions is false                                                                                                       | False                                                                                                                                                                                             |-| **ENABLE_MATERIALIZED_VIEWS**                            | Enables materialized view creation on this node. Materialized views are considered experimental and are not recommended for production use.                                                                                                                                                                                               | False                                                                                                                                                                                             |-| **CDC_ENABLED**                                          | Enable / disable CDC functionality on a per-node basis. This modifies the logic used for write path allocation rejection                                                                                                                                                                                                                  | False                                                                                                                                                                                             |-| **CDC_TOTAL_SPACE_IN_MB**                                | Total space to use for change-data-capture (CDC) logs on disk.                                                                                                                                                                                                                                                                            |                                                                                                                                                                                                   |-| **CDC_FREE_SPACE_CHECK_INTERVAL_MS**                     | Interval between checks for new available space for CDC-tracked tables when the cdc_total_space_in_mb threshold is reached and the CDCCompactor is running behind or experiencing back pressure.                                                                                                                                          |                                                                                                                                                                                                   |-| **PREPARED_STATEMENTS_CACHE_SIZE_MB**                    | Maximum size of the native protocol prepared statement cache                                                                                                                                                                                                                                                                              |                                                                                                                                                                                                   |-| **THRIFT_PREPARED_STATEMENTS_CACHE_SIZE_MB**             | Maximum size of the Thrift prepared statement cache. Leave empty if you do not use Thrift.                                                                                                                                                                                                                                                |                                                                                                                                                                                                   |-| **COLUMN_INDEX_CACHE_SIZE_IN_KB**                        | A threshold for the total size of all index entries for a partition that the database stores in the partition key cache.                                                                                                                                                                                                                  | 2                                                                                                                                                                                                 |-| **SLOW_QUERY_LOG_TIMEOUT_IN_MS**                         | How long before a node logs slow queries. Select queries that exceed this value generate an aggregated log message to identify slow queries. To disable, set to 0.                                                                                                                                                                        | 500                                                                                                                                                                                               |-| **BACK_PRESSURE_ENABLED**                                | Enable for the coordinator to apply the specified back pressure strategy to each mutation that is sent to replicas.                                                                                                                                                                                                                       | False                                                                                                                                                                                             |-| **BACK_PRESSURE_STRATEGY_CLASS_NAME**                    | The back-pressure strategy applied. The default implementation, RateBasedBackPressure, takes three arguments: high ratio, factor, and flow type, and uses the ratio between incoming mutation responses and outgoing mutation requests.                                                                                                   | org.apache.cassandra.net.RateBasedBackPressure                                                                                                                                                    |-| **BACK_PRESSURE_STRATEGY_HIGH_RATIO**                    | When outgoing mutations are below this value, they are rate limited according to the incoming rate decreased by the factor. When above this value, the rate limiting is increased by the factor.                                                                                                                                          | 0.9                                                                                                                                                                                               |-| **BACK_PRESSURE_STRATEGY_FACTOR**                        | A number between 1 and 10. Increases or decreases rate limiting.                                                                                                                                                                                                                                                                          | 5                                                                                                                                                                                                 |-| **BACK_PRESSURE_STRATEGY_FLOW**                          | The flow speed to apply rate limiting: FAST - rate limited to the speed of the fastest replica. SLOW - rate limit to the speed of the slowest replica.                                                                                                                                                                                    | FAST                                                                                                                                                                                              |-| **ALLOCATE_TOKENS_FOR_KEYSPACE**                         | Triggers automatic allocation of num_tokens tokens for this node. The allocation algorithm attempts to choose tokens in a way that optimizes replicated load over the nodes in the datacenter for the replication strategy used by the specified keyspace.                                                                                |                                                                                                                                                                                                   |-| **HINTS_DIRECTORY**                                      | Directory where Cassandra should store hints.                                                                                                                                                                                                                                                                                             |                                                                                                                                                                                                   |-| **COMMITLOG_DIRECTORY**                                  | When running on magnetic HDD, this should be a separate spindle than the data directories. If not set, the default directory is \$CASSANDRA_HOME/data/commitlog.                                                                                                                                                                          |                                                                                                                                                                                                   |-| **CDC_RAW_DIRECTORY**                                    | CommitLogSegments are moved to this directory on flush if cdc_enabled: true and the segment contains mutations for a CDC-enabled table                                                                                                                                                                                                    |                                                                                                                                                                                                   |-| **ROW_CACHE_CLASS_NAME**                                 | Row cache implementation class name.                                                                                                                                                                                                                                                                                                      |                                                                                                                                                                                                   |-| **SAVED_CACHES_DIRECTORY**                               | saved caches If not set, the default directory is \$CASSANDRA_HOME/data/saved_caches.                                                                                                                                                                                                                                                     |                                                                                                                                                                                                   |-| **INTERNODE_SEND_BUFF_SIZE_IN_BYTES**                    | Set socket buffer size for internode communication Note that when setting this, the buffer size is limited by net.core.wmem_max and when not setting it it is defined by net.ipv4.tcp_wm                                                                                                                                                  |                                                                                                                                                                                                   |-| **INTERNODE_RECV_BUFF_SIZE_IN_BYTES**                    | Set socket buffer size for internode communication Note that when setting this, the buffer size is limited by net.core.wmem_max and when not setting it it is defined by net.ipv4.tcp_wmem                                                                                                                                                |                                                                                                                                                                                                   |-| **GC_LOG_THRESHOLD_IN_MS**                               | GC Pauses greater than 200 ms will be logged at INFO level This threshold can be adjusted to minimize logging if necessary                                                                                                                                                                                                                |                                                                                                                                                                                                   |-| **OTC_COALESCING_WINDOW_US**                             | How many microseconds to wait for coalescing.                                                                                                                                                                                                                                                                                             |                                                                                                                                                                                                   |-| **OTC_COALESCING_ENOUGH_COALESCED_MESSAGES**             | Do not try to coalesce messages if we already got that many messages. This should be more than 2 and less than 128.                                                                                                                                                                                                                       |                                                                                                                                                                                                   |-| **OTC_BACKLOG_EXPIRATION_INTERVAL_MS**                   | How many milliseconds to wait between two expiration runs on the backlog (queue) of the OutboundTcpConnection.                                                                                                                                                                                                                            |                                                                                                                                                                                                   |-| **REPAIR_SESSION_MAX_TREE_DEPTH**                        | Limits the maximum Merkle tree depth to avoid consuming too much memory during repairs.                                                                                                                                                                                                                                                   |                                                                                                                                                                                                   |-| **ENABLE_SASI_INDEXES**                                  | Enables SASI index creation on this node. SASI indexes are considered experimental and are not recommended for production use.                                                                                                                                                                                                            |                                                                                                                                                                                                   |-| **CUSTOM_CASSANDRA_YAML_BASE64**                         | Base64-encoded Cassandra properties appended to cassandra.yaml.                                                                                                                                                                                                                                                                           |                                                                                                                                                                                                   |-| **KUBECTL_VERSION**                                      | Version of 'bitnami/kubectl' image. This image is used for some functionality of the operator.                                                                                                                                                                                                                                            | 1.18.2                                                                                                                                                                                            |-| **JVM_OPT_AVAILABLE_PROCESSORS**                         | In a multi-instance deployment, multiple Cassandra instances will independently assume that all CPU processors are available to it. This setting allows you to specify a smaller set of processors and perhaps have affinity.                                                                                                             |                                                                                                                                                                                                   |-| **JVM_OPT_JOIN_RING**                                    | Set to false to start Cassandra on a node but not have the node join the cluster.                                                                                                                                                                                                                                                         |                                                                                                                                                                                                   |-| **JVM_OPT_LOAD_RING_STATE**                              | Set to false to clear all gossip state for the node on restart. Use when you have changed node information in cassandra.yaml (such as listen_address).                                                                                                                                                                                    |                                                                                                                                                                                                   |-| **JVM_OPT_REPLAYLIST**                                   | Allow restoring specific tables from an archived commit log.                                                                                                                                                                                                                                                                              |                                                                                                                                                                                                   |-| **JVM_OPT_RING_DELAY_MS**                                | Allows overriding of the default RING_DELAY (30000ms), which is the amount of time a node waits before joining the ring.                                                                                                                                                                                                                  |                                                                                                                                                                                                   |-| **JVM_OPT_TRIGGERS_DIR**                                 | Set the default location for the trigger JARs. (Default: conf/triggers)                                                                                                                                                                                                                                                                   |                                                                                                                                                                                                   |-| **JVM_OPT_WRITE_SURVEY**                                 | For testing new compaction and compression strategies. It allows you to experiment with different strategies and benchmark write performance differences without affecting the production workload.                                                                                                                                       |                                                                                                                                                                                                   |-| **JVM_OPT_DISABLE_AUTH_CACHES_REMOTE_CONFIGURATION**     | To disable configuration via JMX of auth caches (such as those for credentials, permissions and roles). This will mean those config options can only be set (persistently) in cassandra.yaml and will require a restart for new values to take effect.                                                                                    |                                                                                                                                                                                                   |-| **JVM_OPT_FORCE_DEFAULT_INDEXING_PAGE_SIZE**             | To disable dynamic calculation of the page size used when indexing an entire partition (during initial index build/rebuild). If set to true, the page size will be fixed to the default of 10000 rows per page.                                                                                                                           |                                                                                                                                                                                                   |-| **JVM_OPT_PREFER_IPV4_STACK**                            | Prefer binding to IPv4 network intefaces (when net.ipv6.bindv6only=1). See http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6342561 (short version: comment out this entry to enable IPv6 support).                                                                                                                                     | True                                                                                                                                                                                              |-| **JVM_OPT_EXPIRATION_DATE_OVERFLOW_POLICY**              | Defines how to handle INSERT requests with TTL exceeding the maximum supported expiration date.                                                                                                                                                                                                                                           |                                                                                                                                                                                                   |-| **JVM_OPT_THREAD_PRIORITY_POLICY**                       | allows lowering thread priority without being root on linux - probably not necessary on Windows but doesn't harm anything.                                                                                                                                                                                                                | 42                                                                                                                                                                                                |-| **JVM_OPT_THREAD_STACK_SIZE**                            | Per-thread stack size.                                                                                                                                                                                                                                                                                                                    | 256k                                                                                                                                                                                              |-| **JVM_OPT_STRING_TABLE_SIZE**                            | Larger interned string table, for gossip's benefit (CASSANDRA-6410)                                                                                                                                                                                                                                                                       | 1000003                                                                                                                                                                                           |-| **JVM_OPT_SURVIVOR_RATIO**                               | CMS Settings: SurvivorRatio                                                                                                                                                                                                                                                                                                               | 8                                                                                                                                                                                                 |-| **JVM_OPT_MAX_TENURING_THRESHOLD**                       | CMS Settings: MaxTenuringThreshold                                                                                                                                                                                                                                                                                                        | 1                                                                                                                                                                                                 |-| **JVM_OPT_CMS_INITIATING_OCCUPANCY_FRACTION**            | CMS Settings: CMSInitiatingOccupancyFraction                                                                                                                                                                                                                                                                                              | 75                                                                                                                                                                                                |-| **JVM_OPT_CMS_WAIT_DURATION**                            | CMS Settings: CMSWaitDuration                                                                                                                                                                                                                                                                                                             | 10000                                                                                                                                                                                             |-| **JVM_OPT_NUMBER_OF_GC_LOG_FILES**                       | GC logging options: NumberOfGCLogFiles                                                                                                                                                                                                                                                                                                    | 10                                                                                                                                                                                                |-| **JVM_OPT_GC_LOG_FILE_SIZE**                             | GC logging options: GCLOGFILESIZE                                                                                                                                                                                                                                                                                                         | 10M                                                                                                                                                                                               |-| **JVM_OPT_GC_LOG_DIRECTORY**                             | GC logging options: GC_LOG_DIRECTORY                                                                                                                                                                                                                                                                                                      |                                                                                                                                                                                                   |-| **JVM_OPT_PRINT_FLS_STATISTICS**                         | GC logging options: PrintFLSStatistics                                                                                                                                                                                                                                                                                                    |                                                                                                                                                                                                   |-| **JVM_OPT_CONC_GC_THREADS**                              | By default, ConcGCThreads is 1/4 of ParallelGCThreads. Setting both to the same value can reduce STW durations.                                                                                                                                                                                                                           |                                                                                                                                                                                                   |-| **JVM_OPT_INITIATING_HEAP_OCCUPANCY_PERCENT**            | Save CPU time on large (>= 16GB) heaps by delaying region scanning until the heap is 70% full. The default in Hotspot 8u40 is 40%.                                                                                                                                                                                                        |                                                                                                                                                                                                   |-| **JVM_OPT_MAX_GC_PAUSE_MILLIS**                          | Main G1GC tunable: lowering the pause target will lower throughput and vise versa.                                                                                                                                                                                                                                                        |                                                                                                                                                                                                   |-| **JVM_OPT_G1R_SET_UPDATING_PAUSE_TIME_PERCENT**          | Have the JVM do less remembered set work during STW, instead preferring concurrent GC. Reduces p99.9 latency.                                                                                                                                                                                                                             |                                                                                                                                                                                                   |-| **CUSTOM_JVM_OPTIONS_BASE64**                            | Base64-encoded JVM options appended to jvm.options.                                                                                                                                                                                                                                                                                       |                                                                                                                                                                                                   |-| **POD_MANAGEMENT_POLICY**                                | podManagementPolicy of the Cassandra Statefulset                                                                                                                                                                                                                                                                                          | OrderedReady                                                                                                                                                                                      |-| **REPAIR_POD**                                           | Name of the pod on which 'nodetool repair' should be run.                                                                                                                                                                                                                                                                                 |                                                                                                                                                                                                   |+|                           Name                           |                                                                                                                                                                Description                                                                                                                                                                |                                                                                              Default                                                                                              |

Looks like some additional formatting was added to the table by an updated pytablewriter. Not sure if this is what we want. My environment is using version 0.54.0.

nfnt

comment created time in 3 days

push eventmesosphere/kudo-cassandra-operator

Andreas Neumann

commit sha da9b3ad73b1eb6de6dc646382b31ef0a2fd01821

Add nodeAffinity label to cordon only cassandra from a k8s node, kudo v0.13.0 (#123) * Add nodeAffinity label to cordon only cassandra from a k8s node * Update KUDO dependency to 0.13.0 Signed-off-by: Andreas Neumann <aneumann@mesosphere.com>

view details

Zain Malik

commit sha 48bbbf7cca29df4877706460e80b1ca064ffe6ee

bump cassandra to 3.11.6 (#116) * bump cassandra to 3.11.6 * update CHANGELOG.md

view details

Marcin Owsiany

commit sha aeca8b61dd2ec01b6136dcd848429c9b38eeb611

Create a CODEOWNERS file (#128) Save us from having to select reviewers by hand.

view details

Marcin Owsiany

commit sha b977d9ec8495dcbec13f2b8dddd6f1632b1b3270

Bump the shared submodule to current master. (#129) This includes the following commits: Submodule shared f5116d5..361ff73: > Introduce a way of disabling $IMAGE_DISAMBIGUATION_SUFFIX. (#64) > Fix retry counter. (#62) > Bump to a recent konvoy version with better diagnostics. (#63)

view details

Jan Schlicht

commit sha 49840cb2e6e3cebee0cb637828a8dd3927870b04

Merge branch 'master' into nfnt/change-prometheus-default

view details

push time in 3 days

PR opened mesosphere/kudo-cassandra-operator

Reviewers
Disable the Prometheus exporter by default

The Prometheus exporter requires additional CRDs that aren't available in many default cluster environments. To avoid errors in this case, the PROMETHEUS_EXPORTER_ENABLED has been changed from opt-out to opt-in. <!-- Thanks for sending a pull request! Here are some tips:

  1. Please make yourself familiar with the general development guidelines: https://github.com/mesosphere/kudo-cassandra-operator/blob/master/DEVELOPMENT.md#development

  2. Please make sure that the PR abides to the style guide: https://github.com/mesosphere/kudo-cassandra-operator/blob/master/DEVELOPMENT.md#style-guide

  3. Please make sure that that git status looks clean after running ./tools/compile_templates.sh. If there's a diff you might need to commit further changes. This script is currently linux only. To run on another platform run the docker.sh script; example: ./tools/docker.sh ./tools/compile_templates.sh

  4. Please make sure that that git status looks clean after running ./tools/generate_parameters_markdown.py. If there's a diff you might need to commit further changes. ./tools/docker.sh ./tools/generate_parameters_markdown.py This script is currently linux only. To run on another platform run the docker.sh script; example: ./tools/docker.sh ./tools/generate_parameters_markdown.py

  5. Please make sure that that git status looks clean after running ./tools/format_files.sh. If there's a diff you might need to commit further changes.

  6. If it makes sense, please add an entry to the CHANGELOG: https://github.com/mesosphere/kudo-cassandra-operator/blob/master/CHANGELOG.md#unreleased

  7. If the PR is unfinished, please start it as a Draft PR: https://github.blog/2019-02-14-introducing-draft-pull-requests/ -->

+247 -249

0 comment

4 changed files

pr created time in 3 days

push eventkudobuilder/kudo

Jan Schlicht

commit sha 03edc91be048d6321ffefbb6e8d3135350686d04

Fix namespace create without manifest (#1543) If a namespace was created without a manifest, KUDO would fail with a segfault, because it tries to add an annotation to a nil map. This has been fixed and tests have been added for namespace creation. Signed-off-by: Jan Schlicht <jan@d2iq.com>

view details

Jan Schlicht

commit sha de69d7fe12ab85102b3e66bf941e775e219d309d

Refactor operator package installation The old 'InstallPackage' function has been extracted into a separate package. It's functionality has been split up into multiple functions handling different installation resources. The function signature of 'install.Package' introduces variadic option parameters to provide a backwards-compatible API. Signed-off-by: Jan Schlicht <jan@d2iq.com>

view details

push time in 3 days

Pull request review commentkudobuilder/kuttl

KEP 6 Proposal: KUTTL Test Phase Definition and Lifecycle Hooks

+---+kep-number: 6+short-desc: KUTTL Test Phases and Lifecycle Definitions+title: KUTTL Test Phases and Lifecycle+authors:+  - "@kensipe"+owners:+  - "@kensipe"+editor: "@kensipe"+creation-date: 2020-05-27+last-updated: 2020-05-27+status: provisional+---++# Test Phases and Lifecycle++## Table of Contents++<!-- todo: add  -->++## Summary++KUTTL would benefit from defining different phases of the test lifecycle along with setting expectations of lifecycle hooks for end users.  This is similar to a large number of test frameworks.  Using JUnit as an example, phases are well defined and there are meaningful lifecycle hooks such as `@BeforeAll`, `@AfterAll`, `@BeforeEach` and `@AfterEach`, along with setup and teardown.  This KEP takes on the effort of defining the different test phases and their lifecycles, the lifecycle hooks and how to configure them.+++### Goals++* Define test phases+* Define expected life-cycle and order of phases+* Define the expected life-cycle hooks+* Define how to configure life-cycle hooks+* Separate control plane configuration from TestSuite configuration++### Non-Goals++* Integration with other testing ideologies and technologies+* Cluster creation beyond current KinD integration++### Testing Focus++KUTTL is used for many different purposes.  It can be used for "integration" tests with a mock control plane.  It can be used to run a series of tests which are not portable and is not intended to be.  It can be used to setup and verify a specific configuration of a cluster.  The focus on this KEP is around End to End (e2e) testing in a portable way.  While focusing on e2e, the functionality supporting the other uses should not be diminished. ++### Current Challenges++#### Control Plane and TestSuite++Today KUTTL's TestSuite configuration file provides 2 discreet functions:++1. Test Harness control of KUTTL start with default configurations.  This includes which control plane to use, the number of concurrent tests and the location of TestSuites.+2. TestSuite control which includes; applying manifests, CRDs, and running commands.  This enables a "before" testsuite model, however it is limited to the full collection of TestSuites.++#### Flat TestSuites++Today, KUTTL treats all configured testsuites (listed with `testDirs`) as 1 suite of tests.  While it makes senses for all tests to run under the given the established control plane, it is limiting for a configured testsuite to not be able to express a set of commands or manifests which are specific to that set of tests.  It currently requires the test harness configuration change based on intimate knowledge of the tests.  It would be great to separate out these concerns, which will result in more portable tests.+++## Proposal++### Proposal: Component Definition++![TestSuite Architecture](images/kuttl-testsuite.jpg)++* **TestSuite Collection** - This is a collection of TestSuites, which could be 1 or more.  This traditionaly has been called a TestSuite.  The TestSuite Collection is configured with the test harness configuration file, `kuttl-test.yaml` by default. +* **TestSuite** - Is a folder which contains a collection of Tests defined by sub-folders.+* ** Test** - Is a folder which contains a number of files which define the steps necessary to assert the correct of a test.  If a test folder has sub-folders they are not analyzed by KUTTL unless explicitly referenced through configuration or a teststep.+* **TestStep** - Is the smallest component of a test, which are a collection of indexed files which govern they order of evaluation.  There are 4 basic types of TestStep files; 1) an apply manifest file (used for test setup), 2) an assert file, 3) an error file (used to assert the absense of a resource) and 4) a TestStep file which can run commands, update timeouts and delete resources. 

Let's remove the "and" at the end of point 3, otherwise :+1: on this suggestion.

kensipe

comment created time in 7 days

Pull request review commentkudobuilder/kuttl

KEP 6 Proposal: KUTTL Test Phase Definition and Lifecycle Hooks

+---+kep-number: 6+short-desc: KUTTL Test Phases and Lifecycle Definitions+title: KUTTL Test Phases and Lifecycle+authors:+  - "@kensipe"+owners:+  - "@kensipe"+editor: "@kensipe"+creation-date: 2020-05-27+last-updated: 2020-05-27+status: provisional+---++# Test Phases and Lifecycle++## Table of Contents++<!-- todo: add  -->++## Summary++KUTTL would benefit from defining different phases of the test lifecycle along with setting expectations of lifecycle hooks for end users.  This is similar to a large number of test frameworks.  Using JUnit as an example, phases are well defined and there are meaningful lifecycle hooks such as `@BeforeAll`, `@AfterAll`, `@BeforeEach` and `@AfterEach`, along with setup and teardown.  This KEP takes on the effort of defining the different test phases and their lifecycles, the lifecycle hooks and how to configure them.+++### Goals++* Define test phases+* Define expected life-cycle and order of phases+* Define the expected life-cycle hooks+* Define how to configure life-cycle hooks+* Separate control plane configuration from TestSuite configuration++### Non-Goals++* Integration with other testing ideologies and technologies+* Cluster creation beyond current KinD integration++### Testing Focus++KUTTL is used for many different purposes.  It can be used for "integration" tests with a mock control plane.  It can be used to run a series of tests which are not portable and is not intended to be.  It can be used to setup and verify a specific configuration of a cluster.  The focus on this KEP is around End to End (e2e) testing in a portable way.  While focusing on e2e, the functionality supporting the other uses should not be diminished. ++### Current Challenges++#### Control Plane and TestSuite++Today KUTTL's TestSuite configuration file provides 2 discreet functions:++1. Test Harness control of KUTTL start with default configurations.  This includes which control plane to use, the number of concurrent tests and the location of TestSuites.+2. TestSuite control which includes; applying manifests, CRDs, and running commands.  This enables a "before" testsuite model, however it is limited to the full collection of TestSuites.++#### Flat TestSuites++Today, KUTTL treats all configured testsuites (listed with `testDirs`) as 1 suite of tests.  While it makes senses for all tests to run under the given the established control plane, it is limiting for a configured testsuite to not be able to express a set of commands or manifests which are specific to that set of tests.  It currently requires the test harness configuration change based on intimate knowledge of the tests.  It would be great to separate out these concerns, which will result in more portable tests.+++## Proposal++### Proposal: Component Definition++![TestSuite Architecture](images/kuttl-testsuite.jpg)++* **TestSuite Collection** - This is a collection of TestSuites, which could be 1 or more.  This traditionaly has been called a TestSuite.  The TestSuite Collection is configured with the test harness configuration file, `kuttl-test.yaml` by default. +* **TestSuite** - Is a folder which contains a collection of Tests defined by sub-folders.+* ** Test** - Is a folder which contains a number of files which define the steps necessary to assert the correct of a test.  If a test folder has sub-folders they are not analyzed by KUTTL unless explicitly referenced through configuration or a teststep.+* **TestStep** - Is the smallest component of a test, which are a collection of indexed files which govern they order of evaluation.  There are 4 basic types of TestStep files; 1) an apply manifest file (used for test setup), 2) an assert file, 3) an error file (used to assert the absense of a resource) and 4) a TestStep file which can run commands, update timeouts and delete resources. ++### Proposal: Component Lifecycle++When a test harness is run (be default with no flags for single test runs), the test harness will analyze all configured TestSuites establishing a memory model for execution (without analyzing every test / testfile).  A change from the current KUTTL, KUTTL will run each TestSuite as a unit (in order to support additional lifecycle hooks detailed in the next section). The order of TestSuite execution will be determined by the order of configuation in the kuttl-test.yaml configuration file.  Each TestSuite will run the set of tests defined by that suite with no guarantees on order.  Each Test will have a series of test steps which will run in index order.++### Proposal: Component Lifecycle hooks++It is proposed that the following activities are useful before and after certain test phases:++1. apply manifests+2. CRD management (create/delete)+3. delete resources+4. run commands

Let's be clear that these activities are run in a specific order. Also, isn't CRD management included in the "apply manifests" and "delete resources" steps? Why does it have to be a separate step?

kensipe

comment created time in 7 days

Pull request review commentkudobuilder/kuttl

KEP 6 Proposal: KUTTL Test Phase Definition and Lifecycle Hooks

+---+kep-number: 6+short-desc: KUTTL Test Phases and Lifecycle Definitions+title: KUTTL Test Phases and Lifecycle+authors:+  - "@kensipe"+owners:+  - "@kensipe"+editor: "@kensipe"+creation-date: 2020-05-27+last-updated: 2020-05-27+status: provisional+---++# Test Phases and Lifecycle++## Table of Contents++<!-- todo: add  -->++## Summary++KUTTL would benefit from defining different phases of the test lifecycle along with setting expectations of lifecycle hooks for end users.  This is similar to a large number of test frameworks.  Using JUnit as an example, phases are well defined and there are meaningful lifecycle hooks such as `@BeforeAll`, `@AfterAll`, `@BeforeEach` and `@AfterEach`, along with setup and teardown.  This KEP takes on the effort of defining the different test phases and their lifecycles, the lifecycle hooks and how to configure them.+++### Goals++* Define test phases+* Define expected life-cycle and order of phases+* Define the expected life-cycle hooks+* Define how to configure life-cycle hooks+* Separate control plane configuration from TestSuite configuration++### Non-Goals++* Integration with other testing ideologies and technologies+* Cluster creation beyond current KinD integration++### Testing Focus++KUTTL is used for many different purposes.  It can be used for "integration" tests with a mock control plane.  It can be used to run a series of tests which are not portable and is not intended to be.  It can be used to setup and verify a specific configuration of a cluster.  The focus on this KEP is around End to End (e2e) testing in a portable way.  While focusing on e2e, the functionality supporting the other uses should not be diminished. ++### Current Challenges++#### Control Plane and TestSuite++Today KUTTL's TestSuite configuration file provides 2 discreet functions:++1. Test Harness control of KUTTL start with default configurations.  This includes which control plane to use, the number of concurrent tests and the location of TestSuites.+2. TestSuite control which includes; applying manifests, CRDs, and running commands.  This enables a "before" testsuite model, however it is limited to the full collection of TestSuites.++#### Flat TestSuites++Today, KUTTL treats all configured testsuites (listed with `testDirs`) as 1 suite of tests.  While it makes senses for all tests to run under the given the established control plane, it is limiting for a configured testsuite to not be able to express a set of commands or manifests which are specific to that set of tests.  It currently requires the test harness configuration change based on intimate knowledge of the tests.  It would be great to separate out these concerns, which will result in more portable tests.+++## Proposal++### Proposal: Component Definition++![TestSuite Architecture](images/kuttl-testsuite.jpg)++* **TestSuite Collection** - This is a collection of TestSuites, which could be 1 or more.  This traditionaly has been called a TestSuite.  The TestSuite Collection is configured with the test harness configuration file, `kuttl-test.yaml` by default. +* **TestSuite** - Is a folder which contains a collection of Tests defined by sub-folders.+* ** Test** - Is a folder which contains a number of files which define the steps necessary to assert the correct of a test.  If a test folder has sub-folders they are not analyzed by KUTTL unless explicitly referenced through configuration or a teststep.+* **TestStep** - Is the smallest component of a test, which are a collection of indexed files which govern they order of evaluation.  There are 4 basic types of TestStep files; 1) an apply manifest file (used for test setup), 2) an assert file, 3) an error file (used to assert the absense of a resource) and 4) a TestStep file which can run commands, update timeouts and delete resources. ++### Proposal: Component Lifecycle++When a test harness is run (be default with no flags for single test runs), the test harness will analyze all configured TestSuites establishing a memory model for execution (without analyzing every test / testfile).  A change from the current KUTTL, KUTTL will run each TestSuite as a unit (in order to support additional lifecycle hooks detailed in the next section). The order of TestSuite execution will be determined by the order of configuation in the kuttl-test.yaml configuration file.  Each TestSuite will run the set of tests defined by that suite with no guarantees on order.  Each Test will have a series of test steps which will run in index order.

Can you explain a bit more how "run each TestSuite as a unit" changes the existing behavior? I understand this in the context of the lifecycle hooks mentioned below, but am not sure how this behavior would be different from the existing one. Except, of course, that there aren't any lifecycle hooks right now. Or maybe just remove the "A change from the current KUTTL" part :)

kensipe

comment created time in 7 days

Pull request review commentkudobuilder/kuttl

KEP 6 Proposal: KUTTL Test Phase Definition and Lifecycle Hooks

+---+kep-number: 6+short-desc: KUTTL Test Phases and Lifecycle Definitions+title: KUTTL Test Phases and Lifecycle+authors:+  - "@kensipe"+owners:+  - "@kensipe"+editor: "@kensipe"+creation-date: 2020-05-27+last-updated: 2020-05-27+status: provisional+---++# Test Phases and Lifecycle++## Table of Contents++<!-- todo: add  -->

Please add :)

kensipe

comment created time in 7 days

Pull request review commentkudobuilder/kuttl

KEP 6 Proposal: KUTTL Test Phase Definition and Lifecycle Hooks

+---+kep-number: 6+short-desc: KUTTL Test Phases and Lifecycle Definitions+title: KUTTL Test Phases and Lifecycle+authors:+  - "@kensipe"+owners:+  - "@kensipe"+editor: "@kensipe"+creation-date: 2020-05-27+last-updated: 2020-05-27+status: provisional+---++# Test Phases and Lifecycle++## Table of Contents++<!-- todo: add  -->++## Summary++KUTTL would benefit from defining different phases of the test lifecycle along with setting expectations of lifecycle hooks for end users.  This is similar to a large number of test frameworks.  Using JUnit as an example, phases are well defined and there are meaningful lifecycle hooks such as `@BeforeAll`, `@AfterAll`, `@BeforeEach` and `@AfterEach`, along with setup and teardown.  This KEP takes on the effort of defining the different test phases and their lifecycles, the lifecycle hooks and how to configure them.+++### Goals++* Define test phases+* Define expected life-cycle and order of phases+* Define the expected life-cycle hooks+* Define how to configure life-cycle hooks+* Separate control plane configuration from TestSuite configuration++### Non-Goals++* Integration with other testing ideologies and technologies+* Cluster creation beyond current KinD integration++### Testing Focus++KUTTL is used for many different purposes.  It can be used for "integration" tests with a mock control plane.  It can be used to run a series of tests which are not portable and is not intended to be.  It can be used to setup and verify a specific configuration of a cluster.  The focus on this KEP is around End to End (e2e) testing in a portable way.  While focusing on e2e, the functionality supporting the other uses should not be diminished. ++### Current Challenges++#### Control Plane and TestSuite++Today KUTTL's TestSuite configuration file provides 2 discreet functions:++1. Test Harness control of KUTTL start with default configurations.  This includes which control plane to use, the number of concurrent tests and the location of TestSuites.+2. TestSuite control which includes; applying manifests, CRDs, and running commands.  This enables a "before" testsuite model, however it is limited to the full collection of TestSuites.++#### Flat TestSuites++Today, KUTTL treats all configured testsuites (listed with `testDirs`) as 1 suite of tests.  While it makes senses for all tests to run under the given the established control plane, it is limiting for a configured testsuite to not be able to express a set of commands or manifests which are specific to that set of tests.  It currently requires the test harness configuration change based on intimate knowledge of the tests.  It would be great to separate out these concerns, which will result in more portable tests.+++## Proposal++### Proposal: Component Definition++![TestSuite Architecture](images/kuttl-testsuite.jpg)

Please include this image as part of this PR.

kensipe

comment created time in 7 days

PR opened kudobuilder/kudo

Reviewers
Fix namespace create without manifest

<!-- Thanks for sending a pull request! Here are some tips for you:

  1. If this is your first time, please read our contributor guidelines: https://github.com/kudobuilder/kudo/blob/master/CONTRIBUTING.md
  2. Make sure you have added and ran the tests before submitting your PR
  3. If the PR is unfinished, start it as a Draft PR: https://github.blog/2019-02-14-introducing-draft-pull-requests/ -->

What this PR does / why we need it: If a namespace was created without a manifest, KUDO would fail with a segfault, because it tries to add an annotation to a nil map. This has been fixed and tests have been added for namespace creation.

<!-- *Automatically closes linked issue when PR is merged. Usage: Fixes #<issue number>, or Fixes (paste link of issue). -->

+59 -0

0 comment

2 changed files

pr created time in 7 days

create barnchkudobuilder/kudo

branch : nfnt/test-namespace-create

created branch time in 7 days

PR opened kudobuilder/kudo

Reviewers
Refactor operator package installation

<!-- Thanks for sending a pull request! Here are some tips for you:

  1. If this is your first time, please read our contributor guidelines: https://github.com/kudobuilder/kudo/blob/master/CONTRIBUTING.md
  2. Make sure you have added and ran the tests before submitting your PR
  3. If the PR is unfinished, start it as a Draft PR: https://github.blog/2019-02-14-introducing-draft-pull-requests/ -->

What this PR does / why we need it: The old 'InstallPackage' function has been extracted into a separate package. It's functionality has been split up into multiple functions handling different installation resources. The function signature of 'install.Package' introduces variadic option parameters to provide a backwards-compatible API.

<!-- *Automatically closes linked issue when PR is merged. Usage: Fixes #<issue number>, or Fixes (paste link of issue). --> In preparation for #1514

+399 -220

0 comment

15 changed files

pr created time in 7 days

create barnchkudobuilder/kudo

branch : nfnt/refactor-package-install

created branch time in 7 days

delete branch kudobuilder/kudo

delete branch : nfnt/separate-e2e-operator-tests

delete time in 8 days

push eventkudobuilder/kudo

Jan Schlicht

commit sha 5e6ccf2f754de83efa5762f4794ff7883f96f08d

Separate E2E and operator tests (#1540) This runs operator tests as a separate test in parallel with the other tests. It makes operator tests independent from E2E test results and their failures distinguishable. Signed-off-by: Jan Schlicht <jan@d2iq.com> Signed-off-by: Ken Sipe <kensipe@gmail.com> Co-authored-by: Ken Sipe <kensipe@gmail.com>

view details

push time in 8 days

PR merged kudobuilder/kudo

Reviewers
Separate E2E and operator tests

<!-- Thanks for sending a pull request! Here are some tips for you:

  1. If this is your first time, please read our contributor guidelines: https://github.com/kudobuilder/kudo/blob/master/CONTRIBUTING.md
  2. Make sure you have added and ran the tests before submitting your PR
  3. If the PR is unfinished, start it as a Draft PR: https://github.blog/2019-02-14-introducing-draft-pull-requests/ -->

What this PR does / why we need it: This runs operator tests as a separate test in parallel with the other tests. It makes operator tests independent from E2E test results and their failures distinguishable.

<!-- *Automatically closes linked issue when PR is merged. Usage: Fixes #<issue number>, or Fixes (paste link of issue). --> Fixes #1539

+61 -20

3 comments

6 changed files

nfnt

pr closed time in 8 days

issue closedkudobuilder/kudo

Separate E2E and operator tests

<!-- Please only use this template for submitting enhancement requests. Implementing your enhancement will follow the KEP process: https://github.com/kudobuilder/kudo/blob/master/keps/0001-kep-process.md -->

What would you like to be added: As part of the E2E tests, the github.com/kudobuilder/operator repository is checked out and its tests are run. This step should run separate from the E2E tests.

Why is this needed: Clearer distinction between failures in E2E and operator tests. Easier handling and parsing of diagnostics logs. Parallel execution of tests.

closed time in 8 days

nfnt

Pull request review commentkudobuilder/kudo

Separate E2E and operator tests

 docker build . \     --build-arg ldflags_arg="" \     -t "kudobuilder/controller:$KUDO_VERSION" +rm -rf operators+git clone https://github.com/kudobuilder/operators+mkdir operators/bin/+cp ./bin/kubectl-kudo operators/bin/

:clap: Oh, good point!

nfnt

comment created time in 9 days

Pull request review commentkudobuilder/kudo

Separate E2E and operator tests

 docker build . \     --build-arg ldflags_arg="" \     -t "kudobuilder/controller:$KUDO_VERSION" +rm -rf operators

Oh, good point! 👏

nfnt

comment created time in 9 days

push eventkudobuilder/kudo

Jan Schlicht

commit sha b5dab3de5b44bc4db8343eabb54fdeef712e0c11

Extract 'sed' call Signed-off-by: Jan Schlicht <jan@d2iq.com>

view details

push time in 9 days

Pull request review commentkudobuilder/kudo

Separate E2E and operator tests

+#!/usr/bin/env bash++set -o errexit+set -o nounset+set -o pipefail+set -o xtrace++INTEGRATION_OUTPUT_JUNIT=${INTEGRATION_OUTPUT_JUNIT:-false}+KUDO_VERSION=${KUDO_VERSION:-test}++docker build . \

This has to do with the test setup that is working around limitations in CircleCI. run-tests.sh starts a Docker container that can run Docker-in-Docker. This container then runs a test script in hack/, e.g. hack/run-operator-tests.sh in this case. The script builds a Docker image of the controller containing the changes of the PR under test and injects this into the kind cluster. I don't like this complicated test setup and tried to refactor it to be simpler but ran into limitations in CircleCI: We have to run the tests in a VM environment to be able to start privileged containers in this VM. But we also need to install prerequisites for our tests in this VM and the easiest way is trough Docker, using the test/Dockerfile. That's why we have this complicated Docker-in-Docker setup here.

nfnt

comment created time in 9 days

Pull request review commentkudobuilder/kudo

Separate E2E and operator tests

+#!/usr/bin/env bash++set -o errexit+set -o nounset+set -o pipefail+set -o xtrace++INTEGRATION_OUTPUT_JUNIT=${INTEGRATION_OUTPUT_JUNIT:-false}+KUDO_VERSION=${KUDO_VERSION:-test}

This is to allow users to override KUDO_VERSION. The default value of test is used as a tag for the Docker image that is build and injected to the kind cluster. This is to avoid clashes with existing Docker images as test is only used in this context.

nfnt

comment created time in 9 days

pull request commentkudobuilder/kudo

Separate E2E and operator tests

Operator tests have the same setup but different concerns from E2E tests. The motivation is to get earlier feedback when these tests fail. In the past, the operator tests were only run if the E2E tests succeeded. This made it hard for larger PRs to distinguish between changes that fix operators and changes that fix E2E tests.

nfnt

comment created time in 9 days

pull request commentkudobuilder/kudo

Separate E2E and operator tests

I verified that test logs and artifacts are collected for E2E tests as well as operator tests.

nfnt

comment created time in 9 days

push eventkudobuilder/kudo

Jan Schlicht

commit sha 2a06d7a730cff601b4d66718118211bb0d57e151

Separate E2E and operator tests This runs operator tests as a separate test in parallel with the other tests. It makes operator tests independent from E2E test results and their failures distinguishable. Signed-off-by: Jan Schlicht <jan@d2iq.com>

view details

push time in 9 days

pull request commentkudobuilder/kudo

Separate E2E and operator tests

Something's wrong with artifact uploading. Checking.

nfnt

comment created time in 9 days

push eventkudobuilder/kudo

Jan Schlicht

commit sha d145b2852a27a987b6947e2df858e40582fbb38e

Separate E2E and operator tests This runs operator tests as a separate test in parallel with the other tests. It makes operator tests independent from E2E test results and their failures distinguishable. Signed-off-by: Jan Schlicht <jan@d2iq.com>

view details

push time in 9 days

push eventkudobuilder/kudo

Jan Schlicht

commit sha 9854fb2146743878eaaf71c5db35943352d62172

Separate E2E and operator tests This runs operator tests as a separate test in parallel with the other tests. It makes operator tests independent from E2E test results and their failures distinguishable. Signed-off-by: Jan Schlicht <jan@d2iq.com>

view details

push time in 9 days

PR opened kudobuilder/kudo

Separate E2E and operator tests

<!-- Thanks for sending a pull request! Here are some tips for you:

  1. If this is your first time, please read our contributor guidelines: https://github.com/kudobuilder/kudo/blob/master/CONTRIBUTING.md
  2. Make sure you have added and ran the tests before submitting your PR
  3. If the PR is unfinished, start it as a Draft PR: https://github.blog/2019-02-14-introducing-draft-pull-requests/ -->

What this PR does / why we need it: This runs operator tests as a separate test in parallel with the other tests. It makes operator tests independent from E2E test results and their failures distinguishable.

<!-- *Automatically closes linked issue when PR is merged. Usage: Fixes #<issue number>, or Fixes (paste link of issue). --> Fixes #1539

+55 -19

0 comment

4 changed files

pr created time in 9 days

create barnchkudobuilder/kudo

branch : nfnt/separate-e2e-operator-tests

created branch time in 9 days

issue openedkudobuilder/kudo

Separate E2E and operator tests

<!-- Please only use this template for submitting enhancement requests. Implementing your enhancement will follow the KEP process: https://github.com/kudobuilder/kudo/blob/master/keps/0001-kep-process.md -->

What would you like to be added: As part of the E2E tests, the github.com/kudobuilder/operator repository is checked out and its tests are run. This step should run separate from the E2E tests.

Why is this needed: Clearer distinction between failures in E2E and operator tests. Easier handling and parsing of diagnostics logs. Parallel execution of tests.

created time in 9 days

push eventkudobuilder/test-tools

Jan Schlicht

commit sha 0ca90a056bde57cf5d8e79adc9fa98b53462f5f3

Update dependencies (#28) Signed-off-by: Jan Schlicht <jan@d2iq.com>

view details

push time in 10 days

delete branch kudobuilder/test-tools

delete branch : nfnt/bump-deps

delete time in 10 days

PR merged kudobuilder/test-tools

Chore: Update dependencies
+131 -31

0 comment

5 changed files

nfnt

pr closed time in 10 days

Pull request review commentkudobuilder/kudo

simplified diagnostics

+package diagnostics++import (+	"fmt"+	"strings"+	"time"++	"github.com/kudobuilder/kudo/pkg/kudoctl/env"+	"github.com/kudobuilder/kudo/pkg/version"++	"github.com/spf13/afero"+)++type Options struct {+	Instance string+	LogSince int64+}++func NewOptions(instance string, logSince time.Duration) *Options {+	opts := Options{Instance: instance}+	if logSince > 0 {+		sec := int64(logSince.Round(time.Second).Seconds())+		opts.LogSince = sec+	}+	return &opts+}++func Collect(fs afero.Fs, options *Options, s *env.Settings) error {+	ir, err := NewInstanceResources(options, s)+	if err != nil {+		return err+	}+	instanceDiagRunner := &Runner{}

Yes, I agree. Thought a bit more about this and how my approach could be done. Realized that your approach provides more readability by making resource collection more consistent.

vemelin-epm

comment created time in 10 days

Pull request review commentkudobuilder/kudo

simplified diagnostics

+package diagnostics++import (+	"fmt"++	"github.com/kudobuilder/kudo/pkg/apis/kudo/v1beta1"++	v1 "k8s.io/api/core/v1"+	"k8s.io/apimachinery/pkg/api/meta"+	"k8s.io/apimachinery/pkg/runtime"+)++// processingContext - shared data for the resource collectors+// provides property accessors allowing to define a collector before the data it needs is available+// provides update callback functions. callbacks panic if called on a wrong type of runtime.object+type processingContext struct {+	podNames      []string+	root          string+	opName        string+	opVersionName string+	instanceName  string+}++func (ctx *processingContext) attachToRoot() string {+	return ctx.root+}++func (ctx *processingContext) attachToOperator() string {+	return fmt.Sprintf("%s/operator_%s", ctx.root, ctx.opName)+}++func (ctx *processingContext) attachToInstance() string {+	return fmt.Sprintf("%s/instance_%s", ctx.attachToOperator(), ctx.instanceName)+}++func (ctx *processingContext) mustSetOperatorNameFromOperatorVersion(o runtime.Object) {+	ctx.opName = o.(*v1beta1.OperatorVersion).Spec.Operator.Name+}++func (ctx *processingContext) mustSetOperatorVersionNameFromInstance(o runtime.Object) {+	ctx.opVersionName = o.(*v1beta1.Instance).Spec.OperatorVersion.Name+}++func (ctx *processingContext) mustAddPodNames(o runtime.Object) {

Also, if this must do something, "or else", let's not swallow the error case here but panic if meta.EachListItem returns an error. If it's okay to ignore possible error here, that isn't a "must" function... at least in my book :)

vemelin-epm

comment created time in 10 days

Pull request review commentkudobuilder/kudo

simplified diagnostics

+package diagnostics++import (+	"github.com/kudobuilder/kudo/pkg/kudoctl/env"+	"github.com/kudobuilder/kudo/pkg/kudoctl/util/kudo"+	"github.com/kudobuilder/kudo/pkg/version"+)++type runnerHelper struct {+	p *nonFailingPrinter+}++func (rh *runnerHelper) runForInstance(instance string, options *Options, c *kudo.Client, info version.Info, s *env.Settings) error {+	ir, err := newInstanceResources(instance, options, c, s)+	if err != nil {+		return err+	}++	ctx := &processingContext{root: DiagDir, instanceName: instance}+	instanceDiagRunner := &runner{}+	instanceDiagRunner.+		run(resourceCollectorGroup{+			{+				loadResourceFn: ir.instance,+				errKind:        "instance",+				parentDir:      ctx.attachToOperator,+				failOnError:    true,+				callback:       ctx.mustSetOperatorVersionNameFromInstance,+				printer:        rh.p,+				printMode:      ObjectWithDir},+			{+				loadResourceFn: ir.operatorVersion(ctx.operatorVersionName),+				errKind:        "operatorversion",+				parentDir:      ctx.attachToOperator,+				failOnError:    true,+				callback:       ctx.mustSetOperatorNameFromOperatorVersion,+				printer:        rh.p,+				printMode:      ObjectWithDir},+			{+				loadResourceFn: ir.operator(ctx.operatorName),+				errKind:        "operator",+				parentDir:      ctx.attachToRoot,+				failOnError:    true,+				printer:        rh.p,+				printMode:      ObjectWithDir}}).+		run(&resourceCollector{+			loadResourceFn: ir.pods,+			errKind:        "pod",+			parentDir:      ctx.attachToInstance,+			callback:       ctx.mustAddPodNames,+			printer:        rh.p,+			printMode:      ObjectListWithDirs}).+		run(&resourceCollector{+			loadResourceFn: ir.services,+			errKind:        "service",+			parentDir:      ctx.attachToInstance,+			printer:        rh.p,+			printMode:      RuntimeObject}).

I guess it's the error propagation. Each subsequent run only executes if there weren't any errors before. So it's equivalent to

r := &runner{}
if err := r.run(...); err != nil {
  return ...
}
if err := r.run(...); err != nil {
 return ...
}
...

which would be more explicit about errors. Not a blocker for me as well, but I'd prefer the more explicit error handling instead of the builder pattern here.

vemelin-epm

comment created time in 10 days

Pull request review commentkudobuilder/kudo

simplified diagnostics

+package diagnostics++import (+	"log"+	"os"+	"testing"+	"time"++	"github.com/ghodss/yaml"+	"github.com/spf13/afero"+	"github.com/stretchr/testify/assert"++	appsv1 "k8s.io/api/apps/v1"+	corev1 "k8s.io/api/core/v1"+	rbacv1beta1 "k8s.io/api/rbac/v1beta1"+	"k8s.io/apimachinery/pkg/api/errors"+	"k8s.io/apimachinery/pkg/api/meta"+	"k8s.io/apimachinery/pkg/runtime"+	"k8s.io/apimachinery/pkg/runtime/schema"+	"k8s.io/apimachinery/pkg/util/json"+	kubefake "k8s.io/client-go/kubernetes/fake"+	clienttesting "k8s.io/client-go/testing"++	"github.com/kudobuilder/kudo/pkg/apis/kudo/v1beta1"+	"github.com/kudobuilder/kudo/pkg/client/clientset/versioned/fake"+	"github.com/kudobuilder/kudo/pkg/kudoctl/env"+	"github.com/kudobuilder/kudo/pkg/kudoctl/util/kudo"+)++const (+	fakeNamespace  = "my-namespace"+	fakeZkInstance = "zookeeper-instance"+)++const (+	zkOperatorFile        = "diag/operator_zookeeper/zookeeper.yaml"+	zkOperatorVersionFile = "diag/operator_zookeeper/operatorversion_zookeeper-0.3.0/zookeeper-0.3.0.yaml"+	zkPod2File            = "diag/operator_zookeeper/instance_zookeeper-instance/pod_zookeeper-instance-zookeeper-2/zookeeper-instance-zookeeper-2.yaml"+	zkLog2File            = "diag/operator_zookeeper/instance_zookeeper-instance/pod_zookeeper-instance-zookeeper-2/zookeeper-instance-zookeeper-2.log.gz"+	zkServicesFile        = "diag/operator_zookeeper/instance_zookeeper-instance/servicelist.yaml"+	zkPod0File            = "diag/operator_zookeeper/instance_zookeeper-instance/pod_zookeeper-instance-zookeeper-0/zookeeper-instance-zookeeper-0.yaml"+	zkLog0File            = "diag/operator_zookeeper/instance_zookeeper-instance/pod_zookeeper-instance-zookeeper-0/zookeeper-instance-zookeeper-0.log.gz"+	zkInstanceFile        = "diag/operator_zookeeper/instance_zookeeper-instance/zookeeper-instance.yaml"+	zkPod1File            = "diag/operator_zookeeper/instance_zookeeper-instance/pod_zookeeper-instance-zookeeper-1/zookeeper-instance-zookeeper-1.yaml"+	zkLog1File            = "diag/operator_zookeeper/instance_zookeeper-instance/pod_zookeeper-instance-zookeeper-1/zookeeper-instance-zookeeper-1.log.gz"+	zkStatefulSetsFile    = "diag/operator_zookeeper/instance_zookeeper-instance/statefulsetlist.yaml"+	versionFile           = "diag/version.yaml"+	kmServicesFile        = "diag/kudo/servicelist.yaml"+	kmPodFile             = "diag/kudo/pod_kudo-controller-manager-0/kudo-controller-manager-0.yaml"+	kmLogFile             = "diag/kudo/pod_kudo-controller-manager-0/kudo-controller-manager-0.log.gz"+	kmServiceAccountsFile = "diag/kudo/serviceaccountlist.yaml"+	kmStatefulSetsFile    = "diag/kudo/statefulsetlist.yaml"+	settingsFile          = "diag/settings.yaml"+)++// defaultFileNames - all the files that should be created if no error happens+func defaultFileNames() map[string]struct{} {+	return map[string]struct{}{+		zkOperatorFile:        {},+		zkOperatorVersionFile: {},+		zkPod2File:            {},+		zkLog2File:            {},+		zkServicesFile:        {},+		zkPod0File:            {},+		zkLog0File:            {},+		zkInstanceFile:        {},+		zkPod1File:            {},+		zkLog1File:            {},+		zkStatefulSetsFile:    {},+		versionFile:           {},+		kmServicesFile:        {},+		kmPodFile:             {},+		kmLogFile:             {},+		kmServiceAccountsFile: {},+		kmStatefulSetsFile:    {},+		settingsFile:          {},+	}+}++// resource to be loaded into fake clients+var (+	// resource of the instance for which diagnostics is run+	pods            corev1.PodList+	serviceAccounts corev1.ServiceAccountList+	services        corev1.ServiceList+	statefulsets    appsv1.StatefulSetList+	pvs             corev1.PersistentVolumeList+	pvcs            corev1.PersistentVolumeClaimList+	operator        v1beta1.Operator+	operatorVersion v1beta1.OperatorVersion+	instance        v1beta1.Instance++	// kudo-manager resources+	kmNs              corev1.Namespace+	kmPod             corev1.Pod+	kmServices        corev1.ServiceList+	kmServiceAccounts corev1.ServiceAccountList+	kmStatefulsets    appsv1.StatefulSetList++	// resources unrelated to the diagnosed instance or kudo-manager, should not be collected+	cowPod                corev1.Pod+	defaultServiceAccount corev1.ServiceAccount+	clusterRole           rbacv1beta1.ClusterRole+)++var (+	kubeObjects objectList+	kudoObjects objectList+)++func check(err error) {+	if err != nil {+		log.Fatalln(err)+	}+}++func assertNilError(t *testing.T) func(error) {+	return func(e error) {+		assert.Nil(t, e)+	}+}++func mustReadObjectFromYaml(fs afero.Fs, fname string, object runtime.Object, check func(error)) {

Nit: The check parameter is shadowing the check function defined in this file. Let's rename the parameter (checkFn maybe?) so that this is clearer.

vemelin-epm

comment created time in 10 days

Pull request review commentkudobuilder/kudo

simplified diagnostics

+package diagnostics++import (+	"fmt"+	"io"+	"path/filepath"+	"reflect"++	"k8s.io/apimachinery/pkg/api/meta"+	"k8s.io/apimachinery/pkg/runtime"+)++// resourceCollector - collector interface implementation for Kubernetes resources (runtime objects)+type resourceCollector struct {+	loadResourceFn func() (runtime.Object, error)+	errKind        string                 // object kind used to describe the error+	parentDir      func() string          // parent dir to attach the printer's output+	failOnError    bool                   // define whether the collector should return the error+	callback       func(o runtime.Object) // should be used to update some shared context+	printer        *nonFailingPrinter+	printMode      printMode+}++// collect - load a resource and send either the resource or collection error to printer+// return error if failOnError field is set to true+// if failOnError is true, finding no object(s) is treated as an error+func (c *resourceCollector) collect() error {+	obj, err := c.loadResourceFn()+	switch {+	case err != nil:+		if c.failOnError {+			return fmt.Errorf("failed to retrieve object(s) of kind %s: %v", c.errKind, err)+		}+		c.printer.printError(err, c.parentDir(), c.errKind)+	case obj == nil || reflect.ValueOf(obj).IsNil() || meta.IsListType(obj) && meta.LenList(obj) == 0:+		if c.failOnError {+			return fmt.Errorf("no object(s) of kind %s retrieved", c.errKind)+		}+	default:+		if c.callback != nil {+			c.callback(obj)+		}+		c.printer.printObject(obj, c.parentDir(), c.printMode)+	}+	return nil+}++// resourceCollectorGroup - a composite collector for Kubernetes runtime objects whose loading and printing depend on+// each other's side-effects on the shared context+type resourceCollectorGroup []resourceCollector++// collect - collect resource and run callback for each collector, print all afterwards+// collection failures are treated as fatal regardless of the collectors failOnError flag setting+func (g resourceCollectorGroup) collect() error {+	objs := make([]runtime.Object, len(g))+	modes := make([]printMode, len(g))+	for i, c := range g {+		obj, err := c.loadResourceFn()+		if err != nil {+			return fmt.Errorf("failed to retrieve object(s) of kind %s: %v", c.errKind, err)+		}+		if obj == nil || reflect.ValueOf(obj).IsNil() || meta.IsListType(obj) && meta.LenList(obj) == 0 {

See operator precedence nit above.

vemelin-epm

comment created time in 10 days

Pull request review commentkudobuilder/kudo

simplified diagnostics

+package diagnostics++import (+	"fmt"++	"github.com/kudobuilder/kudo/pkg/apis/kudo/v1beta1"++	v1 "k8s.io/api/core/v1"+	"k8s.io/apimachinery/pkg/api/meta"+	"k8s.io/apimachinery/pkg/runtime"+)++// processingContext - shared data for the resource collectors+// provides property accessors allowing to define a collector before the data it needs is available+// provides update callback functions. callbacks panic if called on a wrong type of runtime.Object+type processingContext struct {

Yeah, I can to the same conclusion. This isn't an easy problem and having a homogeneous collection with a dynamic context seems to be the most readable approach.

vemelin-epm

comment created time in 10 days

Pull request review commentkudobuilder/kudo

simplified diagnostics

+package diagnostics++// collector - generic interface for diagnostic data collection+// implementors are expected to return only fatal errors and handle non-fatal ones themselves+type collector interface {+	collect() error+}++// runner - sequential runner for Collectors reducing error checking boilerplate code+type runner struct {+	fatalErr error+}++func (r *runner) run(c collector) *runner {+	if r.fatalErr == nil {+		r.fatalErr = c.collect()+	}+	return r+}++func (r *runner) runForEach(names []string, fn func(string) collector) *runner {

This function is only used with logCollector.collect which always returns nil, i.e. no error. As a result, no further error checking is performed in this function. Hence, logCollector.collect could be changes to return nothing instead of an error that is always nil and this function could be changed to

func (r *runner) runForEach(names []string, fn func(string)) *runner {
  if r.fatalErr == nil {
    for _, name := range names {
      collector := fn(name)
      collector.collect()
    }
  }
  return r
}
vemelin-epm

comment created time in 10 days

Pull request review commentkudobuilder/kudo

simplified diagnostics

+package cmd++import (+	"fmt"+	"time"++	"github.com/spf13/afero"+	"github.com/spf13/cobra"++	"github.com/kudobuilder/kudo/pkg/kudoctl/cmd/diagnostics"+	"github.com/kudobuilder/kudo/pkg/kudoctl/util/kudo"+)++const (+	diagCollectExample = `  # collect diagnostics example+  kubectl kudo diagnostics collect --instance=%instance% --namespace=%namespace%+`+)++func newDiagnosticsCmd(fs afero.Fs) *cobra.Command {+	cmd := &cobra.Command{+		Use:   "diagnostics",+		Short: "collect diagnostics",+		Long:  "diagnostics command has sub-commands to collect and analyze diagnostics data",+	}+	cmd.AddCommand(newDiagnosticsCollectCmd(fs))+	return cmd+}++func newDiagnosticsCollectCmd(fs afero.Fs) *cobra.Command {+	var logSince time.Duration+	var instance string+	cmd := &cobra.Command{+		Use:     "collect",+		Short:   "collect diagnostics",+		Long:    "collect data relevant for diagnostics of the provided instance's state",+		Example: diagCollectExample,+		RunE: func(cmd *cobra.Command, args []string) error {+			c, err := kudo.NewClient(Settings.KubeConfig, Settings.RequestTimeout, Settings.Validate)+			if err != nil {+				return fmt.Errorf("failed to create kudo client: %v", err)+			}+			return diagnostics.Collect(fs, instance, diagnostics.NewOptions(logSince), c, &Settings)+		},+	}+	cmd.Flags().StringVar(&instance, "instance", "", "The instance name.")+	cmd.Flags().DurationVar(&logSince, "log-since", 0, "Only return logs newer than a relative duration like 5s, 2m, or 3h. Defaults to all logs. Only one of since-time / since may be used.")

IMO, it's clear that this is a single-use parameter. Also s/since-time/log-since/

	cmd.Flags().DurationVar(&logSince, "log-since", 0, "Only return logs newer than a relative duration like 5s, 2m, or 3h. Defaults to all logs.")
vemelin-epm

comment created time in 10 days

Pull request review commentkudobuilder/kudo

simplified diagnostics

+package diagnostics++import (+	"fmt"+	"io"+	"path/filepath"+	"reflect"++	"k8s.io/apimachinery/pkg/api/meta"+	"k8s.io/apimachinery/pkg/runtime"+)++// resourceCollector - collector interface implementation for Kubernetes resources (runtime objects)+type resourceCollector struct {+	loadResourceFn func() (runtime.Object, error)+	errKind        string                 // object kind used to describe the error+	parentDir      func() string          // parent dir to attach the printer's output+	failOnError    bool                   // define whether the collector should return the error+	callback       func(o runtime.Object) // should be used to update some shared context+	printer        *nonFailingPrinter+	printMode      printMode+}++// collect - load a resource and send either the resource or collection error to printer+// return error if failOnError field is set to true+// if failOnError is true, finding no object(s) is treated as an error+func (c *resourceCollector) collect() error {+	obj, err := c.loadResourceFn()+	switch {+	case err != nil:+		if c.failOnError {+			return fmt.Errorf("failed to retrieve object(s) of kind %s: %v", c.errKind, err)+		}+		c.printer.printError(err, c.parentDir(), c.errKind)+	case obj == nil || reflect.ValueOf(obj).IsNil() || meta.IsListType(obj) && meta.LenList(obj) == 0:

Nit, because operator precedence isn't always well-known:

	case (obj == nil) || reflect.ValueOf(obj).IsNil() || (meta.IsListType(obj) && (meta.LenList(obj) == 0)):
vemelin-epm

comment created time in 10 days

Pull request review commentkudobuilder/kudo

simplified diagnostics

+package cmd++import (+	"fmt"+	"time"++	"github.com/spf13/afero"+	"github.com/spf13/cobra"++	"github.com/kudobuilder/kudo/pkg/kudoctl/cmd/diagnostics"+	"github.com/kudobuilder/kudo/pkg/kudoctl/util/kudo"+)++const (+	diagCollectExample = `  # collect diagnostics example+  kubectl kudo diagnostics collect --instance=%instance% --namespace=%namespace%

Consistency nitpick: Other command examples don't use "%instance%" and "%namespace%" placeholders. Looks like all examples are using "dev-flink" instead. Personally, I'd prefer "%instance%" here, but let's stay consistent and use "dev-flink" instead. Also, let's remove the namespace parameter.

vemelin-epm

comment created time in 10 days

more