profile
viewpoint

PR opened talos-systems/talos

feat: implement streaming mode of dmesg, parse messages

Fixes #1563

This implements dmesg reading via /dev/kmsg, with message parsing and formatting. Kernel log facility and severity are parsed, timestamp is calculated relative to boot time (it's accurate unless time jumps a lot during node lifetime).

New flags to follow dmesg was added, tail flag allows to stream only new message (ignoring old messages). We could try to implement tailing last N messages, just a bit more work, open to suggestions (for symmetry with regular logs).

Signed-off-by: Andrey Smirnov smirnov.andrey@gmail.com

+728 -34

0 comment

14 changed files

pr created time in 12 hours

push eventsmira/talos

Andrey Smirnov

commit sha e199c4064c7af4ef6c739a14dd5ddc2b4d97c0b1

feat: implement streaming mode of dmesg, parse messages Fixes #1563 This implements dmesg reading via `/dev/kmsg`, with message parsing and formatting. Kernel log facility and severity are parsed, timestamp is calculated relative to boot time (it's accurate unless time jumps a lot during node lifetime). New flags to follow dmesg was added, tail flag allows to stream only new message (ignoring old messages). We could try to implement tailing last N messages, just a bit more work, open to suggestions (for symmetry with regular logs). Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>

view details

push time in 12 hours

create barnchsmira/talos

branch : dmesg-real-streaming

created branch time in 12 hours

Pull request review commenttalos-systems/talos

refactor: rename protobuf services, RPCs, and messages

 func main() {  	// all existing streaming methods 	for _, methodName := range []string{-		"/machine.Machine/CopyOut",+		"/machine.Machine/Copy",

:+1:

andrewrynhard

comment created time in 15 hours

Pull request review commenttalos-systems/talos

feat: add config nodes command

 func Execute() {  	rootCmd.PersistentFlags().StringVar(&talosconfig, "talosconfig", defaultTalosConfig, "The path to the Talos configuration file") 	rootCmd.PersistentFlags().StringVar(&cmdcontext, "context", "", "Context to be used in command")-	rootCmd.PersistentFlags().StringSliceVarP(&nodes, "nodes", "", []string{}, "target the specified nodes")+	rootCmd.PersistentFlags().StringSliceVarP(&nodes, "nodes", "n", []string{}, "target the specified nodes")

I skipped that as I thought -n is usually shorthand for --dry-run, but probably that is not important

andrewrynhard

comment created time in 17 hours

Pull request review commenttalos-systems/talos

feat: add config nodes command

 var configEndpointCmd = &cobra.Command{ 	}, } +// configNodeCmd represents the config node command.+var configNodeCmd = &cobra.Command{+	Use:     "node <endpoint>...",

should be "<node>" or "<address>"? as we don't allow port there?

andrewrynhard

comment created time in 17 hours

issue commenttalos-systems/talos

test-framework/basic-integration failed to fail

I think Brad's suggestion makes perfect sense to use /bin/bash as an entrypoint, and launch hyperkube if we need to access kubectl? that way we get good error handling

bradbeam

comment created time in 17 hours

issue commenttalos-systems/talos

test-framework/basic-integration failed to fail

I guess that's the reason we don't see the failure: https://github.com/kubernetes/kubernetes/blob/master/cluster/images/hyperkube/hyperkube#L56

bradbeam

comment created time in a day

create barnchsmira/go-kmsg-parser

branch : nofollow

created branch time in 2 days

Pull request review commenttalos-systems/talos

feat: Upgrade kubernetes to 1.17.0

 A collection of commands for managing local docker-based clusters       --context string       Context to be used in command   -e, --endpoints strings    override default endpoints in Talos configuration       --nodes strings        target the specified nodes-      --talosconfig string   The path to the Talos configuration file (default "/home/smira/.talos/config")+      --talosconfig string   The path to the Talos configuration file (default "/home/user/.talos/config")

:man_facepalming: missed that last time

bradbeam

comment created time in 2 days

Pull request review commenttalos-systems/talos

feat: add create and overwrite file operations

 func (task *ExtraFiles) TaskFunc(mode runtime.Mode) phase.TaskFunc { 	return task.runtime } +// nolint: gocyclo func (task *ExtraFiles) runtime(r runtime.Runtime) (err error) {-	var result *multierror.Error+	var (+		content string+		result  *multierror.Error+	)  	for _, f := range r.Config().Machine().Files() {-		// Slurp existing file if append is our op and add contents to it-		if f.Op == "append" {+		switch f.Op {+		case "create":+			if err = existsAndIsFile(f.Path); err == nil {+				return fmt.Errorf("file must not exist: %q", f.Path)+			}+		case "overwrite":+			if err = existsAndIsFile(f.Path); err != nil {+				return err+			}++			if err = os.Remove(f.Path); err != nil {

yep, as it will bind mount overwriting it effectively I guess in any case

andrewrynhard

comment created time in 2 days

fork smira/go-kmsg-parser

A simpler parser for the /dev/kmsg format

fork in 2 days

Pull request review commenttalos-systems/talos

fix: require file operation

 type MachineConfig struct { 	MachineInstall *InstallConfig `yaml:"install,omitempty"` 	//   description: | 	//     Allows the addition of user specified files.+	//     The value of `op` can be `create`, `overwrite`, or `append`.+	//     In the case of `create`, `path` must not exist.+	//     In the case of `overwrite`, and `append`, `path` must be a valid file.+	//     If an `op` value of `append` is used, the existing file will be appended. 	//     Note that the file contents are not required to be base64 encoded. 	//   examples: 	//     - |-	//       kubelet:-	//         contents: |-	//           ...-	//         permissions: 0666-	//         path: /tmp/file.txt+	//       files:+	//         - contents: |

should be content now?

andrewrynhard

comment created time in 2 days

Pull request review commenttalos-systems/talos

fix: require file operation

 func (task *ExtraFiles) TaskFunc(mode runtime.Mode) phase.TaskFunc { 	return task.runtime } +// nolint: gocyclo func (task *ExtraFiles) runtime(r runtime.Runtime) (err error) {-	var result *multierror.Error+	var (+		content string+		result  *multierror.Error+	)  	for _, f := range r.Config().Machine().Files() {-		// Slurp existing file if append is our op and add contents to it-		if f.Op == "append" {+		switch f.Op {+		case "create":+			if err = existsAndIsFile(f.Path); err == nil {+				return fmt.Errorf("file must not exist: %q", f.Path)+			}+		case "overwrite":+			if err = existsAndIsFile(f.Path); err != nil {+				return err+			}++			if err = os.Remove(f.Path); err != nil {

can we remove from readonly rootfs? or we can just bind mount on top of that?

andrewrynhard

comment created time in 2 days

Pull request review commenttalos-systems/talos

fix: require file operation

 type MachineConfig struct { 	MachineInstall *InstallConfig `yaml:"install,omitempty"` 	//   description: | 	//     Allows the addition of user specified files.+	//     If an `op` value of `append` is used, the existing file will be appended.

I guess we need to document that op can be one of: write, append, as write is not in the docs?

andrewrynhard

comment created time in 2 days

Pull request review commenttalos-systems/talos

fix: require file operation

 install: #### files  Allows the addition of user specified files.+If an `op` value of `append` is used, the existing file will be appended. Note that the file contents are not required to be base64 encoded.  Type: `array`  Examples:  ```yaml-kubelet:-  contents: |-    ...-  permissions: 0666-  path: /tmp/file.txt+files:+  - contents: |+      ...+    permissions: 0666+    path: /tmp/file.txt+    op: [write,append]

is this a list, or this is supposed to be single value?

andrewrynhard

comment created time in 2 days

push eventtalos-systems/talos

Andrey Smirnov

commit sha 399aeda0b9470e4d3c7b14d701fb9ecdc64bbaf0

feat: rename confusing target options, --endpoints, etc. Fixes #1610 1. In `talosconfig`, deprecate `Target` in favor of `Endpoints` (client-side LB to come next). 2. In `osctl`, use `--nodes` in place of `--target`. 3. In `osctl` add option `--endpoints` to override `Endpoints` for the call. Other changes are just updates to catch up with the changes. Most probably I missed something... And CAPI provider needs update. Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>

view details

push time in 2 days

PR merged talos-systems/talos

feat: rename confusing target options, --endpoints, etc.

Fixes #1610

  1. In talosconfig, deprecate Target in favor of Endpoints (client-side LB to come next).

  2. In osctl, use --nodes in place of --target.

  3. In osctl add option --endpoints to override Endpoints for the call.

Other changes are just updates to catch up with the changes. Most probably I missed something... And CAPI provider needs update.

Signed-off-by: Andrey Smirnov smirnov.andrey@gmail.com

+304 -227

0 comment

66 changed files

smira

pr closed time in 2 days

issue closedtalos-systems/talos

Rename target/targets in talosconfig/osctl

In talosconfig, get rid of Target field and command config target. In general, it's not safe to change target to any node, as some commands (e.g. kubeconfig) work only against master nodes. At the same time it's easy to forget that target was changed in config and try to debug failures coming from it.

Plus, config target is confusing with --target flag which in fact do two completely different things.

Proposal: drop talosconfig.Target, introduce talosconfig.Endpoints []string which list all the master endpoints. Talos client should put them all to grpc.ClientConn for round-robin/failover.

Add flag -e/--endpoint <endpoint> to osctl to change the endpoint for the call (if we want to hit a specific node).

Rename --target flag to --nodes. (Probably makes sense to rename grpc metadata to nodes as well to stay consistent).

closed time in 2 days

smira

pull request commenttalos-systems/talos

feat: add security hardening settings

I guess it's https://github.com/talos-systems/bootkube/pull/2

andrewrynhard

comment created time in 2 days

Pull request review commenttalos-systems/talos

feat: rename confusing target options, --endpoints, etc.

 type Config struct { 	Contexts map[string]*Context `yaml:"contexts"` } +func (c *Config) upgrade() {+	for _, ctx := range c.Contexts {+		ctx.upgrade()+	}+}+ // Context represents the set of credentials required to talk to a target. type Context struct {-	Target string `yaml:"target"`-	CA     string `yaml:"ca"`-	Crt    string `yaml:"crt"`-	Key    string `yaml:"key"`+	DeprecatedTarget string   `yaml:"target,omitempty"` // Field deprecated in favor of Endpoints+	Endpoints        []string `yaml:"endpoints"`

added, acts as default value for --nodes

smira

comment created time in 2 days

push eventsmira/talos

Andrey Smirnov

commit sha f45966c7976499eaa74aee51980cc955a18d1470

feat: rename confusing target options, --endpoints, etc. Fixes #1610 1. In `talosconfig`, deprecate `Target` in favor of `Endpoints` (client-side LB to come next). 2. In `osctl`, use `--nodes` in place of `--target`. 3. In `osctl` add option `--endpoints` to override `Endpoints` for the call. Other changes are just updates to catch up with the changes. Most probably I missed something... And CAPI provider needs update. Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>

view details

push time in 2 days

Pull request review commenttalos-systems/talos

feat: rename confusing target options, --endpoints, etc.

 type Config struct { 	Contexts map[string]*Context `yaml:"contexts"` } +func (c *Config) upgrade() {+	for _, ctx := range c.Contexts {+		ctx.upgrade()+	}+}+ // Context represents the set of credentials required to talk to a target. type Context struct {-	Target string `yaml:"target"`-	CA     string `yaml:"ca"`-	Crt    string `yaml:"crt"`-	Key    string `yaml:"key"`+	DeprecatedTarget string   `yaml:"target,omitempty"` // Field deprecated in favor of Endpoints+	Endpoints        []string `yaml:"endpoints"`

default to empty? and command to set them?

smira

comment created time in 2 days

issue openedtalos-systems/talos

[osctl] client load-balancing

Implement client-side load-balancing across endpoints in talosconfig.

See https://github.com/grpc/grpc-go/tree/master/examples/features/load_balancing/client but need to figure out full story with TLS creds, DNS, etc.

created time in 2 days

push eventsmira/talos

Brad Beam

commit sha 9f69733d747dfdac9af7403c5446e3a4b2e523df

chore: Remove increased timeouts for dhcp addressing. These timeouts were initially increased to handle long times for links to be ready. I think with the updated link ready check in networkd these timers are unnecessary. Signed-off-by: Brad Beam <brad.beam@talos-systems.com>

view details

Andrew Rynhard

commit sha fa515b81171059386ddff03280f2989e0ac1fd3b

fix: kill POD network mode pods first on upgrades When we upgrade a node, we kill off all pods before performing a fresh install. The issue with this is that we run the risk of killing the CNI pod before we finish killing all other pods, leaving the CRI unable to teardown the pod's networking. This works around that by first killing any pods running without host networking so that the CNI can do its' job, and then removing the remaining pods. Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>

view details

Andrey Smirnov

commit sha b9d9569a004339d00a00486e83e7e10ca96146c8

feat: rename confusing target options, --endpoints, etc. Fixes #1610 1. In `talosconfig`, deprecate `Target` in favor of `Endpoints` (client-side LB to come next). 2. In `osctl`, use `--nodes` in place of `--target`. 3. In `osctl` add option `--endpoints` to override `Endpoints` for the call. Other changes are just updates to catch up with the changes. Most probably I missed something... And CAPI provider needs update. Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>

view details

push time in 2 days

PR opened talos-systems/talos

feat: rename confusing target options, --endpoints, etc.

Fixes #1610

  1. In talosconfig, deprecate Target in favor of Endpoints (client-side LB to come next).

  2. In osctl, use --nodes in place of --target.

  3. In osctl add option --endpoints to override Endpoints for the call.

Other changes are just updates to catch up with the changes. Most probably I missed something... And CAPI provider needs update.

Signed-off-by: Andrey Smirnov smirnov.andrey@gmail.com

+280 -217

0 comment

64 changed files

pr created time in 2 days

create barnchsmira/talos

branch : targets-endpoints-node

created branch time in 2 days

push eventtalos-systems/talos

Andrey Smirnov

commit sha 3a93e65b5480a02c22397244284417d4ee5c5b46

feat: make osd.Dmesg API streaming This is to prepare for upcoming switch to reading `/dev/kmsg` which should allow following logs, doing some kind of tail, etc. The output is far from being perfect, as `dmesg` data is delivered as single chunk (not as lines), but once server side updates, client side should match it. Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>

view details

push time in 3 days

PR merged talos-systems/talos

feat: make osd.Dmesg API streaming

This is to prepare for upcoming switch to reading /dev/kmsg which should allow following logs, doing some kind of tail, etc.

The output is far from being perfect, as dmesg data is delivered as single chunk (not as lines), but once server side updates, client side should match it.

Signed-off-by: Andrey Smirnov smirnov.andrey@gmail.com

+260 -153

0 comment

6 changed files

smira

pr closed time in 3 days

Pull request review commenttalos-systems/talos

fix: kill POD network mode pods first on upgrades

 func (c *Client) PodSandboxStatus(ctx context.Context, podSandBoxID string) (*ru  	return resp.Status, resp.Info, nil }++// RemovePodSandboxesWithNetworkMode removes all pods with the specified network mode.+func (c *Client) RemovePodSandboxesWithNetworkMode(mode runtimeapi.NamespaceMode) (err error) {+	ctx := context.Background()++	pods, err := c.ListPodSandbox(ctx, nil)+	if err != nil {+		return err+	}++	var g errgroup.Group++	for _, pod := range pods {+		pod := pod // https://golang.org/doc/faq#closures_and_goroutines++		status, _, err := c.PodSandboxStatus(ctx, pod.GetId())+		if err != nil {+			return err+		}++		networkMode := status.GetLinux().GetNamespaces().GetOptions().GetNetwork()++		if networkMode != mode {+			continue+		}++		g.Go(func() error {+			if err := remove(ctx, c, pod); err != nil {+				return err

oh, missed that, sorry for the noise!

andrewrynhard

comment created time in 3 days

Pull request review commenttalos-systems/talos

fix: kill POD network mode pods first on upgrades

 func (c *Client) PodSandboxStatus(ctx context.Context, podSandBoxID string) (*ru  	return resp.Status, resp.Info, nil }++// RemovePodSandboxesWithNetworkMode removes all pods with the specified network mode.+func (c *Client) RemovePodSandboxesWithNetworkMode(mode runtimeapi.NamespaceMode) (err error) {+	ctx := context.Background()++	pods, err := c.ListPodSandbox(ctx, nil)+	if err != nil {+		return err+	}++	var g errgroup.Group++	for _, pod := range pods {+		pod := pod // https://golang.org/doc/faq#closures_and_goroutines++		status, _, err := c.PodSandboxStatus(ctx, pod.GetId())+		if err != nil {+			return err+		}++		networkMode := status.GetLinux().GetNamespaces().GetOptions().GetNetwork()++		if networkMode != mode {+			continue+		}++		g.Go(func() error {+			if err := remove(ctx, c, pod); err != nil {+				return err

looks like could be return remove(ctx, c, pod)

plus it needs to wrap variables, as pod is for loop variable

andrewrynhard

comment created time in 3 days

Pull request review commenttalos-systems/talos

fix: kill POD network mode pods first on upgrades

 func (task *RemoveAllPods) standard() (err error) { 	// nolint: errcheck 	defer client.Close() +	// We remove pods with POD network mode first so that the CNI can perform+	// any cleanup tasks. If we don't do this, we run the risk of killing the+	// CNI, preventing the CRI from cleaning up the pod's netwokring.++	if err = RemovePodsWithNetworkMode(client, runtimeapi.NamespaceMode_POD); err != nil {+		return err+	}++	// With the POD network mode pods out of the way, we kill the remaining+	// pods.++	if err = RemovePodsWithNetworkMode(client, runtimeapi.NamespaceMode_NODE); err != nil {

One easy way imho is to use pointer receiver and skip the check if value is nil, but probably passing pointer to value would be challenging

andrewrynhard

comment created time in 3 days

push eventsmira/talos

Tim Gerla

commit sha 9a2fd989c9243ae94401ee7681361cc05be468b3

fix: improve the project site meta description This change should slightly improve the search engine placement of our docs/project site by being a bit more descriptive. Signed-off-by: Tim Gerla <tim@gerla.net>

view details

Andrey Smirnov

commit sha 0ba3980cc6a9d7491d648a4bce0434661d9eac49

feat: make osd.Dmesg API streaming This is to prepare for upcoming switch to reading `/dev/kmsg` which should allow following logs, doing some kind of tail, etc. The output is far from being perfect, as `dmesg` data is delivered as single chunk (not as lines), but once server side updates, client side should match it. Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>

view details

push time in 3 days

PR opened talos-systems/talos

feat: make osd.Dmesg API streaming

This is to prepare for upcoming switch to reading /dev/kmsg which should allow following logs, doing some kind of tail, etc.

The output is far from being perfect, as dmesg data is delivered as single chunk (not as lines), but once server side updates, client side should match it.

Signed-off-by: Andrey Smirnov smirnov.andrey@gmail.com

+260 -153

0 comment

6 changed files

pr created time in 3 days

create barnchsmira/talos

branch : dmesg-streaming

created branch time in 3 days

Pull request review commenttalos-systems/talos

fix: kill POD network mode pods first on upgrades

 func (task *RemoveAllPods) standard() (err error) { 	// nolint: errcheck 	defer client.Close() +	// We remove pods with POD network mode first so that the CNI can perform+	// any cleanup tasks. If we don't do this, we run the risk of killing the+	// CNI, preventing the CRI from cleaning up the pod's netwokring.++	if err = RemovePodsWithNetworkMode(client, runtimeapi.NamespaceMode_POD); err != nil {+		return err+	}++	// With the POD network mode pods out of the way, we kill the remaining+	// pods.++	if err = RemovePodsWithNetworkMode(client, runtimeapi.NamespaceMode_NODE); err != nil {

I mean in the future there might be more modes available, or something else...

andrewrynhard

comment created time in 3 days

Pull request review commenttalos-systems/talos

fix: kill POD network mode pods first on upgrades

 func (task *RemoveAllPods) standard() (err error) { 	// nolint: errcheck 	defer client.Close() +	// We remove pods with POD network mode first so that the CNI can perform+	// any cleanup tasks. If we don't do this, we run the risk of killing the+	// CNI, preventing the CRI from cleaning up the pod's netwokring.++	if err = RemovePodsWithNetworkMode(client, runtimeapi.NamespaceMode_POD); err != nil {+		return err+	}++	// With the POD network mode pods out of the way, we kill the remaining+	// pods.++	if err = RemovePodsWithNetworkMode(client, runtimeapi.NamespaceMode_NODE); err != nil {

should we kill really all the remaining pods here? (ignoring network namespace altogether?)

andrewrynhard

comment created time in 3 days

push eventsmira/talos

Tim Gerla

commit sha 343cba04d3af8674a3250168543baff583cd3e0d

fix: update node dependencies for project website Update node dependencies on the project website to fix a GitHub-flagged vulnerability in serialize-javascript. Signed-off-by: Tim Gerla <tim@gerla.net>

view details

Andrey Smirnov

commit sha 373b88ba1219002885cd6ec948a8e9fd6ee8f33a

fix: response filtering for client API, RunE for osctl There are several changes which cleanup and address features of osctl, mostly for multi-node requests: * responses are filtered, so that client commands can print partial failures/success responses; * `RunE` is used in place of `Run` to propagate correct return sequence on failures; * cleaned up setting `targets` metadata on outgoing requests, it is set by default in `globalCtx` already Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>

view details

push time in 3 days

PR opened talos-systems/talos

fix: response filtering for client API, RunE for osctl

There are several changes which cleanup and address features of osctl, mostly for multi-node requests:

  • responses are filtered, so that client commands can print partial failures/success responses;
  • RunE is used in place of Run to propagate correct return sequence on failures;
  • cleaned up setting targets metadata on outgoing requests, it is set by default in globalCtx already

Signed-off-by: Andrey Smirnov smirnov.andrey@gmail.com

+205 -110

0 comment

12 changed files

pr created time in 3 days

create barnchsmira/talos

branch : resp-filtering-rune

created branch time in 3 days

issue commenttalos-systems/talos

`osctl dmesg` should support streaming

https://github.com/euank/go-kmsg-parser

smira

comment created time in 3 days

issue closedtalos-systems/talos

`osctl logs` shouldn't be following logs by default

By default it should work like cat some.log, and with -f option it should do tail -f some.log.

closed time in 3 days

smira

issue commenttalos-systems/talos

`osctl logs` shouldn't be following logs by default

Closed via #1597

smira

comment created time in 3 days

push eventsmira/talos

Spencer Smith

commit sha c3deb3e439cbd167232736701a97fa095f5a408a

docs: update with new cni abilities This PR updates the docs for cni Signed-off-by: Spencer Smith <robertspencersmith@gmail.com>

view details

Andrey Smirnov

commit sha 258ec167ab06084c0ff37c754485cce24598c21f

docs: update generated osctl documentation `--context`, `osctl logs -f`, `osctl read` Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>

view details

push time in 6 days

PR opened talos-systems/talos

docs: update generated osctl documentation

--context, osctl logs -f, osctl read

Signed-off-by: Andrey Smirnov smirnov.andrey@gmail.com

+119 -44

0 comment

39 changed files

pr created time in 6 days

create barnchsmira/talos

branch : osctl-docs-update

created branch time in 6 days

Pull request review commenttalos-systems/talos

test: add retries to the test which verifies cluster version

 func (t ticker) Stop() { // ExpectedError error represents an error that is expected by the retrying // function. This error is ignored. func ExpectedError(err error) error {+	if err == nil {

added this as shortcut, so that we could do return ExpectedError(someFunc()) without caring about return status in RetryableFunc

smira

comment created time in 6 days

push eventsmira/talos

Andrey Smirnov

commit sha 212f73700d2ab6cb03a7cb894a22b85f0e12bb0a

test: add retries to the test which verifies cluster version It fails on AWS, need to figure out if it's transient failure or not. While I was there, found lots of small bugs when endpoint is unresponsive, or target nodes are unresponsive and fixed them. In retry formatting added `\t` so that embedded errors are better aligned in the output (same as multierror). Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>

view details

push time in 6 days

Pull request review commenttalos-systems/talos

test: add retries to the test which verifies cluster version

 func (e *ErrorSet) Error() string {  	errString := fmt.Sprintf("%d error(s) occurred:", len(e.errs)) 	for _, err := range e.errs {-		errString = fmt.Sprintf("%s\n%s", errString, err)+		errString = fmt.Sprintf("%s\n\t%s", errString, err)

oh yeah, totally forgot, thanks, I'll get it updated

smira

comment created time in 6 days

PR opened talos-systems/talos

test: add retries to the test which verifies cluster version

It fails on AWS, need to figure out if it's transient failure or not.

While I was there, found lots of small bugs when endpoint is unresponsive, or target nodes are unresponsive and fixed them.

In retry formatting added \t so that embedded errors are better aligned in the output (same as multierror).

Signed-off-by: Andrey Smirnov smirnov.andrey@gmail.com

+48 -8

0 comment

6 changed files

pr created time in 6 days

create barnchsmira/talos

branch : integration-test-retries

created branch time in 6 days

Pull request review commenttalos-systems/talos

feat: allow ability to specify custom CNIs

 func generateAssets(config runtime.Configurator) (err error) { 		return err 	} +	// If "none" is the CNI, we expect the user to supply one or more urls that point to CNI yamls+	if config.Cluster().Network().CNI().Name() == "none" {+		ctx := context.Background()++		for _, url := range config.Cluster().Network().CNI().URLs() {+			urlExploded := strings.Split(url, "/")+			fileName := urlExploded[len(urlExploded)-1]

plus it might be nice to filter out . and .. and other friends... or once filepath.Clean(), fail if fileName becomes empty

rsmitty

comment created time in 6 days

Pull request review commenttalos-systems/talos

feat: allow ability to specify custom CNIs

 func generateAssets(config runtime.Configurator) (err error) { 		return err 	} +	// If "none" is the CNI, we expect the user to supply one or more urls that point to CNI yamls+	if config.Cluster().Network().CNI().Name() == "none" {+		ctx := context.Background()++		for _, url := range config.Cluster().Network().CNI().URLs() {+			urlExploded := strings.Split(url, "/")+			fileName := urlExploded[len(urlExploded)-1]

there's a nice chance it ends up empty if URL looks like http://host/some/dir/

idk if path.Base() might do better job here

rsmitty

comment created time in 6 days

issue openedtalos-systems/talos

Rename target/targets in talosconfig/osctl

In talosconfig, get rid of Target field and command config target. In general, it's not safe to change target to any node, as some commands (e.g. kubeconfig) work only against master nodes. At the same time it's easy to forget that target was changed in config and try to debug failures coming from it.

Plus, config target is confusing with --target flag which in fact do two completely different things.

Proposal: drop talosconfig.Target, introduce talosconfig.Endpoints []string which list all the master endpoints. Talos client should put them all to grpc.ClientConn for round-robin/failover.

Add flag -e/--endpoint <endpoint> to osctl to change the endpoint for the call (if we want to hit a specific node).

Rename --target flag to --endpoints. (Probably makes sense to rename grpc metadata to nodes as well to stay consistent).

created time in 6 days

PR opened talos-systems/talos

test: fix flakey test on linear retries

Retry intervals 0 + 100 + 200 + 300 + 400 ms (1000ms) align perfectly with retry timeout (1s), so test might fire 4 or 5 retries depending on timing.

Signed-off-by: Andrey Smirnov smirnov.andrey@gmail.com

+2 -2

0 comment

1 changed file

pr created time in 6 days

create barnchsmira/talos

branch : fix-linear-retry

created branch time in 6 days

pull request commenttalos-systems/talos

fix: error reporting in `osctl kubeconfig`

found a bug, please don't merge...

bug fixed, should be good now

smira

comment created time in 6 days

push eventsmira/talos

Andrey Smirnov

commit sha 059d9403a41ce0995a4799e34483fb88ffdb705d

fix: error reporting in `osctl kubeconfig` Problem seems to be on multiple levels, and there are a bit of changes which got mixed in from another PR (just same file changed). Core of the issue is that `helpers.Fatalf()` calls `os.Exit()` which terminates execution and doesn't let the `defer` and other handlers to run. This uses Cobra feature of error propagation to pop errors through the stack back to root command. Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>

view details

push time in 6 days

Pull request review commenttalos-systems/talos

fix: make retry errors ordered

 func (e *ErrorSet) Append(err error) error { 	defer e.mu.Unlock()  	if e.errs == nil {-		e.errs = make(map[string]error)+		e.errs = []error{} 	} -	if _, ok := e.errs[err.Error()]; !ok {-		e.errs[err.Error()] = err+	ok := false++	for _, existingErr := range e.errs {+		if err == existingErr {

should we compare err.Error() == existingErr.Error() ?

as errors might different by pointer, but same by value? (if say, generated with fmt.Errrof())

andrewrynhard

comment created time in 6 days

pull request commenttalos-systems/talos

fix: error reporting in `osctl kubeconfig`

found a bug, please don't merge...

smira

comment created time in 6 days

push eventsmira/talos

Andrey Smirnov

commit sha 7a424e49273ae5241aceb720a0c81b3d5e967319

fix: error reporting in `osctl kubeconfig` Problem seems to be on multiple levels, and there are a bit of changes which got mixed in from another PR (just same file changed). Core of the issue is that `helpers.Fatalf()` calls `os.Exit()` which terminates execution and doesn't let the `defer` and other handlers to run. This uses Cobra feature of error propagation to pop errors through the stack back to root command. Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>

view details

push time in 6 days

PR opened talos-systems/talos

fix: error reporting in `osctl kubeconfig`

Problem seems to be on multiple levels, and there are a bit of changes which got mixed in from another PR (just same file changed).

Core of the issue is that helpers.Fatalf() calls os.Exit() which terminates execution and doesn't let the defer and other handlers to run. This uses Cobra feature of error propagation to pop errors through the stack back to root command.

Signed-off-by: Andrey Smirnov smirnov.andrey@gmail.com

+63 -35

0 comment

4 changed files

pr created time in 6 days

create barnchsmira/talos

branch : fix-kubeconfig

created branch time in 6 days

Pull request review commenttalos-systems/talos

fix(networkd): Ignore link if carrier not detected

 func New(config runtime.Configurator) (*Networkd, error) { 			if strings.HasPrefix(device.Name, "bond") { 				netconf[device.Name] = append(netconf[device.Name], nic.WithIgnore()) 			}++			// Ignore links that do not have a carrier+			carrier, err := ioutil.ReadFile("/sys/class/net/" + device.Name + "/carrier")+			if err != nil {+				result = multierror.Append(result, err)+				continue+			}++			// note -- since we're only applying this to discovered interfaces+			// that we don't have an explicit user defined, it wont negatively+			// impact bond interfaces+			if string(carrier) == "0" {+				netconf[device.Name] = append(netconf[device.Name], nic.WithIgnore())

being totally dumb on networking config, but what if carrier appears later after the initial boot?

someone powers on the switch or fixes the cable?

bradbeam

comment created time in 6 days

push eventsmira/talos

Andrey Smirnov

commit sha 0909d0c5567d8d3130e39c312de62ca2d0ab3f7e

feat: add support for `osctl logs -f` Now default is not to follow the logs (which is similar to `kubectl logs`). Integration test was added for `Logs()` API and `osctl logs` command. Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>

view details

push time in 6 days

Pull request review commenttalos-systems/talos

feat: add support for `osctl logs -f`

 func Size(s int) Option { 	} } +// Follow file updates using inotify().+func Follow() Option {

I can rename both options, as above there's Size() as well

smira

comment created time in 7 days

push eventsmira/talos

Spencer Smith

commit sha 84354c59414b6795af94e7c62b7443a077064913

feat: add ability to append to existing files with extrafiles This PR introduces "op" to the extra files options. This allows for a user to specify "append" as the op, which will create a copy of the file specified, add the extra data provided, and bind mount over the existing file. Will close #1467 Signed-off-by: Spencer Smith <robertspencersmith@gmail.com>

view details

Andrey Smirnov

commit sha 12cbbb12f9915e91c54cfb25c486ce5639fa5b50

feat: add support for `osctl logs -f` Now default is not to follow the logs (which is similar to `kubectl logs`). Integration test was added for `Logs()` API and `osctl logs` command. Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>

view details

push time in 7 days

Pull request review commenttalos-systems/talos

feat: add support for `ostl logs -f`

 func (c *File) Read(ctx context.Context) <-chan []byte { 					select { 					case <-ctx.Done(): 						return-					case event := <-watcher.Events:+					case event := <-watcherEvents:

read from nil channel blocks forever, so it works as if watcher is not enabled

smira

comment created time in 7 days

Pull request review commenttalos-systems/talos

feat: add support for `ostl logs -f`

 func (c *File) Read(ctx context.Context) <-chan []byte { 	go func(ch chan []byte) { 		defer close(ch) -		watcher, err := fsnotify.NewWatcher()-		if err != nil {-			log.Printf("failed to watch: %v\n", err)-			return-		}-		// nolint: errcheck-		defer watcher.Close()+		var (+			watcherEvents chan fsnotify.Event+			watcherErrors chan error+		) -		if err = watcher.Add(filepath.Dir(filename)); err != nil {-			log.Printf("failed to watch add: %v\n", err)-			return-		}-		offset, err := c.source.Seek(0, io.SeekStart)-		if err != nil {-			log.Printf("failed to seek: %v\n", err)-			return+		if c.options.Follow {+			watcher, err := fsnotify.NewWatcher()+			if err != nil {+				log.Printf("failed to watch: %v\n", err)+				return+			}+			// nolint: errcheck+			defer watcher.Close()++			watcherEvents = watcher.Events+			watcherErrors = watcher.Errors++			if err = watcher.Add(filepath.Dir(filename)); err != nil {+				log.Printf("failed to watch add: %v\n", err)+				return+			} 		}  		buf := make([]byte, c.options.Size)  		for { 			for {-				n, err := c.source.ReadAt(buf, offset)+				n, err := c.source.Read(buf)

figured out ReadAt() was not that necessary, as there was a .Seek() which anyway resets file offset.

So now chunker doesn't .Seek() and uses regular .Read(), this way it will be easier to combine with any kind of "tail" implementation which sets seek offset before passing file to this method.

The way we call chunker is that we always separate *os.File instance for the duration of the API call.

smira

comment created time in 7 days

Pull request review commenttalos-systems/talos

feat: add support for `ostl logs -f`

 func (l *Log) Close() error { 	return l.source.Close() } -// Read implements chunker.Chunker.-func (l *Log) Read(ctx context.Context) <-chan []byte {

gc'ed unused method, and it was dangerous, as it used same *os.File as for writing to the logs, while chunker might still Seek the stream

smira

comment created time in 7 days

PR opened talos-systems/talos

feat: add support for `ostl logs -f`

Now default is not to follow the logs (which is similar to kubectl logs).

Integration test was added for Logs() API and osctl logs command.

Signed-off-by: Andrey Smirnov smirnov.andrey@gmail.com

+438 -143

0 comment

13 changed files

pr created time in 7 days

create barnchsmira/talos

branch : logs-follow

created branch time in 7 days

Pull request review commenttalos-systems/talos

fix: return a unique set of errors on retry failure

 type Ticker interface { 	Stop() } +// ErrorSet represents a set of unique errors.+type ErrorSet struct {+	errs map[string]error++	mu sync.Mutex+}++func (e *ErrorSet) Error() string {+	if len(e.errs) == 0 {+		return ""+	}++	errString := fmt.Sprintf("%d error(s) occurred:", len(e.errs))+	for _, err := range e.errs {+		errString = fmt.Sprintf("%s\n%s", errString, err)+	}++	return errString+}++// Append adds the error to the set if the error is not already present.+func (e *ErrorSet) Append(err error) error {+	e.mu.Lock()+	defer e.mu.Unlock()++	if e.errs == nil {+		e.errs = make(map[string]error)+	}++	if _, ok := e.errs[err.Error()]; !ok {+		e.errs[err.Error()] = err+	}++	return e+}

I'm not sure if order matters after deduplication

andrewrynhard

comment created time in 7 days

Pull request review commenttalos-systems/talos

fix: return a unique set of errors on retry failure

 type Ticker interface { 	Stop() } +// ErrorSet represents a set of unique errors.+type ErrorSet struct {+	errs map[string]error++	mu sync.Mutex+}++func (e *ErrorSet) Error() string {+	if len(e.errs) == 0 {+		return ""+	}++	errString := fmt.Sprintf("%d error(s) occurred:", len(e.errs))

missing \n ?

missed that one as well below

andrewrynhard

comment created time in 7 days

Pull request review commenttalos-systems/talos

fix: return a unique set of errors on retry failure

 type Ticker interface { 	Stop() } +// ErrorSet represents a set of unique errors.+type ErrorSet struct {+	errs map[string]error++	mu sync.Mutex+}++func (e *ErrorSet) Error() string {+	if len(e.errs) == 0 {+		return ""+	}++	errString := fmt.Sprintf("%d error(s) occurred:", len(e.errs))+	for _, err := range e.errs {+		errString = fmt.Sprintf("%s\n%s", errString, err)

should be += ?

please disregard, being blind

andrewrynhard

comment created time in 7 days

Pull request review commenttalos-systems/talos

fix: return a unique set of errors on retry failure

 type Ticker interface { 	Stop() } +// ErrorSet represents a set of unique errors.+type ErrorSet struct {+	errs map[string]error++	mu sync.Mutex+}++func (e *ErrorSet) Error() string {+	if len(e.errs) == 0 {+		return ""+	}++	errString := fmt.Sprintf("%d error(s) occurred:", len(e.errs))

missing \n ?

andrewrynhard

comment created time in 7 days

Pull request review commenttalos-systems/talos

fix: return a unique set of errors on retry failure

 type Ticker interface { 	Stop() } +// ErrorSet represents a set of unique errors.+type ErrorSet struct {+	errs map[string]error++	mu sync.Mutex+}++func (e *ErrorSet) Error() string {+	if len(e.errs) == 0 {+		return ""+	}++	errString := fmt.Sprintf("%d error(s) occurred:", len(e.errs))+	for _, err := range e.errs {+		errString = fmt.Sprintf("%s\n%s", errString, err)

should be += ?

andrewrynhard

comment created time in 7 days

Pull request review commenttalos-systems/talos

feat: add ability to append to existing files with extrafiles

 func (task *ExtraFiles) runtime(r runtime.Runtime) (err error) { 	var result *multierror.Error  	for _, f := range r.Config().Machine().Files() {-		p := filepath.Join("/var", f.Path)+		// Slurp existing file if append is our op and add contents to it+		var existingFileContents []byte+		if f.Op == "append" {+			existingFileContents, err = slurpFile(f.Path)+			if err != nil {+				result = multierror.Append(result, err)+				continue+			}++			f.Contents = string(existingFileContents) + "\n" + f.Contents+		}++		// Determine if supplied path is in /var or not.+		// If not, we'll write it to /var anyways and bind mount below+		p := f.Path+		inVar := true+		explodedPath := strings.Split(+			strings.TrimLeft(f.Path, "/"),+			string(os.PathSeparator),+		)++		if explodedPath[0] != "var" {+			p = filepath.Join("/var", f.Path)+			inVar = false+		}+ 		if err = os.MkdirAll(filepath.Dir(p), os.ModeDir); err != nil { 			result = multierror.Append(result, err)+			continue 		}  		if err = ioutil.WriteFile(p, []byte(f.Contents), f.Permissions); err != nil { 			result = multierror.Append(result, err)+			continue+		}++		// File path was not /var/... so we assume a bind mount is wanted+		if !inVar {+			if err = unix.Mount(p, f.Path, "", unix.MS_BIND|unix.MS_RDONLY, ""); err != nil {+				result = multierror.Append(result, fmt.Errorf("failed to create bind mount for %s: %w", p, err))+			} 		} 	}  	return result.ErrorOrNil() }++// slurpFile simplyu reads in a file and returns a byte slice+func slurpFile(path string) ([]byte, error) {+	file, err := ioutil.ReadFile(path)

isn't this function just ioutil.ReadFile() - it does nothing else?

rsmitty

comment created time in 7 days

Pull request review commenttalos-systems/talos

chore: rewrite basic integration in go instead of bash

+// This Source Code Form is subject to the terms of the Mozilla Public+// License, v. 2.0. If a copy of the MPL was not distributed with this+// file, You can obtain one at http://mozilla.org/MPL/2.0/.++package runner++import (+	"context"+	"errors"+	"io"+	"log"+	"os"+	"os/exec"+	"strings"+	"time"++	"github.com/docker/docker/api/types"+	"github.com/docker/docker/api/types/container"+	"github.com/docker/docker/api/types/filters"+	"github.com/docker/docker/client"++	"github.com/talos-systems/talos/pkg/retry"+)++// ContainerConfigs hold the configs we use to launch our container+type ContainerConfigs struct {+	ContainerConfig *container.Config+	HostConfig      *container.HostConfig+}++// CommandLocal runs a local binary. Used for osctl cluster create and setup+func CommandLocal(command string) error {+	commandSplit := strings.Split(command, " ")+	log.Println("issuing local command : '" + command + "'")++	_, err := exec.Command(commandSplit[0], commandSplit[1:]...).Output()+	if err != nil {+		return err+	}++	return nil+}++// CommandInContainer simply runs a bash command in the hyperkube containers+// nolint: gocyclo+func CommandInContainer(ctx context.Context, client *client.Client, runnerConfig *ContainerConfigs, command string) error {

it's not about this specific PR, but rather a general question.

We do equivalent of docker run many times, which has some good overhead.

Does it make more sense to do docker run once with container which sleeps forever, and use docker exec to run each command?

rsmitty

comment created time in 7 days

push eventsmira/talos

Andrey Smirnov

commit sha 10a40a15d964902ad2b678a166bc19db2a7bf074

fix: extract errors from API response This PR only touches `Version` method, but I will expand it to other methods in the next PR. When proxying to many upstreams, errors are wrapped as responses as we can't return error and response from grpc call. Reflect-based function was introduced to filter out responses which contain errors as multierror. Reflection was used, as each response is a different Go type, and we can't write a generic function for it. osctl was updated to support having both resp & err not nil. One failed response shouldn't result in error. Re-enabled integration test for multiple targets and version consistency, need e2e validation. Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>

view details

Andrey Smirnov

commit sha e13dba6924a6d57913d6e182b6150688d7f5c2a5

refactor: extract TLS bits from apid main.go No functional changes, just moving code around. Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>

view details

push time in 7 days

Pull request review commenttalos-systems/talos

refactor: extract TLS bits from apid main.go

+// This Source Code Form is subject to the terms of the Mozilla Public+// License, v. 2.0. If a copy of the MPL was not distributed with this+// file, You can obtain one at http://mozilla.org/MPL/2.0/.++// Package provider provides TLS config for client & server+package provider++import (+	stdlibtls "crypto/tls"+	"fmt"+	stdlibnet "net"+	"os"++	"github.com/talos-systems/talos/internal/pkg/runtime"+	"github.com/talos-systems/talos/pkg/constants"+	"github.com/talos-systems/talos/pkg/grpc/tls"+	"github.com/talos-systems/talos/pkg/net"+)++// TLSConfig provides client & server TLS configs for apid.+type TLSConfig struct {+	certificateProvider tls.CertificateProvider+}++// NewTLSConfig builds provider from configuration and endpoints.+func NewTLSConfig(config runtime.Configurator, endpoints []string) (*TLSConfig, error) {+	ips, err := net.IPAddrs()+	if err != nil {+		return nil, fmt.Errorf("failed to discover IP addresses: %w", err)+	}+	// TODO(andrewrynhard): Allow for DNS names.+	for _, san := range config.Machine().Security().CertSANs() {+		if ip := stdlibnet.ParseIP(san); ip != nil {+			ips = append(ips, ip)+		}+	}++	hostname, err := os.Hostname()+	if err != nil {+		return nil, fmt.Errorf("failed to discover hostname: %w", err)+	}++	tlsConfig := &TLSConfig{}++	tlsConfig.certificateProvider, err = tls.NewRemoteRenewingFileCertificateProvider(+		config.Machine().Security().Token(),+		endpoints,+		constants.TrustdPort,+		hostname,+		ips,+	)+	if err != nil {+		return nil, err+	}

idk, I always feel uncomfortable returning a structure (tlsConfig) which won't work as it's not initialized properly.

not a big deal either way, but tried to stand on either return value or error

smira

comment created time in 7 days

PR opened talos-systems/talos

refactor: extract TLS bits from apid main.go

No functional changes, just moving code around.

Signed-off-by: Andrey Smirnov smirnov.andrey@gmail.com

+111 -43

0 comment

3 changed files

pr created time in 7 days

create barnchsmira/talos

branch : refactor-tls-apid

created branch time in 7 days

Pull request review commenttalos-systems/talos

fix: Add hostname setting to networkd

 func writeResolvConf(resolvers []string) error {  	return ioutil.WriteFile("/etc/resolv.conf", []byte(resolvconf.String()), 0644) }++const hostsTemplate = `+127.0.0.1       localhost+{{ .IP }}       {{ .Hostname }} {{ if ne .Hostname .Alias }}{{ .Alias }}{{ end }}+::1             localhost ip6-localhost ip6-loopback+ff02::1         ip6-allnodes+ff02::2         ip6-allrouters+`++func writeHosts(hostname string, address net.IP) (err error) {+	data := struct {+		IP       string+		Hostname string+		Alias    string+	}{+		IP:       address.String(),+		Hostname: hostname,+		Alias:    strings.Split(hostname, ".")[0],+	}++	tmpl, err := template.New("").Parse(hostsTemplate)+	if err != nil {+		return

same comment as above on var shadowing, idk, I would probably just remove named result variable

bradbeam

comment created time in 7 days

Pull request review commenttalos-systems/talos

fix: Add hostname setting to networkd

 func OSRelease() (err error) {  	err = tmpl.Execute(writer, data) 	if err != nil {-		return-	}--	if err = ioutil.WriteFile("/run/system/etc/os-release", writer.Bytes(), 0644); err != nil {-		return fmt.Errorf("write /run/system/etc/os-release: %w", err)-	}--	if err = unix.Mount("/run/system/etc/os-release", "/etc/os-release", "", unix.MS_BIND, ""); err != nil {-		return fmt.Errorf("failed to create bind mount for /etc/os-release: %w", err)+		return err 	} -	return nil+	return ioutil.WriteFile("/run/system/etc/os-release", writer.Bytes(), 0644) } -func ip() string {-	addrs, err := net.InterfaceAddrs()-	if err != nil {-		return ""+// createBindMount creates a common way to create a writable source file with a+// bind mounted destination. This is most commonly used for well known files+// under /etc that need to be adjusted during startup.+func createBindMount(src, dst string) (err error) {+	var f *os.File++	if f, err = os.OpenFile(src, os.O_WRONLY|os.O_CREATE, 0644); err != nil {+		return err 	} -	for _, address := range addrs {-		if ipnet, ok := address.(*net.IPNet); ok && !ipnet.IP.IsLoopback() {-			if ipnet.IP.To4() != nil {-				return ipnet.IP.String()-			}-		}+	// nolint: errcheck+	defer f.Close()

I believe we don't really need to defer here, as we can close immediately?

	if err = f.Close(); err != nil {
	  return
	}
bradbeam

comment created time in 7 days

Pull request review commenttalos-systems/talos

fix: Add hostname setting to networkd

 func OSRelease() (err error) {  	tmpl, err := template.New("").Parse(osReleaseTemplate) 	if err != nil {-		return+		return err

that's where it becomes confusing imho: if we have function result declared as (err error), what does tmpl, err := do ? I believe it creates another err which shadows previous err, so bare return returns wrong err

bradbeam

comment created time in 7 days

pull request commenttalos-systems/talos

fix: extract errors from API response

e2e pass with version tests for all cluster nodes enabled: https://ci.dev.talos-systems.io/talos-systems/talos/5557/1/30

smira

comment created time in 7 days

push eventsmira/talos

Andrey Smirnov

commit sha a6e80d8f410453d0b8fa07db4d0c30e95a9ba53a

fix: extract errors from API response This PR only touches `Version` method, but I will expand it to other methods in the next PR. When proxying to many upstreams, errors are wrapped as responses as we can't return error and response from grpc call. Reflect-based function was introduced to filter out responses which contain errors as multierror. Reflection was used, as each response is a different Go type, and we can't write a generic function for it. osctl was updated to support having both resp & err not nil. One failed response shouldn't result in error. Re-enabled integration test for multiple targets and version consistency, need e2e validation. Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>

view details

push time in 7 days

push eventsmira/talos

Andrew Rynhard

commit sha 1f4c17269d2116f19535edafdb834785071beda8

feat: add universal TUN/TAP device driver support This is required when doing anything with KVM. Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>

view details

Seán C McCord

commit sha 9d9b958fba8c56dda640371fdc4441cb9a1d9cc1

fix: reverse preference order of network config Kernel config should always play second to a file-based config. Fixes #1588 Signed-off-by: Seán C McCord <ulexus@gmail.com>

view details

Andrew Rynhard

commit sha d4c202438ceca54bc9395d194e07e93995f4b3cc

refactor: set CRI config to /etc/cri/containerd.toml This changes the CRI specific containerd instance's config to a different path. Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>

view details

Andrew Rynhard

commit sha 034728651156985b7732fbf41c11e14b9e16cf37

feat: upgrade Linux to v5.3.15 This brings in the latest 5.3 version of Linux. Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>

view details

Andrew Rynhard

commit sha 7b6a1fdc94c4ccf90c8a7872313bea71ef390466

fix: update kernel version constant This is required to pass integration tests. Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>

view details

Andrey Smirnov

commit sha 47c716dca91692517fb492f4d517c14c4b4eaeb4

fix: extract errors from API response This PR only touches `Version` method, but I will expand it to other methods in the next PR. When proxying to many upstreams, errors are wrapped as responses as we can't return error and response from grpc call. Reflect-based function was introduced to filter out responses which contain errors as multierror. Reflection was used, as each response is a different Go type, and we can't write a generic function for it. osctl was updated to support having both resp & err not nil. One failed response shouldn't result in error. Re-enabled integration test for multiple targets and version consistency, need e2e validation. Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>

view details

push time in 7 days

issue openedtalos-systems/grpc-proxy

Ignore some errors while proxying

10.5.0.2: 2019/12/05 13:37:16.186640 log.go:87: Unknown [/machine.Machine/Logs] 7.045771818s stream 2 errors occurred:
10.5.0.2: 	* error sending error back: rpc error: code = Canceled desc = context canceled
10.5.0.2: 	* error sending error back: rpc error: code = Canceled desc = context canceled
10.5.0.2: 
10.5.0.2:  (:authority=127.0.0.1;content-type=application/grpc;targets=10.5.0.2,10.5.0.3;user-agent=grpc-go/1.25.1)
10.5.0.2: 2019/12/05 13:37:58.747844 log.go:87: Unknown [/machine.Machine/Logs] 3.659807956s stream 2 errors occurred:
10.5.0.2: 	* error sending error back: rpc error: code = Unavailable desc = transport is closing
10.5.0.2: 	* error sending error back: rpc error: code = Unavailable desc = transport is closing
10.5.0.2: 

created time in 7 days

pull request commenttalos-systems/talos

fix: extract errors from API response

I thought I should also handle edge case when all the responses are errors better, but I will leave it for tomorrow

smira

comment created time in 7 days

PR opened talos-systems/talos

fix: extract errors from API response

This PR only touches Version method, but I will expand it to other methods in the next PR.

When proxying to many upstreams, errors are wrapped as responses as we can't return error and response from grpc call. Reflect-based function was introduced to filter out responses which contain errors as multierror. Reflection was used, as each response is a different Go type, and we can't write a generic function for it.

osctl was updated to support having both resp & err not nil. One failed response shouldn't result in error.

Re-enabled integration test for multiple targets and version consistency, need e2e validation.

Signed-off-by: Andrey Smirnov smirnov.andrey@gmail.com

+198 -17

0 comment

7 changed files

pr created time in 7 days

create barnchsmira/talos

branch : filter-response

created branch time in 7 days

Pull request review commenttalos-systems/talos

feat: use containerd-shim-runc-v2

 COPY images/networkd.tar /rootfs/usr/images/ # symlinks to avoid accidentally cleaning them up. COPY ./hack/cleanup.sh /toolchain/bin/cleanup.sh RUN cleanup.sh /rootfs-COPY hack/containerd.toml /etc/containerd.toml-COPY hack/containerd.toml /etc/containerd-system.toml+COPY hack/containerd.toml /rootfs/etc/containerd/cri.toml

I think because we had reaper before system/cri containerd split

andrewrynhard

comment created time in 7 days

push eventtalos-systems/talos

Andrey Smirnov

commit sha fc52025490d357e79c38a7bfefcb02f3a193b7f6

fix: provide peer remote address for 'NODE': as default in osctl This change is pretty mechanical, just wrap every API so that remote peer address is used as default for `resp.Metadata.Hostname`. This makes `NODE:` non-empty in all the API calls. Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>

view details

push time in 7 days

PR merged talos-systems/talos

fix: provide peer remote address for 'NODE': as default in osctl

This change is pretty mechanical, just wrap every API so that remote peer address is used as default for resp.Metadata.Hostname.

This makes NODE: non-empty in all the API calls.

Signed-off-by: Andrey Smirnov smirnov.andrey@gmail.com

+325 -116

5 comments

13 changed files

smira

pr closed time in 8 days

push eventsmira/talos

Andrey Smirnov

commit sha cb436e591be829194d22dcaee802d9966dbe6371

fix: provide peer remote address for 'NODE': as default in osctl This change is pretty mechanical, just wrap every API so that remote peer address is used as default for `resp.Metadata.Hostname`. This makes `NODE:` non-empty in all the API calls. Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>

view details

push time in 8 days

pull request commenttalos-systems/talos

fix: provide peer remote address for 'NODE': as default in osctl

After I made all the changes, I thought that there might be better way to change the client API. Instead of returning remotePeer from each API call, we could allow passing options... grpc.CallOption to all the client package methods and this way caller (osctl) might use it to pass grpc.Peer() option. On one hand, osctl needs peer info for every call, on other hand allowing options is more generic. Thoughts?

decision in the channel: use generic callOptions ...grpc.Option

PR updated

smira

comment created time in 8 days

push eventsmira/talos

Andrey Smirnov

commit sha b589ae2ac6efd8c0dc01891fc0b39a78b68d78d6

fix: provide peer remote address for 'NODE': as default in osctl This change is pretty mechanical, just wrap every API so that remote peer address is used as default for `resp.Metadata.Hostname`. This makes `NODE:` non-empty in all the API calls. Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>

view details

push time in 8 days

push eventsmira/talos

Andrey Smirnov

commit sha 4dcd53d32b6213fb6ae4c6bd9f771e0f760e4136

fix: provide peer remote address for 'NODE': as default in osctl This change is pretty mechanical, just wrap every API so that remote peer address is used as default for `resp.Metadata.Hostname`. This makes `NODE:` non-empty in all the API calls. Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>

view details

push time in 8 days

more