profile
viewpoint

aws/eks-charts 240

Amazon EKS Helm chart repository

bwagner5/Dynamic-IP-Route53 11

Updates a Route53 Zone with your computer's public IP

bwagner5/docker-forensics-tools 3

Forensics Tool Collection with Docker Containers to replace a "Live CD" Toolset.

bwagner5/docker-the-sleuth-kit 2

Docker Container which builds and runs The Sleuth Kit

bwagner5/docker-mozilla-investigator 1

Docker Container for Mozilla InvestiGator (MIG) and dependencies including PostGres and RabbitMQ

bwagner5/acl-search 0

acl-search is a Python utility to search through a Juniper ACL file to find intersecting destination IPs and return the full term.

bwagner5/amazon-ec2-instance-qualifier 0

A CLI tool that automates benchmarking on a range of EC2 instance types.

bwagner5/amazon-ec2-instance-selector 0

A CLI tool and go library which recommends instance types based on resource criteria like vcpus and memory

bwagner5/amazon-ec2-metadata-mock 0

A tool to simulate Amazon EC2 instance metadata

bwagner5/arlo-go 0

WIP - (BETA) - Go package for interacting with Netgear's Arlo camera system.

push eventaws/aws-node-termination-handler

Nithish Murcherla

commit sha c9813c2796a98a9c94f6017769481ec52dc7e6eb

Retry uncordon workflow to handle transient network issues (#257)

view details

push time in 16 hours

PR merged aws/aws-node-termination-handler

Retry uncordon workflow to handle transient network issues

Issue #, if available: N/A

Description of changes:

Recently, we had an edge case scenario where NTH daemon failed to uncordon the node with below error message:

"Unable to complete the uncordon after reboot workflow on startup: Unable to fetch kubernetes node from API: Get " https://172.20.0.1:443/api/v1/nodes/ip-10-79-29-243.us-east-2.compute.internal": dial tcp 172.20.0.1:443: i/o timeout errors. 

Debugging further, we could neither find any issue with kube-proxy d/s or aws-node d/s which constantly talks to API server upon which we thought this could be a transient network issue due to which NTH failed to uncordon the node.

That said, from a distributed systems stand point, we should be able to handle any such network partition events for at least short period of time without evicting the whole cluster and node termination handler should not worry about such issues as Kubelet will send heart beats to api server which will automatically mark the node unschedulable if that is the case.

This change introduces retries on such transient network issues. It retries 4 times in 8 seconds period at an interval of 2 seconds.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

+16 -8

1 comment

3 changed files

nithu0115

pr closed time in 16 hours

PullRequestReviewEvent
PullRequestReviewEvent

Pull request review commentawslabs/amazon-ec2-instance-qualifier

integrate cloudwatch agent

+{+	"agent": {+		"metrics_collection_interval": 30,+		"run_as_user": "root"+	},+	"metrics": {+		"append_dimensions": {+			"AutoScalingGroupName": "${aws:AutoScalingGroupName}",+			"InstanceId": "${aws:InstanceId}",+			"InstanceType": "${aws:InstanceType}"

maybe a dimension for the test run would be good too

brycahta

comment created time in 4 days

Pull request review commentawslabs/amazon-ec2-instance-qualifier

integrate cloudwatch agent

+{+	"agent": {+		"metrics_collection_interval": 30,+		"run_as_user": "root"

this should run as cwagent

brycahta

comment created time in 4 days

PullRequestReviewEvent
PullRequestReviewEvent

Pull request review commentawslabs/amazon-ec2-instance-qualifier

integrate cloudwatch agent

 for file in *; do 		chmod u+x "$file" 	fi done++wget https://s3.amazonaws.com/amazoncloudwatch-agent/amazon_linux/amd64/latest/amazon-cloudwatch-agent.rpm

Will need to support Debian based OSs as well as Redhat ones.

brycahta

comment created time in 4 days

PullRequestReviewEvent
PullRequestReviewEvent

Pull request review commentawslabs/amazon-ec2-instance-qualifier

integrate cloudwatch agent

+{

A basic setup like this seems good to me. If a user is bringing their own AMI, they may very well already have the Cloudwatch Agent installed and configured as well, for that case the user should just disable cloudwatch agent installation and rely on their own. Wdyt?

brycahta

comment created time in 4 days

push eventaws/aws-node-termination-handler

Brandon Wagner

commit sha ae454a8be3d46ab93ee180f32b0d6dd2c6cd5bd3

upgrade test dependency versions (#248) * upgrade test dependency versions * Update provision-cluster

view details

push time in 5 days

PR merged aws/aws-node-termination-handler

upgrade test dependency versions

Issue #, if available: N/A

Description of changes:

  • Upgrade helm to latest versions
  • Add k8s 1.19 kind docker image to e2e tests
  • patch version upgrade of k8s 1.18 kind docker image
  • Change default k8s version to 1.17
  • Remove k8s 1.12 and 1.13 since Kind no longer supports those version (latest supported eks is 1.14 so seems fine)
  • Enable rbac on AEMM

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

+15 -18

4 comments

6 changed files

bwagner5

pr closed time in 5 days

Pull request review commentaws/aws-node-termination-handler

upgrade test dependency versions

 aemm_helm_args=(   "$AEMM_DL_URL"   --namespace default   --set servicePort="$IMDS_PORT"+  --set 'rbac.pspEnabled=true'

not needed, I just needed to update the cluster psp since k8s 1.19 seems to default to seccomp to runtime/default rather than docker/default

bwagner5

comment created time in 5 days

PullRequestReviewEvent

push eventbwagner5/aws-node-termination-handler

Brandon Wagner

commit sha 2f5a44ef87a6157f35e5098cff3f85e198d40206

Update provision-cluster

view details

push time in 5 days

push eventbwagner5/aws-node-termination-handler

Brandon Wagner

commit sha 3210c51dcf8b4d1abf2911ba494fb0607b4f5a92

Update provision-cluster

view details

push time in 5 days

push eventbwagner5/aws-node-termination-handler

Brandon Wagner

commit sha 7091a312df2b96413302079636d3efa84b453b00

upgrade test dependency versions

view details

push time in 5 days

push eventbwagner5/aws-node-termination-handler

Brandon Wagner

commit sha 7cffe511d4a87c368b6035036a1d4ef0c095d784

upgrade test dependency versions

view details

push time in 5 days

push eventbwagner5/aws-node-termination-handler

Supasteevo

commit sha dad76703aa4741cb9e78b89c026a37a31ebd60b8

Allow users to configure webhook message content with a template file (#253) * added new config variable to customize webhook template from file * added new webhookTemplateFile variable in helm chart * fix gofmt & ineffassign errors * set template file as configmap in helm chart * check webhook template file in ValidateWebhookConfig function + set template content message as debug + typo nthConfig Co-authored-by: Steven Bressey <steven.bressey@mediakeys.com>

view details

Jason Haugen

commit sha f90e90cc8f0ed5762e1c1d66e87bff4ab29f714e

Add AEMM mock interruption documentation (#256) * Add AEMM mock interruption documentation * fix misspelling

view details

Brandon Wagner

commit sha 0264d483b4c7434dd91caf23cf14b2862f18eb45

upgrade test dependency versions

view details

push time in 5 days

push eventaws/aws-node-termination-handler

Jason Haugen

commit sha f90e90cc8f0ed5762e1c1d66e87bff4ab29f714e

Add AEMM mock interruption documentation (#256) * Add AEMM mock interruption documentation * fix misspelling

view details

push time in 5 days

PR merged aws/aws-node-termination-handler

Add AEMM mock interruption documentation

Issue #, if available: #208

Description of changes: Moves the information about using AEMM to simulate a spot interruption from this issue into a more permanent location

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

+90 -0

1 comment

1 changed file

haugenj

pr closed time in 5 days

PullRequestReviewEvent

push eventaws/aws-node-termination-handler

Supasteevo

commit sha dad76703aa4741cb9e78b89c026a37a31ebd60b8

Allow users to configure webhook message content with a template file (#253) * added new config variable to customize webhook template from file * added new webhookTemplateFile variable in helm chart * fix gofmt & ineffassign errors * set template file as configmap in helm chart * check webhook template file in ValidateWebhookConfig function + set template content message as debug + typo nthConfig Co-authored-by: Steven Bressey <steven.bressey@mediakeys.com>

view details

push time in 5 days

PullRequestReviewEvent

Pull request review commentaws/aws-node-termination-handler

Allow users to configure webhook message content with a template file

 type combinedDrainData struct {  // Post makes a http post to send drain event data to webhook url func Post(additionalInfo ec2metadata.NodeMetadata, event *monitor.InterruptionEvent, nthconfig config.Config) {+	var webhookTemplateContent string++	if nthconfig.WebhookTemplateFile != "" {+		content, err := ioutil.ReadFile(nthconfig.WebhookTemplateFile)+		if err != nil {+			log.Log().Msgf("Webhook Error: Could not read template file %s - %s", nthconfig.WebhookTemplateFile, err)+			return+		}+		webhookTemplateContent = string(content)+		log.Log().Msgf("Template file content - %s", webhookTemplateContent)

can you move this tolog.Debug() .Msgf(

supasteev0

comment created time in 6 days

PullRequestReviewEvent
PullRequestReviewEvent

push eventbwagner5/aws-node-termination-handler

Brandon Wagner

commit sha a40cfb61ad6229b2a099ae4c003bf80327c09bcf

upgrade test dependency versions

view details

push time in 6 days

push eventbwagner5/aws-node-termination-handler

Bryan™

commit sha ac6f0e79c3493c7ac621dc1aac6e7f642f9b4af7

upgrade aemm to 1.6 (#251)

view details

Jason Haugen

commit sha ae0676e404e5313d35d3620e8834ba18e5f9ece7

Reduce event logging to only new events (#252) * Reduce event logging to only new events * Disable logging for benchmark test

view details

Brandon Wagner

commit sha 927fc254cb87147376887779feb6669031bf020d

upgrade test dependency versions

view details

push time in 6 days

PullRequestEvent

Pull request review commentaws/aws-node-termination-handler

Add AEMM mock interruption documentation

+# AWS Node Termination Handler & Amazon EC2 Metadata Mock++We have open sourced a tool called the [amazon-ec2-metadata-mock](https://github.com/aws/amazon-ec2-metadata-mock) (AEMM)+that simulates spot interruption notices and more by starting a real webserver that serves data similar to EC2 Instance+Metadata Service. The tool is easily deployed to kubernetes with a Helm chart.++Below is a short guide on how to set AEMM up with your Node Termination Handler cluster in case you'd like to verify the+behavior yourself.++## Triggering AWS Node Termination Handler with Amazon EC2 Metadata Mock++Start by installing AEMM on your cluster. For full and up to date installation instructions reference the AEMM repository.+Here's just one way to do it.++Download the latest tar ball from the releases page, at the time of writing this that was v1.6.0. Then install it using+Helm:+```+helm install amazon-ec2-metadata-mock amazon-ec2-metadata-mock-1.6.0.tgz \+  --namespace default+```++Once AEMM is installed, you need to change the instance metadata url of Node Termination Handler to point+to the location AEMM is serving from. If you use the default values of AEMM, the installation will look similar to this:+```+helm upgrade --install aws-node-termination-handler \+  --namespace kube-system \+  --set instanceMetadataURL="http://amazon-ec2-metadata-mock-service.default.svc.cluster.local:1338" \+  eks/aws-node-termination-handler+```++That's it! Instead of polling the real IMDS service endpoint, AWS Node Termination Handler will instead poll AEMM.+If you open the logs of an AWS Node Termination Handler pod you should see that it receives (mock) interruption+events from AEMM and that the nodes are cordoned and drained. Keep in mind that these nodes won't actually get terminated,+so you might need to manually uncordon the nodes if you want to reset your test cluster.++### AEMM Advanced Configuration+If you run the example above you might notice that the logs are heavily populated. Here's an example output:+```+2020/09/15 21:13:41 Sending interruption event to the interruption channel+2020/09/15 21:13:41 Got interruption event from channel {InstanceID:i-1234567890abcdef0 InstanceType:m4.xlarge PublicHostname:ec2-192-0-2-54.compute-1.amazonaws.com PublicIP:192.0.2.54 LocalHostname:ip-172-16-34-43.ec2.internal LocalIP:172.16.34.43 AvailabilityZone:us-east-1a} {EventID:spot-itn-47ddfb5e39791606bec3e91fea4cdfa86f86a60ddaf014c8b4af8e008f134b19 Kind:SPOT_ITN Description:Spot ITN received. Instance will be interrupted at 2020-09-15T21:15:41Z+ State: NodeName:ip-192-168-123-456.us-east-1.compute.internal StartTime:2020-09-15 21:15:41 +0000 UTC EndTime:0001-01-01 00:00:00 +0000 UTC Drained:false PreDrainTask:0x113c8a0 PostDrainTask:<nil>}+WARNING: ignoring DaemonSet-managed Pods: default/amazon-ec2-metadata-mock-pszj2, kube-system/aws-node-bl2bj, kube-system/aws-node-termination-handler-2pvjr, kube-system/kube-proxy-fct9f+evicting pod "coredns-67bfd975c5-rgkh7"+evicting pod "coredns-67bfd975c5-6g88n"+2020/09/15 21:13:42 Node "ip-192-168-123-456.us-east-1.compute.internal" successfully cordoned and drained.+2020/09/15 21:13:43 Sending interruption event to the interruption channel+2020/09/15 21:13:43 Got interruption event from channel {InstanceID:i-1234567890abcdef0 InstanceType:m4.xlarge PublicHostname:ec2-192-0-2-54.compute-1.amazonaws.com PublicIP:192.0.2.54 LocalHostname:ip-172-16-34-43.ec2.internal LocalIP:172.16.34.43 AvailabilityZone:us-east-1a} {EventID:spot-itn-97be476b6246aba6401ba36e54437719bfdf987773e9c83fe30336eb7fea9704 Kind:SPOT_ITN Description:Spot ITN received. Instance will be interrupted at 2020-09-15T21:15:43Z+ State: NodeName:ip-192-168-123-456.us-east-1.compute.internal StartTime:2020-09-15 21:15:43 +0000 UTC EndTime:0001-01-01 00:00:00 +0000 UTC Drained:false PreDrainTask:0x113c8a0 PostDrainTask:<nil>}+WARNING: ignoring DaemonSet-managed Pods: default/amazon-ec2-metadata-mock-pszj2, kube-system/aws-node-bl2bj, kube-system/aws-node-termination-handler-2pvjr, kube-system/kube-proxy-fct9f+2020/09/15 21:13:44 Node "ip-192-168-123-456.us-east-1.compute.internal" successfully cordoned and drained.+2020/09/15 21:13:45 Sending interruption event to the interruption channel+2020/09/15 21:13:45 Got interruption event from channel...+```++This isn't a mistake, by default AEMM will respond to any request for metadata with a spot interruption occurring 2 minutes+later than the request time.\* AWS Node Termination Handler polls for events every 2 seconds by default, so the effect is+that new interruption events are found and processed every 2 seconds. ++In reality there will only be a single interruption event, and you can mock this by setting the `spot.time` parameter of+AEMM when installing it. +```+helm install amazon-ec2-metadata-mock amazon-ec2-metadata-mock-1.6.0.tgz \+  --set aemm.spot.time="2020-09-09T22:40:47Z" \+  --namespace default+```++Now when you check the logs you should only see a single event get processed. ++For more ways of configuring AEMM check out the [Helm configuration page](https://github.com/aws/amazon-ec2-metadata-mock/tree/master/helm/amazon-ec2-metadata-mock).++## Node Termination Handler E2E Tests++AEMM started out as a test server for aws-node-termination-handler's end-to-end tests in this repo. We use AEMM throughout+our end to end tests to create interruption notices.++The e2e tests install aws-node-termination-handler using Helm and set the metadata url [here](https://github.com/aws/aws-node-termination-handler/blob/master/test/e2e/spot-interruption-test#L36).+This becomes where aws-node-termination-handler looks for metadata; other applications on the node still look at the real+EC2 metadata service.++We set the metadata url environment variable [here](https://github.com/aws/aws-node-termination-handler/blob/master/test/k8s-local-cluster-test/run-test#L18)+for the local tests that use a kind cluster, and [here](https://github.com/aws/aws-node-termination-handler/blob/master/test/eks-cluster-test/run-test#L117)+for the eks-cluster e2e tests.++Check out the [ReadMe](https://github.com/aws/aws-node-termination-handler/tree/master/test) in our test folder for more+info on the e2e tests. ++---++\* Only the first two unique IPs to request data from AEMM receive spot itn information in the response. This was introduced+in AEMM v1.6.0. This can be overridden with a configuration paramter. For previous versions there is no unique IP restriction. 

🤖

haugenj

comment created time in 6 days

PullRequestReviewEvent

issue commentaws/eks-charts

aws-node-termination-handler upgrade failed from v1.5.0 to v1.7.0

Hi @prikesh-patel , I believe if you set the values as follows, it will upgrade cleanly:

nodeSelectorTermsOs: "beta.kubernetes.io/os"
nodeSelectorTermsArch: "beta.kubernetes.io/arch"

K8s will stop supporting the beta key in 1.19 I believe, so you will need to switch to the non-beta keys before upgrading to k8s 1.19.

prikesh-patel

comment created time in 6 days

PullRequestReviewEvent
PullRequestReviewEvent

pull request commentawslabs/amazon-ec2-instance-qualifier

pass args via config file

Generally, cli args have precedence over config files.

Usually this is the order:

  1. CLI Args
  2. ENV VARS
  3. Config File
brycahta

comment created time in 8 days

issue commentaws/amazon-ec2-instance-selector

parameter --memory-min seems not working as expected

I still think it would be helpful to log on stderr. I wrote the tool and I've definitely stared at the output asking why an instance type wasn't there only to realize it was truncated :D

chimerab

comment created time in 8 days

issue commentaws/amazon-ec2-instance-selector

parameter --memory-min seems not working as expected

Hey @chimerab , I believe the problem is that the results have been truncated. The default of --max-results is 20 so that you could use all instance types returned for a fleet (LaunchTemplateOverrides have a limit of 20 https://docs.aws.amazon.com/autoscaling/ec2/APIReference/API_LaunchTemplate.html).

If you do:

$ ./ec2-instance-selector-linux-amd64 --vcpus 4 --memory-min 8 -a x86_64 --max-results 300
c5.xlarge
c5a.xlarge
c5ad.xlarge
c5d.xlarge
c5n.xlarge
d2.xlarge
g3s.xlarge
g4dn.xlarge
i2.xlarge
i3.xlarge
i3en.xlarge
inf1.xlarge
m1.xlarge
m2.2xlarge
m3.xlarge
m4.xlarge
m5.xlarge
m5a.xlarge
m5ad.xlarge
m5d.xlarge
m5dn.xlarge
m5n.xlarge
p2.xlarge
r3.xlarge
r4.xlarge
r5.xlarge
r5a.xlarge
r5ad.xlarge
r5d.xlarge
r5dn.xlarge
r5n.xlarge
t2.xlarge
t3.xlarge
t3a.xlarge
x1e.xlarge
z1d.xlarge

It might be a good idea for the tool to print on stderr if the results were truncated to max-results, what do you think?

chimerab

comment created time in 10 days

PullRequestReviewEvent
PullRequestReviewEvent

Pull request review commentawslabs/amazon-ec2-instance-qualifier

update session-creation logic

 go 1.14  require ( 	github.com/aws/aws-sdk-go v1.31.5+	github.com/mitchellh/go-homedir v1.1.0

can you add these new pkgs to the third party licenses file

brycahta

comment created time in 11 days

PullRequestReviewEvent

issue closedaws/amazon-ec2-instance-selector

Sluggishness of t3.a EC2 instance

Hi, I had launched a t3a.large instance to deploy our microservices. The Available RAM of the ec2 instance is around 4GB and CPU usage is not more than 40% most of the time[As per cloud watch monitoring details]. But, Deployed microservices are providing a delayed response compared to microservices running in other instances. Even SSH to the server is taking more than 10 seconds.

can anyone help me to identify the root cause of the sluggishness?

closed time in 11 days

deekshith-elear

issue commentaws/amazon-ec2-instance-selector

Sluggishness of t3.a EC2 instance

@deekshith-elear

t* instance types are burstable, so that may be your issue. But please note that issues you open on this repo should be about the amazon-ec2-instance-selector CLI/SDK. If you're having problems with EC2, please open an AWS support ticket from your AWS account.

deekshith-elear

comment created time in 11 days

PullRequestReviewEvent
PullRequestReviewEvent

push eventaws/homebrew-tap

ec2-bot 🤖

commit sha 2cc535ec09047def9285ed0fce156f0881264505

ec2-metadata-mock update to version 1.6.0

view details

Brandon Wagner

commit sha 1ff416820fdf84ea15f8d70e9b1d28706b1f7215

Merge pull request #133 from ec2-bot/ec2-metadata-mock-v1.6.0-0bcb1158 🥳 ec2-metadata-mock v1.6.0 Automated Release! 🥑

view details

push time in 14 days

PR merged aws/homebrew-tap

🥳 ec2-metadata-mock v1.6.0 Automated Release! 🥑

ec2-metadata-mock v1.6.0 Automated Release! 🤖🤖

Release Notes 📝:

New Features

  • Add MockIPCount flag to return spot interrupts (Spot ITN) and events to a set number of IPs within a cluster
    • By default, 2 IPs will be eligible for Spot ITN and 2 IPs will be eligible for scheduled events (separate cache)
  • Add placement/* paths that were added to IMDS Aug 2020:
    • placement/availability-zone-id
    • placement/group-name
    • placement/host-id
    • placement/partition-number
    • placement/region

Bug Fixes

N/A

Improvements

  • Updated ReadMes
  • Shell scripts have been linted with shellcheck
  • AEMM is now configured as a Deployment and no longer a DaemonSet. Replicas is defaulted to 1

Breaking Changes

N/A

+5 -5

0 comment

1 changed file

ec2-bot

pr closed time in 14 days

PullRequestReviewEvent
PullRequestReviewEvent
PullRequestReviewEvent
PullRequestReviewEvent
PullRequestReviewEvent

Pull request review commentaws/amazon-ec2-metadata-mock

Adding mock-ip-count feature

 Parameter | Description | Default in Helm | Default AEMM configuration `aemm.server.hostname` | hostname to run AEMM on | `""`, in order to listen on all available interfaces e.g. ClusterIP | `0.0.0.0` `aemm.mockDelaySec` | mock delay in seconds, relative to the start time of AEMM | `0` | `0` `aemm.mockTriggerTime` | mock trigger time in RFC3339 format | `""` | `""`+`aemm.mockIPCount` | number of IPs that can receive spot interrupts and/or scheduled events; subsequent requests will return 404 | `""` | `99`

default should be 2

brycahta

comment created time in 14 days

Pull request review commentaws/amazon-ec2-metadata-mock

Adding mock-ip-count feature

 var ( 		gf.ConfigFileFlag:       cfg.GetDefaultCfgFileName(), 		gf.MockDelayInSecFlag:   0, 		gf.MockTriggerTimeFlag:  "",+		gf.MockIPCountFlag:      99,

I'm not sure what this is used for since a default is specified in cobra, but should this be -1?

brycahta

comment created time in 14 days

PullRequestReviewEvent
PullRequestReviewEvent

Pull request review commentaws/amazon-ec2-metadata-mock

Define number of nodes to receive Spot interrupts

+#! /usr/bin/env bash++set -euo pipefail++SCRIPTPATH="$(+  cd "$(dirname "$0")"+  pwd -P+)"+EXIT_CODE_TO_RETURN=0++function get_status() {+  pod=$1+  status=$(kubectl describe pod $pod | grep "Status:")+  status=${status//"Status:"/}+  status=$(echo $status | xargs)+  echo $status+}++function assert_value() {+  # assert actual == expected+  if [[ $1 == "$2" ]]; then+    echo "✅ Verified $3"+  else+    echo "❌ Failed $3 verification. Actual: $1 Expected: $2"+    EXIT_CODE_TO_RETURN=1+  fi+}++function clean_up() {+  kubectl delete pods/test-pod pods/test-pod-404+}++function test() {+  num_nodes=$1+  expected_test_pod_status=$2+  expected_test_pod_404_status=$3+  echo "executing term-nodes test with terminationNodes=$num_nodes"++  helm upgrade --install "$CLUSTER_NAME-aemm" \+    $AEMM_HELM_REPO \+    --wait \+    --namespace default \+    --values $AEMM_HELM_REPO/ci/local-image-values.yaml \+    --set aemm.spot.time="1994-05-15T00:00:00Z" \+    --set aemm.terminationNodes=$num_nodes++    # Deploy pods+    kubectl apply -f "$SCRIPTPATH/test-pod.yaml"+    sleep 1+    kubectl apply -f "$SCRIPTPATH/test-pod-404.yaml"++    # Proceed with copying only after pod is Running+    test_pod_404_status=$(get_status test-pod-404)+    while [ "$test_pod_404_status" != "Running" ]; do+      echo "test_pod_404_status: $test_pod_404_status"+      sleep 1+      test_pod_404_status=$(get_status test-pod-404)+    done++    echo "test_pod_404_status: $test_pod_404_status"+    echo "copying to 404-pod..."+    kubectl cp "$SCRIPTPATH/../../e2e/golden/404_response.golden" test-pod-404:/tmp/404_response.golden++    test_pod_status=$(get_status test-pod)+    test_pod_404_status=$(get_status test-pod-404)++    # Keep querying status until tests succeed or fail+    while [ "$test_pod_status" == "Running" ] || [ "$test_pod_404_status" == "Running" ]; do

for-loop break out to prevent infinite loops

brycahta

comment created time in 18 days

PullRequestReviewEvent

Pull request review commentaws/amazon-ec2-metadata-mock

Define number of nodes to receive Spot interrupts

 get_chart_test_config() {     echo "$config" } +test_termination_nodes() {+    if [[ $REUSE_ENV == false ]]; then+        mkdir -p $TMP_DIR+        install_kind+        create_kind_cluster+    fi++    build_and_load_image+    install_helm

looks like you might need an install_kubectl here too

brycahta

comment created time in 18 days

Pull request review commentaws/amazon-ec2-metadata-mock

Define number of nodes to receive Spot interrupts

 var ( 		gf.ConfigFileFlag:       cfg.GetDefaultCfgFileName(), 		gf.MockDelayInSecFlag:   0, 		gf.MockTriggerTimeFlag:  "",+		gf.TerminationNodesFlag: 99,

do you think -1 would be more intuitive to be unlimited? 99 seems pretty arbitrary.

I'm also not sure we should default to unlimited. I understand why you did, since it maintains backwards compatibility. It may also be difficult to find a good number that doesn't surprise some users, "Why did some of my nodes not receive ITNs?" But I think that's a better problem than causing a whole cluster to die. What would you think about setting it at 2?

brycahta

comment created time in 18 days

Pull request review commentaws/amazon-ec2-metadata-mock

Define number of nodes to receive Spot interrupts

 Parameter | Description | Default in Helm | Default AEMM configuration `aemm.server.hostname` | hostname to run AEMM on | `""`, in order to listen on all available interfaces e.g. ClusterIP | `0.0.0.0` `aemm.mockDelaySec` | mock delay in seconds, relative to the start time of AEMM | `0` | `0` `aemm.mockTriggerTime` | mock trigger time in RFC3339 format | `""` | `""`+`aemm.terminationNodes` | number of nodes that can receive spot interrupts; subsequent requests will return 404 | `""` | `99`

add replicaCount to readme too

brycahta

comment created time in 18 days

Pull request review commentaws/amazon-ec2-metadata-mock

Define number of nodes to receive Spot interrupts

 image:   tag: "v1.5.0"   pullPolicy: "IfNotPresent" +replicaCount: 1+ # nameOverride overrides the name of the helm chart nameOverride: "" # fullnameOverride overrides the name of the application fullnameOverride: "" -# Create node OS specific daemonset(s). (e.g. "linux", "windows", "linux windows")+# Create node OS specific deployment(s). (e.g. "linux", "windows", "linux windows") targetNodeOs: "linux"

the windows and linux vars don't appear to be in the readme

brycahta

comment created time in 18 days

Pull request review commentaws/amazon-ec2-metadata-mock

Define number of nodes to receive Spot interrupts

 Flags:       --mock-trigger-time string   mock trigger time in RFC3339 format. This takes priority over mock-delay-sec (default: none)   -p, --port string                the HTTP port where the mock runs (default: 1338)   -s, --save-config-to-file        whether to save processed config from all input sources in .ec2-metadata-mock/.aemm-config-used.json in $HOME or working dir, if homedir is not found (default: false)+  -x, --termination-nodes int      number of nodes in a cluster that can receive Spot interrupt notice (default: 99, meaning unlimited nodes) (default 99)

Should this apply to scheduled maintenance events as well with a more generic name?

brycahta

comment created time in 18 days

PullRequestReviewEvent
PullRequestReviewEvent

push eventawslabs/aws-simple-ec2-cli

Jason Haugen

commit sha 3fa1f54edc0e64465e51a6d132bb06beb78a0f59

Change 'ez' to 'simple' to align with repo name

view details

Jason Haugen

commit sha 31256a98cbd9fe671b1fbc0993232b188b13f35e

Add 'make fmt' target. Format project with it

view details

Brandon Wagner

commit sha 68fb8107bf2d8ffca8a7ab5a98c324b44a59515d

Merge pull request #7 from haugenj/master Change 'ez' to 'simple' to align with repo name

view details

push time in 20 days

PR merged awslabs/aws-simple-ec2-cli

Change 'ez' to 'simple' to align with repo name

Issue #, if available: none

Description of changes: change 'ez' to 'simple' everywhere to align the code with the repository name. Done in preparation of adding this to the aws-homebrew tap.

Depending on the context, the format could be slightly different now than before. Examples:

'ez-ec2' -> 'simple-ec2'
'ezec2' -> 'simpleEc2'
'Ezec2' -> 'SimpleEc2'
'EZEC2' -> 'SIMPLE_EC2'

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

+366 -360

0 comment

38 changed files

haugenj

pr closed time in 20 days

PullRequestReviewEvent

push eventaws/amazon-ec2-instance-selector

Brandon Wagner

commit sha 6cba8af522d053a402b6e17a9e02cd53eda0ecc8

Update .travis.yml (#51)

view details

push time in 21 days

delete branch aws/amazon-ec2-instance-selector

delete branch : bwagner5-patch-1

delete time in 21 days

PR merged aws/amazon-ec2-instance-selector

Update .travis.yml

Issue #, if available: N/A

Description of changes: Don't run tests requiring creds

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

+3 -3

0 comment

1 changed file

bwagner5

pr closed time in 21 days

PullRequestReviewEvent

issue commentaws/amazon-ec2-instance-selector

Support AutoSpotting use cases

Hey @cristim thanks for opening this issue! I definitely think there is room for further discussion on most of these, feel free to create separate issues for them.

  • uses an instance ID(assumed to be from the current region) as comparison baseline instead of the region-agnostic instance type supported currently
  • considers actually used resources on the baseline running instance instead of the maximum available in the specs, for example currently we consider the attached instance store volumes in the block device mapping not the total available, but there may be others.

It seems to me that these two would have to go together to make this useful. I don't think just allowing instance-id as the base would be useful since it would basically just look up the instance type and then pass that into instance-selector. If instance-selector does lookup actual resource usage, then that might make sense.

  • consider spot pricing of the target instance type, returning the better-priced compatible spot instance type in the same region or AZ(would be nice to be configurable between Region and AZ).
  • sorts the result in increased order of the spot price.

These also seem related, but since spot prices change and there is already features in ASG to handle dynamic pricing (lowest-price allocation strategy and spot max price), I don't quite see the need of figuring out which one is cheaper. It would be better to pass all compatible instance types into the launch template and then let ASG figure out which one to use based on the configuration of max-price or the allocation strategy.

  • efficient when it comes to API calls and network traffic by storing a lot of static data(having some of it cached for a day or so may be acceptable though).

Yeah, this data would be easily cached since it doesn't change very often. It hasn't made sense so far in the project since this tool is usually executed in an ad-hoc manner, so the next run would usually require the cache to be evicted anyways to account for potentially new instance types. But I can see the usefulness if you want to run this for a bunch of workloads within an account as a system component.

cristim

comment created time in 22 days

PullRequestReviewEvent

create barnchbwagner5/aws-service-operator-k8s

branch : r53r

created branch time in 24 days

PR opened aws/amazon-ec2-instance-selector

Update .travis.yml

Issue #, if available: N/A

Description of changes: Don't run tests requiring creds

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

+3 -3

0 comment

1 changed file

pr created time in 25 days

create barnchaws/amazon-ec2-instance-selector

branch : bwagner5-patch-1

created branch time in 25 days

delete branch awslabs/amazon-ec2-instance-qualifier

delete branch : brycahta-patch-1

delete time in 25 days

push eventawslabs/amazon-ec2-instance-qualifier

Bryan™

commit sha ce69874bb51d51326d807d5cc6dec74e691e6ec9

Update travis.yml (#4) Removed AWS_ACCESS_KEY_ID from Travis. Will add back following approvals

view details

push time in 25 days

PR merged awslabs/amazon-ec2-instance-qualifier

Update travis.yml

Removed AWS_ACCESS_KEY_ID from Travis. Will add back following approvals

Issue #, if available: N/A

Description of changes:

  • disable e2e tests for now

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

+1 -1

0 comment

1 changed file

brycahta

pr closed time in 25 days

PullRequestReviewEvent
PullRequestReviewEvent

pull request commentaws/aws-node-termination-handler

upgrade test dependency versions

Will wait for kind to officially support k8s 1.19 before merging

bwagner5

comment created time in a month

PullRequestReviewEvent

push eventbwagner5/aws-node-termination-handler

Brandon Wagner

commit sha 4cc64fbd3a275e791f44310f886604d395b3d1de

upgrade test dependency versions

view details

push time in a month

PR opened aws/aws-node-termination-handler

upgrade test dependency versions

Issue #, if available: N/A

Description of changes:

  • Upgrade helm to latest versions
  • Add k8s 1.19 kind docker image to e2e tests
  • patch version upgrade of k8s 1.18 kind docker image
  • Change default k8s version to 1.17

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

+7 -5

0 comment

2 changed files

pr created time in a month

create barnchbwagner5/aws-node-termination-handler

branch : test-upgrades

created branch time in a month

Pull request review commentawslabs/amazon-ec2-instance-qualifier

Minor changes & enhancements

 BUCKET_ROOT_DIR= TARGET_UTIL=0 

set -euo pipefail

brycahta

comment created time in a month

Pull request review commentawslabs/amazon-ec2-instance-qualifier

Minor changes & enhancements

 ALL_INSTANCE_TYPES=$SUPPORTED_INSTANCE_TYPES,$UNSUPPORTED_INSTANCE_TYPE_AMI,$UNS  function init_test_resources() {   # create resources used for e2e tests-  AWS_DEFAULT_REGION=$DEFAULT_REGION

rather than specifying --region on every command, you could export AWS_REGION=$DEFAULT_REGION here and unset it at the bottom of the function. OR if the region should always be the same (which I think it should now that the non-default test has been removed), you could just specify it at the top of the file and not worry about setting and unsetting in the functions.

brycahta

comment created time in a month

Pull request review commentawslabs/amazon-ec2-instance-qualifier

Minor changes & enhancements

+#!/usr/bin/env bash++set -euo pipefail++SCRIPTPATH="$( cd "$(dirname "$0")" ; pwd -P )"+BUILD_DIR="${SCRIPTPATH}/../../build"++KERNEL=$(uname -s | tr '[:upper:]' '[:lower:]')+SHELLCHECK_VERSION="0.7.1"++function exit_and_fail() {+   echo "❌ Test Failed! Found a shell script with errors."+   exit 1+}+trap exit_and_fail INT ERR TERM++curl -Lo ${BUILD_DIR}/shellcheck.tar.xz "https://github.com/koalaman/shellcheck/releases/download/v${SHELLCHECK_VERSION}/shellcheck-v${SHELLCHECK_VERSION}.${KERNEL}.x86_64.tar.xz"+tar -C ${BUILD_DIR} -xvf "${BUILD_DIR}/shellcheck.tar.xz"+export PATH="${BUILD_DIR}/shellcheck-v${SHELLCHECK_VERSION}:$PATH"++# setup.go, .template and binaries have embedded user scripts; therefore, omit from grep+# shellcheck disable=SC2207+shell_files=($(grep -RInl --exclude=\*.{go,template} -e '#!.*/bin/bash' -e '#!.*/usr/bin/env bash' ${SCRIPTPATH}/../../))+shellcheck -S warning "${shell_files[@]}"

nice 🚀

brycahta

comment created time in a month

PullRequestReviewEvent
PullRequestReviewEvent

Pull request review commentawslabs/amazon-ec2-instance-qualifier

Minor changes & enhancements

 BUCKET_ROOT_DIR=%s TARGET_UTIL=%d  adduser qualifier-cd /home/qualifier+cd /home/qualifier || :

Was this something shellcheck nagged about? I'm not sure if userdata scripts are executed with bash strict mode or not, but we should probably do a set -euo pipefail at the top anyways. In the future, we should move this outside the go code.

brycahta

comment created time in a month

PullRequestReviewEvent

issue commentaws/aws-node-termination-handler

Question about Kiam and hostNetwork

NTH defaults to use IMDSv2 and will fallback to IMDSv1 if it can't reach v2. The metadata service is the same in both versions (same paths and responses), but IMDSv2 requires a token and has a limit of 1 hop at the IP level. I'm not sure how the kiam agent proxy works with IMDSv2, it may work, but haven't tried it.

"By default, the response to PUT requests has a response hop limit (time to live) of 1 at the IP protocol level. You can adjust the hop limit using the modify-instance-metadata-options command if you need to make it larger. For example, you might need a larger hop limit for backward compatibility with container services running on the instance. For more information, see modify-instance-metadata-options in the AWS CLI Command Reference."

  • https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/configuring-instance-metadata-service.html

Here's a good blog post on the reasoning behind IMDSv2: https://aws.amazon.com/blogs/security/defense-in-depth-open-firewalls-reverse-proxies-ssrf-vulnerabilities-ec2-instance-metadata-service/

gillg

comment created time in a month

issue commentaws/aws-node-termination-handler

Question about Kiam and hostNetwork

I'm pretty sure we can remove the kiam steps from the readme, thanks for bringing this up @gillg

gillg

comment created time in a month

issue commentaws/aws-node-termination-handler

Question about Kiam and hostNetwork

@gillg The NTH pods need hostNetwork to work w/ IMDSv2 which by default only allows 1 network hop.

You may be correct about kiam not intercepting calls to when the pod is using HostNetworking.

Looks like the default setup is to put the iptables intercept rule on the Docker0 bridge interface:
https://github.com/uswitch/kiam/blob/0fe1a077e3a408cf47dea25ac83e14e367fb0bb9/cmd/kiam/agent.go#L64

@leosunmo do you have any insight on this? Is it common for users to override the interface to eth0 or the like that would intercept requests to metadata from hostNetworking pods?

gillg

comment created time in a month

more