profile
viewpoint
Ashley Jeffs Jeffail Reading, UK https://www.jeffail.uk If you want to get in touch please find me in person I'm not good with computers.

Jeffail/benthos 2403

A stream processor for mundane tasks written in Go

Jeffail/gabs 1859

For parsing, creating and editing unknown or dynamic JSON in Go

Jeffail/tunny 1669

A goroutine pool for Go

Jeffail/leaps 681

A pair programming service using operational transforms

benthosdev/benthos-plugin-example 17

Benthos plugin examples

Jeffail/spiril 11

Rust library for genetic algorithms

Jeffail/tokesies 7

A string tokenizer library for Rust

Jeffail/util 5

A collection of basic utilities for slapping golang services together

Jeffail/jc 2

Prints the cardinalities of value paths in a stream of JSON blobs.

Jeffail/brew 0

🍺 The missing package manager for macOS (or Linux)

push eventJeffail/benthos

Ashley Jeffs

commit sha cd3326f2f2adabf50011a7921df1257c7ffc06f6

Add errored function and sync response status

view details

push time in 8 minutes

issue commentJeffail/benthos

http processor doesn't respect keep-alives

Hey @nicktelford I tried this out and confirmed the http processor is negotiating and using keep alives automatically on a local server, are you able to provide more information on how the HTTP traffic is being routed to/from Benthos?

nicktelford

comment created time in 2 hours

push eventJeffail/benthos

Ashley Jeffs

commit sha 9a0953fa6003460a2acfb2215373c6046708dbbb

fix changelog

view details

push time in 2 hours

push eventJeffail/benthos

Ashley Jeffs

commit sha f99fb7490aca8101448e426ad4f219bf65283d34

Add beta tag to azure blob storage output

view details

push time in 2 hours

push eventJeffail/benthos

Marco Amador

commit sha 861ee95cea37fc5a3b1bd44b604986f71c3f16da

Azure Blob Storage output (#466) * azure blob storage output * Use master tools version * Remove not used dependency * Add comment about wrapped internal storageError Rename vars * Validate if StorageAccount is empty within NewAzureBlobStorage Store credentials in struct Create Anonymous credential if StorageAccessKey is empty Retry to upload blob after container creation * Add retry comment * Check error in right place * Use lowercase in error messages

view details

push time in 2 hours

PR merged Jeffail/benthos

Reviewers
Azure Blob Storage output

Hi! I've created an Azure Blob Storage output that I've been running smoothly for a few days now. It allows specifying dynamically the container which is created if it does not exist.

Can you guys please have a look?

Example:

output:
  switch:
    outputs:
    - condition:
        bloblang: meta("msg_type") == "event"
      fallthrough: false
      output:
        blob_storage:
          storage_account: ${AZURE_STORAGE_ACCOUNT}
          storage_access_key: ${AZURE_STORAGE_ACCESS_KEY}
          container: 'test-events-${!timestamp("2006")}'
          path: '${!timestamp("2006-01-02")}/event-${!timestamp_unix_nano()}.json.gz'
          max_in_flight: 5
    - output:
        blob_storage:
          storage_account: ${AZURE_STORAGE_ACCOUNT}
          storage_access_key: ${AZURE_STORAGE_ACCESS_KEY}
          container: 'test-readings-${!timestamp("2006")}'
          path: '${!timestamp("2006-01-02")}/reading-${!timestamp_unix_nano()}.json.gz'
          max_in_flight: 5

Thanks!

+510 -46

1 comment

8 changed files

mfamador

pr closed time in 2 hours

pull request commentJeffail/benthos

Azure Blob Storage output

Thanks for the updates @mfamador, this looks great. I'll include this in the next release but I'm going to add a BETA tag to the docs for now just in case we want to adjust the config fields eventually.

One thing I'd be interested in is whether there's a common use case for uploading to the storage of one account whilst using the credentials of another, do you know if that's possible? If so it might be worth working that into the config structure.

mfamador

comment created time in 2 hours

push eventJeffail/benthos

Ashley Jeffs

commit sha 4d1e8b045cd2393b35fb2fd7490588b103ca6855

Remove warn log for dynamodb get miss

view details

push time in 2 hours

push eventJeffail/benthos

Marco Amador

commit sha 2397e96f875752c123183acba613317e1427daf9

Set correct case for property fields within jsonschema validation errors description (#469) * Do not use lowercase in error description * Add failing test * Tests passing * Little refactoring * Fix field description * Remove unnecessary schema checks

view details

push time in 2 hours

PR merged Jeffail/benthos

Set correct case for property fields within jsonschema validation errors description

Hi! Please confirm that this would resolve #468 Thanks!

+111 -1

1 comment

2 changed files

mfamador

pr closed time in 2 hours

issue closedJeffail/benthos

jsonschema validation errors description in lowercase

The jsonschema validation errors description is in lowercase, changing the real name of the fields.

Ex: myarray.0 issomething is required

when in fact the field is called isSomething

closed time in 2 hours

mfamador

pull request commentJeffail/benthos

Set correct case for property fields within jsonschema validation errors description

Thanks @mfamador, looks great. I'll aim to kick off a release this weekend.

mfamador

comment created time in 2 hours

push eventJeffail/benthos

Ashley Jeffs

commit sha 867f69a78ff11eee88f1e3066409b3481a15643a

Add bloblang input

view details

push time in 2 hours

issue commentJeffail/benthos

http processor doesn't respect keep-alives

Hey @nicktelford, can you give me the exact version of Benthos you're running?

nicktelford

comment created time in 4 days

issue commentJeffail/benthos

Is a 400 http response code on json_schema failures possible?

Hey @dz-at-tc, unfortunately setting a custom status code isn't currently possible. Your options right now are to set a header, and the payload can be whatever format you want. I'll update this issue to add custom status codes to sync responses as I believe it'll be possible.

dz-at-tc

comment created time in 4 days

Pull request review commentJeffail/benthos

Azure Blob Storage output

+package writer++import (+	"bytes"+	"context"+	"fmt"+	"net/url"+	"strings"+	"time"++	"github.com/Azure/azure-storage-blob-go/azblob"+	"github.com/Jeffail/benthos/v3/lib/bloblang/x/field"+	"github.com/Jeffail/benthos/v3/lib/log"+	"github.com/Jeffail/benthos/v3/lib/metrics"+	"github.com/Jeffail/benthos/v3/lib/types"+)++//------------------------------------------------------------------------------++// AzureBlobStorageConfig contains configuration fields for the AzureBlobStorage output type.+type AzureBlobStorageConfig struct {+	StorageAccount   string `json:"storage_account" yaml:"storage_account"`+	StorageAccessKey string `json:"storage_access_key" yaml:"storage_access_key"`+	Container        string `json:"container" yaml:"container"`+	Path             string `json:"path" yaml:"path"`+	BlobType         string `json:"blob_type" yaml:"blob_type"`+	Timeout          string `json:"timeout" yaml:"timeout"`+	MaxInFlight      int    `json:"max_in_flight" yaml:"max_in_flight"`+}++// NewAzureBlobStorageConfig creates a new Config with default values.+func NewAzureBlobStorageConfig() AzureBlobStorageConfig {+	return AzureBlobStorageConfig{+		StorageAccount:   "",+		StorageAccessKey: "",+		Container:        "",+		Path:             `${!count("files")}-${!timestamp_unix_nano()}.txt`,+		BlobType:         "BLOCK",+		Timeout:          "5s",+		MaxInFlight:      1,+	}+}++//------------------------------------------------------------------------------++// AzureBlobStorage is a benthos writer. Type implementation that writes messages to an+// Azure Blob Storage storage account.+type AzureBlobStorage struct {+	conf      AzureBlobStorageConfig+	container field.Expression+	path      field.Expression+	blobType  field.Expression+	timeout   time.Duration+	log       log.Modular+	stats     metrics.Type+}++// NewAzureBlobStorage creates a new Amazon S3 bucket writer.Type.+func NewAzureBlobStorage(+	conf AzureBlobStorageConfig,+	log log.Modular,+	stats metrics.Type,+) (*AzureBlobStorage, error) {+	var timeout time.Duration+	if tout := conf.Timeout; len(tout) > 0 {+		var err error+		if timeout, err = time.ParseDuration(tout); err != nil {+			return nil, fmt.Errorf("failed to parse timeout period string: %v", err)+		}+	}+	a := &AzureBlobStorage{+		conf:    conf,+		log:     log,+		stats:   stats,+		timeout: timeout,+	}+	var err error+	if a.container, err = field.New(conf.Container); err != nil {+		return nil, fmt.Errorf("failed to parse container expression: %v", err)+	}+	if a.path, err = field.New(conf.Path); err != nil {+		return nil, fmt.Errorf("failed to parse path expression: %v", err)+	}+	if a.blobType, err = field.New(conf.BlobType); err != nil {+		return nil, fmt.Errorf("failed to parse blob type expression: %v", err)+	}+	return a, nil+}++// ConnectWithContext attempts to establish a connection to the target Blob Storage Account.+func (a *AzureBlobStorage) ConnectWithContext(ctx context.Context) error {+	return a.Connect()+}++// Connect attempts to establish a connection to the target Blob Storage Account.+func (a *AzureBlobStorage) Connect() error {+	return nil+}++// Write attempts to write message contents to a target Azure Blob Storage container as files.+func (a *AzureBlobStorage) Write(msg types.Message) error {+	return a.WriteWithContext(context.Background(), msg)+}++func (a *AzureBlobStorage) getContainer(name string) (*azblob.ContainerURL, error) {+	accountName, accountKey := a.conf.StorageAccount, a.conf.StorageAccessKey+	if len(accountName) == 0 || len(accountKey) == 0 {+		return nil, fmt.Errorf("invalid azure storage account credentials")+	}+	credential, err := azblob.NewSharedKeyCredential(accountName, accountKey)+	if err != nil {+		return nil, fmt.Errorf("invalid azure storage account credentials: %v", err)+	}+	p := azblob.NewPipeline(credential, azblob.PipelineOptions{})+	URL, _ := url.Parse(fmt.Sprintf("https://%s.blob.core.windows.net/%s", accountName, name))+	containerURL := azblob.NewContainerURL(*URL, p)+	return &containerURL, nil+}++func (a *AzureBlobStorage) uploadToBlob(ctx context.Context, message []byte, blobName string, blobType string, containerURL *azblob.ContainerURL) error {+	var err error++	switch blobType {+	case "BLOCK":+		blobURL := containerURL.NewBlockBlobURL(blobName)+		_, err = azblob.UploadStreamToBlockBlob(ctx, bytes.NewReader(message), blobURL, azblob.UploadStreamToBlockBlobOptions{})+	case "APPEND":+		blobURL := containerURL.NewAppendBlobURL(blobName)+		_, err = blobURL.AppendBlock(ctx, bytes.NewReader(message), azblob.AppendBlobAccessConditions{}, nil)+	}++	return err+}++// WriteWithContext attempts to write message contents to a target storage account as files.+func (a *AzureBlobStorage) WriteWithContext(wctx context.Context, msg types.Message) error {+	ctx, cancel := context.WithTimeout(+		wctx, a.timeout,+	)+	defer cancel()++	return msg.Iter(func(i int, p types.Part) error {+		c, err := a.getContainer(a.container.String(i, msg))+		if err != nil {+			return err+		}+		if err := a.uploadToBlob(ctx, p.Get(), a.path.String(i, msg), a.blobType.String(i, msg), c); err != nil {+			a.log.Errorf("Error uploading blob: %v.", err)+			if containerNotFound(err) {+				if _, cerr := c.Create(ctx, azblob.Metadata{}, azblob.PublicAccessNone); cerr != nil {

If this is successful would it make sense to attempt uploadToBlob once more?

mfamador

comment created time in 4 days

Pull request review commentJeffail/benthos

Azure Blob Storage output

+package writer++import (+	"bytes"+	"context"+	"fmt"+	"net/url"+	"strings"+	"time"++	"github.com/Azure/azure-storage-blob-go/azblob"+	"github.com/Jeffail/benthos/v3/lib/bloblang/x/field"+	"github.com/Jeffail/benthos/v3/lib/log"+	"github.com/Jeffail/benthos/v3/lib/metrics"+	"github.com/Jeffail/benthos/v3/lib/types"+)++//------------------------------------------------------------------------------++// AzureBlobStorageConfig contains configuration fields for the AzureBlobStorage output type.+type AzureBlobStorageConfig struct {+	StorageAccount   string `json:"storage_account" yaml:"storage_account"`+	StorageAccessKey string `json:"storage_access_key" yaml:"storage_access_key"`+	Container        string `json:"container" yaml:"container"`+	Path             string `json:"path" yaml:"path"`+	BlobType         string `json:"blob_type" yaml:"blob_type"`+	Timeout          string `json:"timeout" yaml:"timeout"`+	MaxInFlight      int    `json:"max_in_flight" yaml:"max_in_flight"`+}++// NewAzureBlobStorageConfig creates a new Config with default values.+func NewAzureBlobStorageConfig() AzureBlobStorageConfig {+	return AzureBlobStorageConfig{+		StorageAccount:   "",+		StorageAccessKey: "",+		Container:        "",+		Path:             `${!count("files")}-${!timestamp_unix_nano()}.txt`,+		BlobType:         "BLOCK",+		Timeout:          "5s",+		MaxInFlight:      1,+	}+}++//------------------------------------------------------------------------------++// AzureBlobStorage is a benthos writer. Type implementation that writes messages to an+// Azure Blob Storage storage account.+type AzureBlobStorage struct {+	conf      AzureBlobStorageConfig+	container field.Expression+	path      field.Expression+	blobType  field.Expression+	timeout   time.Duration+	log       log.Modular+	stats     metrics.Type+}++// NewAzureBlobStorage creates a new Amazon S3 bucket writer.Type.+func NewAzureBlobStorage(+	conf AzureBlobStorageConfig,+	log log.Modular,+	stats metrics.Type,+) (*AzureBlobStorage, error) {+	var timeout time.Duration+	if tout := conf.Timeout; len(tout) > 0 {+		var err error+		if timeout, err = time.ParseDuration(tout); err != nil {+			return nil, fmt.Errorf("failed to parse timeout period string: %v", err)+		}+	}+	a := &AzureBlobStorage{+		conf:    conf,+		log:     log,+		stats:   stats,+		timeout: timeout,+	}+	var err error+	if a.container, err = field.New(conf.Container); err != nil {+		return nil, fmt.Errorf("failed to parse container expression: %v", err)+	}+	if a.path, err = field.New(conf.Path); err != nil {+		return nil, fmt.Errorf("failed to parse path expression: %v", err)+	}+	if a.blobType, err = field.New(conf.BlobType); err != nil {+		return nil, fmt.Errorf("failed to parse blob type expression: %v", err)+	}+	return a, nil+}++// ConnectWithContext attempts to establish a connection to the target Blob Storage Account.+func (a *AzureBlobStorage) ConnectWithContext(ctx context.Context) error {+	return a.Connect()+}++// Connect attempts to establish a connection to the target Blob Storage Account.+func (a *AzureBlobStorage) Connect() error {+	return nil+}++// Write attempts to write message contents to a target Azure Blob Storage container as files.+func (a *AzureBlobStorage) Write(msg types.Message) error {+	return a.WriteWithContext(context.Background(), msg)+}++func (a *AzureBlobStorage) getContainer(name string) (*azblob.ContainerURL, error) {+	accountName, accountKey := a.conf.StorageAccount, a.conf.StorageAccessKey+	if len(accountName) == 0 || len(accountKey) == 0 {+		return nil, fmt.Errorf("invalid azure storage account credentials")+	}+	credential, err := azblob.NewSharedKeyCredential(accountName, accountKey)

Would it make sense to initialize this once during construction and reuse the credential struct? Also I'm wondering if we should actually allow users to leave the access key empty in order to use an anonymous credential.

mfamador

comment created time in 4 days

Pull request review commentJeffail/benthos

Azure Blob Storage output

+package writer++import (+	"bytes"+	"context"+	"fmt"+	"net/url"+	"strings"+	"time"++	"github.com/Azure/azure-storage-blob-go/azblob"+	"github.com/Jeffail/benthos/v3/lib/bloblang/x/field"+	"github.com/Jeffail/benthos/v3/lib/log"+	"github.com/Jeffail/benthos/v3/lib/metrics"+	"github.com/Jeffail/benthos/v3/lib/types"+)++//------------------------------------------------------------------------------++// AzureBlobStorageConfig contains configuration fields for the AzureBlobStorage output type.+type AzureBlobStorageConfig struct {+	StorageAccount   string `json:"storage_account" yaml:"storage_account"`+	StorageAccessKey string `json:"storage_access_key" yaml:"storage_access_key"`+	Container        string `json:"container" yaml:"container"`+	Path             string `json:"path" yaml:"path"`+	BlobType         string `json:"blob_type" yaml:"blob_type"`+	Timeout          string `json:"timeout" yaml:"timeout"`+	MaxInFlight      int    `json:"max_in_flight" yaml:"max_in_flight"`+}++// NewAzureBlobStorageConfig creates a new Config with default values.+func NewAzureBlobStorageConfig() AzureBlobStorageConfig {+	return AzureBlobStorageConfig{+		StorageAccount:   "",+		StorageAccessKey: "",+		Container:        "",+		Path:             `${!count("files")}-${!timestamp_unix_nano()}.txt`,+		BlobType:         "BLOCK",+		Timeout:          "5s",+		MaxInFlight:      1,+	}+}++//------------------------------------------------------------------------------++// AzureBlobStorage is a benthos writer. Type implementation that writes messages to an+// Azure Blob Storage storage account.+type AzureBlobStorage struct {+	conf      AzureBlobStorageConfig+	container field.Expression+	path      field.Expression+	blobType  field.Expression+	timeout   time.Duration+	log       log.Modular+	stats     metrics.Type+}++// NewAzureBlobStorage creates a new Amazon S3 bucket writer.Type.+func NewAzureBlobStorage(+	conf AzureBlobStorageConfig,+	log log.Modular,+	stats metrics.Type,+) (*AzureBlobStorage, error) {+	var timeout time.Duration+	if tout := conf.Timeout; len(tout) > 0 {+		var err error+		if timeout, err = time.ParseDuration(tout); err != nil {+			return nil, fmt.Errorf("failed to parse timeout period string: %v", err)+		}+	}+	a := &AzureBlobStorage{+		conf:    conf,+		log:     log,+		stats:   stats,+		timeout: timeout,+	}+	var err error+	if a.container, err = field.New(conf.Container); err != nil {+		return nil, fmt.Errorf("failed to parse container expression: %v", err)+	}+	if a.path, err = field.New(conf.Path); err != nil {+		return nil, fmt.Errorf("failed to parse path expression: %v", err)+	}+	if a.blobType, err = field.New(conf.BlobType); err != nil {+		return nil, fmt.Errorf("failed to parse blob type expression: %v", err)+	}+	return a, nil+}++// ConnectWithContext attempts to establish a connection to the target Blob Storage Account.+func (a *AzureBlobStorage) ConnectWithContext(ctx context.Context) error {+	return a.Connect()+}++// Connect attempts to establish a connection to the target Blob Storage Account.+func (a *AzureBlobStorage) Connect() error {+	return nil+}++// Write attempts to write message contents to a target Azure Blob Storage container as files.+func (a *AzureBlobStorage) Write(msg types.Message) error {+	return a.WriteWithContext(context.Background(), msg)+}++func (a *AzureBlobStorage) getContainer(name string) (*azblob.ContainerURL, error) {+	accountName, accountKey := a.conf.StorageAccount, a.conf.StorageAccessKey+	if len(accountName) == 0 || len(accountKey) == 0 {

This check ought to be done within NewAzureBlobStorage so that users can see the error during start up.

mfamador

comment created time in 4 days

push eventJeffail/benthos

Ashley Jeffs

commit sha 3912af76d05002a2a3a1ceec10ac59634d16179a

Fix title

view details

push time in 6 days

push eventJeffail/benthos

Ashley Jeffs

commit sha 3912af76d05002a2a3a1ceec10ac59634d16179a

Fix title

view details

push time in 6 days

push eventJeffail/benthos

Ashley Jeffs

commit sha 9512688a0fd13a44492039fdba21e1a8fd2e8542

Short circuit boolean operands Closes #464

view details

Ashley Jeffs

commit sha 0252314e195ac0ceb427a4ea21b65ce31b9cd7de

Fix timestamp unix docs

view details

Ashley Jeffs

commit sha 5d879e55eea9e94fd33e27d82eb15c01a879f627

Add SASL fields to AMQP 1 input/output

view details

Ashley Jeffs

commit sha a91eb3842db6caeaf40a0de8b9fba761ebed65ef

Add config linting to test

view details

Ashley Jeffs

commit sha 16d55ecb2d02fe1297cdb999088a6433d17d72c0

Support triple dot wildcard paths in lint subcmd

view details

Ashley Jeffs

commit sha 712a56e9e88785f70353ff8c06e36406a82d07d6

Support unit tests within the target config file

view details

Ashley Jeffs

commit sha 782a328820597fb09ec6cf5a1d8be93d183c9c24

Update CHANGELOG

view details

push time in 6 days

created tagbenthosdev/benthos-lab

tagv0.9.9

A web app for writing, executing and sharing Benthos pipeline configurations

created time in 6 days

push eventbenthosdev/benthos-lab

Ashley Jeffs

commit sha 3a38f435f01ee70aab9e79ec4499e9c3b76082db

Update Benthos

view details

push time in 6 days

issue commentJeffail/benthos

Question - 'try' output with batches

Been thinking through this over the weekend and I'm pretty sure I can implement automatic try into batching without you needing to configure around it, so with the following:

output:
  try:
  - foo:
      batching:
        count: 100
  - bar: {}

If 93 out of 100 messages in the batch succeed only 7 messages would be routed to the bar output. There will be some edge cases where it would retry the whole batch depending on the output type (sometimes it's simply not possible to know which messages in particular failed) but dynamo should be possible.

I'll try and take a first attempt at this next weekend.

bojtirmw

comment created time in 6 days

release Jeffail/benthos

v3.20.0

released time in 6 days

created tagJeffail/benthos

tagv3.20.0

A stream processor for mundane tasks written in Go

created time in 6 days

push eventJeffail/benthos

Ashley Jeffs

commit sha 782a328820597fb09ec6cf5a1d8be93d183c9c24

Update CHANGELOG

view details

push time in 6 days

push eventJeffail/benthos

Ashley Jeffs

commit sha 712a56e9e88785f70353ff8c06e36406a82d07d6

Support unit tests within the target config file

view details

push time in 6 days

push eventJeffail/benthos

Ashley Jeffs

commit sha 16d55ecb2d02fe1297cdb999088a6433d17d72c0

Support triple dot wildcard paths in lint subcmd

view details

push time in 6 days

push eventJeffail/benthos

Ashley Jeffs

commit sha a91eb3842db6caeaf40a0de8b9fba761ebed65ef

Add config linting to test

view details

push time in 6 days

push eventJeffail/benthos

Ashley Jeffs

commit sha 5d879e55eea9e94fd33e27d82eb15c01a879f627

Add SASL fields to AMQP 1 input/output

view details

push time in 7 days

issue closedJeffail/benthos

Bloblang timestamp functions appear to not respect precision params

* As `bloblang` section of the docs says about `timestamp_unix` function: "Resolves to the current unix timestamp in seconds. You can add fractional precision up to the nanosecond by specifying the precision as an argument, e.g. `timestamp_unix(3)` for millisecond precision." I have tried using `timestamp_unix(3)` expecting millis precision but it still returns seconds, or am I missing something? https://lab.benthos.dev/l/FX2Kd5LccHP

closed time in 7 days

Jeffail

issue commentJeffail/benthos

Bloblang timestamp functions appear to not respect precision params

I've updated the docs as the precision params don't really solve a purpose anyway. In order to have a fractional timestamp the currently correct way would be to use timestamp_unix_nano() / 1000000000, it's a little gross so might be worth adding an optional boolean param to timestamp_unix(true) which adds fractional seconds, but I'll wait until there's more demand.

Jeffail

comment created time in 7 days

push eventJeffail/benthos

Ashley Jeffs

commit sha 0252314e195ac0ceb427a4ea21b65ce31b9cd7de

Fix timestamp unix docs

view details

push time in 7 days

push eventJeffail/benthos

Ashley Jeffs

commit sha 9512688a0fd13a44492039fdba21e1a8fd2e8542

Short circuit boolean operands Closes #464

view details

push time in 7 days

issue closedJeffail/benthos

Bloblang boolean operands should short circuit

Performing foo() && bar() in Bloblang should short circuit and not execute bar() if foo() returns false. Currently both are evaluated before the operand.

closed time in 7 days

Jeffail

issue commentJeffail/benthos

Question - 'try' output with batches

Am I right suspecting that try output passes the whole batch down to drop?

Yeah unfortunately a limitation of Benthos is that when a broker type receives a batch it attempts and acknowledges them as a whole. Therefore when one payload fails they all do.

Can I achieve that only the failed message parts are passed to drop and error() is filled properly?

The error() of a message is currently only populated during processor errors, but I can look into expanding it so that the try broker also populates it.

As for breaking the messages down there is a pattern for this where you try into a new try that breaks the batch back down with split, that allows you to reattempt each payload individually. The cons of this approach are that payloads which sent successfully in the first batched attempt will be resent and it also slows down your pipeline by sending messages solo, if the errors are common and therefore in a warm path this might be a problem.

I've actually used a pattern similar to this with Kafka in the past so I can help out with the configuration if you want to try it out, we can then also look into ways of baking the pattern into something easier to configure.

Is the way I do the batching right? Do you have anything to add/advise or is this the right way to do it atm? (also asking because I just saw there is a new release of 3.19.0 containing some switch improvements).

This looks fine, it's exactly how I'd recommend doing it.

bojtirmw

comment created time in 9 days

issue commentJeffail/benthos

FEEDBACK WANTED: New `bloblang` processor

Thanks @zlozano, that should be quick and easy to fix, I'll look at it this weekend: https://github.com/Jeffail/benthos/issues/464

Jeffail

comment created time in 9 days

issue openedJeffail/benthos

Bloblang boolean operands should short circuit

Performing foo() && bar() in Bloblang should short circuit and not execute bar() if foo() returns false. Currently both are evaluated before the operand.

created time in 9 days

issue commentJeffail/benthos

Cache expects JSON data

Hey @patrobinson, the problem is with the postmap tmp.principal: ., in order to insert the result of the process_map processors Benthos requires the result itself to be valid JSON (an object, array, number, boolean or string). Since your goal is to set tmp.principal to a string result you just need to quote the result so that it's a valid json string:

pipeline:
  threads: 1
  processors:
    - cache:
         cache: assume_role_cache
         operator: set
         key: '${!json_field:principal_id}'
         value: ${!json_field:arn}
    - process_map:
         processors:
           - cache:
                cache: assume_role_cache
                operator: get
                key: ${!json_field:userIdentity.principalId}
           - bloblang: 'root = content().quote()'
          postmap:
            tmp.principal: .
patrobinson

comment created time in 9 days

issue commentJeffail/benthos

Question - 'try' output with batches

First of all, thanks for Benthos! It's a really great tool, and I find it amazing how you maintain it! It's awesome to see how frequently there is a new release with some really good features and quick fixes! (and I can say, bloblang is a huge hit ;) )

❤️

* As `bloblang` section of the docs says about `timestamp_unix` function: "Resolves to the current unix timestamp in seconds. You can add fractional precision up to the nanosecond by specifying the precision as an argument, e.g. `timestamp_unix(3)` for millisecond precision." I have tried using `timestamp_unix(3)` expecting millis precision but it still returns seconds, or am I missing something? https://lab.benthos.dev/l/FX2Kd5LccHP

This looks like a bug, should be a quick fix so I'll try and get it in the next release, opened https://github.com/Jeffail/benthos/issues/463 to track it.

* dynamo cache processor: it produces warning log ("key not found:") if there is no value found for a key when using `get`. It's not a big deal, just wondering if there's a possibility to eliminate that log entry, since in my pipeline it is okay not having a value for a key when calling get, and I handle the error properly. This log entry creates a bit of noise in the logs which I can eliminate by filtering out in our log visualisation but that would be great if there would be no such log entry at all. Or is there any other way to check if the dynamo cache has a value under a specific key?

Agreed that's annoying, we shouldn't really be logging that from the cache client so I can get that removed.

I need to pop out so I'll have to tackle the batching question later.

bojtirmw

comment created time in 11 days

issue openedJeffail/benthos

Bloblang timestamp functions appear to not respect precision params

* As `bloblang` section of the docs says about `timestamp_unix` function: "Resolves to the current unix timestamp in seconds. You can add fractional precision up to the nanosecond by specifying the precision as an argument, e.g. `timestamp_unix(3)` for millisecond precision." I have tried using `timestamp_unix(3)` expecting millis precision but it still returns seconds, or am I missing something? https://lab.benthos.dev/l/FX2Kd5LccHP

created time in 11 days

issue commentJeffail/benthos

[Bug] - Balanced-Kinesis plugin fails to start checkpointer

Interesting, thanks @kamaroyl that should help narrow it down, I suspect the setting isn't being propagated but I'll need to dig into the library a little bit.

kamaroyl

comment created time in 11 days

issue commentJeffail/benthos

Support for AMQP 1.0

Hey @JohnRoesler thanks for trying it out. I'm assuming this is due to not having SASL auth set up yet, it should be quick to add.

mintbridge

comment created time in 11 days

push eventJeffail/benthos

Ashley Jeffs

commit sha d1987f21d1dd8de112df126f82e4a6f116216a6b

Add service opt func for defining string flags

view details

Zachary Lozano

commit sha cf949af668459e58b41a514c55602aa589576520

add strict mode for switch conditions

view details

Ashley Jeffs

commit sha 94ce02c10f1c031642f29bcc820b142276e8ae63

Merge pull request #458 from zlozano/switch-output add strict mode for switch conditions

view details

Ashley Jeffs

commit sha 93519fc7a933e8b22fc18bf5ae5635909f24816b

Update CHANGELOG, docs

view details

Zachary Lozano

commit sha a4ec416a3931a0fe91f8022f0474c4ce8395acfc

expose max_in_flight for switch

view details

Ashley Jeffs

commit sha 0be22f77f6556f619c4d267a8285b1ec1b5e576c

Merge pull request #459 from zlozano/switch-max-in-flight expose max_in_flight for switch

view details

Ashley Jeffs

commit sha c0f998e46ebfbfd33a031e236c30fff321b03d02

Update CHANGELOG

view details

Ashley Jeffs

commit sha 4f129983dc64e82ea6a8fadcf9144e59ac7bf6ea

Add AMQP 1.0 input and output betas

view details

Ashley Jeffs

commit sha 3ed21c14e6ebaa506128e89a54293a6df066e576

Update release notes script

view details

push time in 13 days

create barnchJeffail/homebrew-core

branch : benthos-3.19.0

created branch time in 13 days

PR opened Homebrew/homebrew-core

benthos 3.19.0

Created with brew bump-formula-pr.

+2 -2

0 comment

1 changed file

pr created time in 13 days

release Jeffail/benthos

v3.19.0

released time in 13 days

created tagbenthosdev/benthos-lab

tagv0.9.8

A web app for writing, executing and sharing Benthos pipeline configurations

created time in 13 days

push eventbenthosdev/benthos-lab

Ashley Jeffs

commit sha 58e94a4230dc7c305ac2ea8b7035dac0e919fe6e

Update benthos version

view details

push time in 13 days

created tagJeffail/benthos

tagv3.19.0

A stream processor for mundane tasks written in Go

created time in 13 days

push eventJeffail/benthos

Ashley Jeffs

commit sha 3ed21c14e6ebaa506128e89a54293a6df066e576

Update release notes script

view details

push time in 13 days

issue commentJeffail/benthos

Support for AMQP 1.0

@mintbridge, @holmanb, @JohnRoesler, I finally got around to this https://github.com/Jeffail/benthos/commit/4f129983dc64e82ea6a8fadcf9144e59ac7bf6ea

I've added some very basic components and labelled them BETA, they're currently tested against ActiveMQ. There's work needed for supporting metadata/headers. I'll probably put a release out this evening. Is anyone interested in being a guinea pig and trying them out?

mintbridge

comment created time in 13 days

push eventJeffail/benthos

Ashley Jeffs

commit sha 4f129983dc64e82ea6a8fadcf9144e59ac7bf6ea

Add AMQP 1.0 input and output betas

view details

push time in 13 days

issue commentJeffail/benthos

Question - Http Processor

Hey @mallikvala, yeah there's two ways you can do this. If you're processing JSON documents then you can perform the HTTP request and insert the result into the original payload with the process_map processor:

pipeline:
  processors:
    - process_map:
        processors:
          - http: {} # TODO
        postmap:
          http_result: .

The other way is to create a clone of the message (creating a batch of two identical messages), only apply the http processor to the second message and then use split to break the batch into two individual messages:

pipeline:
  processors:
    - select_parts:
        parts: [0,0]
    - process_map:
        parts: [1]
        processors:
          - http: {} # TODO
    - split: {}
mallikvala

comment created time in 14 days

push eventJeffail/benthos

Ashley Jeffs

commit sha c0f998e46ebfbfd33a031e236c30fff321b03d02

Update CHANGELOG

view details

push time in 14 days

issue commentJeffail/benthos

Question - Kafka Output

Hey @kozelok,

  1. The functionality is the same, the difference is that doing the batching at the input level allows the kafka input to batch messages individually for topic partitions, allowing them to be batch processed in order.
  2. It's not possible to set a global limit but you can indirectly control the number of goroutines by limiting the number of inputs, processing threads and outputs. When you increase the field max_in_flight of an output you'll increase the number of goroutines.
  3. Unfortunately using sarama to asynchronously batch and dispatch messages prevents Benthos from propagating acknowledgements up to the input level for ensuring delivery guarantees.
kozelok

comment created time in 14 days

push eventJeffail/benthos

Zachary Lozano

commit sha a4ec416a3931a0fe91f8022f0474c4ce8395acfc

expose max_in_flight for switch

view details

Ashley Jeffs

commit sha 0be22f77f6556f619c4d267a8285b1ec1b5e576c

Merge pull request #459 from zlozano/switch-max-in-flight expose max_in_flight for switch

view details

push time in 14 days

PR merged Jeffail/benthos

expose max_in_flight for switch

👋

me again.

This PR is short, sweet, and hopefully make sense. I think there are a couple of use cases where exposing this makes sense.

  1. Where the child outputs have max_in_flight > 1
  2. Where different outputs are mutually exclusive

I have a use case where I am routing multiple inputs to multiple outputs. If one of those outputs fails, and has an aggressive retry policy, then other messages are blocked and/or contending for the transaction channel until that head of line message is dropped, skipped (which may not be possible depending on the input), or succeeds. If we have the ability to spin up additional consumer loops here then we can mitigate this problem quite a bit I believe for distinct, mutually exclusive outputs/conditions.

+18 -1

1 comment

3 changed files

zlozano

pr closed time in 14 days

pull request commentJeffail/benthos

expose max_in_flight for switch

Thanks @zlozano, looks good. I was a little hesitant to expose this at first but I don't see any valid reason not to.

zlozano

comment created time in 14 days

push eventJeffail/benthos

Ashley Jeffs

commit sha 93519fc7a933e8b22fc18bf5ae5635909f24816b

Update CHANGELOG, docs

view details

push time in 17 days

push eventJeffail/benthos

Zachary Lozano

commit sha cf949af668459e58b41a514c55602aa589576520

add strict mode for switch conditions

view details

Ashley Jeffs

commit sha 94ce02c10f1c031642f29bcc820b142276e8ae63

Merge pull request #458 from zlozano/switch-output add strict mode for switch conditions

view details

push time in 17 days

PR merged Jeffail/benthos

add strict mode for switch conditions

We have a use case where our switch outputs are an exhaustive list, and the condition is derived from user input. If the input is incorrect, we want to communicate that back to the input, and ultimately the caller. This PR is a proposal to satisfy this need. Happy to pivot if there is a better way to go about this.

+79 -0

1 comment

2 changed files

zlozano

pr closed time in 17 days

pull request commentJeffail/benthos

add strict mode for switch conditions

Thanks @zlozano, this looks great! I'm planning to put a release out this weekend.

zlozano

comment created time in 17 days

push eventJeffail/benthos

Ashley Jeffs

commit sha d1987f21d1dd8de112df126f82e4a6f116216a6b

Add service opt func for defining string flags

view details

push time in 20 days

push eventJeffail/benthos

Ashley Jeffs

commit sha f7655970d7901a40bc5cb1c1c5d17bfcd09143a6

Fix title

view details

push time in 20 days

push eventJeffail/benthos

Ashley Jeffs

commit sha f7655970d7901a40bc5cb1c1c5d17bfcd09143a6

Fix title

view details

push time in 20 days

issue commentJeffail/benthos

[Bug] - Balanced-Kinesis plugin fails to start checkpointer

Hey @kamaroyl, are you using the endpoint field in order to connect to a third party service? If so are you able to share which one?

kamaroyl

comment created time in 20 days

Pull request review commenttimberio/vector

feat(new transform): add transaction transform

+use super::Transform;+use crate::{+    conditions::{Condition, DefaultedCondition},+    event::discriminant::Discriminant,+    event::{Event, LogEvent, Value},+    topology::config::{DataType, TransformConfig, TransformContext, TransformDescription},+};+use async_stream::stream;+use bytes::{Bytes, BytesMut};+use chrono::{DateTime, Utc};+use futures::{+    compat::{Compat, Compat01As03},+    stream,+    stream::StreamExt,+};+use futures01::Stream as Stream01;+use indexmap::IndexMap;+use serde::{Deserialize, Serialize};+use std::collections::{hash_map, HashMap};+use std::time::{Duration, Instant};+use string_cache::DefaultAtom as Atom;++//------------------------------------------------------------------------------++#[derive(Deserialize, Serialize, Debug, Derivative)]+#[serde(deny_unknown_fields, default)]+#[derivative(Default)]+pub struct TransactionConfig {+    pub expire_after_ms: Option<u64>,++    pub flush_period_ms: Option<u64>,++    /// An ordered list of fields to distinguish transactions by. Each+    /// transaction has a separate event merging state.+    #[serde(default)]+    pub identifier_fields: Vec<String>,++    #[serde(default)]+    pub merge_strategies: IndexMap<String, MergeStrategy>,++    /// An optional condition that determines when an event is the end of a+    /// transaction.+    pub ends_when: Option<DefaultedCondition>,+}++inventory::submit! {+    TransformDescription::new::<TransactionConfig>("transaction")+}++#[typetag::serde(name = "transaction")]+impl TransformConfig for TransactionConfig {+    fn build(&self, _cx: TransformContext) -> crate::Result<Box<dyn Transform>> {+        let t = Transaction::new(self)?;+        Ok(Box::new(t))+    }++    fn input_type(&self) -> DataType {+        DataType::Log+    }++    fn output_type(&self) -> DataType {+        DataType::Log+    }++    fn transform_type(&self) -> &'static str {+        "transaction"+    }+}++//------------------------------------------------------------------------------++#[derive(Serialize, Deserialize, Debug, Clone)]+#[serde(rename_all = "snake_case")]+pub enum MergeStrategy {+    Discard,+    Sum,+    Max,+    Min,+    Array,+    Concat,+}++//------------------------------------------------------------------------------++#[derive(Debug, Clone)]+struct DiscardMerger {+    v: Value,+}++impl DiscardMerger {+    fn new(v: Value) -> Self {+        return Self { v };+    }+}++impl TransactionValueMerger for DiscardMerger {+    fn add(&mut self, _v: Value) -> Result<(), String> {+        Ok(())+    }++    fn insert_into(self: Box<Self>, k: String, v: &mut LogEvent) -> Result<(), String> {+        v.insert(k, self.v);+        Ok(())+    }+}++//------------------------------------------------------------------------------++#[derive(Debug, Clone)]+struct ConcatMerger {+    v: BytesMut,+}++impl ConcatMerger {+    fn new(v: Bytes) -> Self {+        Self { v: v.into() }+    }+}++impl TransactionValueMerger for ConcatMerger {+    fn add(&mut self, v: Value) -> Result<(), String> {+        if let Value::Bytes(b) = v {+            self.v.extend(&[b' ']);+            self.v.extend_from_slice(&b);+            Ok(())+        } else {+            Err(format!(+                "expected numeric value, found: '{}'",+                v.to_string_lossy()+            ))+        }+    }++    fn insert_into(self: Box<Self>, k: String, v: &mut LogEvent) -> Result<(), String> {+        v.insert(k, Value::Bytes(self.v.into()));+        Ok(())+    }+}++//------------------------------------------------------------------------------++#[derive(Debug, Clone)]+struct ArrayMerger {+    v: Vec<Value>,+}++impl ArrayMerger {+    fn new(v: Vec<Value>) -> Self {+        Self { v }+    }+}++impl TransactionValueMerger for ArrayMerger {+    fn add(&mut self, v: Value) -> Result<(), String> {+        self.v.push(v);+        Ok(())+    }++    fn insert_into(self: Box<Self>, k: String, v: &mut LogEvent) -> Result<(), String> {+        v.insert(k, Value::Array(self.v));+        Ok(())+    }+}++//------------------------------------------------------------------------------++#[derive(Debug, Clone)]+struct TimestampWindowMerger {+    started: DateTime<Utc>,+    latest: DateTime<Utc>,+}++impl TimestampWindowMerger {+    fn new(v: DateTime<Utc>) -> Self {+        return Self {+            started: v.clone(),+            latest: v,+        };+    }+}++impl TransactionValueMerger for TimestampWindowMerger {+    fn add(&mut self, v: Value) -> Result<(), String> {+        if let Value::Timestamp(ts) = v {+            self.latest = ts+        } else {+            return Err(format!(+                "expected timestamp value, found: {}",+                v.to_string_lossy()+            ));+        }+        Ok(())+    }++    fn insert_into(self: Box<Self>, k: String, v: &mut LogEvent) -> Result<(), String> {+        v.insert(format!("{}_end", k), Value::Timestamp(self.latest));+        v.insert(k, Value::Timestamp(self.started));+        Ok(())+    }+}++//------------------------------------------------------------------------------++#[derive(Debug, Clone)]+enum NumberMergerValue {+    Int(i64),+    Float(f64),+}++impl From<i64> for NumberMergerValue {+    fn from(v: i64) -> Self {+        NumberMergerValue::Int(v)+    }+}++impl From<f64> for NumberMergerValue {+    fn from(v: f64) -> Self {+        NumberMergerValue::Float(v)+    }+}++//------------------------------------------------------------------------------++#[derive(Debug, Clone)]+struct AddNumbersMerger {+    v: NumberMergerValue,+}++impl AddNumbersMerger {+    fn new(v: NumberMergerValue) -> Self {+        return Self { v };+    }+}++impl TransactionValueMerger for AddNumbersMerger {+    fn add(&mut self, v: Value) -> Result<(), String> {+        // Try and keep max precision with integer values, but once we've+        // received a float downgrade to float precision.+        match v {+            Value::Integer(i) => match self.v {+                NumberMergerValue::Int(j) => self.v = NumberMergerValue::Int(i + j),+                NumberMergerValue::Float(j) => self.v = NumberMergerValue::Float(i as f64 + j),+            },+            Value::Float(f) => match self.v {+                NumberMergerValue::Int(j) => self.v = NumberMergerValue::Float(f + j as f64),+                NumberMergerValue::Float(j) => self.v = NumberMergerValue::Float(f + j),+            },+            _ => {+                return Err(format!(+                    "expected numeric value, found: '{}'",+                    v.to_string_lossy()+                ));+            }+        }+        Ok(())+    }++    fn insert_into(self: Box<Self>, k: String, v: &mut LogEvent) -> Result<(), String> {+        match self.v {+            NumberMergerValue::Float(f) => v.insert(k, Value::Float(f)),+            NumberMergerValue::Int(i) => v.insert(k, Value::Integer(i)),+        };+        Ok(())+    }+}++//------------------------------------------------------------------------------++#[derive(Debug, Clone)]+struct MaxNumberMerger {+    v: NumberMergerValue,+}++impl MaxNumberMerger {+    fn new(v: NumberMergerValue) -> Self {+        return Self { v };+    }+}++impl TransactionValueMerger for MaxNumberMerger {+    fn add(&mut self, v: Value) -> Result<(), String> {+        // Try and keep max precision with integer values, but once we've+        // received a float downgrade to float precision.+        match v {+            Value::Integer(i) => {+                match self.v {+                    NumberMergerValue::Int(i2) => {+                        if i > i2 {+                            self.v = NumberMergerValue::Int(i);+                        }+                    }+                    NumberMergerValue::Float(f2) => {+                        let f = i as f64;+                        if f > f2 {+                            self.v = NumberMergerValue::Float(f);+                        }+                    }+                };+            }+            Value::Float(f) => {+                let f2 = match self.v {+                    NumberMergerValue::Int(i2) => i2 as f64,+                    NumberMergerValue::Float(f2) => f2,+                };+                if f > f2 {+                    self.v = NumberMergerValue::Float(f);+                }+            }+            _ => {+                return Err(format!(+                    "expected numeric value, found: '{}'",+                    v.to_string_lossy()+                ));+            }+        }+        Ok(())+    }++    fn insert_into(self: Box<Self>, k: String, v: &mut LogEvent) -> Result<(), String> {+        match self.v {+            NumberMergerValue::Float(f) => v.insert(k, Value::Float(f)),+            NumberMergerValue::Int(i) => v.insert(k, Value::Integer(i)),+        };+        Ok(())+    }+}++//------------------------------------------------------------------------------++#[derive(Debug, Clone)]+struct MinNumberMerger {+    v: NumberMergerValue,+}++impl MinNumberMerger {+    fn new(v: NumberMergerValue) -> Self {+        return Self { v };+    }+}++impl TransactionValueMerger for MinNumberMerger {+    fn add(&mut self, v: Value) -> Result<(), String> {+        // Try and keep max precision with integer values, but once we've+        // received a float downgrade to float precision.+        match v {+            Value::Integer(i) => {+                match self.v {+                    NumberMergerValue::Int(i2) => {+                        if i < i2 {+                            self.v = NumberMergerValue::Int(i);+                        }+                    }+                    NumberMergerValue::Float(f2) => {+                        let f = i as f64;+                        if f < f2 {+                            self.v = NumberMergerValue::Float(f);+                        }+                    }+                };+            }+            Value::Float(f) => {+                let f2 = match self.v {+                    NumberMergerValue::Int(i2) => i2 as f64,+                    NumberMergerValue::Float(f2) => f2,+                };+                if f < f2 {+                    self.v = NumberMergerValue::Float(f);+                }+            }+            _ => {+                return Err(format!(+                    "expected numeric value, found: '{}'",+                    v.to_string_lossy()+                ));+            }+        }+        Ok(())+    }++    fn insert_into(self: Box<Self>, k: String, v: &mut LogEvent) -> Result<(), String> {+        match self.v {+            NumberMergerValue::Float(f) => v.insert(k, Value::Float(f)),+            NumberMergerValue::Int(i) => v.insert(k, Value::Integer(i)),+        };+        Ok(())+    }+}++//------------------------------------------------------------------------------++trait TransactionValueMerger: std::fmt::Debug + Send + Sync {+    fn add(&mut self, v: Value) -> Result<(), String>;+    fn insert_into(self: Box<Self>, k: String, v: &mut LogEvent) -> Result<(), String>;+}++impl From<Value> for Box<dyn TransactionValueMerger> {+    fn from(v: Value) -> Self {+        match v {+            Value::Integer(i) => Box::new(AddNumbersMerger::new(i.into())),+            Value::Float(f) => Box::new(AddNumbersMerger::new(f.into())),+            Value::Timestamp(ts) => Box::new(TimestampWindowMerger::new(ts)),+            Value::Map(_) => Box::new(DiscardMerger::new(v)),+            Value::Null => Box::new(DiscardMerger::new(v)),+            Value::Boolean(_) => Box::new(DiscardMerger::new(v)),+            Value::Bytes(_) => Box::new(DiscardMerger::new(v)),+            Value::Array(_) => Box::new(DiscardMerger::new(v)),+        }+    }+}++fn get_value_merger(+    v: Value,+    m: &MergeStrategy,+) -> Result<Box<dyn TransactionValueMerger>, String> {+    match m {+        MergeStrategy::Sum => match v {+            Value::Integer(i) => Ok(Box::new(AddNumbersMerger::new(i.into()))),+            Value::Float(f) => Ok(Box::new(AddNumbersMerger::new(f.into()))),+            _ => Err(format!(+                "expected number value, found: '{}'",+                v.to_string_lossy()+            )),+        },+        MergeStrategy::Max => match v {+            Value::Integer(i) => Ok(Box::new(MaxNumberMerger::new(i.into()))),+            Value::Float(f) => Ok(Box::new(MaxNumberMerger::new(f.into()))),+            _ => Err(format!(+                "expected number value, found: '{}'",+                v.to_string_lossy()+            )),+        },+        MergeStrategy::Min => match v {+            Value::Integer(i) => Ok(Box::new(MinNumberMerger::new(i.into()))),+            Value::Float(f) => Ok(Box::new(MinNumberMerger::new(f.into()))),+            _ => Err(format!(+                "expected number value, found: '{}'",+                v.to_string_lossy()+            )),+        },+        MergeStrategy::Concat => match v {+            Value::Bytes(b) => Ok(Box::new(ConcatMerger::new(b))),+            _ => Err(format!(+                "expected string value, found: '{}'",+                v.to_string_lossy()+            )),+        },+        MergeStrategy::Array => match v {+            Value::Array(a) => Ok(Box::new(ArrayMerger::new(a))),+            _ => Ok(Box::new(ArrayMerger::new(vec![v]))),+        },+        MergeStrategy::Discard => Ok(Box::new(DiscardMerger::new(v))),+    }+}++//------------------------------------------------------------------------------++struct TransactionState {+    fields: HashMap<String, Box<dyn TransactionValueMerger>>,+    stale_since: Instant,+}++impl TransactionState {+    fn new(e: LogEvent, strategies: &IndexMap<String, MergeStrategy>) -> Self {+        Self {+            stale_since: Instant::now(),+            // TODO: all_fields alternative that consumes+            fields: e+                .all_fields()+                .filter_map(|(k, v)| {+                    if let Some(strat) = strategies.get(&k) {+                        match get_value_merger(v.clone(), strat) {+                            Ok(m) => Some((k, m)),+                            Err(err) => {+                                warn!("failed to create merger for field '{}': {}", k, err);+                                None+                            }+                        }+                    } else {+                        Some((k, v.clone().into()))+                    }+                })+                .collect(),+        }+    }++    fn add_event(&mut self, e: LogEvent, strategies: &IndexMap<String, MergeStrategy>) {+        for (k, v) in e.all_fields() {+            let strategy = strategies.get(&k);+            match self.fields.entry(k) {+                hash_map::Entry::Vacant(entry) => {+                    if let Some(strat) = strategy {+                        match get_value_merger(v.clone(), strat) {+                            Ok(m) => {+                                entry.insert(m);+                            }+                            Err(err) => {+                                warn!("failed to merge value: {}", err);+                            }+                        }+                    } else {+                        entry.insert(v.clone().into());+                    }+                }+                hash_map::Entry::Occupied(mut entry) => {+                    if let Err(err) = entry.get_mut().add(v.clone()) {+                        warn!("failed to merge value: {}", err);+                    }+                }+            }+        }+        self.stale_since = Instant::now();+    }++    fn flush(mut self) -> LogEvent {+        let mut event = Event::new_empty_log().into_log();+        for (k, v) in self.fields.drain() {+            if let Err(err) = v.insert_into(k, &mut event) {+                warn!("failed to merge values for field: {}", err);+            }+        }+        event+    }+}++//------------------------------------------------------------------------------++pub struct Transaction {+    expire_after: Duration,+    flush_period: Duration,+    identifier_fields: Vec<Atom>,+    merge_strategies: IndexMap<String, MergeStrategy>,+    transaction_merge_states: HashMap<Discriminant, TransactionState>,+    ends_when: Option<Box<dyn Condition>>,+}++impl Transaction {+    fn new(config: &TransactionConfig) -> crate::Result<Self> {+        let ends_when = if let Some(ends_conf) = &config.ends_when {+            Some(ends_conf.build()?)+        } else {+            None+        };++        let identifier_fields = config+            .identifier_fields+            .clone()+            .into_iter()+            .map(Atom::from)+            .collect();++        Ok(Transaction {+            expire_after: Duration::from_millis(config.expire_after_ms.unwrap_or(30000)),+            flush_period: Duration::from_millis(config.flush_period_ms.unwrap_or(1000)),+            identifier_fields,+            merge_strategies: config.merge_strategies.clone(),+            transaction_merge_states: HashMap::new(),+            ends_when,+        })+    }++    fn flush_into(&mut self, output: &mut Vec<Event>) {+        let mut flush_discriminants = Vec::new();+        for (k, t) in &self.transaction_merge_states {+            if t.stale_since.elapsed() >= self.expire_after {+                flush_discriminants.push(k.clone());+            }+        }+        for k in &flush_discriminants {+            if let Some(t) = self.transaction_merge_states.remove(k) {+                output.push(Event::from(t.flush()));+            }+        }+    }++    fn flush_all_into(&mut self, output: &mut Vec<Event>) {+        self.transaction_merge_states+            .drain()+            .for_each(|(_, s)| output.push(Event::from(s.flush())));+    }+}++impl Transform for Transaction {+    // Only used in tests+    fn transform(&mut self, event: Event) -> Option<Event> {+        let mut output = Vec::new();+        self.transform_into(&mut output, event);+        output.pop()+    }++    fn transform_into(&mut self, output: &mut Vec<Event>, event: Event) {+        let ends_here = self+            .ends_when+            .as_ref()+            .map(|c| c.check(&event))+            .unwrap_or(false);++        let event = event.into_log();+        let discriminant = Discriminant::from_log_event(&event, &self.identifier_fields);++        if ends_here {+            output.push(Event::from(+                if let Some(mut state) = self.transaction_merge_states.remove(&discriminant) {+                    state.add_event(event, &self.merge_strategies);+                    state.flush()+                } else {+                    TransactionState::new(event, &self.merge_strategies).flush()+                },+            ));+        } else {+            match self.transaction_merge_states.entry(discriminant) {+                hash_map::Entry::Vacant(entry) => {+                    entry.insert(TransactionState::new(event, &self.merge_strategies));+                }+                hash_map::Entry::Occupied(mut entry) => {+                    entry.get_mut().add_event(event, &self.merge_strategies);+                }+            }+        }++        self.flush_into(output);+    }++    fn transform_stream(+        self: Box<Self>,+        input_rx: Box<dyn Stream01<Item = Event, Error = ()> + Send>,+    ) -> Box<dyn Stream01<Item = Event, Error = ()> + Send>+    where+        Self: 'static,+    {+        let mut me = self;++        let poll_period = me.flush_period.clone();++        let mut flush_stream = tokio::time::interval(poll_period);+        let mut input_stream = Compat01As03::new(input_rx);++        let stream = stream! {

Nice! I love this.

lukesteensen

comment created time in 22 days

Pull request review commenttimberio/vector

feat(new transform): add transaction transform

 pub mod is_metric;  pub use check_fields::CheckFieldsConfig; +/// A condition enum that can be optionally parsed without a `type` field, and+/// defaults to a `check_fields` condition.+#[derive(Serialize, Deserialize, Debug)]+#[serde(untagged)]+pub enum DefaultedCondition {+    FromType(Box<dyn ConditionConfig>),+    NoTypeCondition(CheckFieldsConfig),+}++impl DefaultedCondition {+    pub fn build(&self) -> crate::Result<Box<dyn Condition>> {+        Ok(match self {+            DefaultedCondition::FromType(c) => c.build()?,+            DefaultedCondition::NoTypeCondition(c) => c.build()?,+        })+    }+}+

And this can be removed, looks like it has been implemented since as https://github.com/timberio/vector/blob/master/src/conditions/mod.rs#L37

lukesteensen

comment created time in 22 days

Pull request review commenttimberio/vector

feat(new transform): add transaction transform

 def initialize(hash)     end      if wildcard? && !object?-      if !@examples.any? { |example| example.is_a?(Hash) }+      if @examples.any? { |example| !example.is_a?(Hash) }

I'm pretty sure this isn't wanted

lukesteensen

comment created time in 22 days

Pull request review commenttimberio/vector

feat(new transform): add transaction transform

 impl Transform for Swimlane { #[derive(Deserialize, Serialize, Debug)] #[serde(deny_unknown_fields)] pub struct SwimlanesConfig {-    lanes: IndexMap<String, AnyCondition>,+    lanes: IndexMap<String, DefaultedCondition>,

Undo all of these back to AnyCondition

lukesteensen

comment created time in 22 days

issue commenttimberio/vector

Live documentation vs released code confusion

I had this issue with benthos and chose to maintain a separate branch that deploys the docs site, each time you release you rebase the docs branch to the tag. The main motivation for that is it allows you to occasionally cherry-pick docs fixes from master that are relevant to the current release whilst omitting docs for new features.

bruceg

comment created time in 23 days

create barnchJeffail/homebrew-core

branch : benthos-3.18.0

created branch time in a month

PR opened Homebrew/homebrew-core

benthos 3.18.0

Created with brew bump-formula-pr.

+2 -2

0 comment

1 changed file

pr created time in a month

issue closedJeffail/benthos

Allow http_client input to process empty messages

Currently, if the http_client input receives an empty message it will drop it. Sometimes the status code and headers are all the user wants to process and so we should allow these messages to come through.

Some thought needs to be put into whether this should be considered a "breaking change" as it's not obvious. If it is then we'll need to expose this as a config field, maybe something like drop_empty_bodies: true which can be set to false.

closed time in a month

Jeffail

issue commentJeffail/benthos

Allow http_client input to process empty messages

Released: https://github.com/Jeffail/benthos/releases/tag/v3.18.0

Jeffail

comment created time in a month

release Jeffail/benthos

v3.18.0

released time in a month

push eventJeffail/benthos

Ashley Jeffs

commit sha 44513973122c20a06f8739078d10e28d435dfcd0

Add drop_empty_bodies field to http_client

view details

Ashley Jeffs

commit sha d779837b23be80a762ac168e24fe73dac6fafc19

fix lambda build command

view details

Ashley Jeffs

commit sha 1672a83fbed275693b2595b09a279a20ee2f5148

Update CHANGELOG

view details

push time in a month

created tagbenthosdev/benthos-lab

tagv0.9.7

A web app for writing, executing and sharing Benthos pipeline configurations

created time in a month

push eventbenthosdev/benthos-lab

Ashley Jeffs

commit sha 8b722e0d712c493564016e958cb803517ad4c2f9

Upgrade benthos

view details

push time in a month

created tagJeffail/benthos

tagv3.18.0

A stream processor for mundane tasks written in Go

created time in a month

push eventJeffail/benthos

Ashley Jeffs

commit sha 1672a83fbed275693b2595b09a279a20ee2f5148

Update CHANGELOG

view details

push time in a month

PR closed Jeffail/benthos

Use Go Modules for lambda package

When trying to install the lambda package, as per the documentation it fails with:

09:53 $ go get github.com/go-redis/redis/v7
package github.com/go-redis/redis/v7: cannot find package "github.com/go-redis/redis/v7" in any of:
/usr/local/Cellar/go/1.14.2_1/libexec/src/github.com/go-redis/redis/v7 (from $GOROOT)
/Users/patrickrobinson/go/src/github.com/go-redis/redis/v7 (from $GOPATH)

This suggested solution is to use Go Modules

+5 -0

1 comment

1 changed file

patrobinson

pr closed time in a month

pull request commentJeffail/benthos

Use Go Modules for lambda package

Hey @patrobinson, you can get away with simply including the version number in the path:

go build github.com/Jeffail/benthos/v3/cmd/serverless/benthos-lambda

That way we don't need to maintain multiple go.mod files. I've fixed the docs with the new URL.

patrobinson

comment created time in a month

push eventJeffail/benthos

Ashley Jeffs

commit sha d779837b23be80a762ac168e24fe73dac6fafc19

fix lambda build command

view details

push time in a month

push eventJeffail/benthos

Ashley Jeffs

commit sha 44513973122c20a06f8739078d10e28d435dfcd0

Add drop_empty_bodies field to http_client

view details

push time in a month

issue commentJeffail/benthos

Question - Kafka Output

Hey @kozelok, it sounds like you're bottlenecked by the output which will limit how much you can utilize processing threads due to the back pressure. With kafka you'll generally see the best performance gains by sending messages as batches.

My advice would be to remove the broker and have a single kafka output with both batching and a max in flight:

output:
  kafka:
    addresses:
      - localhost:9092
    topic: foo
    max_in_flight: 10 # Try tuning this second
    batching:
      count: 20 # Try tuning this first

I'd first try increasing/decreasing the batch size and then once you've found the best fit move onto increasing/decreasing the max in flight parameter. The max_in_flight parameter is a more efficient way of getting the async message delivery gains you'd see with a greedy broker, which is why I'd recommend taking it out.

kozelok

comment created time in a month

issue openedJeffail/benthos

Allow http_client input to process empty messages

Currently, if the http_client input receives an empty message it will drop it. Sometimes the status code and headers are all the user wants to process and so we should allow these messages to come through.

Some thought needs to be put into whether this should be considered a "breaking change" as it's not obvious. If it is then we'll need to expose this as a config field, maybe something like drop_empty_bodies: true which can be set to false.

created time in a month

Pull request review commenttimberio/vector

enhancement(cli): Consolidate and beautify `validate`

+use crate::{+    config_paths, event,+    runtime::Runtime,+    topology::{self, builder::Pieces, Config, ConfigDiff},+};+use colored::*;+use exitcode::ExitCode;+use futures::compat::Future01CompatExt;+use std::{fmt, fs::File, path::PathBuf};+use structopt::StructOpt;++#[derive(StructOpt, Debug)]+#[structopt(rename_all = "kebab-case")]+pub struct Opts {+    /// Disables topology check+    #[structopt(long)]+    no_topology: bool,++    /// Disables environment checks. That includes component checks and health checks.+    #[structopt(long)]+    no_environment: bool,++    /// Shorthand for `--no_topology` and `--no_environment` flags. Just `-n` won't disable anything,+    /// it needs to be used with `t` for `--no_topology`, and or `e` for `--no_environment` in any order.+    /// Example:+    /// `-nte` and `-net` both mean `--no_topology` and `--no_environment`

Nit: the actual flags are --no-topology and --no-environment.

ktff

comment created time in a month

push eventtimberio/go-datemath

Ashley Jeffs

commit sha 85899cbcaa93928307db1492018258c9532df8e1

Prevent lexer crash on `no[^w]` (#4) For some reason a string matching `no[^w]` would cause the lexer to crash rather than returning an invalid char error. This doesn't seem to happen with `n[^o]` which doesn't make any sense to me, so I can't say that this same crash won't occur with other combinations of partially matched tokens.

view details

push time in a month

PR merged timberio/go-datemath

fix: Prevent lexer crash on `/no[^w]/`

For some reason a string matching /no[^w]/ would cause the lexer to crash rather than returning an invalid char error. This doesn't seem to happen with /n[^o]/ which doesn't make any sense to me, so I can't say that this same crash won't occur with other combinations of partially matched tokens.

+42 -26

1 comment

3 changed files

Jeffail

pr closed time in a month

PR opened timberio/go-datemath

fix: Prevent lexer crash on `/no[^w]/`

For some reason a string matching /no[^w]/ would cause the lexer to crash rather than returning an invalid char error. This doesn't seem to happen with /n[^o]/ which doesn't make any sense to me, so I can't say that this same crash won't occur with other combinations of partially matched tokens.

+42 -26

0 comment

3 changed files

pr created time in a month

create barnchJeffail/go-datemath

branch : fix-lexer-crash

created branch time in a month

push eventJeffail/benthos

Jim Gustafsson

commit sha 7b7705e1d88f88715a1a8a228b769cb17aa58c09

Add type property for AMQP output

view details

Ashley Jeffs

commit sha 1d0b646be1184d9f0387332f8cb6455f4a82e691

Merge pull request #448 from jimgus/add_type_property Add type property for AMQP_0.9 output

view details

Ashley Jeffs

commit sha ccf08ad1cddfc8753fb8f15b7a5058874512c64b

Update CHANGELOG

view details

ollystephens

commit sha 0eb9b08cbea7a553a6cbe21efd537be92bce4d35

feat: add base64url variant to decode and encode (#452) * feat: add base64url variant to decode and encode * fix: update the docs Co-authored-by: Olly Stephens <olly.stephens@adarga.ai>

view details

Ashley Jeffs

commit sha f8373fb90f8a0edaa3c862f49cdb05fe89ff6d21

Add methods explode and without

view details

Ashley Jeffs

commit sha 22f84fc691cf18062cc3d271912f0c8284a4040b

Fix message functions in blobl subcommand

view details

Ashley Jeffs

commit sha 0c550aaa02a880be01eeeb2369bb63fd11b0b634

Add opt func OptSetRoundTripper

view details

Ashley Jeffs

commit sha c9a3a6f67fcab1468165fd7bb862017231a84cbe

Update CHANGELOG

view details

Ashley Jeffs

commit sha 9ede24895dee9e8e9a2c1dd887264648a8e5002d

Fix deleting and skipping maps with blobl subcmd

view details

Ashley Jeffs

commit sha 9a742a5d366ee689a5555fb3d2e58726bf591746

Update filtering cookbook to use if

view details

push time in a month

push eventJeffail/benthos

Ashley Jeffs

commit sha 9a742a5d366ee689a5555fb3d2e58726bf591746

Update filtering cookbook to use if

view details

push time in a month

push eventJeffail/benthos

Ashley Jeffs

commit sha 9ede24895dee9e8e9a2c1dd887264648a8e5002d

Fix deleting and skipping maps with blobl subcmd

view details

push time in a month

create barnchJeffail/homebrew-core

branch : benthos-3.17.0

created branch time in a month

PR opened Homebrew/homebrew-core

benthos 3.17.0

Created with brew bump-formula-pr.

+2 -2

0 comment

1 changed file

pr created time in a month

created tagbenthosdev/benthos-lab

tagv0.9.6

A web app for writing, executing and sharing Benthos pipeline configurations

created time in a month

more