profile
viewpoint

klauspost/compress 1438

Optimized compression packages

klauspost/cpuid 333

CPU feature identification for Go

klauspost/dedup 164

Streaming Deduplication Package for Go

klauspost/asmfmt 158

Go Assembler Formatter

klauspost/crc32 74

CRC32 hash with x64 optimizations

klauspost/geoip-service 68

A fast in-memory http microservice for looking up MaxMind GeoIP2 and GeoLite2 database

klauspost/gad 36

Go After Dark

klauspost/doproxy 30

Reverse Proxy for managing multiple Digital Ocean backends.

klauspost/gfx 7

Graphic Drawing Library

klauspost/cld2 6

CLD2 (Compact Language Detector 2) bindings for Go (golang)

push eventklauspost/minio

Klaus Post

commit sha dc1ebd24180eb56439c069a8d1ea5226af1e74e8

Just use init to initialize the tracker.

view details

push time in 16 hours

issue openedminio/warp

Add automatic time segmenting

Analysis of large runs can be very slow.

Add automatic bumping of the time segmentation length based on the time range.

created time in 2 days

push eventklauspost/minio

Klaus Post

commit sha 03b1c4032607a5adea7c8c12df91b782041032f6

Init tracker for test.

view details

push time in 2 days

push eventklauspost/minio

Anis Elleuch

commit sha b207520d98689616fd8eb07f61dfc74a49fce4a3

Fix lifecycle GET: AWS SDK complaints on empty config (#9201)

view details

Robert Thomas

commit sha 27779565816d0e0e15547f28417ab7cde4a2102d

Improve YAML download links listed in K8s doc (#9213)

view details

Sidhartha Mani

commit sha 0c80bf45d00530e28e874c4bac0cec80b0cb1c72

Implement oboard diagnostics admin API (#9024) - Implement a graph algorithm to test network bandwidth from every node to every other node - Saturate any network bandwidth adaptively, accounting for slow and fast network capacity - Implement parallel drive OBD tests - Implement a paging mechanism for OBD test to provide periodic updates to client - Implement Sys, Process, Host, Mem OBD Infos

view details

Klaus Post

commit sha 3c9c492d91e1b171b142bf752481e365f56dafae

Merge branch 'master' into data-usage-tracker

view details

push time in 3 days

push eventklauspost/minio

Klaus Post

commit sha 6ab680721c49914c37c4743b84b5c7fa5d054d46

Debug info

view details

push time in 3 days

push eventklauspost/minio

Klaus Post

commit sha 027f0284c8b42d1bced3d07b6f7503d09fb5a611

Who's annoying?

view details

push time in 3 days

push eventklauspost/minio

Klaus Post

commit sha 6528d7ab5e0252cceb7dbd9609f766881002209a

Make global cycle counter.

view details

push time in 3 days

Pull request review commentklauspost/compress

zstd: If first block and 'final', encode direct.

 func (e *Encoder) nextBlock(final bool) error { 		return fmt.Errorf("block > maxStoreBlockSize") 	} 	if !s.headerWritten {+		// If we have a single block encode, do a sync compression.+		if final && len(s.filling) > 0 {+			s.current = e.EncodeAll(s.filling, s.current[:0])+			var n2 int+			n2, s.err = s.w.Write(s.current)+			if s.err != nil {+				return s.err+			}+			s.nWritten += int64(n2)+			s.current = s.current[:0]+			s.filling = s.filling[:0]+			return nil+		}

You must be doing something strange. Here is a bench that doesn't allocate (as much): https://gist.github.com/klauspost/446e9ab7aeae0b75d7974339b65df815

λ go test -bench=. -benchtime=10s
BenchmarkCZstd-32          13430            889105 ns/op         253.44 MB/s          25 B/op          0 allocs/op
BenchmarkGoZstd-32          7183           1545090 ns/op         145.84 MB/s      251130 B/op          1 allocs/op
BenchmarkGoZstd2-32         5437           2187727 ns/op         103.00 MB/s      112531 B/op          0 allocs/op
BenchmarkS2-32             22752            528732 ns/op         426.19 MB/s         975 B/op          8 allocs/op
klauspost

comment created time in 3 days

Pull request review commentklauspost/compress

zstd: If first block and 'final', encode direct.

 func (e *Encoder) nextBlock(final bool) error { 		return fmt.Errorf("block > maxStoreBlockSize") 	} 	if !s.headerWritten {+		// If we have a single block encode, do a sync compression.+		if final && len(s.filling) > 0 {+			s.current = e.EncodeAll(s.filling, s.current[:0])+			var n2 int+			n2, s.err = s.w.Write(s.current)+			if s.err != nil {+				return s.err+			}+			s.nWritten += int64(n2)+			s.current = s.current[:0]+			s.filling = s.filling[:0]+			return nil+		}

I respectfully disagree, but will also repeat that this will likely be the case when full parallel compression.

First of all Go has very advanced preemption and handles this nicely.

Secondly you are just postponing the problem. If you have a 2 CPU system, the problem will occur as soon as you have 3 users and so on.

If you just make compression slower by using 1 of 2 cores, it will take (in theory) 2x as long, meaning you are strained for resources 2x as long instead of just getting it done faster AND give a better user experience when you are not limited for resources.

Assuming things scale linearly, you only can do xMB/s on your system and there is no reason to slow down every transaction, since the system is likely to behave the same when you reach that total limit anyway, ie. splitting the resources between the users.

klauspost

comment created time in 3 days

issue closedklauspost/compress

zstd performance and Reader/Writer reuse

I have a benchmark that produces following results:

BenchmarkCZstd-12      	     664	   1793805 ns/op	 125.62 MB/s	  522419 B/op	       9 allocs/op
BenchmarkGoZstd-12    	     374	   3247447 ns/op	  69.39 MB/s	 1646825 B/op	      12 allocs/op

That is roughly 2 times slower (and results can be worse on smaller payloads) and allocs are rather big (1646825 to encode & decode 225339 bytes?). I know that I can achieve better results by using EncodeAll and DecodeAll, but I would like to use Encoder/Decoder as wrappers over bytes.Buffer. So my question is - am I doing anything wrong here with Encoder/Decoder? I've disabled CRC for more fair comparison based on our previous discussion - I plan to use it in production.

closed time in 3 days

vmihailenco

PR merged klauspost/compress

zstd: If first block and 'final', encode direct.

If if writing header (ie first block) and it is the final block, use block compression instead of async.

Addition for #248

+32 -11

0 comment

1 changed file

klauspost

pr closed time in 3 days

delete branch klauspost/compress

delete branch : zstd-small-blocks-encode-directly

delete time in 3 days

push eventklauspost/compress

Klaus Post

commit sha e8d1c04622e241e72a5603ea8fba355a7dfef89e

zstd: If first block and 'final', encode direct. (#251) * zstd: If first block and 'final', encode direct. If if writing header (ie first block) and it is the final block, use block compression instead of async. Addition for #248

view details

push time in 3 days

push eventklauspost/compress-fuzz

Klaus Post

commit sha 0bf74cabd62c983c3bfe2182e9e91414f8452867

Add zstd CRC and more corpus.

view details

push time in 3 days

Pull request review commentklauspost/compress

zstd: If first block and 'final', encode direct.

 func (e *Encoder) nextBlock(final bool) error { 		return fmt.Errorf("block > maxStoreBlockSize") 	} 	if !s.headerWritten {+		// If we have a single block encode, do a sync compression.+		if final && len(s.filling) > 0 {+			s.current = e.EncodeAll(s.filling, s.current[:0])+			var n2 int+			n2, s.err = s.w.Write(s.current)+			if s.err != nil {+				return s.err+			}+			s.nWritten += int64(n2)+			s.current = s.current[:0]+			s.filling = s.filling[:0]+			return nil+		}

But when we have enough for 1 block we don't know how much more will come, so we start compressing the first block.

klauspost

comment created time in 3 days

Pull request review commentklauspost/compress

zstd: If first block and 'final', encode direct.

 func (e *Encoder) nextBlock(final bool) error { 		return fmt.Errorf("block > maxStoreBlockSize") 	} 	if !s.headerWritten {+		// If we have a single block encode, do a sync compression.+		if final && len(s.filling) > 0 {+			s.current = e.EncodeAll(s.filling, s.current[:0])+			var n2 int+			n2, s.err = s.w.Write(s.current)+			if s.err != nil {+				return s.err+			}+			s.nWritten += int64(n2)+			s.current = s.current[:0]+			s.filling = s.filling[:0]+			return nil+		}

When you are doing multiple blocks you will get the benefit of the concurrency. There will off course be a small dropoff as you get just above 1 block, but overall speed will be faster.

klauspost

comment created time in 3 days

issue commentklauspost/compress

zstd performance and Reader/Writer reuse

This will be addressed when I add full concurrent compression.

S2 has Writer.EncodeBuffer.

vmihailenco

comment created time in 3 days

push eventklauspost/compress

Klaus Post

commit sha 56a69dbf7dfdacdaf139c6f1ee37226de6da966f

Keep track of full written frames.

view details

push time in 3 days

delete branch klauspost/compress

delete branch : zstd-readfrom-skip-crc

delete time in 3 days

push eventklauspost/compress

Klaus Post

commit sha d8b029aceed3967823d48317ebd6227d5d83f289

zstd: Skip CRC if not requested. (#250)

view details

push time in 3 days

CommitCommentEvent

issue commentklauspost/compress

zstd performance and Reader/Writer reuse

My totally uneducated guess based on the code I've read is that you split the stream into blocks and each block is handled by separate goroutine + it looks like you have separate encoder goroutines that encode those blocks that are coming from the block goroutines. Synchronizing that is expensive, but it is very likely I got all that wrong. Sorry if that is the case.

Exactly. In streaming mode we collect until we have enough for the block size we want, by default 64KB.

When there is enough for a block we start compressing it in a separate goroutine. This is first compressing it into sequences which is then handed to another goroutine that encodes the block output. When the block is handed over the next block can start.

This means that long streams are quite fast, but small blocks of course suffer from the overhead of having to hand over data. #251 simply skips all the goroutines and compresses it at once if we know we only have to compress a single block (final == true and no frame header written).

vmihailenco

comment created time in 3 days

Pull request review commentklauspost/compress

zstd: If first block and 'final', encode direct.

 func (e *Encoder) nextBlock(final bool) error { 		return fmt.Errorf("block > maxStoreBlockSize") 	} 	if !s.headerWritten {+		// If we have a single block encode, do a sync compression.+		if final && len(s.filling) > 0 {+			s.current = e.EncodeAll(s.filling, s.current[:0])+			var n2 int+			n2, s.err = s.w.Write(s.current)+			if s.err != nil {+				return s.err+			}+			s.nWritten += int64(n2)+			s.current = s.current[:0]+			s.filling = s.filling[:0]+			return nil+		}

This is not hiding any problems, but handles small < 1 block streaming encodes.

klauspost

comment created time in 3 days

Pull request review commentklauspost/compress

zstd: If first block and 'final', encode direct.

 func (e *Encoder) nextBlock(final bool) error { 		return fmt.Errorf("block > maxStoreBlockSize") 	} 	if !s.headerWritten {+		// If we have a single block encode, do a sync compression.+		if final && len(s.filling) > 0 {+			s.current = e.EncodeAll(s.filling, s.current[:0])+			var n2 int+			n2, s.err = s.w.Write(s.current)+			if s.err != nil {+				return s.err+			}+			s.nWritten += int64(n2)+			s.current = s.current[:0]+			s.filling = s.filling[:0]+			return nil+		}

What?

klauspost

comment created time in 3 days

PR opened klauspost/compress

zstd: If first block and 'final', encode direct.

If if writing header (ie first block) and it is the final block, use block compression instead of async.

Addition for #248

+14 -0

0 comment

1 changed file

pr created time in 3 days

create barnchklauspost/compress

branch : zstd-small-blocks-encode-directly

created branch time in 3 days

IssuesEvent

delete branch klauspost/compress

delete branch : zstd-decode-bytes-buffer-reduce-allocs

delete time in 3 days

push eventklauspost/compress

Klaus Post

commit sha 6fa181d0a3609d15da9d606057d1fd1f7f3ebd15

zstd: Reduce allocations when decoding from a bytes.Buffer (#249) Fixes #248

view details

push time in 3 days

issue closedklauspost/compress

zstd performance and Reader/Writer reuse

I have a benchmark that produces following results:

BenchmarkCZstd-12      	     664	   1793805 ns/op	 125.62 MB/s	  522419 B/op	       9 allocs/op
BenchmarkGoZstd-12    	     374	   3247447 ns/op	  69.39 MB/s	 1646825 B/op	      12 allocs/op

That is roughly 2 times slower (and results can be worse on smaller payloads) and allocs are rather big (1646825 to encode & decode 225339 bytes?). I know that I can achieve better results by using EncodeAll and DecodeAll, but I would like to use Encoder/Decoder as wrappers over bytes.Buffer. So my question is - am I doing anything wrong here with Encoder/Decoder? I've disabled CRC for more fair comparison based on our previous discussion - I plan to use it in production.

closed time in 3 days

vmihailenco

issue commentklauspost/compress

zstd performance and Reader/Writer reuse

@vmihailenco But there is no way to know that you will not be writing more until you close.

I did add another 'short circuit' though. PR in a minute.

vmihailenco

comment created time in 3 days

PR opened klauspost/compress

zstd: Skip CRC if not requested.
+3 -1

0 comment

1 changed file

pr created time in 3 days

create barnchklauspost/compress

branch : zstd-readfrom-skip-crc

created branch time in 3 days

issue commentklauspost/compress

zstd performance and Reader/Writer reuse

after all Encoder.Writer can easily be a wrapper over EncodeAll.

No. With EncodeAll you always know how much you are compressing. Using a stream you don't know if the user will write more, so you need to buffer input and cannot start compressing until the stream is closed.

With longer stream you can start compressing when you have enough input for a block.

vmihailenco

comment created time in 3 days

issue commentklauspost/compress

zstd performance and Reader/Writer reuse

Thanks for the detailed report 👍

vmihailenco

comment created time in 4 days

create barnchklauspost/compress

branch : zstd-decode-bytes-buffer-reduce-allocs

created branch time in 4 days

issue commentklauspost/compress

zstd performance and Reader/Writer reuse

Yes. There is some special handling when you decode a *bytes.Buffer, and it does indeed allocate more than it should. I will send a PR shortly.

As you note, the streaming en+decoders are not for payloads this small.

Also, if you want to eliminate more allocations you can copy data to a hash function or something so you avoid the ioutil.ReadAll.

vmihailenco

comment created time in 4 days

push eventklauspost/minio

Klaus Post

commit sha 2d56c96b978c0a864f445a1b90a66bd5ce77f831

Semi working state.

view details

push time in 4 days

push eventklauspost/minio

Klaus Post

commit sha 087dc9a621abac0a91847e9a1fbb73f4e8c2e3e4

Add update tracking to storage layer.

view details

push time in 4 days

push eventklauspost/minio

Klaus Post

commit sha 9292203b8453f2160097fa9b8f372e172583815c

Send bloom filter if available.

view details

push time in 5 days

push eventklauspost/minio

Klaus Post

commit sha 0827e229db22e13a2387523952119193c4e0caa8

Don't replace locks and stuff

view details

push time in 5 days

PR opened minio/minio

Data usage tracker

Description

WIP.

Motivation and Context

How to test this PR?

Types of changes

  • [ ] Bug fix (non-breaking change which fixes an issue)
  • [ ] New feature (non-breaking change which adds functionality)
  • [ ] Breaking change (fix or feature that would cause existing functionality to change)

Checklist:

  • [ ] Fixes a regression (If yes, please add commit-id or PR # here)
  • [ ] Documentation needed
  • [ ] Unit tests needed
  • [ ] Functional tests needed (If yes, add mint PR # here: )
+679 -13

0 comment

15 changed files

pr created time in 5 days

push eventklauspost/minio

Klaus Post

commit sha ad766f748ac103572af8f549ed3eb192761e8a03

Add data update tracking

view details

push time in 5 days

push eventklauspost/minio

Harshavardhana

commit sha b1a2169dcc5b4febc4edf1f82407e28b6f695475

fix: data usage crawler env handling, usage-cache.bin location (#9163) canonicalize the ENVs such that we can bring these ENVs as part of the config values, as a subsequent change. - fix location of per bucket usage to `.minio.sys/buckets/<bucket_name>/usage-cache.bin` - fix location of the overall usage in `json` at `.minio.sys/buckets/.usage.json` (avoid conflicts with a bucket named `usage.json` ) - fix location of the overall usage in `msgp` at `.minio.sys/buckets/.usage.bin` (avoid conflicts with a bucket named `usage.bin`

view details

Minio Trusted

commit sha c5b87f93dd3bd441bc20c96f6bcc158de69734bc

Update yaml files to latest version RELEASE.2020-03-19T21-49-00Z

view details

Nitish Tiwari

commit sha ecf156626664b846c5ab2aaaa0d8ff5481e5a5cf

Add an option to allow plaintext connection to LDAP/AD Server (#9151)

view details

Stephen N

commit sha 1ffa983a9d8efac1194fbba2d8365e40b9db41aa

added support for SASL/SCRAM on Kafka bucket notifications. (#9168) fixes #9167

view details

Harshavardhana

commit sha ae654831aa6e676639d75e2e0c4e946dbdbd22ba

Add madmin package context support (#9172) This is to improve responsiveness for all admin API operations and allowing callers to cancel any on-going admin operations, if they happen to be waiting too long.

view details

Harshavardhana

commit sha b4bfdc92ccd7ea1de347dcd658846a595dea1b39

fix: admin console logger changes to log.Info

view details

stefan-work

commit sha f001e99fcd37b7271cc3e843d21958b460e87f93

create the final file with mode 0666 for multipart-uploads (#9173) NAS gateway creates non-multipart-uploads with mode 0666. But multipart-uploads are created with a differing mode of 0644. Both modes should be equal! Else it leads to files with different permissions based on its file-size. This patch solves that by using 0666 for both cases.

view details

Harshavardhana

commit sha bf545dc3203bbc8c452e3a14a7a9aedb563ccb63

migrate to new minio-go with latest changes (#9176) - extract userTags from Get/Head request (#1249) - fix: Context cancellation not handled (#1250) - Check for correct http status in remove object tagging (#1248) - simplify extracting metadata in Head/Get object (#1245) - fix: close and remove .minio.part file on errors (#1243)

view details

Klaus Post

commit sha e944f76914ff24fda0ca11d0893b8ac8768a9093

Merge branch 'master' of https://github.com/minio/minio

view details

poornas

commit sha 27b8f18cce9b63b9119995d7454a41a757b03bc4

Fix storage info message on startup (#9177)

view details

Harshavardhana

commit sha 3d3beb6a9d95f0f1a7ca0c0a19979c290639f9b0

Add response header timeouts (#9170) - Add conservative timeouts upto 3 minutes for internode communication - Add aggressive timeouts of 30 seconds for gateway communication Fixes #9105 Fixes #8732 Fixes #8881 Fixes #8376 Fixes #9028

view details

Harshavardhana

commit sha ea18e51f4def222dc6c7ca86889e5082eec9e9ea

Support multiple LDAP OU's, smAccountName support (#9139) Fixes #8532

view details

Harshavardhana

commit sha cfc9cfd84a0064ad9a7bb7eefe4d0415a783e209

fix: various optimizations, idiomatic changes (#9179) - acquire since leader lock for all background operations - healing, crawling and applying lifecycle policies. - simplify lifecyle to avoid network calls, which was a bug in implementation - we should hold a leader and do everything from there, we have access to entire name space. - make listing, walking not interfere by slowing itself down like the crawler. - effectively use global context everywhere to ensure proper shutdown, in cache, lifecycle, healing - don't read `format.json` for prometheus metrics in StorageInfo() call.

view details

Harshavardhana

commit sha da04cb91ce36b21aa7c4ce3e8774ab3d41aa8c58

optimize listObjects to list only from 3 random disks (#9184)

view details

Krishna Srinivas

commit sha 45b1c66195e806c62a4497672614061b8cc11d4b

fix: implement splunk specific listObjects when delimiter=guidSplunk (#9186)

view details

poornas

commit sha 818d3bcaf58a6e68a8211bf286344cce2a99fbcf

fix: deprecate TestDiskCache test from unit tests (#9187)

view details

Harshavardhana

commit sha ff932ca2a0e429300fff8b7d36f3a6a11403b242

fix: log only catastrophic errors in prepare storage (#9189)

view details

Praveen raj Mani

commit sha e7a0be5bd356c39e5bd91b2baad7b2bb5ad42f5b

fix: throttling of events during their replay (#9188)

view details

Harshavardhana

commit sha 9a951da881a4c45ea3cee0295d6af919d7a489fe

honor the credentials of user admin for encrypt/decrypt (#9194) Fixes #9193

view details

Anis Elleuch

commit sha 791821d59053956b98297817513288f12c518fa5

sa: Allow empty policy to indicate parent user's policy is inherited (#9185)

view details

push time in 5 days

push eventklauspost/minio

push time in 5 days

push eventklauspost/minio

Klaus Post

commit sha 781b4a5d8f4f4360be895eccaaefc14571ce28e0

Refactor data usage crawler

view details

Klaus Post

commit sha 28cb4c4dbe8b59a434100b2084b07fc7b8e13232

Update/refactor.

view details

Klaus Post

commit sha 0050264f6d1b00f4eb0e07127eb19e9021ec8277

Start

view details

Klaus Post

commit sha 6d67f09222a1ae512c06b735934ef64458aadbd7

Add CrawlAndGetDataUsage for zones and below until be reach the object api.

view details

Klaus Post

commit sha ab4b137aaf2b73b82d087ab58a3c39a26ae56394

Add rest support for new types.

view details

Klaus Post

commit sha 24a1adc548bc83d8a1bcc845ae51894901c68d42

Use xxhash for consistent hashing.

view details

Klaus Post

commit sha f5c9ff54b5e9126162974df600409f2c715aa7df

Refactor sendWhiteSpaceToHTTPResponse

view details

Klaus Post

commit sha 18b89ac313c004a9c952216839648c0a6cee4ae1

Simplify loop.

view details

Klaus Post

commit sha d05d2e0402c6f71d4101422810e171f08c1d8644

Merge branch 'master' into data-usage-crawler-recfactor # Conflicts: # cmd/object-api-interface.go

view details

Klaus Post

commit sha 1697981a311c74cf6d3ec4b12e98a282f503d2b8

Add FS crawling.

view details

Klaus Post

commit sha a2c9e83706df5e7e61b636daea4e203315a54503

Start scanner

view details

Klaus Post

commit sha 64b002578bfda7c5a6f085289eaff004e18a8aae

Throw more code at it.

view details

Klaus Post

commit sha 3fa82ed342218f4071793bb1bcf53ee63bef82a6

Add scan loop.

view details

Klaus Post

commit sha 4d1e2126f398015c5edd925abb25d591044239b1

Start adding test.

view details

Klaus Post

commit sha 277d7b33280b374533a1cda33d3f692f94270aff

Finish tests.

view details

Klaus Post

commit sha cc39c1464bb1c7b161206440a7cd0a477e345816

Add serialize tests

view details

Klaus Post

commit sha 82b7aa141931ea4ea881178eb8d4e0fa5f0959a1

Remove unrelated code

view details

Klaus Post

commit sha 3acf51c5033a8f17cf5618481fd8c2980c3ecba2

Merge branch 'master' into data-usage-crawler-recfactor

view details

Klaus Post

commit sha a0f13279962cb0a885736f0063e594f73e749527

Add global cancellable context Adjust inherently racy tests.

view details

Klaus Post

commit sha f57b54af9bcee4d3504178087a5e7b87037089a5

Comments and keep running.

view details

push time in 5 days

push eventklauspost/minio

Klaus Post

commit sha 46536a4bdcacd7e329063876852f0760b4e92a91

Add data update tracking

view details

push time in 5 days

push eventklauspost/minio

Klaus Post

commit sha ae433c281e16d5e00812368f58a5d09f9109c53d

Squashed commit of the following: commit a7debe79b4b75844742289dc91fe1d3fd571a70d Merge: e944f769 2196fd9c Author: Klaus Post <klauspost@gmail.com> Date: Wed Mar 25 11:54:49 2020 +0100 Merge branch 'master' of https://github.com/minio/minio commit 2196fd9cd5d9b64c6a0630f0713839f7ac423b75 Author: Minio Trusted <trusted@minio.io> Date: Wed Mar 25 07:11:33 2020 +0000 Update yaml files to latest version RELEASE.2020-03-25T07-03-04Z commit ef6304c5c2f0e962c4f75cf83aae6ac7b1073b5f Author: Krishna Srinivas <634494+krishnasrinivas@users.noreply.github.com> Date: Tue Mar 24 23:26:13 2020 -0700 Improve connectDisks() performance (#9203) commit 6b984410d566c7032133f797a679e485d8b80eda Author: Nitish Tiwari <nitish@minio.io> Date: Wed Mar 25 11:10:45 2020 +0530 Add support for self-healing related metrics in Prometheus (#9079) Fixes #8988 Co-authored-by: Anis Elleuch <vadmeste@users.noreply.github.com> Co-authored-by: Harshavardhana <harsha@minio.io> commit 813e0fc1a86359a4b1a2c93b046f730cb13b27d3 Author: Harshavardhana <harsha@minio.io> Date: Tue Mar 24 18:53:24 2020 -0700 fix: optimize isConnected to avoid url.String() conversions (#9202) Stringifying in a loop can tax the system, avoid this and convert the endpoints to strings early on and remember them for the lifetime of the server. commit 38cf263409e7ea29265965beac999524677f87c5 Author: Harshavardhana <harsha@minio.io> Date: Tue Mar 24 14:51:06 2020 -0700 fix: docs remove goreportcard, its deprecated commit 6f6a2214fc0294569a8e2c992d23db1c4f3c5851 Author: Harshavardhana <harsha@minio.io> Date: Tue Mar 24 12:43:40 2020 -0700 Add rate limiter for S3 API layer (#9196) - total number of S3 API calls per server - maximum wait duration for any S3 API call This implementation is primarily meant for situations where HDDs are not capable enough to handle the incoming workload and there is no way to throttle the client. This feature allows MinIO server to throttle itself such that we do not overwhelm the HDDs. commit 791821d59053956b98297817513288f12c518fa5 Author: Anis Elleuch <vadmeste@users.noreply.github.com> Date: Mon Mar 23 22:17:18 2020 +0100 sa: Allow empty policy to indicate parent user's policy is inherited (#9185) commit 9a951da881a4c45ea3cee0295d6af919d7a489fe Author: Harshavardhana <harsha@minio.io> Date: Mon Mar 23 14:06:00 2020 -0700 honor the credentials of user admin for encrypt/decrypt (#9194) Fixes #9193 commit e7a0be5bd356c39e5bd91b2baad7b2bb5ad42f5b Author: Praveen raj Mani <praveen@minio.io> Date: Tue Mar 24 01:04:39 2020 +0530 fix: throttling of events during their replay (#9188) commit ff932ca2a0e429300fff8b7d36f3a6a11403b242 Author: Harshavardhana <harsha@minio.io> Date: Mon Mar 23 07:32:18 2020 -0700 fix: log only catastrophic errors in prepare storage (#9189) commit 818d3bcaf58a6e68a8211bf286344cce2a99fbcf Author: poornas <poornas@users.noreply.github.com> Date: Sun Mar 22 23:46:36 2020 -0700 fix: deprecate TestDiskCache test from unit tests (#9187) commit 45b1c66195e806c62a4497672614061b8cc11d4b Author: Krishna Srinivas <634494+krishnasrinivas@users.noreply.github.com> Date: Sun Mar 22 19:23:47 2020 -0700 fix: implement splunk specific listObjects when delimiter=guidSplunk (#9186) commit da04cb91ce36b21aa7c4ce3e8774ab3d41aa8c58 Author: Harshavardhana <harsha@minio.io> Date: Sun Mar 22 16:33:49 2020 -0700 optimize listObjects to list only from 3 random disks (#9184) commit cfc9cfd84a0064ad9a7bb7eefe4d0415a783e209 Author: Harshavardhana <harsha@minio.io> Date: Sun Mar 22 12:16:36 2020 -0700 fix: various optimizations, idiomatic changes (#9179) - acquire since leader lock for all background operations - healing, crawling and applying lifecycle policies. - simplify lifecyle to avoid network calls, which was a bug in implementation - we should hold a leader and do everything from there, we have access to entire name space. - make listing, walking not interfere by slowing itself down like the crawler. - effectively use global context everywhere to ensure proper shutdown, in cache, lifecycle, healing - don't read `format.json` for prometheus metrics in StorageInfo() call. commit ea18e51f4def222dc6c7ca86889e5082eec9e9ea Author: Harshavardhana <harsha@minio.io> Date: Sat Mar 21 22:47:26 2020 -0700 Support multiple LDAP OU's, smAccountName support (#9139) Fixes #8532 commit 3d3beb6a9d95f0f1a7ca0c0a19979c290639f9b0 Author: Harshavardhana <harsha@minio.io> Date: Sat Mar 21 22:10:13 2020 -0700 Add response header timeouts (#9170) - Add conservative timeouts upto 3 minutes for internode communication - Add aggressive timeouts of 30 seconds for gateway communication Fixes #9105 Fixes #8732 Fixes #8881 Fixes #8376 Fixes #9028 commit 27b8f18cce9b63b9119995d7454a41a757b03bc4 Author: poornas <poornas@users.noreply.github.com> Date: Sat Mar 21 10:02:20 2020 -0700 Fix storage info message on startup (#9177) commit e944f76914ff24fda0ca11d0893b8ac8768a9093 Merge: 2bb7ea99 bf545dc3 Author: Klaus Post <klauspost@gmail.com> Date: Sat Mar 21 16:54:25 2020 +0100 Merge branch 'master' of https://github.com/minio/minio commit bf545dc3203bbc8c452e3a14a7a9aedb563ccb63 Author: Harshavardhana <harsha@minio.io> Date: Fri Mar 20 17:28:36 2020 -0700 migrate to new minio-go with latest changes (#9176) - extract userTags from Get/Head request (#1249) - fix: Context cancellation not handled (#1250) - Check for correct http status in remove object tagging (#1248) - simplify extracting metadata in Head/Get object (#1245) - fix: close and remove .minio.part file on errors (#1243) commit f001e99fcd37b7271cc3e843d21958b460e87f93 Author: stefan-work <51439505+stefan-work@users.noreply.github.com> Date: Fri Mar 20 23:32:15 2020 +0100 create the final file with mode 0666 for multipart-uploads (#9173) NAS gateway creates non-multipart-uploads with mode 0666. But multipart-uploads are created with a differing mode of 0644. Both modes should be equal! Else it leads to files with different permissions based on its file-size. This patch solves that by using 0666 for both cases. commit b4bfdc92ccd7ea1de347dcd658846a595dea1b39 Author: Harshavardhana <harsha@minio.io> Date: Fri Mar 20 15:13:41 2020 -0700 fix: admin console logger changes to log.Info commit ae654831aa6e676639d75e2e0c4e946dbdbd22ba Author: Harshavardhana <harsha@minio.io> Date: Fri Mar 20 15:00:44 2020 -0700 Add madmin package context support (#9172) This is to improve responsiveness for all admin API operations and allowing callers to cancel any on-going admin operations, if they happen to be waiting too long. commit 1ffa983a9d8efac1194fbba2d8365e40b9db41aa Author: Stephen N <stephen.nickson@gmail.com> Date: Fri Mar 20 18:10:27 2020 +0000 added support for SASL/SCRAM on Kafka bucket notifications. (#9168) fixes #9167 commit ecf156626664b846c5ab2aaaa0d8ff5481e5a5cf Author: Nitish Tiwari <nitish@minio.io> Date: Fri Mar 20 07:50:51 2020 +0530 Add an option to allow plaintext connection to LDAP/AD Server (#9151) commit c5b87f93dd3bd441bc20c96f6bcc158de69734bc Author: Minio Trusted <trusted@minio.io> Date: Thu Mar 19 21:57:16 2020 +0000 Update yaml files to latest version RELEASE.2020-03-19T21-49-00Z commit b1a2169dcc5b4febc4edf1f82407e28b6f695475 Author: Harshavardhana <harsha@minio.io> Date: Thu Mar 19 09:47:47 2020 -0700 fix: data usage crawler env handling, usage-cache.bin location (#9163) canonicalize the ENVs such that we can bring these ENVs as part of the config values, as a subsequent change. - fix location of per bucket usage to `.minio.sys/buckets/<bucket_name>/usage-cache.bin` - fix location of the overall usage in `json` at `.minio.sys/buckets/.usage.json` (avoid conflicts with a bucket named `usage.json` ) - fix location of the overall usage in `msgp` at `.minio.sys/buckets/.usage.bin` (avoid conflicts with a bucket named `usage.bin` # Conflicts: # go.mod # go.sum

view details

push time in 5 days

push eventklauspost/minio

Klaus Post

commit sha d3e4e97982172233c8101755b2e3671c1b7229ab

Regen

view details

push time in 5 days

issue commentminio/minio

two PNG files with same md5 but different name, it displays diffrent content-type in chrome.

@PoplarYang The client uploading the content is responsible for setting the content type. The server accepts the content-type as sent by the client.

PoplarYang

comment created time in 5 days

delete branch klauspost/compress

delete branch : zstd-adjust-window-sizes

delete time in 5 days

push eventklauspost/compress

Klaus Post

commit sha 8f4a8f1c5ea44f1e9d5ee5119025eeb9fe56ae45

zstd: Adjust default window sizes (#247) Double window size for default (8MB) 4x for Better (16 MB)

view details

push time in 5 days

PR merged klauspost/compress

zstd: Adjust default window sizes

Double window size for default (8MB) 4x for Better (16 MB)

+22 -10

0 comment

1 changed file

klauspost

pr closed time in 5 days

issue commentminio/minio

Distributed minio low read performance

@olddanmer Have you considered something like restic that AFAIK doesn't require crawling the remote file system? rclone basically downloads the entire directory structure which will be resource intensive.

You can use warp to get some idea of the expected speed/throughput of various operations.

rvadim

comment created time in 5 days

issue closedminio/minio

Distributed MinIO requirements for drivers

It's not a bug report, just to ask a question:

MinIO creates erasure-coding sets of 4, 6, 8, 10, 12, 14 or 16 drives. The number of drives you provide must be a multiple of one of those numbers.

When I deploy distributed minio, to ensure full data protection, the number of drives above is the total drives for all nodes or the drives number for each node? e.g. 6 nodes, 1 drive per node is enough? or 6 nodes, 4 drive per node at least?

Thks.

closed time in 5 days

NeoyeElf

issue commentminio/minio

Distributed MinIO requirements for drivers

Please ask questions on slack: https://slack.min.io/

You can see information on erasure setup here: https://docs.min.io/docs/minio-erasure-code-quickstart-guide.html and https://docs.min.io/docs/distributed-minio-quickstart-guide.html

NeoyeElf

comment created time in 5 days

issue closedminio/minio

two PNG files with same md5 but different name, it displays diffrent content-type in chrome.

<!--- Provide a general summary of the issue in the Title above --> The two files with the same md5 but different name,in chrome it displays content-type.

Expected Behavior

In fact, two PNG files should be the same content-type.

Current Behavior

content-type in chrome

  • 15-31-41-ZYGRZAAm.png image/png
  • 15-31-41-ZYGRZAAM.png application/octet-stream
[root@iZuf64cw1gup8aagpfo2nuZ 25]# md5sum 15-31-41-ZYGRZAAm.png 15-31-41-ZYGRZAAM.png
0156b61a1b8ae18d597729cb95c57f8d  15-31-41-ZYGRZAAm.png
0156b61a1b8ae18d597729cb95c57f8d  15-31-41-ZYGRZAAM.png
[root@iZuf64cw1gup8aagpfo2nuZ 25]# stat 15-31-41-ZYGRZAAm.png 15-31-41-ZYGRZAAM.png
  文件:"15-31-41-ZYGRZAAm.png"
  大小:10703     	块:24         IO 块:4096   普通文件
设备:fd01h/64769d	Inode:1575187     硬链接:1
权限:(0644/-rw-r--r--)  Uid:(    0/    root)   Gid:(    0/    root)
最近访问:2020-03-25 15:32:41.254553524 +0800
最近更改:2020-03-25 15:31:46.371290144 +0800
最近改动:2020-03-25 15:32:33.381515742 +0800
创建时间:-
  文件:"15-31-41-ZYGRZAAM.png"
  大小:10703     	块:24         IO 块:4096   普通文件
设备:fd01h/64769d	Inode:1575188     硬链接:1
权限:(0644/-rw-r--r--)  Uid:(    0/    root)   Gid:(    0/    root)
最近访问:2020-03-25 15:37:07.488831159 +0800
最近更改:2020-03-25 15:32:41.254553524 +0800
最近改动:2020-03-25 15:32:41.254553524 +0800
创建时间:-
[root@iZuf64cw1gup8aagpfo2nuZ 25]# file 15-31-41-ZYGRZAAm.png 15-31-41-ZYGRZAAM.png
15-31-41-ZYGRZAAm.png: PNG image data, 574 x 288, 8-bit colormap, non-interlaced
15-31-41-ZYGRZAAM.png: PNG image data, 574 x 288, 8-bit colormap, non-interlaced

Possible Solution

Unknow

Steps to Reproduce (for bugs)

<!--- Provide a link to a live example, or an unambiguous set of steps to --> <!--- reproduce this bug. Include code to reproduce, if relevant -->

  1. upload 15-31-41-ZYGRZAAM.png to minio by AWS s3 go SDK, its content-type is application/octet-stream
  2. In minio backend, cp 15-31-41-ZYGRZAAM.png to 15-31-41-ZYGRZAAm.png, its content-type change to image/png.
  3. when using the web to upload 15-31-41-ZYGRZAAM.png, its content-type is application/octet-stream

Context

<!--- How has this issue affected you? What are you trying to accomplish? --> <!--- Providing context helps us come up with a solution that is most useful in the real world --> 15-31-41-ZYGRZAAM.png can be download here in 7 days.

Regression

minio version RELEASE.2020-03-19T21-49-00Z

Your Environment

<!--- Include as many relevant details about the environment you experienced the bug in -->

  • Version used (minio version): minio version RELEASE.2020-03-19T21-49-00Z
  • Environment name and version (e.g. Nginx 1.9.1): nginx version: nginx/1.16.1
  • Server type and version: CentOS Linux release 7.7.1908 (Core)
  • Operating System and version (uname -a): Linux iZuf64cw1gup8aagpfo2nuZ 3.10.0-1062.9.1.el7.x86_64 #1 SMP Fri Dec 6 15:49:49 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
  • Link to your project:

closed time in 5 days

PoplarYang

issue commentminio/minio

two PNG files with same md5 but different name, it displays diffrent content-type in chrome.

We do not support manually changing data in the back end. If you do, you are on your own.

You don't provide information on how you run your server. I suppose are are running it with a single disk. In that case metadata i stored in /disk/.minio.sys/buckets/bucket/15-31-41-ZYGRZAAm.png 15-31-41-ZYGRZAAM.pn/fs.json.

This may change with no notice, so use that information at your own risk.

PoplarYang

comment created time in 5 days

push eventklauspost/minio

Klaus Post

commit sha f35e384dc8ca93223440f18154ddf832afc0636d

Hook up more stuff.

view details

push time in 5 days

Pull request review commentminio/minio

deep heal object when bitrot detected

 func healErasureSet(ctx context.Context, setIndex int, xlObj *xlObjects) error { 	// Heal all buckets with all objects 	for _, bucket := range buckets { 		// Heal current bucket-		bgSeq.sourceCh <- bucket.Name+		bgSeq.sourceCh <- healSource{+			path: bucket.Name,+		}  		// List all objects in the current bucket and heal them 		listDir := listDirFactory(ctx, xlObj.getLoadBalancedDisks()...) 		walkResultCh := startTreeWalk(ctx, bucket.Name, "", "", true, listDir, nil) 		for walkEntry := range walkResultCh {-			bgSeq.sourceCh <- pathJoin(bucket.Name, walkEntry.entry)+			bgSeq.sourceCh <- healSource{+				path: pathJoin(bucket.Name, walkEntry.entry),+			} 		} 	}  	return nil } +// deepHealObject heals given object path in deep to fix bitrot.+func deepHealObject(ctx context.Context, objectPath string) {+	// Get background heal sequence to send elements to heal+	var bgSeq *healSequence+	var ok bool+	for {+		bgSeq, ok = globalBackgroundHealState.getHealSequenceByToken(bgHealingUUID)+		if ok {+			break+		}+		time.Sleep(time.Second)

@balamurugana The main difference is that this is called from active requests which we don't want to hold up.

balamurugana

comment created time in 6 days

Pull request review commentminio/minio

Onboard Diagnostics

 func (client *peerRESTClient) NetworkInfo() (info madmin.ServerNetworkHardwareIn 	return info, err } +type networkOverloadedErr struct{}++var networkOverloaded networkOverloadedErr++func (n networkOverloadedErr) Error() string {+	return "network overloaded"+}++type progressReader struct {+	r            io.Reader+	progressChan chan int64+}++func (p *progressReader) Read(b []byte) (int, error) {+	n, err := p.r.Read(b)+	if err != nil && err != io.EOF {+		return n, err+	}+	p.progressChan <- int64(n)+	return n, err+}++func (client *peerRESTClient) doNetOBDTest(ctx context.Context, dataSize int64, threadCount uint) (info madmin.NetOBDInfo, err error) {+	latencies := []float64{}+	throughputs := []float64{}++	buf := make([]byte, dataSize)++	buflimiter := make(chan bool, threadCount)+	errChan := make(chan error, threadCount)++	totalTransferred := int64(0)

totalTransferred is not an ahead-of-time counter. totalTransferred is only incremented after the buffer has been emptied of n bytes.

Correct, didn't read it correctly.

However, get rid of the lag imposed by the transferChan channel. Right now your before/after relies on everything in transferChan to have been applied. But you can just get rid of it.

In contrast, my approach measures throughput in terms of total bytes transferred between two points in time.

There is simply no need to count the number of active thread.

My main objection is that this assumes that 1 connection can saturate the NIC. We know that isn't true which is why you have several goroutines running. So we should consider the number of requests, or at least reject the ones at the end where less are running.

wlan0

comment created time in 6 days

Pull request review commentminio/minio

Onboard Diagnostics

+/*+ * MinIO Cloud Storage, (C) 2020 MinIO, Inc.+ *+ * Licensed under the Apache License, Version 2.0 (the "License");+ * you may not use this file except in compliance with the License.+ * You may obtain a copy of the License at+ *+ *     http://www.apache.org/licenses/LICENSE-2.0+ *+ * Unless required by applicable law or agreed to in writing, software+ * distributed under the License is distributed on an "AS IS" BASIS,+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.+ * See the License for the specific language governing permissions and+ * limitations under the License.+ *+ */++package disk++import (+	"context"+	"fmt"+	"os"+	"path/filepath"+	"runtime"+	"time"++	"github.com/montanaflynn/stats"+)++const kb = uint64(1 << 10)+const mb = uint64(kb << 10)+const gb = uint64(mb << 10)++var globalLatency = map[string]Latency{}+var globalThroughput = map[string]Throughput{}++// Latency holds latency information for write operations to the drive+type Latency struct {+	Avg          float64 `json:"avg_secs,omitempty"`+	Percentile50 float64 `json:"percentile50_secs,omitempty"`+	Percentile90 float64 `json:"percentile90_secs,omitempty"`+	Percentile99 float64 `json:"percentile99_secs,omitempty"`+	Min          float64 `json:"min_secs,omitempty"`+	Max          float64 `json:"max_secs,omitempty"`+}++// Throughput holds throughput information for write operations to the drive+type Throughput struct {+	Avg          float64 `json:"avg_bytes/s,omitempty"`

@wlan0 avg_bytes_per_sec or something like it then. No / pls.

wlan0

comment created time in 6 days

issue commentminio/minio

memory keep rising as objects increase

It is quite common that buffers, etc is kept in memory and that Go releases these lazily.

If you are interested in digging into the details, you can run the profiler to get before/after using mc admin profile start --type=mem myminio && run-test && mc admin profile stop myminio.

This will download server profiles you can inspect with Go tools like go tool pprof -base=profile-172.31.91.126_9000-mem-before.pprof profile-172.31.91.126_9000-mem.pprof. But this of course requires some insight into the Go development tools.

NeoyeElf

comment created time in 6 days

Pull request review commentminio/minio

deep heal object when bitrot detected

 func healErasureSet(ctx context.Context, setIndex int, xlObj *xlObjects) error { 	// Heal all buckets with all objects 	for _, bucket := range buckets { 		// Heal current bucket-		bgSeq.sourceCh <- bucket.Name+		bgSeq.sourceCh <- healSource{+			path: bucket.Name,+		}  		// List all objects in the current bucket and heal them 		listDir := listDirFactory(ctx, xlObj.getLoadBalancedDisks()...) 		walkResultCh := startTreeWalk(ctx, bucket.Name, "", "", true, listDir, nil) 		for walkEntry := range walkResultCh {-			bgSeq.sourceCh <- pathJoin(bucket.Name, walkEntry.entry)+			bgSeq.sourceCh <- healSource{+				path: pathJoin(bucket.Name, walkEntry.entry),+			} 		} 	}  	return nil } +// deepHealObject heals given object path in deep to fix bitrot.+func deepHealObject(ctx context.Context, objectPath string) {+	// Get background heal sequence to send elements to heal+	var bgSeq *healSequence+	var ok bool+	for {+		bgSeq, ok = globalBackgroundHealState.getHealSequenceByToken(bgHealingUUID)+		if ok {+			break+		}+		time.Sleep(time.Second)

This shouldn't block if it cannot queue the heal. Just return (and remove the for loop).

Maybe log an error, since I assume this case isn't expected.

balamurugana

comment created time in 6 days

Pull request review commentminio/minio

deep heal object when bitrot detected

 func healErasureSet(ctx context.Context, setIndex int, xlObj *xlObjects) error { 	// Heal all buckets with all objects 	for _, bucket := range buckets { 		// Heal current bucket-		bgSeq.sourceCh <- bucket.Name+		bgSeq.sourceCh <- healSource{+			path: bucket.Name,+		}  		// List all objects in the current bucket and heal them 		listDir := listDirFactory(ctx, xlObj.getLoadBalancedDisks()...) 		walkResultCh := startTreeWalk(ctx, bucket.Name, "", "", true, listDir, nil) 		for walkEntry := range walkResultCh {-			bgSeq.sourceCh <- pathJoin(bucket.Name, walkEntry.entry)+			bgSeq.sourceCh <- healSource{+				path: pathJoin(bucket.Name, walkEntry.entry),+			} 		} 	}  	return nil } +// deepHealObject heals given object path in deep to fix bitrot.+func deepHealObject(ctx context.Context, objectPath string) {+	// Get background heal sequence to send elements to heal+	var bgSeq *healSequence+	var ok bool+	for {+		bgSeq, ok = globalBackgroundHealState.getHealSequenceByToken(bgHealingUUID)+		if ok {+			break+		}+		time.Sleep(time.Second)+	}++	bgSeq.sourceCh <- healSource{

Will this block until it is picked up?

Maybe add a size to the channel and do:

select {
case bgSeq.sourceCh <- healSource{....}:
default:
}

We don't want reads blocking to kick off healing.

balamurugana

comment created time in 6 days

issue commentminio/minio

Distributed minio low read performance

This doesn't seem completely unreasonable. For very small files like this a regular http server will likely outperform it. MinIO has to coordinate between all your servers on each request, so small payloads will have a considerable overhead.

'ab' is generally not considered a good benchmark, even for HTTP servers. You can give warp a try. In particular you can see request times for different payload sizes using the --obj.randsize option.

Maybe try warp get --obj.randsize --autoterm --concurrent=10 --obj.size=50MB --requests to get an idea of the performance. It will test up to 50MiB objects. You can set host and credentials with env variables.

rvadim

comment created time in 6 days

Pull request review commentminio/minio

Add rate limiter for S3 API layer

+# MinIO Server Throttling Guide [![Slack](https://slack.min.io/slack?type=svg)](https://slack.min.io) [![Docker Pulls](https://img.shields.io/docker/pulls/minio/minio.svg?maxAge=604800)](https://hub.docker.com/r/minio/minio/)++MinIO server allows two ways to throttle incoming requests++- limiting the number of active requests allowed across the cluster+- limiting the total wait duration for reach request in the queue++These values are enabled using environment variables only.++## Configure connection limit+If you have traditional spinning (hdd) drives, some applications with high concurrency might require MinIO cluster to be tuned such that to avoid random I/O on the drives. The way to convert high concurrent I/O into a sequential I/O is by reducing the number of concurrent operations allowed per cluster. This allows MinIO cluster to be operationally resilient to such workloads, while also making sure the drives are at optimal efficiency and responsive.++Example: Limit MinIO cluster to accept at max 1600 simultaneous S3 API requests.+```sh+export MINIO_API_CONN_MAX=1600+export MINIO_ACCESS_KEY=your-access-key+export MINIO_SECRET_KEY=your-secret-key+minio server http://server{1...8}/mnt/hdd{1...16}+```++> NOTE: Setting MINIO_API_CONN_MAX=0 means unlimited and that is the default behavior. These values need to be set based on your deployment requirement and application.++## Configure connection (wait) deadline+This value works in conjunction with max connection setting, setting this value allows for long waiting requests to quickly timeout whenever there is no slot available to perform the request. Allows for reducing pile up of waiting connections when clients are not configured with better timeouts. Default wait time is *3 minutes* if MINIO_API_CONN_MAX is enabled this needs to be tuned as per applications needs, in our testing *3 minutes* wait time is sufficient for most workloads.
This value works in conjunction with max connection setting, setting this value allows for long waiting requests to quickly time out when there is no slot available to perform the request. 

This will reduce the pileup of waiting requests when clients are not configured with timeouts. Default wait time is *3 minutes* if MINIO_API_CONN_MAX is enabled. This may need to be tuned to your application needs.
harshavardhana

comment created time in 6 days

Pull request review commentminio/minio

Add rate limiter for S3 API layer

+# MinIO Server Throttling Guide [![Slack](https://slack.min.io/slack?type=svg)](https://slack.min.io) [![Docker Pulls](https://img.shields.io/docker/pulls/minio/minio.svg?maxAge=604800)](https://hub.docker.com/r/minio/minio/)++MinIO server allows two ways to throttle incoming requests++- limiting the number of active requests allowed across the cluster+- limiting the total wait duration for reach request in the queue++These values are enabled using environment variables only.++## Configure connection limit
## Configuring connection limit
harshavardhana

comment created time in 6 days

Pull request review commentminio/minio

Add rate limiter for S3 API layer

+# MinIO Server Throttling Guide [![Slack](https://slack.min.io/slack?type=svg)](https://slack.min.io) [![Docker Pulls](https://img.shields.io/docker/pulls/minio/minio.svg?maxAge=604800)](https://hub.docker.com/r/minio/minio/)++MinIO server allows two ways to throttle incoming requests++- limiting the number of active requests allowed across the cluster
- limit the number of active requests allowed.

Across the cluster? Isn't this per server?

harshavardhana

comment created time in 6 days

Pull request review commentminio/minio

Add rate limiter for S3 API layer

+# MinIO Server Throttling Guide [![Slack](https://slack.min.io/slack?type=svg)](https://slack.min.io) [![Docker Pulls](https://img.shields.io/docker/pulls/minio/minio.svg?maxAge=604800)](https://hub.docker.com/r/minio/minio/)++MinIO server allows two ways to throttle incoming requests++- limiting the number of active requests allowed across the cluster+- limiting the total wait duration for reach request in the queue
- limit the wait duration for reach request in the queue.
harshavardhana

comment created time in 6 days

Pull request review commentminio/minio

Add rate limiter for S3 API layer

+# MinIO Server Throttling Guide [![Slack](https://slack.min.io/slack?type=svg)](https://slack.min.io) [![Docker Pulls](https://img.shields.io/docker/pulls/minio/minio.svg?maxAge=604800)](https://hub.docker.com/r/minio/minio/)++MinIO server allows two ways to throttle incoming requests
MinIO server allows to throttle incoming requests:
harshavardhana

comment created time in 6 days

Pull request review commentminio/minio

Add rate limiter for S3 API layer

+# MinIO Server Throttling Guide [![Slack](https://slack.min.io/slack?type=svg)](https://slack.min.io) [![Docker Pulls](https://img.shields.io/docker/pulls/minio/minio.svg?maxAge=604800)](https://hub.docker.com/r/minio/minio/)++MinIO server allows two ways to throttle incoming requests++- limiting the number of active requests allowed across the cluster+- limiting the total wait duration for reach request in the queue++These values are enabled using environment variables only.++## Configure connection limit+If you have traditional spinning (hdd) drives, some applications with high concurrency might require MinIO cluster to be tuned such that to avoid random I/O on the drives. The way to convert high concurrent I/O into a sequential I/O is by reducing the number of concurrent operations allowed per cluster. This allows MinIO cluster to be operationally resilient to such workloads, while also making sure the drives are at optimal efficiency and responsive.++Example: Limit MinIO cluster to accept at max 1600 simultaneous S3 API requests.

1600 is very, very, very high and the server is probably dead. Maybe the example should be a more realistic number, like 100?

harshavardhana

comment created time in 6 days

Pull request review commentminio/minio

Add rate limiter for S3 API layer

 const ( 	EnvPublicIPs    = "MINIO_PUBLIC_IPS" 	EnvEndpoints    = "MINIO_ENDPOINTS" +	// API sub-system+	EnvAPIConnMax      = "MINIO_API_CONN_MAX"+	EnvAPIConnDeadline = "MINIO_API_CONN_DEADLINE"

Since you are already discussing, IMO it should be 'requests' and not 'connections'. People may think this is somehow related to the number of open TCP connections, which it is not.

	EnvAPIConnMax      = "MINIO_API_REQUESTS_MAX"
	EnvAPIConnDeadline = "MINIO_API_REQUESTS_DEADLINE"
harshavardhana

comment created time in 6 days

Pull request review commentminio/minio

Add rate limiter for S3 API layer

 func httpTraceHdrs(f http.HandlerFunc) http.HandlerFunc { 	} } +// maxClients throttles the S3 API calls+func maxClients(f http.HandlerFunc, enabled bool, connMaxCh chan struct{}, connDeadline time.Duration) http.HandlerFunc {+	return func(w http.ResponseWriter, r *http.Request) {+		if !enabled {+			f.ServeHTTP(w, r)+			return+		}++		select {+		case connMaxCh <- struct{}{}:+			f.ServeHTTP(w, r)+			<-connMaxCh

defer this, so if a panic happens we always release this.

			defer func() {<-connMaxCh}()
			f.ServeHTTP(w, r)
harshavardhana

comment created time in 6 days

issue closedfacebook/zstd

Q: Concurrent stream compression feedback

I am gearing up to add concurrent stream compression to the Go zstandard library.

In terms of implementation, my thoughts mainly go to split up the input into chucks (term used to not confuse it with other stuff) bigger than blocks and compress these chunks separately, but provide a history for the first blocks of each chunk to initialize each chunk with the previous content. Everything is still emitted as a single frame.

That way a number of chunks could be 'in flight' on different codes and the output should just be appended in correct order.

Implementation notes: https://gist.github.com/klauspost/87644c642e92b0acda3c7968c4786b79

With your experience would that be a reasonable approach? The only real downside is the higher memory usage which seems pretty unavoidable.

Apologies if this is not the proper forum to ask. I looked around for something more suitable.

closed time in 6 days

klauspost

issue commentfacebook/zstd

Q: Concurrent stream compression feedback

Makes sense. Thanks for the information!

klauspost

comment created time in 6 days

issue commentfacebook/zstd

Q: Concurrent stream compression feedback

How big are your 'chunks' and how much history do you feed it?

klauspost

comment created time in 6 days

issue commentfacebook/zstd

Q: Concurrent stream compression feedback

You will have to disable them for the first 3 sequences of each chunk.

I already do, since I compress finished sequences/literal values separately and want to be able to just switch to an uncompressed block in case it doesn't compress.

That allows me to start matching the next block without having to wait for output block compression.

Thanks. Just wanted to confirm I didn't do anything silly :)

klauspost

comment created time in 6 days

push eventklauspost/pgzip

Qais Patankar

commit sha 3fc4d32681e68458b92fa9b199d8953b1511d73f

Fix LICENSE header (#31) The previous format was confusing GitHub

view details

push time in 7 days

PR merged klauspost/pgzip

Fix LICENSE header

The previous format was confusing GitHub

+1 -2

1 comment

1 changed file

qaisjp

pr closed time in 7 days

pull request commentklauspost/pgzip

Fix LICENSE header

Thanks!

qaisjp

comment created time in 7 days

Pull request review commentminio/minio-go

Add Object Retention related flags for CopyObject API

 func (c Client) CopyObjectWithProgress(dst DestinationInfo, src SourceInfo, prog 		header.Set(amzLegalHoldHeader, dst.opts.LegalHold.String()) 	} +	if dst.opts.Mode != RetentionMode("") && dst.opts.RetainUntilDate != timeSentinel {

time.Time values should not be compared with ==. Use a.Equal(b) or a.IsZero() which will be the value if it wasn't set.

nitisht

comment created time in 7 days

Pull request review commentminio/minio-go

Add Object Retention related flags for CopyObject API

 import ( 	"io" 	"io/ioutil" 	"net/http"+	"time"  	"github.com/minio/minio-go/v6/pkg/encrypt" 	"github.com/minio/minio-go/v6/pkg/s3utils" ) +var timeSentinel = time.Unix(0, 0).UTC()

So this converts unit time 0 at (server) local time to UTC... which will not be the same for all servers.

nitisht

comment created time in 7 days

issue openedfacebook/zstd

Q: Concurrent stream compression feedback

I am gearing up to add concurrent stream compression to the Go zstandard library.

In terms of implementation, my thoughts mainly go to split up the input into chucks (term used to not confuse it with other stuff) bigger than blocks and compress these chunks separately, but provide a history for the first blocks of each chunk to initialize each chunk with the previous content. Everything is still emitted as a single frame.

That way a number of chunks could be 'in flight' on different codes and the output should just be appended in correct order.

Implementation notes: https://gist.github.com/klauspost/87644c642e92b0acda3c7968c4786b79

With your experience would that be a reasonable approach? The only real downside is the higher memory usage which seems pretty unavoidable.

Apologies if this is not the proper forum to ask. I looked around for something more suitable.

created time in 7 days

Pull request review commentminio/mc

[DO NOT MERGE] Onboard Diagnostics

 package cmd  import (-	"encoding/json"+	json "github.com/minio/mc/pkg/colorjson"

Move import down.

wlan0

comment created time in 7 days

Pull request review commentminio/mc

[DO NOT MERGE] Onboard Diagnostics

 import ( 	"bytes" 	"context" 	"crypto/tls"-	"encoding/json"+	json "github.com/minio/mc/pkg/colorjson"

Not a big fan of these blanket replacements.

  1. Output is not JSON and should only be used for printing.
  2. Unmarshal has horrible error reporting. Even worse than stdlib.

Could you instead use this as cjson "github.com/minio/mc/pkg/colorjson" and only replace where it makes sense?

wlan0

comment created time in 7 days

Pull request review commentminio/mc

[DO NOT MERGE] Onboard Diagnostics

+/*+ * MinIO Client (C) 2020 MinIO, Inc.+ *+ * Licensed under the Apache License, Version 2.0 (the "License");+ * you may not use this file except in compliance with the License.+ * You may obtain a copy of the License at+ *+ *     http://www.apache.org/licenses/LICENSE-2.0+ *+ * Unless required by applicable law or agreed to in writing, software+ * distributed under the License is distributed on an "AS IS" BASIS,+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.+ * See the License for the specific language governing permissions and+ * limitations under the License.+ */++package cmd++import (+	"compress/gzip"
	"github.com/klauspost/compress/gzip"

(+ fmt)

wlan0

comment created time in 7 days

create barnchklauspost/compress

branch : zstd-concurrent-compression

created branch time in 8 days

Pull request review commentminio/minio

Onboard Diagnostics

+/*+ * MinIO Cloud Storage, (C) 2020 MinIO, Inc.+ *+ * Licensed under the Apache License, Version 2.0 (the "License");+ * you may not use this file except in compliance with the License.+ * You may obtain a copy of the License at+ *+ *     http://www.apache.org/licenses/LICENSE-2.0+ *+ * Unless required by applicable law or agreed to in writing, software+ * distributed under the License is distributed on an "AS IS" BASIS,+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.+ * See the License for the specific language governing permissions and+ * limitations under the License.+ *+ */++package net++import (+	"github.com/montanaflynn/stats"+)++// Latency holds latency information for read/write operations to the drive+type Latency struct {+	Avg          float64 `json:"avg_secs,omitempty"`+	Percentile50 float64 `json:"percentile50_secs,omitempty"`+	Percentile90 float64 `json:"percentile90_secs,omitempty"`+	Percentile99 float64 `json:"percentile99_secs,omitempty"`+	Min          float64 `json:"min_secs,omitempty"`+	Max          float64 `json:"max_secs,omitempty"`+}++// Throughput holds throughput information for read/write operations to the drive+type Throughput struct {+	Avg          float64 `json:"avg_bytes/s,omitempty"`

Same as disk.

wlan0

comment created time in 8 days

Pull request review commentminio/minio

Onboard Diagnostics

+/*+ * MinIO Cloud Storage, (C) 2020 MinIO, Inc.+ *+ * Licensed under the Apache License, Version 2.0 (the "License");+ * you may not use this file except in compliance with the License.+ * You may obtain a copy of the License at+ *+ *     http://www.apache.org/licenses/LICENSE-2.0+ *+ * Unless required by applicable law or agreed to in writing, software+ * distributed under the License is distributed on an "AS IS" BASIS,+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.+ * See the License for the specific language governing permissions and+ * limitations under the License.+ *+ */++package disk++import (+	"context"+	"fmt"+	"os"+	"path/filepath"+	"runtime"+	"time"++	"github.com/montanaflynn/stats"+)++const kb = uint64(1 << 10)+const mb = uint64(kb << 10)+const gb = uint64(mb << 10)++var globalLatency = map[string]Latency{}+var globalThroughput = map[string]Throughput{}++// Latency holds latency information for write operations to the drive+type Latency struct {+	Avg          float64 `json:"avg_secs,omitempty"`+	Percentile50 float64 `json:"percentile50_secs,omitempty"`+	Percentile90 float64 `json:"percentile90_secs,omitempty"`+	Percentile99 float64 `json:"percentile99_secs,omitempty"`+	Min          float64 `json:"min_secs,omitempty"`+	Max          float64 `json:"max_secs,omitempty"`+}++// Throughput holds throughput information for write operations to the drive+type Throughput struct {+	Avg          float64 `json:"avg_bytes/s,omitempty"`+	Percentile50 float64 `json:"percentile50_bytes/s,omitempty"`+	Percentile90 float64 `json:"percentile90_bytes/s,omitempty"`+	Percentile99 float64 `json:"percentile99_bytes/s,omitempty"`+	Min          float64 `json:"min_bytes/s,omitempty"`+	Max          float64 `json:"max_bytes/s,omitempty"`+}++// GetOBDInfo about the drive+func GetOBDInfo(ctx context.Context, endpoint string) (Latency, Throughput, error) {+	runtime.LockOSThread()

Why?

This doesn't avoid preemption - AFAICT it makes it even more likely. But if there is a reason please document it for future reference.

wlan0

comment created time in 8 days

Pull request review commentminio/minio

Onboard Diagnostics

+/*+ * MinIO Cloud Storage, (C) 2020 MinIO, Inc.+ *+ * Licensed under the Apache License, Version 2.0 (the "License");+ * you may not use this file except in compliance with the License.+ * You may obtain a copy of the License at+ *+ *     http://www.apache.org/licenses/LICENSE-2.0+ *+ * Unless required by applicable law or agreed to in writing, software+ * distributed under the License is distributed on an "AS IS" BASIS,+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.+ * See the License for the specific language governing permissions and+ * limitations under the License.+ *+ */++package disk++import (+	"context"+	"fmt"+	"os"+	"path/filepath"+	"runtime"+	"time"++	"github.com/montanaflynn/stats"+)++const kb = uint64(1 << 10)+const mb = uint64(kb << 10)+const gb = uint64(mb << 10)++var globalLatency = map[string]Latency{}+var globalThroughput = map[string]Throughput{}++// Latency holds latency information for write operations to the drive+type Latency struct {+	Avg          float64 `json:"avg_secs,omitempty"`

For warp I used milliseconds, since it is usually a bit easier to read. Maybe do that?

wlan0

comment created time in 8 days

Pull request review commentminio/minio

Onboard Diagnostics

+/*+ * MinIO Cloud Storage, (C) 2020 MinIO, Inc.+ *+ * Licensed under the Apache License, Version 2.0 (the "License");+ * you may not use this file except in compliance with the License.+ * You may obtain a copy of the License at+ *+ *     http://www.apache.org/licenses/LICENSE-2.0+ *+ * Unless required by applicable law or agreed to in writing, software+ * distributed under the License is distributed on an "AS IS" BASIS,+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.+ * See the License for the specific language governing permissions and+ * limitations under the License.+ *+ */++package disk++import (+	"context"+	"fmt"+	"os"+	"path/filepath"+	"runtime"+	"time"++	"github.com/montanaflynn/stats"+)++const kb = uint64(1 << 10)+const mb = uint64(kb << 10)+const gb = uint64(mb << 10)++var globalLatency = map[string]Latency{}+var globalThroughput = map[string]Throughput{}++// Latency holds latency information for write operations to the drive+type Latency struct {+	Avg          float64 `json:"avg_secs,omitempty"`+	Percentile50 float64 `json:"percentile50_secs,omitempty"`+	Percentile90 float64 `json:"percentile90_secs,omitempty"`+	Percentile99 float64 `json:"percentile99_secs,omitempty"`+	Min          float64 `json:"min_secs,omitempty"`+	Max          float64 `json:"max_secs,omitempty"`+}++// Throughput holds throughput information for write operations to the drive+type Throughput struct {+	Avg          float64 `json:"avg_bytes/s,omitempty"`+	Percentile50 float64 `json:"percentile50_bytes/s,omitempty"`+	Percentile90 float64 `json:"percentile90_bytes/s,omitempty"`+	Percentile99 float64 `json:"percentile99_bytes/s,omitempty"`

Again, likely to be just noise or just equal to Min.

wlan0

comment created time in 8 days

Pull request review commentminio/minio

Onboard Diagnostics

+/*+ * MinIO Cloud Storage, (C) 2020 MinIO, Inc.+ *+ * Licensed under the Apache License, Version 2.0 (the "License");+ * you may not use this file except in compliance with the License.+ * You may obtain a copy of the License at+ *+ *     http://www.apache.org/licenses/LICENSE-2.0+ *+ * Unless required by applicable law or agreed to in writing, software+ * distributed under the License is distributed on an "AS IS" BASIS,+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.+ * See the License for the specific language governing permissions and+ * limitations under the License.+ *+ */++package disk++import (+	"context"+	"fmt"+	"os"+	"path/filepath"+	"runtime"+	"time"++	"github.com/montanaflynn/stats"+)++const kb = uint64(1 << 10)+const mb = uint64(kb << 10)+const gb = uint64(mb << 10)++var globalLatency = map[string]Latency{}+var globalThroughput = map[string]Throughput{}++// Latency holds latency information for write operations to the drive+type Latency struct {+	Avg          float64 `json:"avg_secs,omitempty"`+	Percentile50 float64 `json:"percentile50_secs,omitempty"`+	Percentile90 float64 `json:"percentile90_secs,omitempty"`+	Percentile99 float64 `json:"percentile99_secs,omitempty"`+	Min          float64 `json:"min_secs,omitempty"`+	Max          float64 `json:"max_secs,omitempty"`+}++// Throughput holds throughput information for write operations to the drive+type Throughput struct {+	Avg          float64 `json:"avg_bytes/s,omitempty"`
	Avg          float64 `json:"avg_bps,omitempty"`

Pretty sure slashes in a JSON name is a bad idea.

wlan0

comment created time in 8 days

Pull request review commentminio/minio

Onboard Diagnostics

+/*+ * MinIO Cloud Storage, (C) 2020 MinIO, Inc.+ *+ * Licensed under the Apache License, Version 2.0 (the "License");+ * you may not use this file except in compliance with the License.+ * You may obtain a copy of the License at+ *+ *     http://www.apache.org/licenses/LICENSE-2.0+ *+ * Unless required by applicable law or agreed to in writing, software+ * distributed under the License is distributed on an "AS IS" BASIS,+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.+ * See the License for the specific language governing permissions and+ * limitations under the License.+ *+ */++package disk++import (+	"context"+	"fmt"+	"os"+	"path/filepath"+	"runtime"+	"time"++	"github.com/montanaflynn/stats"+)++const kb = uint64(1 << 10)+const mb = uint64(kb << 10)+const gb = uint64(mb << 10)++var globalLatency = map[string]Latency{}+var globalThroughput = map[string]Throughput{}++// Latency holds latency information for write operations to the drive+type Latency struct {+	Avg          float64 `json:"avg_secs,omitempty"`+	Percentile50 float64 `json:"percentile50_secs,omitempty"`+	Percentile90 float64 `json:"percentile90_secs,omitempty"`+	Percentile99 float64 `json:"percentile99_secs,omitempty"`

Do you have enough information for a 99 percentile?

wlan0

comment created time in 8 days

Pull request review commentminio/minio

Onboard Diagnostics

+/*+ * MinIO Cloud Storage, (C) 2020 MinIO, Inc.+ *+ * Licensed under the Apache License, Version 2.0 (the "License");+ * you may not use this file except in compliance with the License.+ * You may obtain a copy of the License at+ *+ *     http://www.apache.org/licenses/LICENSE-2.0+ *+ * Unless required by applicable law or agreed to in writing, software+ * distributed under the License is distributed on an "AS IS" BASIS,+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.+ * See the License for the specific language governing permissions and+ * limitations under the License.+ *+ */++package disk++import (+	"context"+	"fmt"+	"os"+	"path/filepath"+	"runtime"+	"time"++	"github.com/montanaflynn/stats"+)++const kb = uint64(1 << 10)+const mb = uint64(kb << 10)+const gb = uint64(mb << 10)+
const (
	kb = 1 << 10
	mb = kb << 10
	gb = mb << 10
)
wlan0

comment created time in 8 days

Pull request review commentminio/minio

Onboard Diagnostics

 To trace entire HTTP request and also internode communication mc admin trace --all --verbose myminio ``` ++### On-board Diagnostics+On-board diagnostics help ensure that the underlying infrastructure that runs MinIO is configured correctly, and is functioning properly. This test is one-shot long running one, that is recommended to be run as soon as the cluster is first provisioned, and each time a failure scenrio is encountered. Note that the test incurs majority of the available resources on the system. Care must be taken when using this to debug failure scenario, so as to prevent larger outages. OBD tests can be triggered using [`mc admin obd`](https://github.com/minio/mc/blob/master/docs/minio-admin-complete-guide.md#command-obd---display-minio-server-obd) command.++Example:+```sh+minio server /data+```++The command takes no flags+```sh+mc admin obd myminio+```++The output printed will be of the form+```sh+● Admin Info ... ✔ +● CPU ... ✔ +● Disk Hardware ... ✔ +● Os Info ... ✔ +● Mem Info ... ✔ +● Process Info ... ✔ +● Config ... ✔ +● Drive ... ✔ +● Net ... ✔ +*********************************************************************************+                                   WARNING!!+     ** THIS FILE MAY CONTAIN SENSITIVE INFORMATION ABOUT YOUR ENVIRONMENT ** +     ** PLEASE INSPECT CONTENTS BEFORE SHARING IT ON ANY PUBLIC FORUM **+*********************************************************************************+OBD data saved to dc-11-obd_20200321053323.json.gz+```++The gunzipped output contains debugging information for your system
The gzipped output contains debugging information for your system.
wlan0

comment created time in 8 days

more