profile
viewpoint

Ask questionsruntime: unaligned jumps causing performance regression on Intel

What version of Go are you using (go version)?

<pre> λ go version go version go1.13 windows/amd64 </pre>

And Go 1.14-RC1.

Does this issue reproduce with the latest release?

Yes.

What operating system and processor architecture are you using (go env)?

<details><summary><code>go env</code> Output</summary><br><pre> λ go env set GO111MODULE= set GOARCH=amd64 set GOBIN= set GOCACHE=C:\Users\klaus\AppData\Local\go-build set GOENV=C:\Users\klaus\AppData\Roaming\go\env set GOEXE=.exe set GOFLAGS= set GOHOSTARCH=amd64 set GOHOSTOS=windows set GONOPROXY= set GONOSUMDB= set GOOS=windows set GOPATH=c:\gopath set GOPRIVATE= set GOPROXY=https://proxy.golang.org,direct set GOROOT=c:\go set GOSUMDB=sum.golang.org set GOTMPDIR= set GOTOOLDIR=c:\go\pkg\tool\windows_amd64 set GCCGO=gccgo set AR=ar set CC=gcc set CXX=g++ set CGO_ENABLED=1 set GOMOD= set CGO_CFLAGS=-g -O2 set CGO_CPPFLAGS= set CGO_CXXFLAGS=-g -O2 set CGO_FFLAGS=-g -O2 set CGO_LDFLAGS=-g -O2 set PKG_CONFIG=pkg-config set GOGCCFLAGS=-m64 -mthreads -fmessage-length=0 -fdebug-prefix-map=c:\temp\wintemp\go-build453787042=/tmp/go-build -gno-record-gcc-switches </pre></details>

What did you do?

Isolated code: reproducer.zip

go test -bench=. -test.benchtime=10s

Most of the code is needed for the test setup, only (*tokens).EstimatedBits and mFastLog2 is run during the benchmark.

λ benchcmp go113.txt go114.txt
benchmark                             old ns/op     new ns/op     delta
Benchmark_tokens_EstimatedBits-12     663           716           +7.99%

benchmark                             old MB/s     new MB/s     speedup
Benchmark_tokens_EstimatedBits-12     1.51         1.40         0.93x

What did you expect to see?

Equivalent performance.

What did you see instead?

8% performance regression.

golang/go

Answer questions dr2chase

I did a as-careful-as-possible laptop benchmark, and with asynchronous preemption OFF, the branch alignment fix makes 1.13 and 1.14 effectively equal (on my laptop).

We still don't entirely understand what's wrong with preemption in this benchmark, will keep looking, will also see how the latest version of the branch alignment batch compares.

useful!

Related questions

cmd/link: segmentation fault during mach-o linking hot 4
cmd/go: cannot find module providing package error stops `go get` processing hot 2
cmd/go: needs a better error than "missing dot in first path element" when GOROOT is set incorrectly hot 2
x/xerrors: fails to compile on tip hot 1
vendor/golang.org/x/xerrors/adaptor_go1_13.go:16:14: undefined: errors.Frame ... hot 1
cmd/go: `go clean <package>` downloads modules hot 1
cmd/cgo error: runtime: unknown pc 0x7fff5c805b86 hot 1
runtime: crash with "invalid pc-encoded table" hot 1
cmd/vet: potential false positive in the "suspect or" check hot 1
cmd/link: showing many ld warnings of "building for macOS, but linking in object file" hot 1
runtime: go program crach, it seems fall into infinite loop hot 1
cmd/go: major version without preceding tag must be v0, not v1 - breaks build of github.com/go-check hot 1
runtime: macOS Sierra builders spinning hot 1
cmd/go: Problem using go modules hot 1
cmd/go: "unrecognized import path" for local packages after updating to go1.13 hot 1
Github User Rank List