profile
viewpoint

dr2chase/bent 18

Benchmark and test getter + runner

dr2chase/solve_inlines 4

Runs modified Go compiler with randomized set of inline sites and uses that to derive best inline sites

dr2chase/gossahash 3

Searches for the function that the SSA phase of the Go compiler is doing wrong.

dr2chase/benchmarks 2

Collection of benchmarks that caused problematic regressions in Go development

dr2chase/bikelight 1

Software for power/light management from a bicycle hub to lights and an optional USB charger.

dr2chase/delve 0

Delve is a debugger for the Go programming language.

dr2chase/dwarf-goodness 0

Tool(s) for measuring the quality of the debugging information produced by the Go compiler

dr2chase/go-ethereum 0

Official Go implementation of the Ethereum protocol

dr2chase/go-metrics 0

Go port of Coda Hale's Metrics library

dr2chase/nostmt 0

nostmt

issue commentgolang/go

cmd/internal/obj/x86: pad jumps to avoid Intel erratum

Just so everyone knows, the penalty can be quite bad, as reported in #37190.

I reproduced this,

  • comparing 1.13 (which was just lucky),
  • two versions of aligned,
  • and unpadded
name \ time/op    Go-1.13     Go-1.14-vzu-align  Go-1.14-vzu-nopalign  Go-1.14-vzu
FastTest2KB-4     141ns ± 2%  115ns ± 2%         112ns ± 0%            269ns ± 1%

Note that alignment improves the best case, so the best-to-worst slowdown exceeds 100% when things line up just so.

For that set of benchmarks (excluding those affected by a not-fully-mitigated MacOS bug):

name \ time/op    Go-1.13     Go-1.14-vzu-align  Go-1.14-vzu-nopalign  Go-1.14-vzu
[Geo mean]        54.9µs      53.1µs             53.3µs                55.1µs

The benchmarks were those in https://github.com/dr2chase/bent

In another benchmark run, I also checked the size and performance costs of 16 vs 32-byte alignment; we want 32-byte alignment, 16 gives 0.7% bigger text and 0.82% slower geomean execution, with almost no winners in the run-time column.

For reference, the two benchmark configurations:

[[Configurations]]
  Name = "Go-1.14-vzeroupper-nopalign-32-lessf2i-nopreempt"
  Root = "$HOME/work/go-quick/"
  GcEnv = ["GOAMD64=alignedjumps"]
  RunEnv = ["GODEBUG=asyncpreemptoff=1"]

[[Configurations]]
  Name = "Go-1.14-vzeroupper-nopalign-16-lessf2i-nopreempt"
  Root = "$HOME/work/go/"
  GcEnv = ["GOAMD64=alignedjumps"]
  RunEnv = ["GODEBUG=asyncpreemptoff=1"]

and git diff in go:

diff --git a/src/cmd/internal/obj/x86/asm6.go b/src/cmd/internal/obj/x86/asm6.go
index 16e73fad44..21d254d1e2 100644
--- a/src/cmd/internal/obj/x86/asm6.go
+++ b/src/cmd/internal/obj/x86/asm6.go
@@ -1982,7 +1982,7 @@ func makePjc(ctxt *obj.Link) *padJumpsCtx {
                return &padJumpsCtx{}
        }
        return &padJumpsCtx{
-               jumpAlignment: 32,
+               jumpAlignment: 16,
        }
 }
 
diff --git a/src/cmd/link/internal/amd64/obj.go b/src/cmd/link/internal/amd64/obj.go
index 3239c61864..f1f2e3e11c 100644
--- a/src/cmd/link/internal/amd64/obj.go
+++ b/src/cmd/link/internal/amd64/obj.go
@@ -40,9 +40,9 @@ func Init() (*sys.Arch, ld.Arch) {
        arch := sys.ArchAMD64
 
        fa := funcAlign
-       if objabi.GOAMD64 == "alignedjumps" {
-               fa = 32
-       }
+       //if objabi.GOAMD64 == "alignedjumps" {
+       //      fa = 32
+       //}
 
        theArch := ld.Arch{
                Funcalign:  fa,

I think we should be looking at the NOP-only patch and probably just have it turned on all the time. This seems like the least-risk way of avoiding this sometimes-terrible slowdown that will also interfere with performance-tuning work on updated-microcode Intel processors.

rsc

comment created time in 5 days

issue closedgolang/go

Performance regeression using go1.14rc1

<!-- Please answer these questions before submitting your issue. Thanks! For questions please use one of our forums: https://github.com/golang/go/wiki/Questions -->

What version of Go are you using (go version)?

<pre> $ go version 1.14rc1 and 1.13.7

</pre>

Does this issue reproduce with the latest release?

Yes.

What operating system and processor architecture are you using (go env)?

<details><summary><code>go env</code> Output</summary><br><pre> $ go env (anonymised) GO111MODULE="" GOARCH="amd64" GOBIN="" GOCACHE="/home/--/.cache/go-build" GOENV="/home/--/.config/go/env" GOEXE="" GOFLAGS="" GOHOSTARCH="amd64" GOHOSTOS="linux" GONOPROXY="" GONOSUMDB="" GOOS="linux" GOPATH="/home/--/go" GOPRIVATE="" GOPROXY="https://proxy.golang.org,direct" GOROOT="/home/--/sdk/go1.13.7" GOSUMDB="sum.golang.org" GOTMPDIR="" GOTOOLDIR="/home/--/sdk/go1.13.7/pkg/tool/linux_amd64" GCCGO="gccgo" AR="ar" CC="gcc" CXX="g++" CGO_ENABLED="1" GOMOD="" CGO_CFLAGS="-g -O2" CGO_CPPFLAGS="" CGO_CXXFLAGS="-g -O2" CGO_FFLAGS="-g -O2" CGO_LDFLAGS="-g -O2" PKG_CONFIG="pkg-config" GOGCCFLAGS="-fPIC -m64 -pthread -fmessage-length=0 -fdebug-prefix-map=/tmp/go-build174863045=/tmp/go-build -gno-record-gcc-switches"

</pre></details>

What did you do?

<!-- If possible, provide a recipe for reproducing the error. A complete runnable program is good. A link on play.golang.org is best. -->

Run the benchmarks recommended by https://github.com/golang/go/wiki/Benchmarks (some of them are out of date btw.). And specifically:

github.com/ethereum/go-ethereum/common/bitutil BenchmarkFastTest2KB

using v1.9.10.

What did you expect to see?

About the same performance or better when using go1.14rc1 compared to 1.13.7.

What did you see instead?

About 90% regression.

Let me clarify that I am not involved in the ethereum project at all. I did just run some benchmarks and this popped out. Still I did a little bit of investigation and this benchmark uses unsafe code.

The patch notes of 1.14 mention:

This release adds -d=checkptr as a compile-time option for adding instrumentation to check that Go code is following unsafe.Pointer safety rules dynamically. This option is enabled by default (except on Windows) with the -race or -msan flags, and can be disabled with -gcflags=all=-d=checkptr=0.

So I did use -gcflags=all=-d=checkptr=0 but with no effect. Trying to use -msan for comparision did result in the error gcc: error: unrecognized argument to '-fsanitize=' option: 'memory' and I think this should be address instead. Using -race however worked fine but with expected severe slowdown with 1.14 and 1.13 about equal in absolute numbers.

closed time in 6 days

D1CED

issue commentgolang/go

Performance regeression using go1.14rc1

Also, this is a duplicate of #37121, it just happens to be much more painful.

D1CED

comment created time in 6 days

push eventdr2chase/bent

David Chase

commit sha 5c2979d5be44b1f4ae29ddcb39d8c19a5180e2bd

Tweaked the recursive-chmod-before gopath/pkg cleanup.

view details

push time in 6 days

issue commentgolang/go

Performance regeression using go1.14rc1

Reproducible, I'm going to guess that you have the new Intel microcode which penalizes certain unaligned jumps severely.

Till we align those branches (I don't think for 1.14.0) this is a luck-of-the-draw problem -- sometimes 1.13 picks wrong, sometimes 1.14 picks wrong, sometimes neither, sometimes both. There's a CL for that from Intel ( https://go-review.googlesource.com/c/go/+/206837 ), another simpler version that I am trying to finish that only pads with nops.

D1CED

comment created time in 6 days

issue commentgolang/go

runtime: Darwin slow when using signals + int->float instructions

We have an office full of people playing musical VZEROUPPERs trying to figure out what the problem is. A whole bunch of things don't help, we think the bug is in Darwin....

randall77

comment created time in 7 days

issue commentgolang/go

runtime: Darwin slow when using signals + int->float instructions

@cherrymui reports that gratuitously inserting the VZEROUPPER before the conversion instruction also works. Wheeee!

randall77

comment created time in 7 days

issue commentgolang/go

runtime: Darwin slow when using signals + int->float instructions

@jyknight suggests lack of a vzeroupper, perhaps in the Darwin signal handler.

randall77

comment created time in 7 days

issue commentgolang/go

runtime: go1.14rc1 performance regression

The alignment issue depends on your CPU's microcode version, which Apple has updated in the last few months. No idea how that works for Linux users. For example:

sysctl -a | egrep machdep.cpu.'(family|model|extfamily|stepping|microcode)'
machdep.cpu.family: 6
machdep.cpu.model: 142
machdep.cpu.extfamily: 0
machdep.cpu.stepping: 9
machdep.cpu.microcode_version: 202
klauspost

comment created time in 7 days

issue commentgolang/go

runtime: go1.14rc1 performance regression

I did a as-careful-as-possible laptop benchmark, and with asynchronous preemption OFF, the branch alignment fix makes 1.13 and 1.14 effectively equal (on my laptop).

We still don't entirely understand what's wrong with preemption in this benchmark, will keep looking, will also see how the latest version of the branch alignment batch compares.

klauspost

comment created time in 8 days

issue commentgolang/go

runtime: go1.14rc1 performance regression

I have to quit for the night, what I find in 1.14 vs 1.13 (ignoring likely interactions with preemption) is an extra instruction in the GOSSAFUNC output, comparing final output for

GOSSAFUNC='(*tokens).EstimatedBits' go build token.go

I doubt we're going to fix that extra instruction, but the other slowdown seems worth further look.

The inner loops are 1.13:

v106	00017 (+25) INCQ CX
v18	00018 (+25) CMPQ CX, $256
b6	00019 (25) JGE 55

v41	00020 (25) MOVWLZX 136(AX)(CX*2), DX
v178	00021 (+26) TESTW DX, DX
b7	00022 (26) JLS 53

v46	00023 (+27) XORPS X2, X2
v46	00024 (27) CVTSL2SS DX, X2
v289	00025 (+28) MOVUPS X2, X3
v48	00026 (28) MULSS X1, X2

# $GOROOT/src/math/unsafe.go
v60	00027 (+12) MOVL X2, DX

# /Users/drchase/work/gocode/src/github.com/dr2chase/benchmarks/klauspost/token.go
v301	00028 (+58) MOVL DX, BX
v65	00029 (58) SARL $23, DX
v67	00030 (58) MOVBLZX DX, DX
v69	00031 (58) ADDL $-128, DX
v70	00032 (58) XORPS X2, X2
v70	00033 (58) CVTSL2SS DX, X2
v72	00034 (+59) ANDL $-2139095041, BX
v74	00035 (+60) LEAL 1065353216(BX), DX

# $GOROOT/src/math/unsafe.go
v85	00036 (+18) MOVL DX, X4

# /Users/drchase/work/gocode/src/github.com/dr2chase/benchmarks/klauspost/token.go
v276	00037 (+62) MOVSS $(-0.34484842419624329), X5
v90	00038 (62) MULSS X4, X5
v274	00039 (62) MOVSS $(2.0246658325195312), X6
v92	00040 (62) ADDSS X5, X6
v93	00041 (62) MULSS X4, X6
v273	00042 (62) MOVSS $(0.6748775839805603), X4
v95	00043 (62) SUBSS X4, X6
v96	00044 (62) ADDSS X6, X2
v272	00045 (+28) MOVSS $(-0.0), X5
v100	00046 (28) PXOR X5, X2
v102	00047 (28) MULSS X2, X3
v103	00048 (28) ADDSS X3, X0
v50	00049 (?) NOP
v55	00050 (+57) XCHGL AX, AX
v80	00051 (+61) XCHGL AX, AX
b14	00052 (28) JMP 17

v172	00053 (28) MOVSS $(0.6748775839805603), X4
b40	00054 (26) JMP 17

and 1.14

1.14x

v106	00019 (+25) INCQ CX
v117	00020 (+25) CMPQ CX, $256
b6	00021 (25) JGE 58

v41	00022 (25) MOVWLZX 136(AX)(CX*2), DX
v139	00023 (+26) TESTW DX, DX
b7	00024 (26) JLS 55

v46	00025 (+27) XORPS X2, X2
v46	00026 (27) CVTSL2SS DX, X2
v252	00027 (+28) MOVUPS X2, X3
v48	00028 (28) MULSS X1, X2

# $GOROOT/src/math/unsafe.go
v60	00029 (+12) MOVL X2, DX

# /Users/drchase/work/gocode/src/github.com/dr2chase/benchmarks/klauspost/token.go
v173	00030 (+58) MOVL DX, BX
v65	00031 (58) SARL $23, DX
v67	00032 (58) MOVBLZX DX, DX
v69	00033 (58) ADDL $-128, DX
v70	00034 (58) XORPS X2, X2
v70	00035 (58) CVTSL2SS DX, X2
v72	00036 (+59) ANDL $-2139095041, BX
v74	00037 (+60) LEAL 1065353216(BX), DX

# $GOROOT/src/math/unsafe.go
v85	00038 (+18) MOVL DX, X4

# /Users/drchase/work/gocode/src/github.com/dr2chase/benchmarks/klauspost/token.go
v171	00039 (+62) MOVSS $(-0.34484842419624329), X5
v90	00040 (62) MULSS X4, X5
v153	00041 (62) MOVSS $(2.0246658325195312), X6
v92	00042 (62) ADDSS X6, X5
v93	00043 (62) MULSS X5, X4
v151	00044 (62) MOVSS $(0.6748775839805603), X5
v95	00045 (62) SUBSS X5, X4
v96	00046 (62) ADDSS X4, X2
v79	00047 (+28) MOVSS $(-0.0), X4
v100	00048 (28) PXOR X4, X2
v102	00049 (28) MULSS X2, X3
v103	00050 (28) ADDSS X3, X0
v50	00051 (?) NOP
v55	00052 (+57) XCHGL AX, AX
v80	00053 (+61) XCHGL AX, AX
b14	00054 (28) JMP 19

v221	00055 (28) MOVSS $(0.6748775839805603), X5
v263	00056 (28) MOVSS $(2.0246658325195312), X6
b29	00057 (26) JMP 19
klauspost

comment created time in 11 days

issue commentgolang/go

runtime: go1.14rc1 performance regression

Charming. Turning off asynchronous preemption eliminates a whole lot of the regression, but not all of it. (Why did I try this? Because attempting to collect profiles made both of them equally, terribly slow.)

benchstat 20200207T165050.Go-1.13.stdout 20200207T165050.Go-1.14.stdout
name \ time/op           20200207T165050.Go-1.13  20200207T165050.Go-1.14  delta
_tokens_EstimatedBits-4               791ns ± 9%               911ns ± 9%  +15.06%  (p=0.016 n=5+5)
klauspost

comment created time in 11 days

push eventdr2chase/bent

David Chase

commit sha 8d1dbb652c87d66769631ddf52ef43bd50a691ca

Added a new benchmark (a 1.14 regression) and a flag setter for benchtime

view details

push time in 11 days

issue commentgolang/go

Go 1.14-rc1 performance regression

I get a startling regression on a Mac laptop -- 70% slowdown -- will look further to see if I am making some obvious mistake.

klauspost

comment created time in 11 days

push eventdr2chase/benchmarks

David Chase

commit sha 767488d585705652362a7f139ba8f0a7f86eb3c8

Added klauspost https://github.com/golang/go/issues/37121

view details

push time in 11 days

create barnchdr2chase/benchmarks

branch : master

created branch time in 11 days

created repositorydr2chase/benchmarks

Collection of benchmarks that caused problematic regressions in Go development

created time in 11 days

issue commentgolang/go

cmd/compile/internal/ssa: test TestNexting failing

@bcmills I added dlv to the containers that run longtests, so it should be there.

Using gdb is a recipe for flaky awfulness, different versions behave differently, it can be sensitive to your Python installation, it's not an option on Macs anymore (as in, I have been unable to build and install it correctly despite multiple tries and many searches for how-to recipes -- I follow them, I do not end up debugging code as a non-root user).

cagedmantis

comment created time in 12 days

issue commentgolang/go

cmd/compile/internal/ssa: test TestNexting failing

Works for me. I removed my build copy of dlv, discovered "/usr/bin/dlv", that worked okay. I updated dlv to their tip, that worked okay. Tried it on Mac with dlv completely removed, got expected skip.

I'd assume a flake, unless you can repeat this (this test can flake, worked pretty hard to deflake it but it's not zero). Maybe dlv has some extra behavior that I don't know about, depending on environment, hard to say.

cagedmantis

comment created time in 13 days

push eventdr2chase/bent

David Chase

commit sha deb44449806756c56eb36b8fa37bcdf4cea521a8

Updated configurations-cronjob

view details

push time in 14 days

push eventdr2chase/bent

David Chase

commit sha 881d0f156e527598f059d074eb5a6c80f9c656de

Fixed typo in setting "ROOT" in scripts

view details

David Chase

commit sha 8e8ea2bf20d29cf9fe221aaa5fce6cfe2a641da2

Tweaks to improve quality, also to not overwrite daily reference benchmarks

view details

push time in 14 days

issue commentgolang/go

proposal: cmd/compile: make 64-bit fields be 64-bit aligned on 32-bit systems, add //go:packed directive on structs

@ianlancetaylor Or declaring that such alignment upconversions are unsafe -- and perhaps they are checked when they occur there (i.e., semantic change to a pointer cast that adds checking, but this can only occur in unsafe code).

My Great Idea may be a Great Idea™. (As a wise former colleague once remarked, the ™ means it isn't what it says it is. Would you rather have cheese, or Cheese™?)

danscales

comment created time in 20 days

issue commentgolang/go

proposal: cmd/compile: make 64-bit fields be 64-bit aligned on 32-bit systems, add //go:packed directive on structs

Statically, it would be unsafe. Dynamically, we could check. Maybe we call it unsafe for now, wait to see if anyone ever needs to do it. It wouldn't surprise me if "old" code mechanically ported forward would trip over this (in the static case).

danscales

comment created time in 21 days

issue commentgolang/go

proposal: cmd/compile: make 64-bit fields be 64-bit aligned on 32-bit systems, add //go:packed directive on structs

Not sure yet another "great idea" is welcome, but perhaps we can specify field ordering and type alignment separately. Before I go too far with this, I ran into a problem, this might be a common problem, so I'll lead with that.

<hr> IF type alignment is specified separately, there will be a problem with conversions from pointer-to-less-aligned-thing to pointer-to-more-aligned-thing. Pointer conversion currently is unchecked -- if the compiler allows it, it happens. If we allow such pointers (and we might want them for atomics, if we like static checking) then we need an answer for this. I think the least-surprise option would be to enhance the semantics of a plain pointer cast to include an alignment check where this would be needed. The advantage there is that old and new code would interoperate; the disadvantage is that an operation that previously could not panic, now can panic. <hr>

Returning to the presentation of the great idea...

The two magic comments would be

//go:align(<somevalue>)
//go:orderedfields

Here, <somevalue> specifies the byte alignment required for a type, and is one of

  • a power of two (not sure how large these should be allowed; can be higher or lower than default, subject to the compiler whining about lack of a load instruction.)
  • ptr to specify the native pointer alignment
  • the name of a language (C, Java, Fortran, ???) to specify the native alignment of the corresponding type in the other programming language, with an error if we believe the other language has no such corresponding type (Fortran has complex, Java does not).
  • Possibly "*" preceding an alignment to indicate a pointer to an aligned quantity; I am not convinced this is necessary, because a pointer to an aligned type can inherit the alignment. If this is allowed, there can be multiple alignment attributes for both the pointer itself and the referent.

Thus

type cacheLine       [64]uint64 //go:align(64)
type atomicComplex64 complex64  //go:align(8)
type floatPtrPun     float64    //go:align(ptr)

type cacheLinePtr *cacheLine // implicitly, a pointer to something that is aligned
type alignedPtrToAligned8Uint8 *[8]uint8 // go:align(8,*8)

and

type INPUT_keymap_entry struct { //go:orderedfields
	Flags uint8; 
	Len uint8;
	Index uint16; 
	Keycode uint32;
	Scancode [32]uint8;
};

and if you are not a trusting soul or wish to be explicit about your expectations

type INPUT_keymap_entry struct { //go:orderedfields
	Flags uint8;         //go:align(1)
	Len uint8;           //go:align(1)
	Index uint16;        //go:align(2)
	Keycode uint32;      //go:align(4)
	Scancode [32]uint8;  //go:align(1)
};

I expect the following uses to be common:

  • simple type, with alignment, gets that alignment wherever it is allocated, and is part of minimum alignment of any containing struct. Within an otherwise undecorated struct, field may be reordered to satisfy whims of compiler writers.
  • structures for interchange are ordered, if there's no ambiguity about field alignments (this is up to convention and expectations of code authors, vet might help define "ambiguity") then they don't need alignment annotations. But useless alignment annotations are fine.
  • Some types have non-negotiable alignment restrictions and it is a compile-time error to break those.

There is a problem with conversions from pointer-to-less-aligned-thing to pointer-to-more-aligned-thing. Pointer conversion currently is unchecked -- if the compiler allows it, it happens. If we allow such pointers (and we might want them for atomics, if we like static checking) then we need an answer for this. I think the least-surprise option would be to enhance the semantics of a plain pointer cast to include an alignment check where this would be needed. The advantage there is that old and new code would interoperate; the disadvantage is that an operation that previously could not panic, now can panic.

We could provide a tool that would pre-annotate structs/types for easy compatibility into the future, and the annotations are backwards compatible in the sense that old code, annotated, then runs in an old system, would behave as it always did. However, old code, annotated, might not run in a new system because of referent alignment issues in pointer conversions.

danscales

comment created time in 21 days

issue commentgolang/go

proposal: cmd/go: add a way to query for non-defaults in the env

I think this is a good idea, both because it helps us and because it might help the user with "bugs" that are really misconfigurations. Would it make sense to simply enhance the existing output with a trailing shell comment for the modified ones, that way it still copy-pastes as environment variable settings? E.g.

$ go env
GO111MODULE="on"                   # instead of ""
GOARCH="amd64"
GOBIN="/home/mvdan/go/bin"
GOCACHE="/home/mvdan/go/cache"
GOENV="/home/mvdan/.config/go/env"
GOEXE=""
GOFLAGS=""
GOHOSTARCH="amd64"
GOHOSTOS="linux"
GONOPROXY="brank.as/*"             # instead of ""
GONOSUMDB="brank.as/*"             # instead of ""
GOPROXY="https://proxy.golang.org" # instead of "https://proxy.golang.org,direct"
(etc)
mvdan

comment created time in 21 days

issue commentgolang/go

cmd/cover: (html output) UI accessibility issues

See also, for the screen reader version of html UI accessibility issues, #36685.

cznic

comment created time in a month

issue openedgolang/go

cmd/cover: (html output) UI accessibility issues, unfriendly to screen reader

What version of Go are you using (go version)?

go 1.13, go 1.14

Recently on golang-nuts, "Is it possible to get code coverage information in a way that does not assume you can see color?"

https://groups.google.com/g/golang-nuts/c/DY4O9UXMr9M?pli=1

In this case, "see color" refers to a screen reader for a totally blind person.

This is a bug, not an enhancement, because accessibility is important. It does need someone who knows something about UI accessibility to look at it.

created time in a month

issue closedgolang/go

cmd/cover: (html output) UI accessibility issues

<pre>The color scheme has IMO some troubles:

  1. It's inverted, ie. light text on dark background. Please be aware that some people are not able to read comfortably it or read it at all for more than a brief time period without physical troubles. I suggest to never ever default to inverted colors anywhere whatsoever.

  2. To make thing worse, the light gray text on black background has so low contrast that the people having already trouble from the inversion scheme has now one more source of pain. (Also a big sin of many Android UI screens.)

  3. People can select default fonts they're comfortable with in the browser. Please respect that and do not force another font where it's completely unnecessary. It's a tooling report, not a design studio website front page or similar.

Suggestions:

  • Select a font-family only, eg. 'monospace', not 'Menlo, monospace'.
  • Remove 'font-weight: bold;'
  • Support how many report styles you wish, but keep the default one which is plain black text on white background by default (like this issues page.). The green/red emphasizes on covered/not covered source lines is IMO ok.
  • If a B/W scheme is considered too "plain", then I suggest to adopt the color scheme already used for source code by the the godoc tool. It has no accessibility problems I'm aware of. The typographical consistency would be a nice bonus, I believe.

My apologies for I haven't followed the design of the cover tool before so I could provide such feedback earlier.</pre>

closed time in a month

cznic

issue openedgolang/go

cmd/objdump: panic for source code annotation with too-large line directives

<!-- Please answer these questions before submitting your issue. Thanks! For questions please use one of our forums: https://github.com/golang/go/wiki/Questions -->

What version of Go are you using (go version)?

Go almost-14 <pre> $ go version go version devel +71239b4f49 Mon Jan 20 15:06:42 2020 +0000 darwin/amd64 </pre>

Does this issue reproduce with the latest release?

Yes.

What operating system and processor architecture are you using (go env)?

<details><summary><code>go env</code> Output</summary><br><pre> $ go env GO111MODULE="" GOARCH="amd64" GOBIN="" GOCACHE="/Users/drchase/Library/Caches/go-build" GOENV="/Users/drchase/Library/Application Support/go/env" GOEXE="" GOFLAGS="" GOHOSTARCH="amd64" GOHOSTOS="darwin" GOINSECURE="" GONOPROXY="" GONOSUMDB="" GOOS="darwin" GOPATH="/Users/drchase/work/gocode" GOPRIVATE="" GOPROXY="https://proxy.golang.org,direct" GOROOT="/Users/drchase/work/go" GOSUMDB="sum.golang.org" GOTMPDIR="" GOTOOLDIR="/Users/drchase/work/go/pkg/tool/darwin_amd64" GCCGO="gccgo" AR="ar" CC="clang" CXX="clang++" CGO_ENABLED="1" GOMOD="" CGO_CFLAGS="-g -O2" CGO_CPPFLAGS="" CGO_CXXFLAGS="-g -O2" CGO_FFLAGS="-g -O2" CGO_LDFLAGS="-g -O2" PKG_CONFIG="pkg-config" GOGCCFLAGS="-fPIC -m64 -pthread -fno-caret-diagnostics -Qunused-arguments -fmessage-length=0 -fdebug-prefix-map=/var/folders/gr/vvb66dqx6jl6lh8wckfd5p9w0095tn/T/go-build189264983=/tmp/go-build -gno-record-gcc-switches -fno-common" </pre></details>

What did you do?

I compiled this program, call it bogo.go, with go build bogo.go. Then I run go tool objdump -S ./bogo > bogo.s

package main

func main() {
	println("Try 'go tool objdump -S ./bogo > bogo.s'")
	loop()
}

var x int64

func loop() {
//line bogo.go:9999999
	for x <= 1024*1024*1024 {
		x = x + 1
	}
}

What did you expect to see?

Not a panic

What did you see instead?

panic: runtime error: index out of range [9999998] with length 37

goroutine 1 [running]:
cmd/internal/objfile.(*FileCache).Line(0xc000141d70, 0xc00019f678, 0x7, 0x98967f, 0xc000141ab8, 0x0, 0x0, 0xc000141b18, 0x10dd412)
        /Users/drchase/work/go/src/cmd/internal/objfile/disasm.go:178 +0x602
cmd/internal/objfile.(*Disasm).Print.func1(0x10b5350, 0x2, 0xc00019f678, 0x7, 0x98967f, 0xc000090ce0, 0xd)
        /Users/drchase/work/go/src/cmd/internal/objfile/disasm.go:232 +0xd8
cmd/internal/objfile.(*Disasm).Decode(0xc00012e000, 0x10b5350, 0x10b536c, 0x0, 0x0, 0x0, 0xc000141e40)
        /Users/drchase/work/go/src/cmd/internal/objfile/disasm.go:283 +0x27b
cmd/internal/objfile.(*Disasm).Print(0xc00012e000, 0x11fa440, 0xc00000e018, 0x0, 0x1001000, 0xffffffffffffffff, 0x1)
        /Users/drchase/work/go/src/cmd/internal/objfile/disasm.go:227 +0x4e2
main.main()
        /Users/drchase/work/go/src/cmd/objdump/main.go:90 +0x615

This is related to #36570 (which is fixed). Using similar methods I tried to crash both test coverage and pprof, but was unable to provoke a panic.

created time in a month

issue commentgolang/go

cmd/cover: (html output) UI accessibility issues

Recently on golang-nuts, "Is it possible to get code coverage information in a way that does not assume you can see color?"

https://groups.google.com/g/golang-nuts/c/DY4O9UXMr9M?pli=1

In this case, "see color" refers to a screen reader for a totally blind person. That's not what this bug was originally exactly about, but the bug title is very general and includes this problem.

Reopening, so this can get some attention. This is a bug, not an enhancement, because accessibility is important. It does need someone who knows something about UI accessibility to look at it.

cznic

comment created time in a month

IssuesEvent
IssuesEvent

issue commentgolang/go

panic: runtime error: slice bounds out of range

Might not be closed, was a gaffe in the commit wording.

javasgl

comment created time in a month

issue commentgolang/go

panic: runtime error: slice bounds out of range

I tried to reproduce this several ways, and failed. We need more information. Test program ("bogo.go"):

package main

import (
	"runtime/pprof"
	"os"
	"io"
	"time"
)

var file io.WriteCloser 

func main() {
	file, _ = os.Create("bogo.prof")
	pprof.StartCPUProfile(file)
	go func() {
		time.Sleep(5*time.Second)
		pprof.StopCPUProfile()
		file.Close()
		os.Exit(0)
	}()


	loop()
}

//go:noinline
func loop() {

	for { }

}

I tried web, weblist, and disasm within go tool pprof bogo.prof and I also tried poking at the UI in go tool pprof -http=localhost:9090 ./bogo bogo.prof. I definitely saw the bogus line number in the disassembly, but nothing crashed.

javasgl

comment created time in a month

issue commentgolang/go

cmd/objdump: panic: runtime error: index out of range [1048574] when disassembling empty infinite loop

To be fair, this a flaw in objdump, though we can certainly tweak the compiler to avoid triggering it:

package main

func main() {
//line bogo.go:9999999
	for { }
}

yields

go tool objdump -S ./bogo > bogo.s
panic: runtime error: index out of range [9999998] with length 7
bigwhite

comment created time in a month

issue commentgolang/go

cmd/objdump: panic: runtime error: index out of range [1048574] when disassembling empty infinite loop

But for mysterious reasons that didn't work for ssa/debug_test.go, despite the generated line being present, like so:

  infloop.go:10		0x1057005		90			NOPL					
  <infiniteloop>:1	0x1057006		ebfd			JMP 0x1057005				

The debuggers could start looking for <infiniteloop> before it is present, though that might complicate their testing. I think they need a more general solution for this anyway (record PC every 2-to-the-N instructions into stepping, start looking for a repeat after half a second?) because the compiler certainly does not detect all single-line infinite loops, just the (very) easy ones.

bigwhite

comment created time in a month

issue commentgolang/go

cmd/objdump: panic: runtime error: index out of range [1048574] when disassembling empty infinite loop

How do you feel about

%  dlv exec ./bogo
Type 'help' for list of commands.
(dlv) b main.main
Breakpoint 1 set at 0x1056fd0 for main.main() ./bogo.go:8
(dlv) c
> main.main() ./bogo.go:8 (hits goroutine(1):1 total:1) (PC: 0x1056fd0)
Warning: debugging optimized function
     3:     func call() {
     4:          println("called!")
     5:     }
     6:     
     7:     func main() {
=>   8:          for { }
     9:     }
(dlv) n
> main.main() <infiniteloop>:1 (PC: 0x1056fd1)
Warning: debugging optimized function
(dlv) n
> main.main() ./bogo.go:8 (hits goroutine(1):2 total:2) (PC: 0x1056fd0)
...
bigwhite

comment created time in a month

issue commentgolang/go

cmd/objdump: panic: runtime error: index out of range [1048574] when disassembling empty infinite loop

My first stab at this turned into a mess.

bigwhite

comment created time in a month

issue commentgolang/go

cmd/objdump: panic: runtime error: index out of range [1048574] when disassembling empty infinite loop

I had not thought of the line number in a different file hack, I should see how that behaves. Would work for step, not sure about "next".

bigwhite

comment created time in a month

issue commentgolang/go

cmd/objdump: panic: runtime error: index out of range [1048574] when disassembling empty infinite loop

Problem with the infinite loop answer is that (1) we're not super excited about large changes right now and (2) I am not exactly sure it will work anyway for the debugger. I just tried "debugging"

package main

func call() {
	println("called!")
}

func main() {
	for { call(); }
}

and "n" was not happy-making.

The better bogus line solves the problem for the naive empty loop, and solves the crash for this bug. However, debugging the program for this bug is still not good -- it hangs (in deadloop), unless I set a breakpoint on line 1 (!).

If the debuggers could be convinced to all set a breakpoint in runtime.InfiniteLoop, that would work, also.

bigwhite

comment created time in a month

issue commentgolang/go

cmd/objdump: panic: runtime error: index out of range [1048574] when disassembling empty infinite loop

There are few programs with interesting code on line 1, but most programs have a line 1.

We have to play this game because typing "n" in a debugger single steps instructions until the line number changes. For a "B ." loop, that takes a long time. So, insert a nop with a bogus line number to break up the monotony.

I tried zero for a bogus line number, one of our tools got the vapors. I tried a large number for a bogus line number, one of our tools got the vapors.

So, I am trying 1.

Programs that are entirely on line 1 will behave poorly in a debugger, but they did that already.

bigwhite

comment created time in a month

issue commentgolang/go

cmd/objdump: panic: runtime error: index out of range [1048574] when disassembling empty infinite loop

Cherry has made an excellent argument for bogus line = 1 (in person).

bigwhite

comment created time in a month

issue commentgolang/go

cmd/objdump: panic: runtime error: index out of range [1048574] when disassembling empty infinite loop

That bad line number is intentional; it's to prevent an infinite loop stepping in the debugger. See https://go-review.googlesource.com/c/go/+/168477

Perhaps the debuggers are smarter now. @cherrymui had an idea that we should just write "runtime.InfiniteLoop()" and for any effect-free infinite loops that we detect (like this one), replace the code with a call to that. It will of course deschedule the goroutine, and the name speaks for itself to anyone debugging their code, and we can get rid of these line number shenanigans.

Saves energy, too.

bigwhite

comment created time in a month

issue commentgolang/go

runtime: GC causes latency spikes

Which version of Go? There's several bugs that cause few-millisecond pauses for one or more threads, some are planned for 1.15, one -- failure to preempt call-free long-running loops -- got fixed in 1.14, so if you test with tip and the latency gets better, that was your problem. We also fixed one of the others in 1.14, but it's limited to just a few milliseconds. See https://github.com/golang/go/issues/27732 for a discussion of those bugs, and traces showing what they look like -- it would be nice to know if what's causing your problems is new or known.

A quick summary of the known bugs (from 27732):

  1. Stack allocations are not divisible work; can take a long time to process. (Workaround for 1.14 added https://go-review.googlesource.com/c/go/+/180817/ -- use go build -gcflags 'all=-smallframes' hello.go)
  2. Mark assist applied to global and stack allocations (generally, roots) doesn't generate any credit.
  3. Sweeping problem (#18155 - experimental fix here: https://go-review.googlesource.com/c/go/+/187817/ )
  4. Reschedule (i.e., quantum expires) that discovers a need for GC does not wake other Ps (or, does not check global run queue) (i.e., the work to do expanded by 1, if any Ps are idle, they one should be wakened to take over doing "real work"). (Fixed in 1.14: https://go-review.googlesource.com/c/go/+/146817/ )
  5. (Related to 4?) Dedicated worker doesn't kick goroutines out of its local run queue.

The workaround for bug 1 will also address bug 2 -- if there's no oversized frames to mark, the size of the accounting error is small, also.

Bug 3 is rare, usually only occurs during program startup or if your code is very very good at not generating garbage; if the heap is completely live, scanning for a free slot can run very long. The workaround is somewhat nutty -- add a counter to the allocator for the problem, and every 64 allocations (counter&63 == 0) drop the allocation on the floor and do it again. This leaves holes in your heap that prevent long stalls for scanning.

If you try these workarounds and one of them works, that's really useful information about which bugs is the bigger problem and needs attention first.

It's also helpful to exclude startup from delay stats for long-running servers; those are a separate category of bug, and (our opinion) not nearly as important as steady-state glitches. They do matter; minor changes in startup heap growth heuristics have a noticeable effect on build speed, so we care, just not as much.

Dieterbe

comment created time in a month

issue commentgolang/go

runtime: GC causes latency spikes

Sorry for the huge delay in replying, the short answer to your problem is "don't call memstat that often and don't put the call someplace you would mind waiting for an answer". There was an old problem with memstat blocking till the GC finished; we though we had fixed it, but apparently you have found a new problem with memstat blocking. When I remove the calls to memstat from your test program, all is well.

Even when we fix this new bug, frequent calls to memstat are not a great idea, because it actually stops the world. Not for long, but everything. It has overheads.

Dieterbe

comment created time in a month

issue commentgolang/go

runtime: system experienced long pauses during GC marking

Crud, I just now saw this. Is this still a problem? And how many processors are there, and is the Go process the only significant load on the box? (I.e., is it possible that there is competition with other processes?)

fmstephe

comment created time in a month

issue commentgolang/go

runtime: memory corruption on Linux 5.2+

@lmb How low is "pretty low"? Expected number of pages locked is O(threads) (not goroutines) pages, since it is one per page. Unless you have a lot of goroutines tied to threads, ought to be GOMAXPROCS pages, plus a few for bad luck.

aclements

comment created time in 2 months

issue commentgolang/go

runtime: memory corruption on Linux 5.2+

Prefer touching the signal stack because it reduces the need to explain things to users, though I hope we're talking about a small number of people anyway (latency-sensitive Go users on latest-N-greatest Linux for the next few months). And, also, it will be trivial to remove if/when we get around to doing that.

aclements

comment created time in 2 months

issue commentgolang/go

runtime: memory corruption on Linux 5.2+

How bad is it to mlock the first page of each signal stack?

It seems to me that disabling signal preemption is strictly worse than pre-poking the page with a CAS. Both leave us equally vulnerable to this bug triggered by not-our-signals (an apparently tiny risk, since none have been reported), but disabling signal preemption means that we'll have OS-linked latency-glitches, for a few versions of Linux (5.2, 5.3, 5.4). The cost of the CAS compared to sending the signal is small.

But mlock still seems preferable -- maybe we only mlock on the affected versions of Linux? If anyone complains, we have an obvious workaround, and we eliminate all that risk without OS-linked latency weirdness.

aclements

comment created time in 3 months

issue commentgolang/go

runtime: memory corruption on Linux 5.2+

Austin confirms a fix, I tested it at Monday's Linux tip and it also works for me.

aclements

comment created time in 3 months

issue commentgolang/go

runtime: memory corruption on Linux 5.2+

Don't know if Austin is verifying this already or not, but I'm about to fire off a build, go to lunch, then test.

aclements

comment created time in 3 months

issue commentgolang/go

runtime: corrupt binary export data seen after signal preemption CL

Austin said he'd copy my remarks over there (I don't think I've got an account) and this is pretty much the end of it for me. I'm pretty sure we gave them a nice start. Ian's observation has a good smell to it.

mvdan

comment created time in 3 months

issue commentgolang/go

runtime: corrupt binary export data seen after signal preemption CL

FYI, I reproduced, at Linux tip (5.4 + a little), the dependence of the bug on how arch/x86/kernel/fpu/signal.o is compiled --

  • if that is the only file compiled with gcc9, the rest with gcc8, then the bug appears;
  • if that is the only file compiled with gcc8, the rest with gcc9 then the bug does not appear.

The exact gcc versions are

gcc-8 --version
gcc-8 (Ubuntu 8.3.0-23ubuntu2) 8.3.0

and

gcc-9 --version
gcc-9 (Ubuntu 9.2.1-9ubuntu2) 9.2.1 20191008
mvdan

comment created time in 3 months

issue commentgolang/go

runtime: corrupt binary export data seen after signal preemption CL

For your nerd-sniping amusement (and not mine), the bug is somewhere in the differences in these two disassembled files. Link Linux with the gcc8 one, all is well, link with the other, it goes bad.

signal.8.dis.txt signal.9.dis.txt

There's a ridiculous amount of inlining going on, not sure it helps to have the source annotation in the disassembly, but here it is:

signal.8.il.dis.txt signal.9.il.dis.txt

The source file in question, arch/x86/kernel/fpu/signal.o, is from this commit: https://github.com/torvalds/linux/commit/d9c9ce34ed5c892323cbf5b4f9a4c498e036316a

I verified this by building two kernels, one entirely compiled by gcc8 except for arch/x86/kernel/fpu/signal.o, compiled by gcc9. It fails . The other kernel is built entirely by gcc9, except for arch/x86/kernel/fpu/signal.o, built by gcc8. It does not fail.

Enjoy!

mvdan

comment created time in 3 months

issue commentgolang/go

runtime: memory corruption on Linux 5.3.x from async preemption

Not sure where Austin's reporting this or if he had time today, but:

  • he has a C program demonstrating the bug in Linux 5.3 (built with gcc 9) for purposes of filing a bug soonish;
  • there is a workaround on the Go implementation side (be sure the signal stack is mapped);
  • I managed to create a failing Linux 5.3 where the entire kernel is compiled with gcc 8, except for arch/x86/kernel/fpu/signal.c.
aclements

comment created time in 3 months

issue commentgolang/go

runtime: memory corruption on Linux 5.3.x from async preemption

To recap experiments last Friday (and I rechecked the test for the more mystifying of these Sunday afternoon), Cherry and I tried the following:

Double the size of the sigaltstack, just in case. Also sanity check the bounds within gdb, they were okay.

Modified the definition of fpstate to conform to what is defined in the linux header files. Modified sigcontext to use the new Xstate:

fpstate *Xstate // *fpstate1

Wrote a method to allow us to store the ymm registers that were supplied (as registers) to the signal handler,

  1. tried an experiment in the assembly language handler to trash the YMM registers (not the data structures) before return. We never saw any sign of the trash but this seemed to raise the rate of the failures (running "go vet all"). The trashing string stored was "This_is_a_test. "

  2. tried printing the saved and current ymm registers in sigtrampgo. The saved ones looked like memmove artifacts (source code while running vet all), and the current ones were always zero. The memmove artifacts stayed unchanged, a lot, between signals. I rechecked the code that did this earlier today, just in case we got it wrong.

  3. made a copy of the saved xmm and ymm registers on sigtrampgo entry, then checked the copy against the saved registers, to see if our code ever somehow modified them. That never fired.

I spent some time Saturday looking for "interesting" comments in the Linux git log, I have some to review. What I am wondering is if there was some attempt to optimize saving of the ymm registers and that got fouled up. One thing I wonder a little about was what they are doing for power management with AVX use, I saw some mention of that. (I.e., what triggers AVX use, can they "save power" if they don't touch the registers, if they believe AVX is not being used? Suppose they rely on some hardware bit that isn't set under exactly the expected conditions?)

type Xstate struct {
   Fpstate Fpstate
   Hdr Header
   Ymmh Ymmh_state
}

type Fpstate struct {
   Cwd uint16
   Swd uint16
   Twd uint16
   Fop uint16
   Rip uint64
   Rdp uint64
   Mxcsr uint32
   Mxcsr_mask uint32
   St_space [32]uint32
   Xmm_space [64]uint32
   Reserved2 [12]uint32
   Reserved3 [12]uint32
}

type Header struct {
   Xfeatures uint64
   Reserved1 [2]uint64
   Reserved2 [5]uint64
}

type Ymmh_state struct {
   Space [64]uint32
}
TEXT runtime·getymm(SB),NOSPLIT,$0
    MOVQ    0(FP), AX
    c Y0,0(AX)
    VMOVDQU Y1,(1*32)(AX)
    VMOVDQU Y2,(2*32)(AX)
    VMOVDQU Y3,(3*32)(AX)
    VMOVDQU Y4,(4*32)(AX)
    VMOVDQU Y5,(5*32)(AX)
    VMOVDQU Y6,(6*32)(AX)
    VMOVDQU Y7,(7*32)(AX)
    VMOVDQU Y8,(8*32)(AX)
    VMOVDQU Y9,(9*32)(AX)
    VMOVDQU Y10,(10*32)(AX)
    VMOVDQU Y11,(11*32)(AX)
    VMOVDQU Y12,(12*32)(AX)
    VMOVDQU Y13,(13*32)(AX)
    VMOVDQU Y14,(14*32)(AX)
    VMOVDQU Y15,(15*32)(AX)
    RET
aclements

comment created time in 3 months

push eventdr2chase/bent

David Chase

commit sha 64ad67c97030591931d04e1b39825b11cf0613b7

Added configurations-cmpjob.toml

view details

push time in 3 months

push eventdr2chase/bent

David Chase

commit sha 16806fcebedc0d0eb3354478636afcc081bce362

Added two scripts to the initialization-copy list

view details

push time in 3 months

issue commentgolang/go

cmd/compile: additional internal debugging support for escape.go

Your second one is good (the old version and your first one are both not good), except that I don't quite see why I need the lines that are indented under "flow" (i.e., the intermediate lines). The lines marked "flow" would be even better with some parenthetical mention of the line/column. Users are going to be a little mystified by ~r0 and ~R0.

mdempsky

comment created time in 5 months

push eventdr2chase/bent

David Chase

commit sha 41c8c7e73c378888fd3ea86c0b5c9055808721fc

Added two useful scripts; removed some noisy benchmarks.

view details

push time in 5 months

more