Description
What version of Go are you using (go version
)?
$ go version go version go1.15.1 linux/amd64
Does this issue reproduce with the latest release?
Yes
What operating system and processor architecture are you using (go env
)?
OS: Fedora/YUM based OS (RHEL, CentOS, OEL, etc)
Architecture: amd64
Kernel: Reproduced on anything from 4.1 to 5.8
go env
Output
$ go env GO111MODULE="" GOARCH="amd64" GOBIN="" GOCACHE="/root/.cache/go-build" GOENV="/root/.config/go/env" GOEXE="" GOFLAGS="" GOHOSTARCH="amd64" GOHOSTOS="linux" GOINSECURE="" GOMODCACHE="/root/go/pkg/mod" GONOPROXY="" GONOSUMDB="" GOOS="linux" GOPATH="/root/go" GOPRIVATE="" GOPROXY="https://proxy.golang.org,direct" GOROOT="/root/go" GOSUMDB="sum.golang.org" GOTMPDIR="" GOTOOLDIR="/root/go/pkg/tool/linux_amd64" GCCGO="gccgo" AR="ar" CC="gcc" CXX="g++" CGO_ENABLED="1" GOMOD="" CGO_CFLAGS="-g -O2" CGO_CPPFLAGS="" CGO_CXXFLAGS="-g -O2" CGO_FFLAGS="-g -O2" CGO_LDFLAGS="-g -O2" PKG_CONFIG="pkg-config" GOGCCFLAGS="-fPIC -m64 -pthread -fno-caret-diagnostics -Qunused-arguments -fmessage-length=0 -fdebug-prefix-map=/tmp/go-build513552128=/tmp/go-build -gno-record-gcc-switches"
What did you do?
While writing an extremely basic HTTP load tool, I noticed some funky behavior when re-using the *http.Client (and it's embedded *http.Transport).
The tool prepares a *http.Client and a *http.Request, spins up multiple goroutines that clone the *http.Request (as *http.Requests aren't supposed to be used in this concurrent fassion) and uses the *http.Client (which is re-usable concurrently) to send as many requests per second as possible at a server of the user's choice.
Here is a hacky, simplified and shortened but functioning example that tries to showcase my observation. It sends requests to a URL specified by the command line and goes through 2 passes of 30sec each where the first pass is using a shared client/transport and the second pass is using a new client/transport for each worker.
Code Example
package mainimport (
"context"
"crypto/tls"
"flag"
"fmt"
"io"
"io/ioutil"
"net/http"
"sync"
"time"
)func fire(ctx context.Context, wg *sync.WaitGroup, r *http.Request, c *http.Client, chGo chan struct{}, chResult chan bool) {
defer wg.Done()
<-chGofor { select { case <-ctx.Done(): return default: chResult <- true resp, err := c.Do(r) if err != nil { chResult <- false continue } _, err = io.Copy(ioutil.Discard, resp.Body) if err != nil { chResult <- false } resp.Body.Close() } }
}
func main() {
flagURL := flag.String("url", "", "URL for the request")
flag.Parse()if *flagURL == "" { panic(fmt.Errorf("URL can not be empty")) } req, err := http.NewRequest("GET", *flagURL, nil) if err != nil { panic(err) } tp := http.DefaultTransport.(*http.Transport).Clone() // In the case of HTTPS, disable verification for testing purposes tp.TLSClientConfig = &tls.Config{InsecureSkipVerify: true} // Bump up connection limits tp.MaxConnsPerHost = 200 tp.MaxIdleConns = 200 tp.MaxIdleConnsPerHost = 200 // Disable HTTP/2 tp.TLSNextProto = make(map[string]func(string, *tls.Conn) http.RoundTripper) // First pass is with a shared transport hClt := &http.Client{Transport: tp} chGo := make(chan struct{}, 0) chResult := make(chan bool, 200) ctx, cancel := context.WithCancel(context.Background()) d := time.Duration(30 * time.Second) var wgFire sync.WaitGroup for i:=1; i<=200; i++ { wgFire.Add(1) // Need to clone request // // See https://github.com/golang/go/issues/21780 go fire(ctx, &wgFire, req.Clone(context.Background()), hClt, chGo, chResult) } var wgStat sync.WaitGroup wgStat.Add(1) go func() { defer wgStat.Done() var reqs uint64 var errs uint64 for { v, more := <-chResult if !more { break } switch v { case true: reqs++ case false: errs++ } } fmt.Println("\nWorker sharing HTTP transport") fmt.Printf("Duration: %v\nRequests: %v\nRequests per second: %v\nErrors: %v\n\n", d, reqs, reqs / uint64(d.Seconds()), errs) }() close(chGo) time.Sleep(d) cancel() wgFire.Wait() close(chResult) wgStat.Wait() // Let's go again but with non-shared transports chGo = make(chan struct{}, 0) chResult = make(chan bool, 200) ctx, cancel = context.WithCancel(context.Background()) for i:=1; i<=200; i++ { wgFire.Add(1) // Here each worker gets their own transport hClt := &http.Client{Transport: tp.Clone()} // Need to clone request // // See https://github.com/golang/go/issues/21780 go fire(ctx, &wgFire, req.Clone(context.Background()), hClt, chGo, chResult) } wgStat.Add(1) go func() { defer wgStat.Done() var reqs uint64 var errs uint64 for { v, more := <-chResult if !more { break } switch v { case true: reqs++ case false: errs++ } } fmt.Println("Each worker getting a HTTP transport copy") fmt.Printf("Duration: %v\nRequests: %v\nRequests per second: %v\nErrors: %v\n", d, reqs, reqs / uint64(d.Seconds()), errs) }() close(chGo) time.Sleep(d) cancel() wgFire.Wait() close(chResult) wgStat.Wait()
}
What did you expect to see?
The same RPS throughput for both scenarios in any environment
What did you see instead?
When running this on my local DEV machine (Intel(R) Core(TM) i5-8259U CPU @ 2.30GHz) against a server (Intel(R) Xeon(R) CPU E5-2620 0 @ 2.00GHz), I get the expected results:
$ ./http_perf_compare -url http://perf.domain.local
Worker sharing HTTP transport
Duration: 30s
Requests: 1640557
Requests per second: 54685
Errors: 0
Each worker getting a HTTP transport copy
Duration: 30s
Requests: 1553561
Requests per second: 51785
Errors: 0
Both approaches result in a similar RPS.
However, when I run the same code on a client that is the same as the server (Intel(R) Xeon(R) CPU E5-2620 0 @ 2.00GHz), I see a ~90% increase of RPS throughput when using a separate client/transport for each worker goroutine:
$ ./http_perf_compare -url http://perf.domain.local
Worker sharing HTTP transport
Duration: 30s
Requests: 2292353
Requests per second: 76411
Errors: 0
Each worker getting a HTTP transport copy
Duration: 30s
Requests: 4442219
Requests per second: 148073
Errors: 0
This happens independent of plain HTTP or HTTP-over-TLS (HTTPS).
Is this expected?
Thank you in advance for the time anyone takes to look into this.