Remove invalid UTF-8 data.#117
Remove invalid UTF-8 data.#117lyda wants to merge 2 commits intolabstack:masterfrom lyda:utf-8-issue
Conversation
|
Please provide some executable example for this problem |
|
and are you sure this is the right place for this fix? any middleware or handler could do something with that "bad UTF8" url value. and why statuscode needs to be replaced? |
// Package main sends an invalid http request
package main
import (
"fmt"
"net"
)
func main() {
conn, _ := net.Dial("tcp", "127.0.0.1:8080")
data := []byte("GET /\xc0 HTTP/1.1\r\nHost: example.com\r\n\r\n")
conn.Write(data)
buf := make([]byte, 5000)
conn.Read(buf)
fmt.Println(string(buf[:]))
}This will result in a panic when this middleware is used. So far I've only seen this for the path so I could change it to only include that. I've seen such requests in random probes of sites I look after. This is the only middleware that crashes that I've seen (but I don't use a lot of them). |
|
Seems that https://github.com/prometheus/client_golang/blob/853c5de11e8a0ee0742689859f1642996517b479/prometheus/histogram.go#L1197 this method panics because strings are validated for being proper utf8 here https://github.com/prometheus/client_golang/blob/853c5de11e8a0ee0742689859f1642996517b479/prometheus/labels.go#L178 so I propose we change these lines echo-contrib/echoprometheus/prometheus.go Lines 279 to 282 in 415ca6f to if obs, tmpErr := requestDuration.GetMetricWithLabelValues(values...); tmpErr == nil {
obs.Observe(elapsed)
} else {
return fmt.Errorf("requestDuration observation failed, err: %w", tmpErr)
}
if obs, tmpErr := requestCount.GetMetricWithLabelValues(values...); tmpErr == nil {
obs.Inc()
} else {
return fmt.Errorf("requestCount observation failed, err: %w", tmpErr)
}
if obs, tmpErr := requestSize.GetMetricWithLabelValues(values...); tmpErr == nil {
obs.Observe(float64(reqSz))
} else {
return fmt.Errorf("requestSize observation failed, err: %w", tmpErr)
}
if obs, tmpErr := responseSize.GetMetricWithLabelValues(values...); tmpErr == nil {
obs.Observe(float64(c.Response().Size))
} else {
return fmt.Errorf("responseSize observation failed, err: %w", tmpErr)
} |
This will catch and log UTF-8 without causing the middleware to panic.
|
OK, logging the error would be good. You only need to do the first one though. After that you know the values are ok. Edit: See the latest commit for that. |
|
Ping? |
aldas
left a comment
There was a problem hiding this comment.
Please add at least one test that hows this problem is fixed.
| values[3] = url | ||
| for _, cv := range customValuers { | ||
| values[cv.index] = cv.valueFunc(c, err) | ||
| values[cv.index] = strings.ToValidUTF8(cv.valueFunc(c, err), "�") |
There was a problem hiding this comment.
I do not think this is necessary. It should be up to valueFunc creator to handle this.
|
In the end I made a number of changes to this to make it "safe by default" in a kubernetes context. It handles broken utf-8 in all fields and it avoids remote-driven cardinality explosions by default. I also simplified the code a bit. I have a bunch of other things to do so if you'd like some or all of those changes feel free to take them. It meets my needs now so I'll just use that. Thanks! |
Without this it panics with bad UTF-8 data which means we lose those hits.