-
-
Notifications
You must be signed in to change notification settings - Fork 2.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Race on echo.maxParams, unable to use some context methods safely concurrently #1705
Comments
Hi @iaburton. Yes you are right, the use of maxParam is not thread safe, and changing it from a context could lead to have Context instances on the sync.Pool that don't have the proper length for pvalues (panics when maxParam is changed to a bigger value, the other way around it could lead to a memory leak) I'll continue trying to dig into the Git history to find out what is the intended purpose of Context#SetParamNames, because at a first glance it looks like as a "testing helper". If that is true, that could be the reason of why Echo's Context/Router assumes that maxParams shouldn't be protected (in production maxParam always is set by the Router and don't change). Please don't interpret this as me saying that maxParams shouldn't be protected, I'm just trying to discover what is the design behind the code, to then try to propose the best solution (not breaking change/keep performance) |
Hey @pafuent , nice to get some confirmation on what I was observing. I noticed after posting that the original issue was related to #1492 that effected several people. I have a workaround right now that might be of help in terms of a partial solution. Note this workaround below assumes you've pinned a certain echo version where the context is described below, and compiling for a 64bit system. //inside custom setParamNamesUnsafe function
eptr := unsafe.Pointer(reflect.ValueOf(ectx).Pointer())
/*
type context struct {
request *http.Request // 64bit pointer
response *Response // 64bit pointer
path string // size of string header
pnames []string // value we want to change
pvalues []string
query url.Values
handler HandlerFunc
store Map
echo *Echo
logger Logger
lock sync.RWMutex
}
Use unsafe pointer arithmetic to access and change pnames.
stringSize := reflect.TypeOf(reflect.StringHeader{}).Size()
*/
pnames := (*[]string)(unsafe.Pointer(uintptr(eptr) + 8 + 8 + stringSize))
copy(*pnames, vals) Basically I use unsafe to set pnames myself using I'm still not sure what is causing the 404's I mentioned prior though, I'm still doing more testing as I cannot easily replicate the issue (leading me to believe it is a concurrency issue). It's like certain groups or routes lose their ability to parse out params (or rather params look like part of a path so it becomes 404), which is what led me to this code in the first place, a hot fix for echo that lets me set the name and params on a context before going into the final handler. Btw putting a mutex or using atomics on |
We need to look at the original concept behind maxParam. I'd like to avoid introducing unsafe pointer arithmetics for working around a conceptional issue. |
Hi @lammel, my apologizes for the ambiguity. I didn't mean to imply that unsafe should be added to echo, on the contrary that was meant as a workaround (for others if their code is also panicing) until something better was found. The echo version of this unsafe code is simply: func (c *context) SetParamNames(names ...string) {
copy(c.pnames, names)
} Your restating the need to look at maxParam is correct, as the above is not enough on its own. |
@iaburton No offense taken, just wanted to clarify we are not finished yet ;-) |
Bit of an interesting development; my 404 issue has gone away this this, but instead of 404s now I get invalid contexts essentially. My unsafe code is honoring the original len of pnames, but some other code seems to be changing that len. I'm guessing it's Reset again, like before where the panic was occuring func (c *context) Reset(r *http.Request, w http.ResponseWriter) {
c.request = r
c.response.reset(w)
c.query = nil
c.handler = NotFoundHandler
c.store = nil
c.path = ""
c.pnames = nil
c.logger = nil
// NOTE: Don't reset because it has to have length c.echo.maxParam at all times
for i := 0; i < *c.echo.maxParam; i++ {
c.pvalues[i] = ""
}
} Once pnames becomes nil, my code with the copy doesn't actually copy anymore so I get requests without params in the context. 🤔 EDIT: Since the issue is nondeterministic let me get back to this after trying another patch. EDIT 2: Added: if len(*pnames) == 0 {
*pnames = make([]string, len(vals))
} To my unsafe code, and that seems to have stopped the params from "disappearing", but its ultimately the same issue, bad contexts getting into the sync.Pool and messing up internal state in the router.Find and other related code. |
We'll use this issue for a general improvement for the current quite messy handling with echo.maxParam. So tagged v5 now. |
Maybe if we decide to return errors from our router (see #1762 (comment)), we can implement the solution already proposed by @lammel. Any thoughts? |
Basically yes. As we have benchmarking in place can check for negative impact on allocations or processing time. You are right, that the router can be decoupled from the echo instance, it just needs to know the maximum number of parameters that will be processed. |
We just ran into this, and the panic in In other words, if you execute a request and a context is created and pooled, then that Here's a quick unit test to explain and reproduce the problem: func Test_show_context_pvalues_vs_echo_maxParam_panic_during_pooled_context_reset(t *testing.T) {
// Start a fresh echo server.
e := echo.New()
port := 9999
go func() {
_ = e.Start(fmt.Sprintf("localhost:%d", port))
}()
time.Sleep(100)
// Register a no-params endpoint - at this point e.maxParam is 0.
noParamsPath := "/foo"
e.GET(noParamsPath, func(c echo.Context) error {
return c.String(200, "doesnotmatter")
})
// Execute a request. This will create a context that gets put into the e.pool pool of contexts,
// and that pooled context will be created with len(c.pvalues) of 0 to match the
// _current_ e.maxParam value.
_, _ = http.Get(fmt.Sprintf("http://localhost:%d%s", port, noParamsPath))
// Now register an endpoint that does have a param. This will cause e.maxParam to change to 1.
withPathParamBase := "/bar"
withParamPathTmplt := fmt.Sprintf("%s/:stuff", withPathParamBase)
e.GET(withParamPathTmplt, func(c echo.Context) error {
return c.String(200, "doesnotmatter")
})
// Execute some more requests. Eventually this will pull the pooled context with len(c.pvalues) of 0,
// but now e.maxParam is 1, leading to a panic in context.Reset() due to the for loop logic in
// context.Reset(...) expecting len(c.pvalues) >= e.maxParam.
urlWithParams := fmt.Sprintf("http://localhost:%d%s%s", port, withPathParamBase, "/stuff-param")
for i := 0; i < 1000; i++ {
_, err := http.Get(urlWithParams)
if err != nil {
t.Errorf("call failed due to %v", err)
}
}
} If there's a design decision that you should never add endpoints after any requests have been processed (in order to guarantee that all pooled But assuming you do want people to be able to add to the Echo server's endpoints after it's started processing requests, then IMO you have a few options:
Assuming that func (c *context) Reset(r *http.Request, w http.ResponseWriter) {
c.request = r
... etc
// c.echo.maxParam might have changed since this context was used last. Make sure our len(c.pvalues)
// is big enough to handle c.echo.maxParams, otherwise we end up with a panic.
if len(c.pvalues) < *c.echo.maxParam {
c.pvalues = make([]string, *c.echo.maxParam)
}
// NOTE: Don't reset because it has to have length c.echo.maxParam at all times
for i := 0; i < *c.echo.maxParam; i++ {
c.pvalues[i] = ""
}
} There would still be a potential for a race condition if |
At this point - routes (adding a route, note: there is no way to remove a route) or any of the echo fields is not meant to be mutated after server has started. This is not limited to |
Is this an official and intentional design decision (and if so is it documented somewhere)? Or just an implicit assumption in the codebase? If it's official and intentional, then IMO any of the mutation methods should fail-fast if they're called after the server has started. If the intention is that you should be able to mutate after the server is started, then it looks like a significant overhaul is needed to clean up all the race condition potential (not just this one in |
I really can not say what intentions @vishr had but Basically - trade-off is that for ease of use you must not mutate anything related to running Echo and Router instance. This is not something that is specific to Echo. As Echo and Gin are quite similar both have similar limitation (having public fields which are accessed or accessible from context inside handler coroutines) - they are not meant to be mutated after server has been started. Personally - router not being safe after server has started has been bothering me from design perspective but addressing everything related to potentially dataracey parts is questionable, as there are not much demand for it (adding routes etc after startup) and this would change the API quite drastically (no public fields anymore) - which would rub people, who do not need this feature, wrong way and they seem to be vast majority. |
There is trick to avoid it: We can register an empty route with many param to set an enough
|
Hello all 👋 long time I got the notification that this had been closed @aldas , but don't see a commit that references a fix. Was this addressed in the upcoming v5? |
This will not be changed in For See e := echo.New()
// workaround to get echo.maxParam value to high enough before server starts to routing requests
{
tmp := e.NewContext(nil, nil)
tmp.SetParamNames(make([]string, 256)...) // echo.maxParam will be now 256
e.ReleaseContext(tmp)
}
e.Use(middleware.Logger()) but there is nothing for guarding you from races if you start adding/registering routes after HTTP server has been started. |
Ah okay, so the suggestion is to use the workaround until v5. Thanks for the info! Edit: Though I do wonder, for those who don't know about this, if |
@iaburton could you first describe the use-case where it is necessary mutate params count? It sound like some kind of dynamic routing like thing. I do not understand why it is necessary. For example dynamically changing routes is pretty much dead end at the moment because router does not support removing routes at all. |
The original issue was simply about using the context method |
But is there actual use case when you want to
|
And if routes needs to be added after server has started. something like that could be created to provide locking for router type safeServe struct {
mu *sync.RWMutex
e *echo.Echo
}
func newSafeServe(e *echo.Echo) *safeServe {
// workaround to get echo.maxParam value to high enough before server starts to routing requests
{
tmp := e.NewContext(nil, nil)
tmp.SetParamNames(make([]string, 256)...) // echo.maxParam will be now 256
e.ReleaseContext(tmp)
}
return &safeServe{
mu: new(sync.RWMutex),
e: e,
}
}
func (s *safeServe) ServeHTTP(w http.ResponseWriter, r *http.Request) {
c := s.e.NewContext(r, w)
defer s.e.ReleaseContext(c)
s.find(r, c)
if err := c.Handler()(c); err != nil {
s.e.HTTPErrorHandler(err, c)
}
}
func (s *safeServe) find(r *http.Request, c echo.Context) {
s.mu.RLock()
defer s.mu.RUnlock()
s.e.Router().Find(r.Method, echo.GetPath(r), c)
}
func (s *safeServe) Add(method, path string, handler echo.HandlerFunc, middleware ...echo.MiddlewareFunc) {
for i := len(middleware) - 1; i >= 0; i-- {
handler = middleware[i](handler)
}
s.mu.Lock()
defer s.mu.Unlock()
s.e.Router().Add(method, path, handler)
}
func main() {
e := echo.New()
safe := newSafeServe(e)
//e.Use(middleware.Logger()) // DO NOT USE e.Use
handler := func(c echo.Context) error {
return c.String(http.StatusOK, c.Param("param"))
}
safe.Add(http.MethodGet, "/:param", handler, middleware.Logger()) // add middlewares to route here
s := http.Server{
Addr: ":8080",
Handler: safe,
}
if err := s.ListenAndServe(); err != nil && !errors.Is(err, http.ErrServerClosed) {
log.Fatal(err)
}
} but this looks silly. + you can not still remove these added routes which is probably something that you want if dynamic routing is something that is required. |
coming back to assuming it can only grow we do not need to access |
#2611 addresses this issue. middlewares using |
Hey @aldas thanks for the replies I get what you're asking now. So originally (a few years ago now so the details are a bit fuzzy) I was updating an echo service from 3 to 4 and ran into an issue regarding routes disappearing, as mentioned in the OP
I think I ended up trying the Anyways, I see your #2611 addresses it nicely though, fixing the race caused by the original MR that added it. |
Issue Description
Hi there!
I've been using echo for a relatively large api service, upgrading from 3.x to the latest 4.x recently and fixing some issues along the way.
While the initial issue I ran into may be related to #1678 (some groups or routes randomly "forget" their handler and start returning 404's) I discovered some other issues in echo.
First I'd like to thank the maintainers for the project, I don't mind discussing this issue or attempting to contribute a fix.
Anyways, I noticed here a significant race condition introduced via #1535 .
echo.maxParam
cannot be changed safely, even with atomics (which would fix the data race), as the algorithm itself is racy.Overwriting that value via the context I realized was causing panics here for me (not sure why that isn't just using the len of pvalues to begin with).
Further, manipulation of the context/echo types seems to lead to issues with the Find algorithm in the Router which is already hard to follow, and may be leading to my initial issue above.
Perhaps unrelated, is there a particular usage for the context type and methods I'm missing? It's defined as an interface but cannot be (re)implemented due to unsafe assertions such as this. I was hoping to override/shadow some methods for debugging purposes but cannot.
Thanks again, as for the initial issue I mentioned above regarding 404's I'd like to discuss/tackle after this one.
Checklist
Expected behaviour
Safely use context methods
Actual behaviour
Panic from incorrect len / race.
Steps to reproduce
Noted above.
Version/commit
Latest echo release, v4.1.17. Note I haven't tried with master, I noticed ~50 commits to master since the last release, I briefly looked at them but didn't see anything that would fix this.
The text was updated successfully, but these errors were encountered: