-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Review min[Request|Response]DataRate behavior #4014
Comments
One additional problem with the read side of this implementation (ie minRequestDataRate handling) is that it is only measured on the data content. This gives some protection against a slow POST DoS attack, but no protection against a classical Slowloris attack where the client is sending the header data extremely slowly. It would be much better if the data rate could be measured for all network reads for the request, but this seems like it would require some significant rework. |
It is possible for use to capture a timestamp of the very first select that relates to a request (ie as parsing is started). The HTTP parser could then check the data rate easily enough as it is already tracking the header size to protect against large headers. @sbordet could h2 do the same? |
@gregw a HTTP/2 client that wants to send headers very slowly would have to use a |
@mattiaslundstrom @gregw note also that a client sending the headers slowly is not a big problem, since we don't allocate a thread until we have all the headers. The content, however, it's a problem because an application may read it with blocking APIs and keep threads blocked. |
For reads the current behavior is that we record the first timestamp In 9.4.x HTTP/1.1 we only trigger content arrival every time we read:
In 9.4.x HTTP/2 content arrives at any time and it's queued:
In Jetty 10.0.x the two behaviors will be coalesced into the 9.4.x HTTP/1.1 (see #4318). Therefore, like in writes, reads suffer from the fact that the application idle time is taken into account, while it should not. An alternative approach is to record how long reads really take:
I'm thinking this new behavior should be implemented in Jetty 10. |
So you're basically suggesting to take both timestamps and checking them in the This sounds like the most straightforward and reasonable method to me. |
I can see a problem with this new approach.... ah but it also affects our current approach. Ah but my idea for fix will solve both problems:
If we block for some time waiting for this raw content, then the initial small chunk will be added and we'll read only 5 bytes, but the read could have taken a long time because the second chunk is really rather large. So if we check the data rate in either addContent or in the read, then both The fix is to only check the data rate before we block (or schedule async callback) - ie if there is content to be consumed, we don't even look at the data rate. Only once are about to block (or schedule an async callback) should we look at the data rate and then |
@gregw I don't see the problem.
I don't think reads take time if there is content, no matter how large it is. Checking for the rate should be done in |
Also, if we have gzipped content, what do we measure as data rate? The gzipped content, or the ungzipped content? |
@sbordet Consider the following:
With the current algorithm, on the first read parseAndFill will add the 1 byte chunk and we divide 1 byte by the 10s block and the data rate is failed. With my proposal:
|
With regards to gzip/transformations, I think we should check raw bytes. So I propose that we check data rate of raw_bytes/total_time only before we block or schedule an async callback. |
@gregw I don't understand. If an application initiates a read where it blocks for 10s and after 10s it receives 1 byte only, that's exactly the case we want the min data rate to apply, as it is evidently an attack. |
@sbordet but it didn't receive 1 bytes. The client sent 1001 bytes plus chunking framing. It was just an artefact of the transport framing and our implementation that does not aggregate chunks that we see those 1001 bytes in 10 seconds as 1 byte after 10 seconds and 1000 bytes instantly after that. |
Also note that we need to protect the initial read from short times as it will be very chaotic. |
@gregw let me see if I understand; this is what you're proposing:
We never check when we have data to return to the application. We may need to reset the time in case of Expect: 100-Continue when the application calls If the application goes into a long pause (e.g. GC), when it comes back it will see data to read and we don't do the check until there is data to read. @gregw however, if the network buffers are small, we may have the client blocked and when the server resumes, the server may consume all the content before a TCP (or HTTP/2) message tells the client to send more content. |
@sbordet That is indeed a problem, but I think it is a small one. Essentially if the application is not reading fast enough so the pipe congests and then doesn't achieve it's data rate target, then it hasn't achieved it's data rate target. Now this could be because of an evil client... or simply that because the app was slow then the TCP/IP and/or h2 flow control windows were never grown (or shrunk) to a size that is too small to support the required data rate. But I'm not sure if there is anything we can do to prevent that because we can't tell the difference between an evil client and a small flow control window. Something that helps is that if the application is slow on read N, then it will often also be slow on read N+1, so there will be time for more data to arrive. |
I still don't like this. It's a mechanism to protect the server against bad clients, and basing the algorithm on what the server application does feels not correct. Having said that, I'm not sure we can do better than what proposed here. Perhaps we can adjust for application idleness? I.e. rather than counting time |
I believe I convinced simone on a hangout of the approach I'm suggesting (checking total_data/total_time only when about to block), but for completeness I'll respond to his comment above. This mechanism is to protect against a bad client, but that doesn't mean we have to check before every read. The badness we are protecting against is a deliberately slow client blocking lots of application threads, thus the badness is the blocking and we only need to check if the client is bad before doing the blocking. Using total_data/total_time means that we measure the achieved data rate, which may be affected by a slow server, but if the client is faster than the server then the server will never block and the check will never be made. The chance for false positives is only if the server is busty - running slow for a while so that the effective data rate becomes very low and then running fast enough to exhaust all the buffered data. If that does become a problem, then perhaps we need to check some kind of sliding average rather than total, but I'm dubious it will be a significant problem (above the problem the server will be having anyway if it regularly runs so slow to cause this kind of behaviour). |
This issue has been automatically marked as stale because it has been a |
This issue has been automatically marked as stale because it has been a |
This issue has been automatically marked as stale because it has been a |
Changing the
HttpConfiguration.min[Request|Response]DataRate
value from0
to non-zero causes many tests to fail, but some unexpectedly.Current behavior for reads is different from HTTP/1.1 and HTTP/2, as HTTP/1.1 calls
HttpInput.addContent()
fromread(...)
, while HTTP/2 calls it from parsing (and theread()
may happen way later).The current behavior for writes is that the content production must be greater than the min rate. An application that writes small chunks of content separated by pauses (e.g. GC, sleeps or DB queries) may risk to not produce content fast enough (Ws=Write start, We=Write end):
now-t0
may be large (and trip the min threshold) because the application was stalled, not because the writes were stalled (in fact, the writes were fast).An alternative approach for writes is to just measure how long the writes take (from write begin to write end), and how many bytes they write:
This is now measuring whether writes are stalled (the number of asterisks), not content production, which may be better with respect to protecting the server from slow clients.
The text was updated successfully, but these errors were encountered: