-
-
Notifications
You must be signed in to change notification settings - Fork 1.6k
Use from_reader_eager instead of from_reader when deserialize Json #547
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
contrib/src/json.rs
Outdated
.take(size_limit) | ||
.read_to_string(&mut buf) | ||
.map_err(|e| { error_!("IO Error: {:?}", e); e }) | ||
.unwrap(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure how to deal with this unwrap()
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunately it doesn't look like Serde's error type can be created from an io:Error
, so we'll have to change the error type here so that it returns an io::Error
instead. We can then use impl From<serde::Error> for io::Error
. Alternatively, @dtolnay: would it be possible to add a From<io::Error> for serde_json::error::Error
implementation?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would prefer not to have From<io::Error> for serde_json::Error
. JSON errors are for Serde to create.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Switched to io::Error
@messense what library are you using to profile? |
@Dowwie Relevant post: http://carol-nichols.com/2015/12/09/rust-profiling-on-osx-cpu-time/ |
Just did a profile on using diff --git a/contrib/src/json.rs b/contrib/src/json.rs
index a461127..ff2e441 100644
--- a/contrib/src/json.rs
+++ b/contrib/src/json.rs
@@ -1,5 +1,5 @@
use std::ops::{Deref, DerefMut};
-use std::io::Read;
+use std::io::{Read, BufReader};
use rocket::outcome::{Outcome, IntoOutcome};
use rocket::request::Request;
@@ -98,7 +98,8 @@ impl<T: DeserializeOwned> FromData for Json<T> {
}
let size_limit = request.limits().get("json").unwrap_or(LIMIT);
- serde_json::from_reader(data.open().take(size_limit))
+ let reader = BufReader::new(data.open().take(size_limit));
+ serde_json::from_reader(reader)
.map(|val| Json(val))
.map_err(|e| { error_!("Couldn't parse JSON body: {:?}", e); e })
.into_outcome(Status::BadRequest) Much slower than |
That is...surprising. Have you investigated why |
We optimize |
89c83ae
to
203a26e
Compare
I filed serde-rs/json#407 to support this in serde_json. Let me know if anyone has API suggestions. |
contrib/src/json.rs
Outdated
if !request.content_type().map_or(false, |ct| ct.is_json()) { | ||
error_!("Content-Type is not JSON."); | ||
return Outcome::Forward(data); | ||
} | ||
|
||
let size_limit = request.limits().get("json").unwrap_or(LIMIT); | ||
serde_json::from_reader(data.open().take(size_limit)) | ||
let mut buf = String::new(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you use with_capacity
here with size_limit
to optimize this a little more?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't want to over-allocate memory. Perhaps we could choose a safe lower-bound: something like 1k or 4k.
@messense Lets copy the code from serde-rs/json#407 (https://docs.rs/eager_json/0.1.0/src/eager_json/lib.rs.html#13-24) into Rocket. Then we could do this correctly. I'd be fine with using |
`serde_json::from_reader` is considerably slow for some reason. serde-rs/json#160
203a26e
to
4aed87b
Compare
Issue #547 identified a performance issue when serde's 'from_reader' is used to deserialize incoming data. Using 'from_str' resolves the issue. This commit swaps a use of 'from_reader' in favor of 'from_str' in rocket_contrib's 'Json' implementation. Additionally, this commit ensures that un-deserialized JSON data is discarded as long as it is within the JSON data limit. Closes #562.
I will measure the performance implications with different lower bounds when I have the time. Would be great to have this patch backported to 0.3.x version! |
Regarding the buffer capacity question, I just did a simple instrument with the I am using the following Python script to send some requests the server. # -*- coding: utf-8 -*-
from concurrent.futures import ThreadPoolExecutor
from requests import Session
session = Session()
def post_json(index):
return session.post(
'http://localhost:8000/message/{}'.format(index),
json={
'contents': 'Hello World!' * 1000
}
)
def bench():
with ThreadPoolExecutor() as executor:
fs = [executor.submit(post_json, i) for i in range(10000)]
_rs = [r.result() for r in fs]
if __name__ == '__main__':
bench() String::new()String::with_capacity(1024)String::with_capacity(4096)Though I am not sure this is the correct way to measure its performance. To me it seems that use |
Issue #547 identified a performance issue when serde's 'from_reader' is used to deserialize incoming data. Using 'from_str' resolves the issue. This commit swaps a use of 'from_reader' in favor of 'from_str' in rocket_contrib's 'Json' implementation. Additionally, this commit ensures that un-deserialized JSON data is discarded as long as it is within the JSON data limit. Closes #562.
serde_json::from_reader
is considerably slow for some reason.serde-rs/json#160