ureq 3.x #762

algesten · 2024-07-03T12:54:34Z

This PR/branch is a complete, from the ground up rewrite of ureq. For Rust there clearly is a role for a sync, minimal dependency HTTP/1.1 client. The ureq 2.x code has gradually been iterated to its current state, but some early design decisions means we are struggling to make some changes we would like to have (such as being transport agnostic). While ureq's surface API is largely okay (with some warts), internally there is a lack of good separation of concerns and clear contracts between the parts of the library.

This PR starts from a complete clean slate, but the intention is to gradually bring back code from the 2.x branch where it makes sense to do so. Especially I'd like to bring back the majority of the tests to ensure backwards compatibility. The intention is also to preserve as much of the API surface as possible, only making changes in areas where there currently are clear problems (such as making unnecessary allocations). The intention is to also have a 1-1 feature parity with 2.x and the MSRV will not move much.

This PR/effort might not work out. I might discover things that are irreconcilable with the current ureq 2.x, or I might run out of steam to complete the project. There is also no timeline for completion, might take months, might take years. Ideally there are more people wanting to engage in the project – please talk to me here or send an email!

Here are some things I been thinking about fixing:

Separation between HTTP/1.1 logic and transport – this effort has started in hoot – a project which aims to provide a Sans-IO HTTP/1.1 protocol underpinning.
Bring your own transport. The transport should be completely plugin based and possible for the user to write their own.
http crate. While I will try to retain a lot of ureq's API, adhering to a common Rust HTTP API is ultimately a better idea.
Less allocations – less Strings, Vec. Try to work more with borrows/lifetimes.
Potentially webasm. As a consequence of making the transport possible to swap out, we can maybe make ureq work in a webasm context (needs more thought and input!)
Improve middleware support. The current middleware chains are OK, but we want them to be even better.
Improve the Error type and error-handling.

Please engage in discussion below if there are things I have omitted or ideas of where ureq could go!

algesten · 2024-07-06T10:33:03Z

For a while I was considering the main API entry point would be taking the request http::Request<Body> by reference, i.e. the most central signature would be something like:

pub fn run(request: &Request<impl IntoBody>) -> Result<Response<RecvBody>, Error> { ... }

However &Request precludes sending any body would require &mut, such as those implementing std::io::Read (File comes to mind). Therefore I'm changing to not taking by ref. The internals of hoot still runs by ref, which means we should be able to avoid ever needing to clone it.

pub fn run(request: Request<impl IntoBody>) -> Result<Response<RecvBody>, Error> { ... }

amkillam · 2024-07-06T16:24:57Z

Just wanted to comment to say that I am very excited about the potential to use ureq with WASM! I managed to make ureq 2.x compile for WASM with some conditional compilation of rustls things, but am currently facing implementing further conditional compilation of TCPStream and some other things to make requests execute successfully. This rewrite was a great idea and will be really useful for my projects.

algesten · 2024-07-06T16:36:32Z

@amkillam I haven't done any WASM myself, so I don't know what do's and dont's there are. I'm not sure I can go full no_std, because living without Box<dyn X> could prove a challenge.

I just tried to compile this branch with cargo build --target wasm32-unknown-unknown – and that's fine. Does that mean it's ok?

If I add that to the CI, are there more lints/things to check for?

amkillam · 2024-07-06T18:01:47Z

If compilation works then it should be most of the way there. Could you provide an example of how ureq usage would change in 3.x? I can try to make a working example of ureq usage in WASM based on that, and share my results after.

amkillam · 2024-07-06T18:03:47Z

If compilation works then it should be most of the way there. Could you provide an example of how ureq usage would change in 3.x? I can try to make a working example of ureq usage in WASM based on that, and share my results after.

Also I'm not totally sure about tests specific to WASM yet, but I think once I've (hopefully) created a working example of usage in WASM, I'll be in a better position to comment or maybe contribute if you'd like!

algesten · 2024-07-06T19:09:45Z

If compilation works then it should be most of the way there. Could you provide an example of how ureq usage would change in 3.x? I can try to make a working example of ureq usage in WASM based on that, and share my results after.

Also I'm not totally sure about tests specific to WASM yet, but I think once I've (hopefully) created a working example of usage in WASM, I'll be in a better position to comment or maybe contribute if you'd like!

I will pick you up on that, but it's a bit too early still. I can't even make ureq 3.x do a regular request just yet. Hopefully in a couple of days.

rmg-x · 2024-07-08T13:25:35Z

and the MSRV will stay the same.

I really really think the MSRV policy should at least be relaxed and communicated clearly with a major version bump. My personal opinion is that users shouldn't be relying on this in the first place but such is life.

I'm not saying we should bump MSRV anytime a new Rust version is released but allow for some modernization when it makes sense.
E.g. #725

algesten · 2024-07-08T14:52:38Z

and the MSRV will stay the same.

I really really think the MSRV policy should at least be relaxed and communicated clearly with a major version bump. My personal opinion is that users shouldn't be relying on this in the first place but such is life.

I'm not saying we should bump MSRV anytime a new Rust version is released but allow for some modernization when it makes sense. E.g. #725

The problem is developer environments installed via Linux (or similar) package managers instead of rustup. They tend to have a much slower upgrade cadence. I want ureq to be available to as many users as possible, and I personally don't think the code get that much worse by not having "modern" constructs like let-else.

You're right about once cell – there we can avoid a dependency. So far I've managed to avoid using it, we'll see how that goes when I get more into the tests. I've been forced to bump MSRV to 1.63 because rustls bumped, so right now I'm leaning towards 1.63.

But nothing is set in stone.

rmg-x · 2024-07-08T15:09:15Z

installed via Linux (or similar) package managers instead of rustup. They tend to have a much slower upgrade cadence

That's valid. Just as an example, the creator of once_cell supports at least the previous eight Rust versions which is roughly ~1 year behind current. To me, that seems reasonable.

Also as an aside, compatibility for crates and Rust versions is tracked here if you are interested: https://lib.rs/stats#rustc

algesten · 2024-07-09T07:16:02Z

                         │                                                  
                                                                            
                         │                                                  
                                        ┌─────────┐                         
                         │              │         │                         
                                        │ CONNECT │                         
                         │              │         │                         
                                        └─────────┘                         
            ┌─────────┐  │  ┌─────────┐ ┌─────────┐             ┌─────────┐ 
            │         │     │         │ │         │             │         │ 
            │   TLS   │  │  │ CONNECT │ │   TLS   │             │   TLS   │ 
            │         │     │         │ │         │             │         │ 
            └─────────┘  │  └─────────┘ └─────────┘             └─────────┘ 
┌─────────┐ ┌─────────┐     ┌─────────┐ ┌─────────┐ ┌─────────┐ ┌─────────┐ 
│         │ │         │  │  │         │ │         │ │         │ │         │ 
│TcpStream│ │TcpStream│     │TcpStream│ │TcpStream│ │  SOCKS  │ │  SOCKS  │ 
│         │ │         │  │  │         │ │         │ │         │ │         │ 
└─────────┘ └─────────┘     └─────────┘ └─────────┘ └─────────┘ └─────────┘ 
                         │     Proxy       Proxy       Proxy       Proxy    
    HTTP       HTTPS          CONNECT     CONNECT      SOCKS       SOCKS    
                         │      HTTP       HTTPS        HTTP       HTTPS    
                                                                            
                         │

Actually. I need TLS also for SOCKS.

algesten · 2024-07-09T13:01:53Z

I've been debating in my head how to think about Send/Sync. ureq 2.x is very liberal, Agent, Request and Response are all thread safe.

In ureq 3.x, I gone from thinking "let's just stay on the thread" (i.e. nothing is Send/Sync), to thinking, that maybe some parts should be. One core idea is to avoid allocations and thus not require ownership of any request body – that can be orthogonal with being Send/Sync. However I think we can get around it.

This is how the types map from ureq 2.x -> 3.x.

ureq2::Agent -> ureq3::Agent
ureq2::Request -> ureq3::RequestBuilder
ureq2::Response -> http::Response<RecvBody> (naming TBD)

ureq 3.x deliberately drops the type names Request and Response, leaving them to the http-crate. Thus http::Request becomes a central type giving us a new method on Agent that is:

    pub fn run(&self, request: http::Request<impl AsBody>) -> Result<http::Response<RecvBody>, Error> {

This lives side by side with the existing ureq2 API. These are equivalent:

New http-crate centric API:

   let data = vec![0_u8, 1, 2, 3, 4];
   let request = http::Request::post("https://yaz").body(&data).unwrap();
   let response = ureq::run(request).unwrap();

Old ureq2 API. Notice that ureq::post(...) produces a RequestBuilder, and we will make sure this is Send/Sync since that makes it possible to partially constructs requests with both 2 and 3 API.

   let data = vec![0_u8, 1, 2, 3, 4];
   let response = ureq::post("https://yaz").send_bytes(&data).unwrap();

impl AsBody above is so far implemented for: &[u8], &str, String, Vec<u8>, &String, &Vec<u8>, &File, &TcpStream, &Stdin, File, TcpStream, Stdin, RecvBody and Response<RecvBody>. Those are not all Send/Sync. However ureq only needs to know about them when it's time to send a request, which means the type Body<'a> – which ISN'T Send/Sync – only exists while calling Agent::run or RequestBuilder::send_bytes. Once those functions return a response, the Body<'a> is not used anymore, meaning ureq doesn't actually enforce any constraints on the user wrt Send/Sync.

algesten · 2024-07-10T07:57:29Z

Considering building out a TLS API in ureq3.x to not be in the situation we have in #765

Instead of exposing/re-exporting rustls config (or native-tls), we would abstract over both. The initial abstraction would need support a reasonable amount of minimal config:

client certs
custom roots
sni enable/disable
TLS12/13
disable certificate verifier (maybe under a dangerous feature flag)

Even if we default to rustls, native-tls might be a better starting point for looking at which config options to include, since rustls can do more than native-tls.

https://docs.rs/native-tls/0.2.12/native_tls/struct.TlsConnectorBuilder.html
https://docs.rs/rustls/0.23.11/rustls/struct.ConfigBuilder.html

algesten · 2024-07-11T08:15:22Z

Got TLS working with both rustls and native-tls. For now I decided to not use the webpki-roots crate instead opting for using rustls-native-certs for both rustls and native-tls. webpki-roots is structured such that it's only compatible with rustls, thus it breaks symmetry to use it.

Question is how many ureq users want the webpki-roots instead?

algesten · 2024-07-11T12:52:24Z

@amkillam i think we're at a place where we can try webasm now.

What I'm not clear on is whether webasm can give me a TCP socket style interface and if TLS should be in-or-outside the scope ureq. Let me explain what we got and see if this is a good starting point for tinkering.

Agent::new constructor takes a config, a connector and a resolver. https://github.com/algesten/ureq/blob/feature/ureq3/src/agent.rs#L148-L161 . The config is for stuff like timeouts and TLS. The connector is a composable thing that provides a transport per request (which I get back to below), and the resolver is something that translates hostnames to SocketAddr.

The DefaultConnector has a chain of subconnectors that are run one after another. Each can chose to pass-through or build on top of the previous Transport. https://github.com/algesten/ureq/blob/feature/ureq3/src/transport/mod.rs#L72-L93
Simplified we get a chain of connectors: SOCKS -> TcpStream -> Rustls. If we don't use SOCKS, we get just TcpStream -> Rustls and the scheme, https decides whether there Rustls connector is used.

Connector and Transport are public traits that we can implement to provide some completely different transport. The Transport trait https://github.com/algesten/ureq/blob/feature/ureq3/src/transport/mod.rs#L55-L63 is quite different to io::Read/io::Write – rather working on shared buffers. This is because I have io_uring in mind for the future.

TcpTransport is maybe the cleanest example: https://github.com/algesten/ureq/blob/feature/ureq3/src/transport/tcp.rs#L78

Let's take a concrete example. Let's say webasm transport does not need an explicit resolving, instead you give it some combo (hostname, port, is_tls) – then we can write a dummy Resolver that does nothing and a corresponding Transport opening the connection.

If I remove all default features, it compiles on wasm32-unknown-unknown. If we need to bring in some feature like rustls, we can maybe make it compole through feature flags – needs investigating.

cpu · 2024-07-11T14:36:14Z

For now I decided to not use the webpki-roots crate instead opting for using rustls-native-certs for both rustls and native-tls. webpki-roots is structured such that it's only compatible with rustls, thus it breaks symmetry to use it.

For unrelated reasons we've been thinking about extending webpki-roots with an optional feature that includes the DER of the self-signed root certificates in addition to the more minimal (and webpki crate focused) trust anchor representation. I think if that were to happen you could use webpki-roots for both TLS backends similar to rustls-native-certs. The main downside (and why it would be opt-in) is the increase in binary size from all the extra (and for rustls, superfluous) data.

If you were open to breaking symmetry between backends the rustls-platform-verifier crate is often the best choice for rustls in particular. The README table breaks it down, but it works like a better rustls-native-certs on most platforms and degrades to rustls-native-certs and webpki-roots as appropriate for other platforms (like Linux/BSD/WASM).

What I'm not clear on is whether webasm can give me a TCP socket style interface and if TLS should be in-or-outside the scope ureq.

I'm very out of the loop with wasm, but have heard folks say they've gotten rustls working there. Certainly you need webpki-roots in this context because there is no platform trust store to load certificates from in that environment (e.g. rustls-native-certs would never be supported in WASM AIUI).

In case it helps colour your decision for support rustls is interested in adding first-class wasm support but it hasn't been done yet and isn't something tested in CI. I've heard anecdotally it works, but it could break :-)

algesten · 2024-07-11T16:30:05Z

For unrelated reasons we've been thinking about extending webpki-roots with an optional feature that includes the DER of the self-signed root certificates in addition to the more minimal (and webpki crate focused) trust anchor representation. I think if that were to happen you could use webpki-roots for both TLS backends similar to rustls-native-certs. The main downside (and why it would be opt-in) is the increase in binary size from all the extra (and for rustls, superfluous) data.

That would be a most welcome feature indeed. In a previous life I hugely preferred webpki-roots to the platform, since enterprise sec and others want to MITM by installing their own roots. That's why ureq2.x used the webpki-roots as primary. However I'm older and wiser now, and maybe it's more pragmatic to just go with the local root store as the default choice.

If you were open to breaking symmetry between backends the rustls-platform-verifier crate is often the best choice for rustls in particular. The README table breaks it down, but it works like a better rustls-native-certs on most platforms and degrades to rustls-native-certs and webpki-roots as appropriate for other platforms (like Linux/BSD/WASM).

Looks great! Revocations support is nice, and using the OS-default verifiers probably means even better protection in case there are 0-day updates etc. I think it is worth breaking symmetry for, yes.

I'm very out of the loop with wasm, but have heard folks say they've gotten rustls working there. Certainly you need webpki-roots in this context because there is no platform trust store to load certificates from in that environment (e.g. rustls-native-certs would never be supported in WASM AIUI).

In case it helps colour your decision for support rustls is interested in adding first-class wasm support but it hasn't been done yet and isn't something tested in CI. I've heard anecdotally it works, but it could break :-)

Interesting. I was hoping that by making the transport truly plug-in in ureq3.x this would open for these possibilities. But I'm not sure I'll tinker with it much myself until I have a real need.

algesten · 2024-07-13T13:36:41Z

With that last push, ureq 3.x is now feature complete in terms of feature flags.

There are a couple of TODOs I want to deal with and I need to glance through the 2.x code to make sure there aren't functionality areas that I've forgotten about.

Next up is porting all the tests over.

cpu · 2024-08-07T15:37:27Z

For unrelated reasons we've been thinking about extending webpki-roots with an optional feature that includes the DER of the self-signed root certificates in addition to the more minimal (and webpki crate focused) trust anchor representation.

That would be a most welcome feature indeed.

In case it ends up being helpful I wanted to mention I finished implementing this. It ended up being a separate crate instead of an opt-in feature: https://crates.io/crates/webpki-root-certs

algesten · 2024-08-13T17:58:38Z

Moving this to the main branch with individual PRs now.

algesten force-pushed the feature/ureq3 branch 3 times, most recently from a4233f3 to 1b9cb3b Compare July 6, 2024 10:29

algesten force-pushed the feature/ureq3 branch from 0a9172b to 3dc865e Compare July 8, 2024 07:07

algesten mentioned this pull request Jul 8, 2024

rustls upgrade in 2.10 is a breaking change? #765

Open

This comment was marked as outdated.

Sign in to view

algesten force-pushed the feature/ureq3 branch 2 times, most recently from f365d90 to 965fdff Compare July 11, 2024 08:13

algesten force-pushed the feature/ureq3 branch 2 times, most recently from b55ed8f to 6fc839c Compare July 12, 2024 17:14

algesten force-pushed the feature/ureq3 branch from 657b41c to db1e1aa Compare July 27, 2024 20:40

cpu mentioned this pull request Aug 5, 2024

Add webpki-root-certs crate rustls/webpki-roots#75

Merged

algesten and others added 26 commits August 13, 2024 18:59

Fix redact user-agent

a90a1c6

Lossy reader

3c7ef60

Fix broken compilation

f0fe113

Add missing doc

a6f436b

Clarify error for gzip/brotli decompression

0d0835d

BodyReaderConfig and rework body API

244efd2

Start of lib.rs main doc

cc07ade

json feature

e8b16f6

Improve doc

37ae0c9

Rejig the body API (again)

e1dee1c

Remove meaningless conf

9e80d9b

Fix broken test

b5d355a

Document error handling

825319f

Fix tests

69e021e

Document features

a809281

Document JSON, sending body data, charset and proxying

05398e9

RequestBuilder::content_type helper function

b571f8f

Fix panic with chunked 0 size bodies

f3737c2

Fix panic in cookie url parsing

d78c14c

Test HEAD request

5600ec0

Log incorrect Unit::State name

f2867f1

Clippy fixes

e86fc89

Reworks native-roots, use rustls-platform-verifier

72bbc29

RootCerts::WebPki config feature

08287de

CONNECT proxy support

3dbd8c8

Use process-wide default rustls backend if set

01ee307

algesten force-pushed the feature/ureq3 branch from 5f2a503 to 01ee307 Compare August 13, 2024 16:59

algesten merged commit 0e3cbb9 into main Aug 13, 2024
48 checks passed

algesten deleted the feature/ureq3 branch August 13, 2024 17:58

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ureq 3.x #762

ureq 3.x #762

algesten commented Jul 3, 2024 •

edited

Loading

algesten commented Jul 6, 2024

amkillam commented Jul 6, 2024

algesten commented Jul 6, 2024 •

edited

Loading

amkillam commented Jul 6, 2024

amkillam commented Jul 6, 2024

algesten commented Jul 6, 2024

rmg-x commented Jul 8, 2024

algesten commented Jul 8, 2024

rmg-x commented Jul 8, 2024 •

edited

Loading

This comment was marked as outdated.

algesten commented Jul 9, 2024

algesten commented Jul 9, 2024

algesten commented Jul 10, 2024

algesten commented Jul 11, 2024

algesten commented Jul 11, 2024

cpu commented Jul 11, 2024 •

edited

Loading

algesten commented Jul 11, 2024

algesten commented Jul 13, 2024

cpu commented Aug 7, 2024

algesten commented Aug 13, 2024

ureq 3.x #762

ureq 3.x #762

Conversation

algesten commented Jul 3, 2024 • edited Loading

algesten commented Jul 6, 2024

amkillam commented Jul 6, 2024

algesten commented Jul 6, 2024 • edited Loading

amkillam commented Jul 6, 2024

amkillam commented Jul 6, 2024

algesten commented Jul 6, 2024

rmg-x commented Jul 8, 2024

algesten commented Jul 8, 2024

rmg-x commented Jul 8, 2024 • edited Loading

This comment was marked as outdated.

algesten commented Jul 9, 2024

algesten commented Jul 9, 2024

algesten commented Jul 10, 2024

algesten commented Jul 11, 2024

algesten commented Jul 11, 2024

cpu commented Jul 11, 2024 • edited Loading

algesten commented Jul 11, 2024

algesten commented Jul 13, 2024

cpu commented Aug 7, 2024

algesten commented Aug 13, 2024

algesten commented Jul 3, 2024 •

edited

Loading

algesten commented Jul 6, 2024 •

edited

Loading

rmg-x commented Jul 8, 2024 •

edited

Loading

cpu commented Jul 11, 2024 •

edited

Loading