Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Movable HTTPClient and fixing WiFiClient copy #8237

Merged
merged 10 commits into from
Oct 13, 2021

Conversation

mcspr
Copy link
Collaborator

@mcspr mcspr commented Jul 24, 2021

  • =default for default ctor, destructor, move ctor and the assignment move
  • use std::unique_ptr instead of raw pointer
  • implement clone() for the WiFiClient so it's safe to copy the WiFiClientSecure, and not accidentally call the WiFiClient copy instead
  • replace headers pointer array with unique_ptr<T[]> to simplify the move operations
  • substitute userAgent with the default one when it is empty

Allow HTTPClient to be placed inside of movable classes (e.g. std::optional) or to be returned from functions. Class logic stays as-is, only the underlying member types are slightly changed.

replaces #8236
resolves #8231
@paulocsanz

edit:
also should resolve #5734 with the WiFiClient changes

- =default for default ctor, move ctor and the assignment move
- proxy wificlient through a helper class that handles the pointer
- replace headers pointer array with unique_ptr<T[]> to simplify the move operations
- substitue userAgent with the default one when it is empty
@paulocsanz
Copy link
Contributor

This seems the perfect solution!

@paulocsanz
Copy link
Contributor

paulocsanz commented Jul 24, 2021

There is a tiny breaking change, that setting userAgent to an empty string doesn't work anymore, the only way to ensure seems std::optional. But it is overkill and I don't even know if a empty userAgent is supported in HTTP spec.

@mcspr
Copy link
Collaborator Author

mcspr commented Jul 24, 2021

There is a tiny breaking change, that setting userAgent to an empty string doesn't work anymore, the only way to ensure seems std::optional. But it is overkill and I don't even know if a empty userAgent is supported in HTTP spec.

Technically... it is, but depends on the client / server reading those. Replaced with a static String that is default value for userAgent.

@paulocsanz
Copy link
Contributor

paulocsanz commented Jul 24, 2021

Yeah it's not a big deal, I don't feel confident enough to give a position on the tradeoff empty user agent after move vs changing empty user agents for the default. Both seem to have issues when taking into account backwards compatibility. But technically move operations allow that in cpp, it's a valid but indeterminate value. I don't like taking decisions only on the "tecnically" side of things. But it's good to have that on the table before the "last word" on the subject.

Again, I'm not really a person with reputation in the community and knowledge on the subject to give a strong position, it was just my 2 cents.

But as the person that created the issue, this solves all of my problems.

@mcspr
Copy link
Collaborator Author

mcspr commented Jul 25, 2021

I did state that the code did not change the (observable) behaviour, and it did. Note the line with String::reserve above that uses userAgent length. Plus, setting userAgent to something with length, then resetting back to empty user agent set it to default string and not the empty one as before.

"technically" referred to the http rfc(s) and some discussions I found related to the question you raised - could user-agent be empty, or is it allowed for the header field-value to be empty in browsers, http servers, standalone http clients, etc.

@paulocsanz
Copy link
Contributor

paulocsanz commented Jul 25, 2021

In the worst case, wrapping _userAgent with a class that sets to default when moved out would solve it. If we are trying to avoid depending on manually writing the HTTPClient move operations to make extensibility easier. But maybe we are creating so many independent changes to have something that ends up being equally complex in the long run.

But the boilerplate here could be generalized for all libraries in the ecosystem to use. This NonOwningPtr<T> certainly is frequently needed, and a DefaultOnMove<T, const T value> (or getting a function pointer to generate T on construction) could be used here too and generalized. My dream would also be a internal class that is basically a std::variant<String, __FlashStringHelper*> with String API so we can avoid needless "persistent" allocations where we know the value at compile time. And the usage of StringView where String API is needed instead of forcing an allocation. But I digress.

This PR seems super OK in the current state, I'm just nitpicking.

@mcspr
Copy link
Collaborator Author

mcspr commented Jul 27, 2021

Once again with the user-agent, this may be slightly safer approach

diff --git a/libraries/ESP8266HTTPClient/src/ESP8266HTTPClient.h b/libraries/ESP8266HTTPClient/src/ESP8266HTTPClient.h
index 3f6ea010..b8885176 100644
--- a/libraries/ESP8266HTTPClient/src/ESP8266HTTPClient.h
+++ b/libraries/ESP8266HTTPClient/src/ESP8266HTTPClient.h
@@ -284,6 +284,35 @@ protected:
         WiFiClient* _ptr = nullptr;
     };

+    struct UserAgent {
+        UserAgent() = delete;
+        explicit UserAgent(const char* ptr, size_t length) :
+            _ptr(ptr),
+            _length(length)
+        {}
+
+        template <typename T>
+        UserAgent& operator=(T&& value) {
+            _string = std::forward<T>(value);
+            _ptr = _string.c_str();
+            _length = _string.length();
+            return *this;
+        }
+
+        size_t length() const {
+            return _length;
+        }
+
+        const char* c_str() const {
+            return _ptr;
+        }
+
+    private:
+        String _string;
+        const char* _ptr;
+        size_t _length;
+    };
+
     bool beginInternal(const String& url, const char* expectedProtocol);
     void disconnect(bool preserveClient = false);
     void clear();
@@ -307,8 +336,9 @@ protected:
     String _headers;
     String _base64Authorization;

-    static const String defaultUserAgent;
-    String _userAgent = defaultUserAgent;
+    static const char defaultUserAgent[];
+    static const size_t defaultUserAgentLength;
+    UserAgent _userAgent { defaultUserAgent, defaultUserAgentLength };

     /// Response handling
     std::unique_ptr<RequestArgument[]> _currentHeaders;

And setup like this

diff --git a/libraries/ESP8266HTTPClient/src/ESP8266HTTPClient.cpp b/libraries/ESP8266HTTPClient/src/ESP8266HTTPClient.cpp
index 5aca23a8..8a3a6252 100644
--- a/libraries/ESP8266HTTPClient/src/ESP8266HTTPClient.cpp
+++ b/libraries/ESP8266HTTPClient/src/ESP8266HTTPClient.cpp
@@ -35,8 +35,8 @@ static_assert(!std::is_copy_constructible_v<HTTPClient>, "");
 static_assert(std::is_move_constructible_v<HTTPClient>, "");
 static_assert(std::is_move_assignable_v<HTTPClient>, "");

-static const char defaultUserAgentPstr[] PROGMEM = "ESP8266HTTPClient";
-const String HTTPClient::defaultUserAgent = defaultUserAgentPstr;
+const char HTTPClient::defaultUserAgent[] PROGMEM = "ESP8266HTTPClient";
+const size_t HTTPClient::defaultUserAgentLength = sizeof(HTTPClient::defaultUserAgentLength) - 1;

 static int StreamReportToHttpClientReport (Stream::Report streamSendError)
 {
@@ -901,7 +901,7 @@ bool HTTPClient::sendHeader(const char * type)
         header += String(_port);
     }
     header += F("\r\nUser-Agent: ");
-    header += _userAgent;
+    header.concat(_userAgent.c_str(), _userAgent.length());

     if (!_useHTTP10) {
         header += F("\r\nAccept-Encoding: identity;q=1,chunked;q=0.1,*;q=0");

Since this does not involve global defaultUserAgent string and directly concats from the pointer with a known length. Notably, this also raises the question why flashstrings were not something like struct fstr { const char* ptr; size_t length; } in the first place so we don't have to strlen_P needlessly. edit: Like:

struct PgmString {
    const char* const ptr;
    size_t size;
};

#define PSTR(X) (__extension__({\
    static const char __pstr__[] = (X);\
    PgmString{&__pstr__[0], sizeof(__pstr__)};}))

void func(PgmString&& string) {
    std::cout << string.ptr << " (" << string.size << ")" << '\n';
}

int main() {
    func(PSTR("hello world"));
}

@earlephilhower earlephilhower added this to the 3.1 milestone Sep 5, 2021
Copy link
Collaborator

@d-a-v d-a-v left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, Nothing jumps at me.
I see this change as an encapsulating class for WiFiClient which forbids copy constructor, allows and implements fake move constructor (of a pointer) and calling ->stop() on destruction.
Thanks for the time spent on this !

@mcspr
Copy link
Collaborator Author

mcspr commented Sep 6, 2021

I see this change as an encapsulating class for WiFiClient which forbids copy constructor, allows and implements fake move constructor (of a pointer) and calling ->stop() on destruction.

Pretty much, yes. Looking at this again though...

  • httpclient's _client becomes unique_ptr<WiFiClient>
  • instead of _client = client in the begin(WiFiClient& client), it becomes _client = std::make_unique<WiFiClient>(client);, also solving the issue with the order of object declaration that is mentioned in the TODO
  • NonOwningClientPtr is removed as unneeded piece of code

edit:

consider that WiFiClient implements WiFiClient(WiFiClient&&) and operator=(WiFiClient&&), in a way that the underlying connection is simply transferred from one object to another

...not really needed, since it's copying, not moving the pointer here. Still a pretty straightforward addition, too

@mcspr
Copy link
Collaborator Author

mcspr commented Sep 6, 2021

instead of _client = client in the begin(WiFiClient& client), it becomes _client = std::make_unique(client);

Scratch that. My original thought was that WiFiClient already handles the virtual copies... it does not, and it 'slices' when secure connection object is copied. Meaning, it should have something like virtual std::unique_ptr<WiFiClient> T::clone() const for each T implementing WiFiClient stuff - currently that is WiFiClientSecureCtx and the WiFiClientSecure itself (that wraps the ctx).

imo that would make more sense in the current implementation, and possible changes further to simplify the wificlient<->wificlientsecure<->clientcontext relations. Will clean up the wip and push here
(...to also possibly fix the #5734)

so wificlientsecure : public wificlient copy does not slice us back
to a basic wificlient
@mcspr mcspr requested a review from d-a-v September 6, 2021 13:23
Copy link
Collaborator

@d-a-v d-a-v left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not the same PR since last approval.
https://github.com/esp8266/Arduino/pull/8237/files#diff-15722e0b5cf418bd34281d1ee61ad5b3e59c1ed22d9c33317ea98bc84b1a20abR104 is now the core of the changes.

Maybe more comments on what's left to do around these new clone methods would be useful to help refactor allocation for both ssl and non-ssl context api.

(edit tested to work with BasicHttpsClient example)

@paulocsanz
Copy link
Contributor

It seemed everybody was satisfied with the old changes, are the new ones that need review and have a few TODOs the best way? I'm asking as someone with interest in merging this as fast as we can, but I understand the need for being thorough before stabilizing things.

@mcspr mcspr changed the title HTTPClient: refactor constructors and operator= HTTPClient & WiFiClient: refactor constructors and operator=, introduce clone() Sep 18, 2021
@mcspr mcspr changed the title HTTPClient & WiFiClient: refactor constructors and operator=, introduce clone() Movable HTTPClient and fixing WiFiClient copy Sep 18, 2021
@d-a-v d-a-v added the alpha included in alpha release label Sep 23, 2021
@paulocsanz
Copy link
Contributor

paulocsanz commented Oct 9, 2021

Sorry I think I'm missing the point. This seems too complex for a simple change to implement move, emptying the moved from value. Which was implemented and approved. We now are always allocating a client on constructor, instead of referencing one managed by the caller. I do dislike c++ traditional approach of holding a reference, because it's asking for UB, but I do get why, and it's how it works today. To change that maybe we should use a different PR? And it's not a trivial change, it has brought other questions to the table.

When I have some time I'm thinking about reopening my last merge request and update it to be simple, work, and be released fast, with the only tradeoff being, you can't use a moved from client, but doing it loudly and without UB. And here the community can then evolve this idea of cloning the inner client.

Anyway, thanks for the work, I'm just tired of waiting.

@mcspr
Copy link
Collaborator Author

mcspr commented Oct 9, 2021

@paulocsanz The delay is not intentional, but idk what answer you expect for the comment above :/ There is a solution for the issue you reported and confirmed this worked for your use-case, and this is definitely will be a part of the next release, so no need to worry about that. Also note that I am in charge of merging this PR, so the whole blame is on me atm.

And it's not a trivial change, it has brought other questions to the table.

I'd ask for specifics here, not the general feeling.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
alpha included in alpha release
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Unable to set HTTPClient to a std::optional Crash when Debug port is Disabled - HTTPClient destructor issue?
4 participants