Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WString: direct operator overloads instead of StringSumHelper #7781

Merged
merged 23 commits into from
Mar 21, 2021

Conversation

mcspr
Copy link
Collaborator

@mcspr mcspr commented Dec 19, 2020

As noticed in the #7758 by @jjsuwa-sys3175, StringSumHelper is redundant. This PR reworks operator+ using the modern C++ features.
WIP as operators may need some more work tweaking for size, quick test of FSBrowser shows ~200 more bytes in the rom (but, it is not a direct port of the old class, so that was kind of expected). And I also wanted to see what the CI says about it first.

  • Try to use String directly when overloading String operator+. I initially tried overloading like String::operator+(T) && / String::operator+(T) const, but that did not work as expected. Will need to verify that that actually does not work though, it may have been some other issue
  • Avoids really weird const_cast<StringSumHelper&> when we want to reuse the buffer via operator+=
  • Provide String::insert(position, source) for inner methods. This also technically allows to re-use source buffer if capacity is larger than the current buffer, so in the chained + operation it may replace the current buffer
  • Provide String::operator=(char). Right now, with a construct like:
String x;
x = ','; // this is char

C++ decides to cast , to a StringSumHelper first, since the most obvious constructor String(char) is marked as explicit and requires x = String(','); instead of the former. This became apparent after building FSBrowser example with the new operators in place, where it may choose any numeric type from the SumHelper class despite String declaring everything as explicit. I very much doubt that was the intended behaviour

  • template String &String::operator+=(T&&) forwarding to the String::concat(T)
  • template String operator+(String&& / const String&, T&&) forwarding to the String::concat(T)
    Note that T&& is allowed to be optimized to a basic copy via -fipa-sra (included in -O s, 2 and 3) when dealing with pod arguments.
    Any errors show up as error: call of overloaded ‘concat(some unknown type)’ is ambiguous
  • Tweaked header inclusion to use <c...> variants. Those are not real C-lib headers anyway with the current G++
  • StringSumHelper name is still present b/c at least ArduinoJson expects it for template specializations

Comment on lines +420 to +422
inline String operator +(const __FlashStringHelper *lhs, const String &rhs) {
return reinterpret_cast<const char*>(lhs) + rhs;
}
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As to why - the rest of the .cpp, const char* is expected to sometimes be in flash, meaning it is also valid to do PSTR("123") + String(456)

cores/esp8266/WString.h Outdated Show resolved Hide resolved
@mcspr
Copy link
Collaborator Author

mcspr commented Dec 23, 2020

Just to compare with something more that just fsbrowser, testing letscontrolit/ESPEasy@c530b08 via pio run -e custom_beta_ESP8266_4M1M and running sizes.py script show +976 bytes in the flash

Original WString.{h,cpp}

ICACHE : 32768           - flash instruction cache
IROM   : 898300          - code in flash         (default or ICACHE_FLASH_ATTR)
IRAM   : 31672   / 32768 - code in IRAM          (ICACHE_RAM_ATTR, ISRs...)
DATA   : 1688  )         - initialized variables (global, static) in RAM/HEAP
RODATA : 4860  ) / 81920 - constants             (global, static) in RAM/HEAP
BSS    : 37880 )         - zeroed variables      (global, static) in RAM/HEAP

PR

ICACHE : 32768           - flash instruction cache
IROM   : 899276          - code in flash         (default or ICACHE_FLASH_ATTR)
IRAM   : 31672   / 32768 - code in IRAM          (ICACHE_RAM_ATTR, ISRs...)
DATA   : 1688  )         - initialized variables (global, static) in RAM/HEAP
RODATA : 4860  ) / 81920 - constants             (global, static) in RAM/HEAP
BSS    : 37880 )         - zeroed variables      (global, static) in RAM/HEAP

For example, moving from inline to a function String operator +(const String &lhs, const String &rhs); does -80 bytes
(but, not sure where to look for more atm)

@devyte
Copy link
Collaborator

devyte commented Dec 26, 2020

Beware removing inline, there are cases where it is warranted due to potential uses from IRAM code, i. e. : inlining from an IRAM function will make the inlined function IRAM, while inlining from a flash function will make the same inlined function flash code.
Removing the inline makes the function flash code, and therefore it can't be used from an ISR anymore. That means that your commit that removes inline can have a very subtle impact.

In the general case, containers that have contiguous memory, such as String, std::string, std::vector, may do a realloc when appending. The realloc operation has very limited functionality from an ISR, because realloc is not interrupt safe.
The conditions under which such use is ok are very specific and limited, and most users should steer away from it. Therefore, I think it is ok to make operator+() flash-only by removing inline in this particular case, but please ask before adding/removing the inline keyword from function signatures, so that it can be discussed in each case.

`cannot bind bit-field '...' to 'signed char&'
`cannot bind bit-field '...' to 'unssigned char&'

noticed in both tasmota and espeasy, where this generates a webpage
content from some status struct containing bitfields
@mcspr mcspr changed the title [WIP] WString: direct operator overloads instead of StringSumHelper WString: direct operator overloads instead of StringSumHelper Jan 6, 2021
@mcspr
Copy link
Collaborator Author

mcspr commented Jan 6, 2021

In the general case, containers that have contiguous memory, such as String, std::string, std::vector, may do a realloc when appending. The realloc operation has very limited functionality from an ISR, because realloc is not interrupt safe.
The conditions under which such use is ok are very specific and limited, and most users should steer away from it. Therefore, I think it is ok to make operator+() flash-only by removing inline in this particular case, but please ask before adding/removing the inline keyword from function signatures, so that it can be discussed in each case.

Every inline is still as-is, comment above meant to check the impact of used operations across String-heavy apps. It is varying though, so it's not really measuring anything useful. As-is, did not it already call the flash-only concat?

Removing WIP btw, please take a look!

mcspr added a commit to mcspr/esp8266-Arduino that referenced this pull request Feb 3, 2021
@d-a-v d-a-v added this to the 3.0.0 milestone Feb 19, 2021
@d-a-v
Copy link
Collaborator

d-a-v commented Feb 19, 2021

Sorry for stepping on your toes @mcspr with #7863 and #7888

@mcspr
Copy link
Collaborator Author

mcspr commented Feb 19, 2021

@d-a-v Not a big problem. Only patch required here is for the new long long concat, operator+ code is automatically generated by the template

@d-a-v d-a-v added the merge-conflict PR has a merge conflict that needs manual correction label Mar 5, 2021
Copy link
Collaborator

@devyte devyte left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nothing jumps out at me, approved!

@d-a-v d-a-v removed the merge-conflict PR has a merge conflict that needs manual correction label Mar 21, 2021
@d-a-v d-a-v merged commit 0894b51 into esp8266:master Mar 21, 2021
@mcspr
Copy link
Collaborator Author

mcspr commented Mar 21, 2021

Also sorry for sleeping on this :)

On a related note - can we further break Arduino String? e.g.

  • unsigned char concat() -> bool concat()
  • unsigned char reserve() -> bool reserve()
  • StringIfType can probably (iiutc) be replaced with explicit operator bool() const

@mcspr mcspr deleted the string/no-sum-helper branch March 21, 2021 13:55
@d-a-v
Copy link
Collaborator

d-a-v commented Mar 21, 2021

can we further break Arduino String? e.g.

Do you mean "fix" it ?
I guess we can

@mcspr
Copy link
Collaborator Author

mcspr commented Mar 22, 2021

Do you mean "fix" it ?

Indeed, s/break/break API, but fix to make a better implementation/. Breakage would be with the API though, this this is what is expected to ship with the Arduino original Cores currently:
https://github.com/arduino/ArduinoCore-API/blob/master/api/String.h

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants