-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reduce static footprint #34316
base: master
Are you sure you want to change the base?
Reduce static footprint #34316
Conversation
71e33da
to
695b6b4
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I only looked at Global.h for now, in case changes there affect how the other bits work, but I tried this PR on Mac with #define CHIP_CONFIG_DYNAMIC_GLOBALS 1
and it fails unit tests with ASAN failures.... So that's worth looking into, perhaps. But maybe after we settle on how Global.h should work.
PR #34316: Size comparison from 8ebace8 to 695b6b4 Increases above 0.2%:
Full report (82 builds for bl602, bl702, bl702l, cc13x4_26x4, cc32xx, cyw30739, efr32, esp32, linux, mbed, nxp, psoc6, qpg, stm32, telink, tizen)
|
847d093
to
7a71686
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for taking the Global<>
path :)
PR #34316: Size comparison from 34462f1 to 7a71686 Increases above 0.2%:
Full report (82 builds for bl602, bl702, bl702l, cc13x4_26x4, cc32xx, cyw30739, efr32, esp32, linux, mbed, nxp, psoc6, qpg, stm32, telink, tizen)
|
7a71686
to
5bb635a
Compare
How can I reproduce this? |
5bb635a
to
a405bfd
Compare
dd9b1f9
to
384e040
Compare
384e040
to
b314b5c
Compare
{ &sDebugEventBuffer[0], sizeof(sDebugEventBuffer), ::chip::app::PriorityLevel::Debug }, | ||
{ &sInfoEventBuffer[0], sizeof(sInfoEventBuffer), ::chip::app::PriorityLevel::Info }, | ||
{ &sCritEventBuffer[0], sizeof(sCritEventBuffer), ::chip::app::PriorityLevel::Critical } | ||
{ sDebugEventBuffer.get(), CHIP_DEVICE_CONFIG_EVENT_LOGGING_DEBUG_BUFFER_SIZE, ::chip::app::PriorityLevel::Debug }, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can easily get out of sync with the definition of sDebugEventBuffer (and similar in MessageDefHelper). The size()
bit on Global was a good idea; it was just operator[]
that I didn't think we needed...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Having the arrays inside a struct also means sizeof
continues to work on the member as expected.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am still getting ASAN failures on Mac with this PR if I set
#define CHIP_CONFIG_DYNAMICALLY_ALLOCATED_GLOBALS 1
and then do:
./gn_build.sh is_asan=true enable_host_clang_build=false enable_host_gcc_mbedtls_crypto_tests=false enable_host_clang_boringssl_crypto_tests=false chip_enable_dnssd_tests=false
I prodded at this, and the issue is that at least PlatformManager has the following places where it does not properly initialize members, but that was being covered up by 0-init of static things:
- In the Darwin
PlatformManagerImpl
,mRunLoopSem
should be getting initialized to nullptr and isn't. This was the proximate cause of the test failure. - In
GenericPlatformManagerImpl
,mAppEventHandlerList
should be getting initialized to nullptr and isn't. I'm surprised anything real actually passed with that problem....mMsgLayerWasActive
is also not initialized in the constructor, but looks like it gets initialized in_InitChipStack
, so that seems OK.
The question is: do any of the other things that this behavior change applies to have these sorts of problems? Could probably use an audit.
@@ -62,7 +62,8 @@ class ReportSchedulerImpl : public ReportScheduler | |||
public: | |||
using Timeout = System::Clock::Timeout; | |||
|
|||
ReportSchedulerImpl(TimerDelegate * aTimerDelegate); | |||
ReportSchedulerImpl() = default; | |||
ReportSchedulerImpl(TimerDelegate * aTimerDelegate) : ReportScheduler(aTimerDelegate) {} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why remove the nullptr check for this constructor?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a default constructor now and an Init method to set the timer delegate ptr. This constructor is probably only used in some tests. The idea that once on object is constructed there will be a non-null pointer to a timer delegate in it is not valid anymore, so why keep that check?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How and where would the Init method get called is the constructor using a TimerDelegate is used?
So far, it doesn't seem it happens. Also, if the constructor accepts a TimerDelegate, how is the Init method supposed to be used?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Init() method is only used right above your other comment (#34316) and only for the sReportScheduler in CommonCaseDeviceServerInitParams, which is always default constructed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Which is why the nullptr check should remain in this constructor right? Otherwise the other places in the code using something else than the default constructor just lost their nullptr check.
} | ||
|
||
// Injection of report scheduler WILL lead to two schedulers being allocated. As recommended above, this should only be used | ||
// for IN-TREE examples. If a default scheduler is desired, the basic ServerInitParams should be used by the application and | ||
// CommonCaseDeviceServerInitParams should not be allocated. | ||
if (this->reportScheduler == nullptr) | ||
{ | ||
reportScheduler = &sReportScheduler; | ||
sReportScheduler->Init(&sTimerDelegate.get()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will not get called on injection of a report scheduler, we can either keep the nullptr check in the constructor that takes a TimerDelegate as a param, or have an EnsureInit
that takes no parameter and that gets called in an else
condition here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure I get your point. Why initialise that global at all, especially when it's not going to be used? The whole point of this PR is to allow us to not even allocate the memory for it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this->reportScheduler == nullptr
only checks if there were no report Scheduler injected. If this check fails, that means the reportScheduler was injected somewhere else, so why not check that it is initialized?
something like:
reportScheduler.EnsureInit()
to make sure that whatever is injected has been properly generated?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doesn't the same logic apply to all other things that can be injected? My impression was you need to Init() them before injecting (in this case you can alternatively construct it with a pointer). I don't see EnsureInit() used for any of the other ones...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As noted above, I think we should avoid changing how ReportScheduler is initialized.
ReportScheduler(TimerDelegate * aTimerDelegate) : mTimerDelegate(aTimerDelegate) {} | ||
|
||
virtual ~ReportScheduler() = default; | ||
|
||
void Init(TimerDelegate * aTimerDelegate) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How is this method supposed to be used with this constructor:
ReportScheduler(TimerDelegate * aTimerDelegate) : mTimerDelegate(aTimerDelegate) {}
?
It would overwrite the TimerDelegate previously used, which defeats the purpose of having a constructor that accepts a TimerDelegate.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, right, Global doesn't do anything except default constructors, so I added this Init method instead. I didn't delete the constructor with the pointer as I suspect it's already used by people. The intention was you can use the default constructor and then Init() later, or you can use the constructor with a pointer, in which case you don't need to Init().
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the better pattern is to have the Global be of a sub-class or wrapper struct with the default constructor we want.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think overall this is looking really good, just a few comments.
uint32_t gPrettyPrintingDepthLevel = 0; | ||
char gLineBuffer[CHIP_CONFIG_LOG_MESSAGE_MAX_SIZE]; | ||
Global<char[CHIP_CONFIG_LOG_MESSAGE_MAX_SIZE]> gLineBuffer; | ||
size_t gCurLineBufferSize = 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would be nicer to do something like
struct PrettyPrintState {
uint32_t mDepthLevel = 0;
char mLineBuffer[CHIP_CONFIG_LOG_MESSAGE_MAX_SIZE];
size_t mCurLineBufferSize = 0;
};
Global<PrettyPrintState> gPrettyPrint;
We can probably avoid adding support for arrays to Global that way too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This pattern also avoids the overhead for the extra Global
s in the lazy / dynamic cases.
static uint8_t sCritEventBuffer[CHIP_DEVICE_CONFIG_EVENT_LOGGING_CRIT_BUFFER_SIZE]; | ||
static ::chip::PersistedCounter<chip::EventNumber> sGlobalEventIdCounter; | ||
static ::chip::app::CircularEventBuffer sLoggingBuffer[CHIP_NUM_EVENT_LOGGING_BUFFERS]; | ||
static Global<uint8_t[CHIP_DEVICE_CONFIG_EVENT_LOGGING_INFO_BUFFER_SIZE]> sInfoEventBuffer; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As above, maybe these should be grouped into a struct constructed by a single Global
?
{ &sDebugEventBuffer[0], sizeof(sDebugEventBuffer), ::chip::app::PriorityLevel::Debug }, | ||
{ &sInfoEventBuffer[0], sizeof(sInfoEventBuffer), ::chip::app::PriorityLevel::Info }, | ||
{ &sCritEventBuffer[0], sizeof(sCritEventBuffer), ::chip::app::PriorityLevel::Critical } | ||
{ sDebugEventBuffer.get(), CHIP_DEVICE_CONFIG_EVENT_LOGGING_DEBUG_BUFFER_SIZE, ::chip::app::PriorityLevel::Debug }, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Having the arrays inside a struct also means sizeof
continues to work on the member as expected.
app::DefaultTimerDelegate CommonCaseDeviceServerInitParams::sTimerDelegate; | ||
app::reporting::ReportSchedulerImpl | ||
CommonCaseDeviceServerInitParams::sReportScheduler(&CommonCaseDeviceServerInitParams::sTimerDelegate); | ||
Global<KvsPersistentStorageDelegate> CommonCaseDeviceServerInitParams::sKvsPersistenStorageDelegate; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of all these separate globals, you can create a DefaultCommonCaseDeviceServerInitParams
class inside Server.cpp
(could be a sub-class of CommonCaseDeviceServerInitParams
I think) with an appropriate default constructor, and a single Global
constructing that. Then reference that global in InitializeStaticResourcesBeforeServerInit
(which should probably move into the .cpp file) instead of those individual globals.
This also avoids having to add an Init
method to ReportScheduler
.
(A simpler existing example of this pattern is struct ControllerAccessControl
.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thinking about this more, it may be worthwhile to look at some of those individual default implementation objects and see how large they are, to determine if it makes sense to keep them as separate Globals. However the pattern of wrapping or sub-classing the ReportScheduler should still be used to avoid having to change the initialization.
} | ||
|
||
// Injection of report scheduler WILL lead to two schedulers being allocated. As recommended above, this should only be used | ||
// for IN-TREE examples. If a default scheduler is desired, the basic ServerInitParams should be used by the application and | ||
// CommonCaseDeviceServerInitParams should not be allocated. | ||
if (this->reportScheduler == nullptr) | ||
{ | ||
reportScheduler = &sReportScheduler; | ||
sReportScheduler->Init(&sTimerDelegate.get()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As noted above, I think we should avoid changing how ReportScheduler is initialized.
@@ -33,8 +34,8 @@ namespace Internal { | |||
template <class ImplClass> | |||
chip::Inet::EndPointManager<Inet::UDPEndPoint> & GenericConnectivityManagerImpl_UDP<ImplClass>::_UDPEndPointManager() | |||
{ | |||
static chip::Inet::UDPEndPointManagerImpl sUDPEndPointManagerImpl; | |||
return sUDPEndPointManagerImpl; | |||
static Global<chip::Inet::UDPEndPointManagerImpl> sUDPEndPointManagerImpl; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please move the Global out of the method, otherwise we end up with the compiler-generated lazy init + whatever Global does.
* | ||
* The default is to use static allocation. | ||
*/ | ||
#ifndef CHIP_CONFIG_DYNAMICALLY_ALLOCATED_GLOBALS |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we call rename this to CHIP_CONFIG_GLOBALS_DYNAMIC
, or something that starts with CHIP_CONFIG_GLOBALS_
anyway?
This also semantically implies CHIP_CONFIG_GLOBALS_LAZY_INIT
and/or conflicts with that setting if we take CHIP_CONFIG_GLOBALS_LAZY_INIT
to mean "static but lazy". Maybe the simplest thing is to just check if CHIP_CONFIG_GLOBALS_DYNAMIC
and CHIP_CONFIG_GLOBALS_LAZY_INIT
are both 1 and #error
if so.
@@ -97,25 +97,70 @@ class Global | |||
~Global() = default; | |||
|
|||
private: | |||
#if CHIP_CONFIG_DYNAMICALLY_ALLOCATED_GLOBALS | |||
std::remove_extent_t<T> * mStorage = nullptr; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There don't seem to be any strong use cases for Global
with arrays, so I would prefer to just static_assert
here that T isn't an array and have the code be simpler.
These changes reduce memory usage in our firmware by about 30K when matter is disabled.
This is important to us, as matter is optional and not the only stack we support, making large static and global variables too expensive.