-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature: mark a Func as no_profiling, to prevent injection of profiling. (2nd implementation) #8143
Conversation
c030bb1
to
25f9990
Compare
src/Profiling.cpp
Outdated
} | ||
} | ||
|
||
const Function *lookup_function(const string &name) const { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Function is already a pointer type, so this should probably just return a Function by value. Alternatively you could return a const Function &
, but I'm not sure how to satisfy the compiler in the case where you need to return something valid after the internal_error. Maybe just return env.begin()->second?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I first had a reference, but I in case the lookup fails, I need to satisfy the compiler and give it a nullptr. Idk if we can easily create a nullptr-Function& instead of a Function* being nullptr?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suggest satisfying the compiler by returning the first Function in the environment: env.begin()->second.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a default constructor I see. So for this unreachable code, a simple return {};
looks a bit nicer I think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't that returning a reference to a local? Given that it's unreachable, whatever makes the compiler shut up is fine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't that returning a reference to a local?
AFAICT, no. It's invoking the default constructor (the one literally defined with = default;
), which just initialized the FunctionPtr
inside to be default-constructed, meaning that there are just some pointers put to nullptr
I assume.
@@ -713,6 +713,7 @@ table Func { | |||
trace_stores: bool = false; | |||
trace_realizations: bool = false; | |||
trace_tags: [string]; | |||
no_profiling: bool = false; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably need to bump the serialization minor version number
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm actually clueless where to find that number.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm a bit confused too. @TH3CHARLie ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it gets refactored earlier to here:
Line 9 in 8cc4f02
enum SerializationVersionMajor: int { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@TH3CHARLie looking at those inline comments, suggest that I shouldn't touch anything, as we are still at "unstable"? Do I bump the patch version instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah I believe so
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@abadams @TH3CHARLie It all looks suspicious, as the Patch version was at 0 (zero). Serialization version 18.0.0 seems suspicious, but maybe is what we are after? I just bumped it to 18.0.1.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It was at zero since we recently had a release and introduced this version numbering system and hasn't added anything new to the format, I don't understand what's suspicious here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense! Thanks!
…of the serialization.
AFAICT the failing build is unrelated, right? This might conflict with abadams/per_instance_profiling, in which case I'm happy to adapt this PR after we merge in Andrew's work. |
Yes, the failure is unrelated (fixed by #8148) |
Replaces #8136 with a cleaner implementation.
@abadams I took your suggestion, and this implementation is much better. Very small changes, and the issue I reported in the previous one is gone. The trick with the "environment" was a good lesson!
Additionally, I managed to also clean up the report by having the stack-allocated
Allocate
/Free
nodes also have their allocation contribute to the enclosingFunc
. Doing that changed my profiling report from (w_0
,w_1
,w_2
,max_channel
,lum
,lum_pow3
,r
,g
,b
all marked withno_profiling()
, but still producing lines in the report):to this (note how the stack size contributions (132 + 8 * 64 = 644) moved up into the enclosing
Func
s:min_field_hpass
: 1412 - 1280 = 132 bytes extra,ratio_map
384 - 0 = 384 bytes extraThe total difference is less than 644 bytes, because the lifetimes of some of the variables were actually fully-disjunct, causing them to not actually have a peak stack usage of that much.):