-
Notifications
You must be signed in to change notification settings - Fork 3.4k
preprocess_pthread_main #5494
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
preprocess_pthread_main #5494
Conversation
return filter(lambda x: '#if' in x or '#elif' in x, open(file, 'r').readlines()) | ||
|
||
def run_c_preprocessor_on_file(src, dst): | ||
# Run LLVM's C preprocessor on the given file, expanding #includes and variables found in src/setting.js. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks like this overlaps with part of #5296 (the preprocessing bit from python). As discussed there, how about doing this in python - just iterate over lines, find lines with #if
and their matching #endif
, and compute their condition, etc.?
I don't think we need the full c preprocessor here, and the complexity (as discussed in the comment here) worries me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I haven't read #5296 yet (I guess I should) but your suggestion alarms me. We really need to be decreasing rather than increasing the number of places and ways that we munge the code. e.g. we have the CPP already in lots of places, I'd rather use that than put yet another custom ad-hoc bit if python code that parses the input and spits it out again.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how about doing this in python
I did not want to create a second implementation of the partial C preprocessor in src/parseTools.js
. Using LLVM to preprocess can be considered Correct out of the box, so that won't need to care about adding tests, especially tests to check that python preprocessor and src/parseTools.js
would agree. Writing a preprocessor is hard (#5457, #5458, [3]). I took a pass long time ago at fixing these types of issues with a more complete preprocessor in JS (#1969), but its complexity was too high that we did not want to land it.
I don't think I'm able to implement a preprocessor in Python that I would feel good about, that would either have less complexity than the approach at #1969 had, or if settling for less supported features, not have various similar kinds of limits that people eventually run into with the current JS preprocessor. That is why I felt really glad to notice I can just reuse LLVM's preprocessor here, albeit with a slight change of behavior. Getting the ability to do #define
, #if A || B
and #if !A && B
for "free" are big improvements, so I'd strongly like to use LLVM to be able to get to use those.
Before implementing the whitelist heuristic, I wrote a blacklisting heuristic, with manually listed identifiers that should not get expanded, which only contained TOTAL_MEMORY
for the purposes of pthread-main.js
and Fetch.js
. That felt a bit more dirty than this.
Some alternatives to be able to drop the whitelisting mechanism:
- expand the
src/settings.js
defines to code with an alias. For example, we could give all defines there aS_
prefix, so thatTOTAL_MEMORY
would become-DS_TOTAL_MEMORY=1
for the preprocessor. - Don't expand
src/settings.js
defines to code automatically, but have a directive insrc/Settings.js
, with something likeexportToPreprocessor('TOTAL_MEMORY', (optional)'S_TOTAL_MEMORY');
which would tell this machinery to take theTOTAL_MEMORY
variable and preprocess with it, possibly using the specified optional alias. - Expand
src/settings.js
defines automatically, but add ahideFromPreprocessor('TOTAL_MEMORY');
directive or similar to allow removing some fields, or aexportToPreprocessor('TOTAL_MEMORY', (optional)'S_TOTAL_MEMORY');
as well to allow routing the defines under other names.
Overall, running C preprocessor on our .js files is the best way we have to offer as efficient as possible form of "don't pay for what you don't use" type of dead code elimination, to be able to get to tiny compiled applications that a lot of people are looking for. Having the "full" LLVM preprocessor available to do that would be quite helpful here. Is there some variant above that you think would work better than the current whitelist mechanism in this PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see what you're saying about the benefits of the LLVM preprocessor. It would be nice to use it, but I think we should only do so if (1) we can do it without hacks, and (2) we have one preprocessor for everything (or at least if we have 2 then they are simple and have identical behavior).
So I'm saying that if we want to use the LLVM preprocessor, let's replace the js one too. I'm not sure how that would work in terms of calling it from inside the js compiler though. Aside from that are the issues you mentioned with making it work without hacks, might need some refactoring of things (like no longer using {{{ }}}
stuff, etc.). But maybe this could be done.
Alternatively, I still think a limited preprocessor - basically the one we have in js right now - is simple and gives 95% of the benefits. Instead of adding another preprocessor, we can run the existing one in js for our purposes here, we would just need a little refactoring to make it usable standalone so it is callable from python (this was proposed in #5296). This seems like very little work, but I admit it isn't as good a final result as using the LLVM preprocessor.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(1) we can do it without hacks,
I am not sure the "preprocess only src/settings.js defines that are present on lines starting with #if
and #elif
" warrants to be called a hack over the JS one, since that was added to implement exactly the same semantic that the existing JS one gives so that these two align.
(2) we have one preprocessor for everything
This sounds like a good direction, although it is too much to work on right now, so let me pause here, and splice the fix to the multithreading stack cookie setting to a separate PR, and pick this up when I have the cycles to start this kind of refactoring.
4d78742
to
9a0b8cc
Compare
… pthread-main.js. Add writeStackCookie(); directive to pthread-main when STACK_OVERFLOW_CHECK is defined.
9a0b8cc
to
d857b26
Compare
PR #7667 adds preprocessing support to pthreads, that supercedes this. |
Fix hardcoded preprocessing of Fetch worker, and add preprocessing to pthread-main.js. Add writeStackCookie(); directive to pthread-main when STACK_OVERFLOW_CHECK is defined.