Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

mark FPs while writing subgraph files #379

Open
3 tasks
yakra opened this issue Dec 12, 2020 · 0 comments
Open
3 tasks

mark FPs while writing subgraph files #379

yakra opened this issue Dec 12, 2020 · 0 comments

Comments

@yakra
Copy link
Contributor

yakra commented Dec 12, 2020

Single-threaded datacheck FP matching is already pretty efficient:

  • Currently roughly 2/3 of datacheck entries are FPs.
  • datacheckerrors->entries is sorted asciibetically, and datacheckfps.csv is sorted pretty darn close to asciibetically.
  • Thus when an FP is matched, it's found & erased right off the beginning of the FP list, very little searching necessary.
  • When there's no FP entry to match, we do search the whole list, but at least it's continually shrinking.

Still, it's about 5% of total execution time on lab2.
It doesn't lend itself well to multithreading:

  • How to avoid data races when reading & erasing from the datacheckfps list?
  • We could use a mutex, but the performance penalty would probably be a big one.
  • Attempting to avoid this by splitting datacheckerrors->entries and datacheckfps into n chunks to feed to n threads hurts our ability to "cross off" items right away.

A better approach?
Some datacheck errors are flagged during HighwayGraph construction, but after the structure is built out, we're all set.

  • Rather than lose efficiency with any of the above approaches, we can process datacheck FPs in the background as we begin to write subgraphs.
  • Spawn one less SubgraphThread to start, and have our datacheck FP thread spawn one when it finishes up, just like MasterTmgThread does.
  • This will require getting a bit more clever when threading is enabled but -t 1 is specified.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant