forked from TravelMapping/DataProcessing
-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
TMBitset
checklist
#245
Closed
42 of 44 tasks
Labels
Comments
find crashes via for file in sulogs/2023-04-06/siteupdate_{a423,4448,8fba,8d55,7828,96e9,2d1b,5c76}*; do
echo -e "$file\t" `tail -n 1 $file \
| sed 's~\[[0-9.]*\]~~' \
| tr -d .`; done \
| grep -v 'Total run time:' \
| sed -r 's~.*siteupdate_(....)-.*~\1~' \
| uniq |
dead branches
|
commits
|
Functions, operators, etc.
|
This comment was marked as resolved.
This comment was marked as resolved.
HTML
|
hidden/deleted branches
|
Alpha ToDo listSolo items
traveled_tmg_line & traveler_lists cleanup
contig RAM group
TMB group
t_l via TMB group
|
partial specialization experiments |
Benchmark commits:
|
deleted branches (BiggaTomato)
|
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Benchmark:
Route::clinched_by_traveler
: branching vs indirectionBranching is faster.
unit
sBigger is faster.
matching_vertices_and_edges
(Change name? It finds travelers too)Slightly faster on all machines except lab2. (Noise?)
eaa6
:eaa6
; apply4e05
directly to prev commitFor Computing Stats,
96e9
...• is THE top performer so far for lab1, lab3, lab4.
• outperforms
4e05
on BiggaTomato.• on lab2 lags behind top performer
4e05
by only 3.8 ms. Just noise? A very tight race here.5c76
inconclusive, but does appear slightly slower for userlogs, very consistently. Try:6922
for a more apples-to-apples comparison.4a64
).traveler_lists
is nixed altogether?As we iterate thru
traveler_set
, just keep a count instead of creating a vector.The trade-off is doing one more iteration thru
traveler_set
at the end of the traveled graph for traveler names.Is this outweighed by not having to construct the vector and do a modest number of allocations/reallocations?
Preliminary results: helps on BiggaTomato; hurts on lab1. Be interesting to see results on bsdlab, at higher thread counts, and after doing more work on the RAM bandwidth bottleneck.
Final: No speed advantage. Leaving as-is. Maybe re-examine after more RAM bandwidth improvements are implemented.
Try out:
HGEdge
construction slow down if I force an active/preview canonicalHighwaySegment
?~ 0.01 - 0.02 s on BiggaTomato -- 0.9 - 1.9 % more time.
unit
s)The trade-off: Larger units = skip back farther but less often.
|=
performance.f749
underperforms 8-32-bit simple iteration on lab{1..4}; slight lead on BiggaTomato.ec4e
is 1st place for BiggaTomato & lab1; underperformsf749
& even 64-bitf8cf
on lab2.Other machines TBD.
[bits >> 1]
solutionBoth underperform
ec4e
on BiggaTomato & lab1. Successively better on lab2 though, with ternaries 1st place overall &[bits >> 1]
falling between 16 & 32-bit SIO2. Other machines TBD.LOL what if INever mind. Nothing to be gained by doing this.constexpr
the damn thing by brute force?ec4e
perform poorly on Epoch? No. Performs well; outperforms SIO2. Similar to BiggaTomato.|=
Try simplified versions for larger units:
TravelerList::traveler_num
with oneunsigned int* traveler_num = new unsigned int[TravelerList::allusers.size()];
per thread.Index via
for (TravelerList *t : traveler_lists) traveler_nums[t-TravelerList::allusers.data()] = travnum++;
A variant ofLOLNOPE. Different vectors (everything vs subset); indices don't match up.eaa6
with TMBitset[]
operator?Kinda both:
Init TravelerLists at beginning; init segments with size not capacityOr better yet, init segments withTravelerList::ids.size()
Segments set via TMArray::size, set via TravelerList::ids.size()
TMArray<TravelerList>
means threaded construction in place (via placement new) without separate read_list function.Clean up:
ecff
, cmath no longer explicitly needed in HighwayGraph.cppelse
aftercontinue
; parens can be clarifiedDataProcessing/siteupdate/cplusplus/classes/GraphGeneration/HighwayGraph.cpp
Lines 112 to 116 in 80f2724
visibility == 1
check just before that4db4
)maxbits
less useful inSimItOpt2
. Delete it; replace with8*sizeof(unit)
; let-1
and+1
cancel out.bits
should beunsigned char
inpun
branch. Dumb luck that it ran without errors. Fixed forComItOpt
.Never mind. Using a branch that retains
unit
instead.uint8_t
etc.2d1b
) means#include "../../templates/TMBitset.cpp"
not needed in HighwayGraph.h. Can lose the include guard too LOL. 🤠segments
SQL table: only iterateclinched_by
for active/preview systemsc8b9
) Check for diffs due to constant folding:8/sizeof(unit)
!=
and|=
operators(unit)1
before<<
, lest the unsigned long bug make a reappearance. Seef8cf
onSimItOpt2
branch.4db4
) Switch: are additions infor
loop reordered? Make it look pretty!()
andadd_value
until neededThe text was updated successfully, but these errors were encountered: