Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

concurrency detection bug #645

Closed
yakra opened this issue Oct 1, 2024 · 11 comments · Fixed by #647
Closed

concurrency detection bug #645

yakra opened this issue Oct 1, 2024 · 11 comments · Fixed by #647
Labels

Comments

@yakra
Copy link
Contributor

yakra commented Oct 1, 2024

Uncovered some concurrency problem while trying to expand the HIDDEN_JUNCTION datacheck to catch cases that'd produce a segment name mismatch error (#91, #178, #603) in datacheck mode before making it to the graph generation process.

  • RailwayData (eeea42f): ME DE +DIV_Por1 Por is not flagged as concurrent with ME DE Por +DIV_Por2.
  • Meanwhile, PA US19TrkPit I-376(69B)_S I-376(69A) is concurrent with PA US19TrkPit I-376(69A) I-376(69B)_N as expected (7a955f2).

The fact that the DIV points are hidden does not affect this.

The first thing that comes to mind is that ME DE is not concurrent with any other route, whereas PA US19TrkPit has 4 others in the mix. IIRC this fact caused wacky hijinks in my first epic concurrency debugathon back in 2018. OK yeah, that's it. Thank God, I didn't wanna do another one.

{ if (!s.concurrent && p->colocated && p[1].colocated)

In this case p[1] is not colocated. It's a single point on a single route that's doing a 180° turn.
Something like this would be overkill & less efficient; I should be able to move an if and bung in an else block.

@yakra yakra added the bug label Oct 1, 2024
@yakra
Copy link
Contributor Author

yakra commented Oct 1, 2024

Fix is implemented. 3 new lines of code, not counting a comment and one that's just a }.
Looks like these cases are quite common in RailwayData; 47 of them. Still checking my work.

@yakra
Copy link
Contributor Author

yakra commented Oct 2, 2024

checklist

  • stats CSVs
  • userlogs
    Overall mileages in systems & regions have gone down due to more multiplexes detected. Makes sense.
    Do spot checks of some changed stats. Do regional & overall a/p miles check out too?
    • bartpetat on belic in NLD: IC35
    • bejacob on GlaDis (Glacier Discovery) (up 0.66 via augment)
    • communityrailpartnerships on gbngw in ENG: LooeVlyLn
    • communityrailpartnerships on gbnle (ENG only): BitLn
    • communityrailpartnerships on gbnnr (ENG only): multiple routes
    • hotdogPi on Downeaster, ME & ConnectedRoute (augmented 0.68 connected, 0.69 chopped)
    • M3200 on FarNorLn (Far North Line) (up 1.6 via augment)
    • M3200 on Downeaster, ME & ConnectedRoute (augmented 0.68 connected, 0.69 chopped)
    • M3200 on Silver Star, FL & ConnectedRoute (up 14.15 via augment)
    • M3200 on GlaDis (Glacier Discovery) (up 0.66 via augment)
    • michih on deuhhs in DEU-HH: S1
    • michih on deuulrs in DEU-BW: RS21
    • michih on deuhalt (DEU-ST only): 5Hal
    • mojavenc has unchanged travels on belic in NLD. Yep, that's fine. :)
    • neptun on czedukt in CZE: multiple routes
    • neptun on czeep (CZE only): multiple routes
    • oscar on WinChu (Winnipeg-Churchill): MB & ConnectedRoute (up 0.74 via augment)
    • scenicrailbritain.log on gbngw in ENG: LooeVlyLn
    • scenicrailbritain.log on gbnle (ENG only): BitLn
    • scenicrailbritain.log on gbnnr (ENG only): GloLn
    • scenicrailbritain.log on gbnsr in SCT: FarNorLn
    • selectric on usaamtk in ME: Downeaster
  • concurrencies.log
  • DB
    • clinched
      Line counts mismatch 212997 <-> 213005
    • overallMileageByRegion
    • systemMileageByRegion
    • clinchedOverallMileageByRegion
    • clinchedSystemMileageByRegion
    • clinchedConnectedRoutes
    • clinchedRoutes
    • graphs
  • graphs
    should have fewer segments. Same # vertices & travelers, right?

@yakra
Copy link
Contributor Author

yakra commented Oct 2, 2024

stats CSVs

#diff size for each file not including final TOTAL row
files=$( \
for f in $(tail -n +3 d4a3_conc.l0g | head -n 34 \
           | sed -r -e "s~$e\[[0-9]+m~~" -e 's~.*_d4a3/stats/(.+) and .*~\1~'); do
  printf "%02i $f\n" $(diff <(head -n -1 _d4a3/stats/$f) <(head -n -1 _conc/stats/$f) | wc -l);
done)

#get deltas from the last lines of those without diffs, formatted to paste into the smbr spreadsheet tab
for f in $(echo "$files" | grep ^00 | cut -f2 -d' '); do
  # get column numbers
  cols=$(diff <(tail -n 1 _d4a3/stats/$f | sed -e 's~,~\n\n~g' -e 's~TOTAL~\n0~') \
              <(tail -n 1 _conc/stats/$f | sed -e 's~,~\n\n~g' -e 's~TOTAL~\n0~') \
         | egrep '([0-9]+)c\1' | sed -r 's~([0-9]+)c\1~\1/2~' | bc | tail -n +2) # skip trav total column
  for col in $cols; do
    rg=`head -n 1 _d4a3/stats/$f | cut -f$col -d,`
    d=`paste -d- <(tail -n 1 _d4a3/stats/$f | cut -f$col -d,) \
                 <(tail -n 1 _conc/stats/$f | cut -f$col -d,) | bc`
    echo $f | sed "s~-all.csv~\t$rg\t$d~"
  done
done

#do those with traveler diffs
for f in $(echo "$files" | grep -v ^00 | cut -f2 -d' '); do
  #get row numbers
  rows=$(diff _d4a3/stats/$f _conc/stats/$f \
         | egrep '([0-9]+,[0-9]+)c\1|([0-9]+)c\2' \
         | sed -e 's~c.*~~' -e 's~,~ ~')
  for row in $rows; do
    l1=`tail -n +$row _d4a3/stats/$f | head -n 1`
    l2=`tail -n +$row _conc/stats/$f | head -n 1`
    t=`echo $l1 | cut -f1 -d,`
    # get column numbers
    cols=$(diff <(echo; echo $l1 | sed -e 's~,~\n\n~g') \
                <(echo; echo $l2 | sed -e 's~,~\n\n~g') \
           | egrep '([0-9]+)c\1' | sed -r 's~([0-9]+)c\1~\1/2~' | bc)
    for col in $cols; do
      rg=`head -n 1 _d4a3/stats/$f | cut -f$col -d,`
      d=`paste -d- <(echo $l1 | cut -f$col -d,) <(echo $l2 | cut -f$col -d,) | bc`
      echo $f | sed "s~[-.].*~\t$rg\t$t\t$d~"
    done
  done
done

grep & compare...

  • | grep Total.*TOTAL per-system & site-wide totals all match routedatastats.log
  • #set $table to the result of that last big shell script block, and then...
    users=`echo "$table" | cut -f3 | sort | uniq`
    for u in $users; do clear; echo "$table" | grep $u; pluma _{d4a3,conc}/logs/users/$u.log; read OK; done
    # ...until getting to "TOTAL". This all matches userlogs. Then...
    echo "$table" | grep -v Total | grep $TOTAL

@yakra

This comment was marked as resolved.

@yakra

This comment was marked as resolved.

@yakra
Copy link
Contributor Author

yakra commented Oct 3, 2024

LOL clinched table

#find new concurrencies in concurrencies.log & & search for augments based on them
cl=$( \
fgrep -f \
  <(diff <(grep -v augment _d4a3/logs/concurrencies.log) \
         <(grep -v augment _conc/logs/concurrencies.log) \
    | grep '^>' | sed -r 's~^> New concurrency (\[.+\])(\[.+\]) \(2\)$~\1 based on \2\n\2 based on \1~') \
  _conc/logs/concurrencies.log \
| sed -r 's~Concurrency augment for traveler (.+) based on .+~\1~')

#convert DB lines to the same style
db=$( \
for e in $(diff <(tail -n +354277 _d4a3/TravelMapping.sql | head -n 212997 | sed s/^,// | sort) \
                <(tail -n +354277 _conc/TravelMapping.sql | head -n 213005 | sed s/^,// | sort) \
           | grep '^>' | cut -f2 -d" "); do
  t=`echo "$e" | cut -f4 -d"'"`
  id=`echo "$e" | cut -f2 -d"'"`
  s_line=`tail -n +186429 _d4a3/TravelMapping.sql | head -n 167848 | egrep "^,?\('$id'"`
  r=`echo "$s_line" | cut -f8 -d"'"`
  r=`tail -n +1494 _d4a3/TravelMapping.sql | head -n 5695 | grep "$r'"`
  r=`(echo "$r" | cut -f4 -d"'"; echo "$r" | cut -f6 -d"'") | tr '\n' ' '`
  i=`echo "$s_line" | cut -f4 -d"'"`
  w_pair=`tail -n +12887 _d4a3/TravelMapping.sql | head -n 173542 | grep -A 1 "^,\?('$i'" | tr -d '\n'`
  w=`echo "$w_pair" | cut -f4 -d"'"`
  p=`echo "$w_pair" | cut -f14 -d"'"`
  echo "$t: [$r$w $p]"
done)

#now sort & diff them
diff <(echo "$cl" | sort) <(echo "$db" | sort)

@yakra
Copy link
Contributor Author

yakra commented Oct 3, 2024

graphs=$(diff -qr _d4a3/graphs/ _conc/graphs/ | cut -f3 -d/ | sed -e 's~\.tmg.*~~' -e 's~-simple$\|-traveled$~~' | uniq)
allgraphs=$(diff -qr _d4a3/graphs/ _conc/graphs/ | cut -f3 -d/ | sed -e 's~\.tmg.*~~')

Simple graphs

  • vertices: all same number confirmed
    for g in $graphs; do
      diff <(head -n 2 _d4a3/graphs/$g-simple.tmg | tail -n 1 | cut -f1 -d' ') \
           <(head -n 2 _conc/graphs/$g-simple.tmg | tail -n 1 | cut -f1 -d' ')
    done
  • edges (67)
    • region (29): all down by number of new concurrencies in that region per concurrencies.log
      for g in $graphs; do
        printf "%4i $g\n" \
               $(paste -d- <(head -n 2 _d4a3/graphs/$g-simple.tmg | tail -n 1 | cut -f2 -d' ') \
                           <(head -n 2 _conc/graphs/$g-simple.tmg | tail -n 1 | cut -f2 -d' ') \
                 | bc)
      done | grep -e -region
    • country (10): all match totals of constituent regions
      same script but | grep -e -country instead
    • continent (5): all match totals of constituent regions
      same script but | grep -e -continent instead
    • tm-master
      47, matches total of all continents and total new concurrencies in concurrencies.log
    • siena100-area down 1 for VT EA Rut
    • multiregion (6): all match totals of constituent regions
    • multisystem (15): all match totals of constituent systems

@yakra
Copy link
Contributor Author

yakra commented Oct 4, 2024

Collapsed graph vertices

Number of vertices differ in some collapsed graphs. Makes sense though.
The difference is in whether or not a vertex's incident edges are collapsed around it, causing it to go from a bona fide visible vertex to one of the intermediate_points along a collapsed edge.

e=`echo -en '\e'`
for g in $graphs; do
  oldv=`head -n 2 _d4a3/graphs/$g.tmg | tail -n 1 | cut -f1 -d' '`
  newv=`head -n 2 _conc/graphs/$g.tmg | tail -n 1 | cut -f1 -d' '`
  d=`diff <(tail -n +3 _d4a3/graphs/$g.tmg | head -n $oldv) \
          <(tail -n +3 _conc/graphs/$g.tmg | head -n $newv) | grep '^[<>]'`
  if [ `echo "$d" | wc -w` -gt 0 ]; then echo -e "$g\n$d"; fi
done \
| sed -e "s~^<.*~$e[31m&$e[0m~" \
      -e "s~^>.*~$e[32m&$e[0m~"

Each line item diff falls into 1 of 2 categories:

  1. One less vertex. Adjacent to a U-turn. 8 of these worldwide.
    Before: 2 segments not flagged as concurrent cause 2 edges & a hidden junction. Vertex unhidden.
    After: Segments flagged as concurrent get 1 edge. Vertex has 2 incident edges, stays hidden, gets collapsed.
  2. One more vertex. Is a hidden U-turn. 2 of these, both in CZE.
    Before: 2 incident edges for an undetected concurrency collapse around the U-turn to 1 edge starting/ending @ same point.
    After: 1 incident edge. Vertex unhidden. No collapse occurs.

@yakra

This comment was marked as outdated.

@yakra
Copy link
Contributor Author

yakra commented Oct 4, 2024

Collapsed graph edges

Looking at the diffs between graph edges shows that diffs don't just occur one-by-one. They come in clusters, in one of three patterns. These patterns can be mixed & matched in a graph, and occur however many times.

  1. Scenario 1 above, when the graph loses a vertex.
    < Vertex adjacent to the U-turn point goes from being incorrectly being considered a hidden junction to getting collapsed
    < Simple edge from the U-turn point to the deleted adjacent point
    < Simple edge connecting the same points in the opposite direction
    < Edge connecting the deleted point to some other point farther down the line
    > Collapsed edge connecting the U-turn to the faraway point
    Happens when the U-turn vertex is visible & the adjacent vertex is collapsible.
  2. Scenario 2 above, when the graph gains a vertex.
    > U-turn vertex goes from being incorrectly collapsed to visible per having 1 incident edge
    > Simple edge from the U-turn to the adjacent point
    < Collapsed edge connecting the adjacent point to itself via the U-turn as an intermediate
    Happens when the U-turn vertex is hidden & the adjacent vertex is visible or a junction.
  3. Most commonly, a newly detected concurrency won't add or remove any vertices.
    < Simple edge from the U-turn point to the adjacent point
    < Simple edge connecting the same points in the opposite direction
    > A new simple edge from the U-turn to the adjacent point again. Only this time, label has a doubled route name, e.g. "DE,DE"
    Happens when the U-turn & adjacent vertices are both visible.
  4. A 4th possibility, I didn't observe in the wild and had to create "in the lab". 1 vertex added & 1 removed, for the same total.
    < Shaping point adjacent to the U-turn point goes from being incorrectly being considered a hidden junction to getting collapsed
    > U-turn vertex goes from being incorrectly collapsed to visible per having 1 incident edge
    < Edge connecting the deleted point to some other point farther down the line
    < Collapsed edge connecting the adjacent point to itself via the U-turn as an intermediate
    > Collapsed edge connecting the U-turn to the faraway point
    Happens when the U-turn & adjacent vertices are both hidden.

@yakra
Copy link
Contributor Author

yakra commented Oct 5, 2024

Traveled graphs

If collapsed graphs make sense, traveled graphs will be fine too.
The decision whether or not to collapse has nothing to do with the new concurrencies at this point; no new more info to be gained. Nonetheless, this...

trav=$( \
for g in $graphs; do \
  d=$(diff <(tail -n +2 _conc/graphs/$g.tmg | head -n 1) \
           <(tail -n +2 _conc/graphs/$g-traveled.tmg | head -n 1 | sed -r 's~(.+ .+) .*~\1~') \
      | grep '^[<>]')
  if [ `echo "$d" | wc -w` -gt 0 ]; then echo "$g"; fi; done)

e=`echo -en '\e'`
for g in $trav; do
  cv=$(tail -n +2 _conc/graphs/$g.tmg | head -n 1 | cut -f1 -d' ')
  tv=$(tail -n +2 _conc/graphs/$g-traveled.tmg | head -n 1 | cut -f1 -d' ')
  echo "$e[1m$g$e[0m"
  d=$(diff <(tail -n +3 _conc/graphs/$g.tmg | head -n $cv) \
           <(tail -n +3 _conc/graphs/$g-traveled.tmg | head -n $tv) | grep '^[<>]')
  for l in $(echo "$d" | tr ' ' '%'); do
    echo "$l" | tr '%' ' '
    fgrep -ir -f <(echo "$l" | cut -f2 -d@ | cut -f1 -d% | sed 's~^+~~') ~/tm/UserData/rlist_files
  done
done

...gives strong (though not foolproof) evidence that all uncollapsed vertices are used in .lists.
Only 2 points gave no results, which were found manually

  • Oce/OttQue@+DIV_Cha_W in use as +DIV_Cha on the other route that didn't get the pick for the hidden vertex label
  • AirTrn@+SKIP_EmpOnly used via an AltRouteName, TA IIRC

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant