I'm digging into nightly benchy and concurrent HNSW merging, trying to extend segment traces to record details of HNSW concurrency.
InfoStream has these useful lines:
HNSW 0 [2026-04-15T14:58:00.633261871Z; hnsw-merge-1-thread-91]: built 950000 in 11447/11447 ms
But the in X/X ms is always the same X. Here's the source:
private long printGraphBuildStatus(int node, long start, long t) {
long now = System.nanoTime();
infoStream.message(
HNSW_COMPONENT,
String.format(
Locale.ROOT,
"built %d in %d/%d ms",
node,
TimeUnit.NANOSECONDS.toMillis(now - t),
TimeUnit.NANOSECONDS.toMillis(now - start)));
return now;
}
And it's called from addVectors in the same class:
/** add vectors in range [minOrd, maxOrd) */
protected void addVectors(int minOrd, int maxOrd) throws IOException {
if (frozen) {
throw new IllegalStateException("This HnswGraphBuilder is frozen and cannot be updated");
}
long start = System.nanoTime(), t = start;
if (infoStream.isEnabled(HNSW_COMPONENT)) {
infoStream.message(HNSW_COMPONENT, "addVectors [" + minOrd + " " + maxOrd + ")");
}
for (int node = minOrd; node < maxOrd; node++) {
addGraphNode(node);
if ((node % 10000 == 0) && infoStream.isEnabled(HNSW_COMPONENT)) {
t = printGraphBuildStatus(node, start, t);
}
}
}
Since the chunk size is always 2048 (I think?), a worker working on a chunk could print 0 or 1 times ... which means the incoming t is always the same as start.
I suspect this was factored out from a place where start was the true start time (to create the whole graph), and this InfoStream message is trying to show "time since we started" and "time since the last progress print" or so. It's useful to fix this so segment traces can help debug HNSW merge concurrency...
I'm digging into nightly benchy and concurrent HNSW merging, trying to extend segment traces to record details of HNSW concurrency.
InfoStream has these useful lines:
But the
in X/X msis always the sameX. Here's the source:And it's called from
addVectorsin the same class:Since the chunk size is always 2048 (I think?), a worker working on a chunk could print 0 or 1 times ... which means the incoming
tis always the same asstart.I suspect this was factored out from a place where
startwas the true start time (to create the whole graph), and this InfoStream message is trying to show "time since we started" and "time since the last progress print" or so. It's useful to fix this so segment traces can help debug HNSW merge concurrency...