Fix #263: Add box_id to ORDER BY for deterministic sorting #274
+2
−2
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Fix #263: Add box_id to ORDER BY for deterministic sorting
🎯 Summary
Fixes Issue #263 - sortDirection parameter not working correctly
This PR fixes a PostgreSQL query bug that caused non-deterministic results when querying unspent boxes by address. The issue was a mismatch between
DISTINCT ONcolumns andORDER BYcolumns, which violates PostgreSQL requirements and leads to inconsistent query results.🐛 Problem Description
What Was Broken
The API endpoint
/api/v1/boxes/byAddress/{address}was returning different results on repeated queries with the same parameters. This affected:sortDirectionparameter (asc/desc) didn't work correctlyRoot Cause
The queries in
OutputQuerySet.scalaused:PostgreSQL Requirement: When using
DISTINCT ON (col1, col2, ...), theORDER BYclause must start with the same columns in the same order.What we had:
DISTINCT ON (o.box_id, o.global_index)butORDER BY o.global_index❌What we need:
DISTINCT ON (o.box_id, o.global_index)andORDER BY o.box_id, o.global_index✅Impact
✅ Solution
Added
o.box_idto theORDER BYclause in two query methods to match theDISTINCT ONclause.Changes Made
File:
modules/explorer-core/src/main/scala/org/ergoplatform/explorer/db/queries/OutputQuerySet.scalaChange 1: Line 253 (getMainUnspentByErgoTree)
Change 2: Line 289 (getMainUnspentByErgoTreeFiltered)
🧪 Testing
Manual Testing
Test with an address that has many boxes:
Before Fix: Different boxes returned on each iteration ❌
After Fix: Identical boxes returned on every iteration ✅
Expected Behavior After Fix
📊 Performance Impact
Performance: ✅ No negative impact
box_idcolumn is already part of the index used byDISTINCT ONORDER BYdoesn't require additional sorting🔍 Code Quality
📝 Checklist
🔗 References
💡 Technical Details
Why This Works
PostgreSQL's
DISTINCT ONremoves duplicate rows based on the specified columns, but it needs to know which duplicate to keep. TheORDER BYclause tells PostgreSQL how to sort the rows before selecting the first one from each group.When the
ORDER BYdoesn't start with the same columns asDISTINCT ON, PostgreSQL can't determine which row to keep deterministically, leading to non-deterministic results across query executions.Query Flow
WHERE o.main_chain = true AND i.box_id IS NULL AND o.ergo_tree = ?ORDER BY o.box_id, o.global_index DESCDISTINCT ON (o.box_id, o.global_index)- keeps first row per groupOFFSET ? LIMIT ?🎁 Additional Benefits
🚀 Deployment
This fix can be deployed immediately: