Skip to content

BUG: Hybrid RRF fusion dilutes vector scores with weak FTS scores #577

@bm-clawd

Description

@bm-clawd

Problem

Hybrid search (RRF fusion of FTS + vector) produces worse results than vector-only search. The RRF formula averages strong vector scores with weak FTS scores, resulting in useless combined relevance.

Evidence

Query: arscontexta skill graphs tweet viral

  • Vector-only: 65.4% relevance for correct match ✅
  • FTS-only: 1.6% ❌
  • Hybrid (RRF): 3.2% — worse than vector alone ❌

The correct note (containing an entire section about arscontexta's skill graphs tweet) scores 65% on semantic similarity but gets crushed to 3% by the RRF fusion with FTS.

Expected Behavior

Hybrid should produce scores >= max(vector, FTS), not lower than both meaningful components. When vector returns a strong match and FTS doesn't, the hybrid score should still reflect the vector confidence.

Impact

High. The OpenClaw plugin uses default search (not --vector), so users get the diluted hybrid scores. This makes memory_search unreliable for semantic recall.

Possible Fixes

  1. Weighted RRF: give vector results higher weight than FTS
  2. Max-score fusion instead of averaging
  3. Fall through: if vector score > threshold, use it directly
  4. Let callers specify search mode preference

Environment

  • BM v0.18.3, SQLite backend, fastembed bge-small-en-v1.5
  • 66 entities in claw project, all with embeddings

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions