A Datalog parser. This parser is used by Datahike and follows the Datalog dialect of Datomic.
Note: This repository has been moved from the lambdaforge organization to replikativ. So, you will find older releases of the parser at the lambdaforge clojars page.
Add the current release of io.replikativ/datalog-parser to your project.clj. Start a repl and run:
(require '[datalog.parser :as parser])
(parser/parse '[:find ?x :in $ ?y :where [?x :z ?y]])
;;=> (namespaces omitted for brevity)
;; #Query{:qfind #FindRel{:elements [#Variable{:symbol ?x}]}
;; :qwith nil
;; :qin [#BindScalar{:variable #SrcVar{:symbol $}}
;; #BindScalar{:variable #Variable{:symbol ?y}}]
;; :qwhere [#Pattern{:source #DefaultSrc{}
;; :pattern [#Variable{:symbol ?x}
;; #Constant{:value :z}
;; #Variable{:symbol ?y}]}]}For more examples look at the tests.
This fork adds support for vector similarity search operations, designed for querying vector stores with metadata and vector columns.
;; Basic vector similarity search
[:find ?e ?score
:where [?e :embedding ?v]
[(vector-search ?v [0.1 0.2 0.3] {} ?score)]]
;; Vector search with options
[:find ?doc ?score
:where [?doc :vector ?v]
[(vector-search ?v ?query-vec {:metric :cosine :top-k 10} ?score)]]
;; Combined with metadata filtering
[:find ?e ?score
:where [?e :category "science"]
[?e :created_at ?date]
[(> ?date "2024-01-01")]
[?e :embedding ?v]
[(vector-search ?v ?query-vec {:metric :cosine :threshold 0.8} ?score)]]The options map supports:
:metric- Similarity metric (:cosine,:euclidean,:dot,:l2) - default::cosine:threshold- Minimum similarity score (0.0 to 1.0):top-k- Return top K results:index- Index type hint (:hnsw,:ivf,:flat):text-query- Text for hybrid search:text-field- Field name for text search:alpha- Weight for hybrid search (0=pure text, 1=pure vector):rerank- Reranking strategy (:mmr,:cross-encoder):diversity- Diversity parameter for MMR- Additional options can be added for extensibility
;; Hybrid search combining vector and text
[:find ?doc ?title ?score
:where [?doc :title ?title]
[?doc :vector ?v]
[(vector-search ?v ?query-vec {:metric :cosine
:text-query "machine learning"
:text-field :content
:alpha 0.7}) ?score]]
;; With MMR diversity reranking
[:find ?doc ?score
:where [?doc :vector ?v]
[(vector-search ?v ?query-vec {:metric :cosine
:top-k 20
:rerank :mmr
:diversity 0.3}) ?score]]
;; Index-specific parameters
[:find ?doc ?score
:where [?doc :vector ?v]
[(vector-search ?v ?query-vec {:metric :cosine
:index :hnsw
:ef-search 100}) ?score]]To benchmark or profile the parser, change to the parser-perf namespace or require it:
(in-ns 'datalog.parser-perf')Then run the parse benchmark or profiler:
(parse)
(profile)To see the produced flame graph, you can start a web server at port:
(in-ns 'datalog.parser-perf')
(prof/server-files port)Unparsing support, missing types:
- PullSpec
- PullAttrName
- PullReverseAttrName
- PullLimitExpr
- PullDefaultExpr
- PullWildcard
- PullRecursionLimit
- PullMapSpecEntry
- PullAttrWithOpts
Copyright © 2020 lambdaforge UG (haftungsbeschränkt), Nikita Prokopov
This program and the accompanying materials are made available under the terms of the Eclipse Public License 1.0.