-
Query Languages, Visualization
-
FCS-QL Details
-
Query Mapping
-
More resources in Awesome FCS List > Query Parsers
-
CQL (Contextual Query Language)
-
BNF grammar: www.loc.gov/standards/sru/cql/spec.html#bnf
-
Hand-written parser implementation in Java, Python, JS, …
-
Documentation: Java
-
Visualization in demo of JS parser
-
Validation for Text+ LexCQL
-
-
-
FCS-QL (Federated Content Search Query Language)
-
EBNF grammar: github.com/clarin-eric/fcs-misc (FCS Core 2.0)
-
Grammar visualization with ANTLR4 tools
-
-
Installation
pip install antlr4-tools git clone https://github.com/clarin-eric/fcs-ql.git cd fcs-ql/src/main/antlr4/eu/clarin/sru/fcs/qlparser
-
Visualization according to ANTLR4 > Getting Started
antlr4-parse src/fcsql/FCSParser.g4 src/fcsql/FCSLexer.g4 query -gui [ word = "her.*" ] [ lemma = "Artznei" ] [ pos = "VERB" ] ^D
QueryNode (with child node “children”)
-
Expression (layer identifier, layer identifier qualifier, operator, regular expression + flags)
-
Wildcard
-
Group → 1 QueryNode; “
(
” … “)
” -
NOT → 1 QueryNode
-
AND, OR → list of QueryNodes
-
-
QueryDisjunction → list of QueryNodes
-
QuerySequence → list of QueryNodes → “list of QuerySegmenten”
-
QuerySegment (min, max) → Expression → “a single token”
-
QueryGroup (min, max) → QueryNode
-
Within-Query (SimpleWithin, QueryWithWithin) (Scope: sentence, utterance, paragraph, turn, text, session) (unused)
-
grayed out: currently not supported by the FCS Aggregator for searching (in visual query builder)
Parsed Query:
-
Query Sequence → with list of Query Segment
[ word = ".*her" ] [ lemma = "Artznei" ] [ pos = "VERB" ]
-
Query Segment → a token (can be repeatable)
[ word = "her.*" & ( word = "test" | word = "Apfel" ) ] [ pos = "ADV" ]{1,3}
-
Expression AND
[ word = "her.*" & word = "test" ]
-
Expression Group
-
Expression
-
-
Expression Group → Expression OR → list of Expression
[ ( word = "her.*" | word = "Test" ) ]
-
Expression → Layer Identifier, Operator, Regex (value)
[ word = "her.*" ]
-
-
Currently (Aggregator v3.9.1) only limited support of all FCS-QL features
→ partly due to Visual Query Builder
-
Free text input / improved query builder planned for the future
-
Use appropriate diagnostics if query features are not supported
-
SRU:
\info:srw/diagnostic/1/48
- Query feature unsupported. -
FCS:
http://clarin.eu/fcs/diagnostic/10
- General query syntax error. - should be intercepted by FCS-QL parser library -
FCS:
http://clarin.eu/fcs/diagnostic/11
- Query too complex. Cannot perform Query.
-
-
Idea:
-
Let libraries parse raw queries (CQL, FCS-QL)
-
Recursively walk through the parsed query tree, “depth first”
-
Successively generate transformed query (for target system),
e.g.
StringBuilder
in Java
-
-
Examples:
-
NoSketchEngine: CQL → CQL (Java), FCS-QL → CQL (Java)
-
Solr: CQL → Solr (Java), LexCQL → Solr (Java)
-
SolrQuery with highlighting, Custom hit prefixes/postfixes, use Solr result as pre-formatted Data View content (Code)
-
-
CQI Bridge: CQL → CQP (Java)
-
ElasticSearch
-
Only BASIC Search with full-text queries, e.g. with Simple Query String
-
-
Solr
-
Only BASIC Search
-
ADVANCED Search with e.g. MTAS (“Multi Tier Annotation Search”)
-
-
In general: use actual Corpus Search Engine for ADVANCED Search
→ otherwise at most a single annotation layer (“text”) can be searched