-
Notifications
You must be signed in to change notification settings - Fork 2
Query Construction and Query Language
Being a recommender EEXCESS does not directly use a query language or queries themselves. The information need of the user is represented as a "profile" containing:
- Demographic and user information like name, address, languages, and interests
- Context information, i.e. information on the current situation of the user. This includes
- Keywords
- Named Entities like persons, organisations, locations, topics and miscellaneous
- Content and reason of the information need
- The page the information need originated from
The exact fields filled out depend on the privacy policy.
Context information consists of a set of keywords and/or a set of entities:
Keywords = {Keyword1, Keyword2, ....}
Entities = {<Entitiy1, TypeOfEntity1>, <Entitiy2, TypeOfEntity2>, ....}
Every keyword or entity consists of a list of terms, where a term
is defined as sequence of non-space characters:
Keyword = <term1, term2, term3....>
Entity = <entity1, entity2, entity3....>
term = sequence of non-space characters
entity = sequence of non-space characters
Keyword and Entity also correspond to one context entry in the JSON format, i.e.
"contextKeywords":[
{
"text":"term1 term2 ...",
"weight":0.1,
"reason":"manual"
}
.
.
.
]
Since all partner systems of the recommender are search based, the profile has to be translated into a query language. While the individual translation depends on the partner system, the following semantics is retained by the recommender: 1. An entities or a keywords terms are combined with an AND semantic, i.e. all terms must be contained in a document. Order or closeness does not play a role. 2. Multiply entities or keywords are combined in an OR semantic, i.e. one entity or keyword must be contained in a document to be returned. Documents containing more entities should be ranked higher.
So basically, the resulting query looks like
term1Entity1 AND term2Entity1 ... AND termkEntity1 OR term1Entity2 AND term2Entity2 ... AND termk2Entity2 ..
The Chrome Plugin (C4) has the following search syntax:
entityType:"term1 term2" "term3 term4"
If the entity type is not provided, the terms are considered as keywords (as with "term3 term4"
)