Specification “Relations Query Language” (Hercules)¶
Introduction¶
Goals RQL¶
The goal is to have a language emphasizing the way of browsing relations. As such, attributes will be regarded as cases of special relations (in terms of implementation, the user language not to see virtually no difference between an attribute and a relation).
RQL is inspired by SQL but is the highest level. A knowledge of the CubicWeb schema defining the application is necessary.
Comparison with existing languages¶
SQL¶
RQL builds on the features of SQL but is at a higher level (the current implementation of RQL generates SQL). For that it is limited to the way of browsing relations and introduces variables. The user does not need to know the model underlying SQL, but the CubicWeb scheam defining the application.
Versa¶
Should I look in more detail, but here is already some ideas for the moment … Versa is the language most similar to what we wanted to do, but the model underlying data being RDF, there is some number of things such as namespaces or handling of the RDF types which does not interest us. On the functionality level, Versa is very comprehensive including through many functions of conversion and basic types manipulation, which may need to be guided at one time or another. Finally, the syntax is a little esoteric.
See also¶
The different types of queries¶
- Search ( Any)
This type of query can extract entities and attributes of entities.
- Inserting entities ( INSERT)
This type of query is used to insert new entities in the database. It will also create direct relationships entities newly created.
- Update entities, relations creation( SET)
This type of query updates existing entities in the database, or create relations between existing entities.
- Deletion of entities or relationship ( DELETE)
This type of query allows for the removal of entities and relations existing in the database.
Examples¶
(see the tutorial: ref: tutorielRQL for more examples)
Search Query¶
[ DISTINCT] <entity type> V1 (V2) * [ GROUPBY V1 (V2) *] [ ORDERBY <orderterms>] [ WHERE <restriction>] [ LIMIT <value>] [ OFFSET <value>]
- entity type:
Type of selected variables. The special type Any is equivalent to not specify a type.
- restriction:
- list of relations to go through whic follow the pattern
V1 relation V2 | <static value>
- orderterms:
Definition of the selection order: variable or column number followed by sorting method ( ASC, DESC), ASC is the default.
- note for grouped queries:
For grouped queries (e.g., a clause GROUPBY), all selected variables must be aggregated or grouped.
- Search for the object of identifier 53
Any WHERE X X eid 53
- Search material such as comics, owned by syt and available
WHERE X Document X occurence_of F, F class C, C name 'Comics' X owned_by U, U login 'syt' X available true
- Looking for people working for eurocopter interested in training
Person P WHERE P work_for P, S name 'Eurocopter' P interested_by T, T name 'training'
- Search note less than 10 days old written by jphc or ocy
Note N WHERE N written_on D, D day> (today -10), N written_by P, P name 'jphc' or P name 'ocy'
- Looking for people interested in training or living in Paris
Person P WHERE (P interested_by T, T name 'training') or (P city 'Paris')
- The name and surname of all people
Any N, P WHERE X is Person, X name N, X first_name P
Note that the selection of several entities generally force the use of “Any” because the type specification applies otherwise to all the selected variables. We could write here
String N, P WHERE X is Person, X name N, X first_name P
Insertion query¶
INSERT <entity type> V1 (, <entity type> V2) * : <assignments> [ WHERE <restriction>]
- : assignments:
list of relations to assign in the form V1 relationship V2 | <static value>
The restriction can define variables used in assignments.
Caution, if a restriction is specified, the insertion is done for each line results returned by the restriction.
- Insert a new person named ‘foo’
INSERT Person X: X name 'widget'
Insert a new person named ‘foo’, another called ‘nice’ and a ‘friend’ relation between them
INSERT Person X, Person Y: X name 'foo', Y name 'nice', X friend Y
Insert a new person named ‘foo’ and a ‘friend’ relation with an existing person called ‘nice’
INSERT Person X: X name 'foo', X friend Y WHERE name 'nice'
Update and relation creation queries¶
SET <assignements> [ WHERE <restriction>]
Caution, if a restriction is specified, the update is done for each line results returned by the restriction.
Renaming of the person named ‘foo’ to ‘bar’ with the first name changed
SET X name 'bar', X first_name 'original' where X is Person X name 'foo'
Insert a relation of type ‘know’ between objects linked by the relation of type ‘friend’
SET X know Y WHERE X friend Y
Deletion query¶
DELETE (<entity type> V) | (V1 relation v2 ),… [ WHERE <restriction>]
Caution, if a restriction is specified, the deletion is made for each line results returned by the restriction.
Deletion of the person named ‘foo’
DELETE Person X WHERE X name 'foo'
Removal of all relations of type ‘friend’ from the person named ‘foo’
DELETE X friend Y WHERE X is Person, X name 'foo'
Language definition¶
Reserved keywords¶
The keywords are not case sensitive.
DISTINCT, INSERT, SET, DELETE,
WHERE, AND, OR, NOT
IN, LIKE, ILIKE,
TRUE, FALSE, NULL, TODAY, NOW
GROUPBY, ORDERBY, ASC, DESC
Variables and Typing¶
With RQL, we do not distinguish between entities and attributes. The value of an attribute is considered an entity of a particular type (see below), linked to one (real) entity by a relation called the name of the attribute.
Entities and values to browse and/or select are represented in the query by variables that must be written in capital letters.
There is a special type Any, referring to a non specific type.
We can restrict the possible types for a variable using the special relation is. The possible type(s) for each variable is derived from the schema according to the constraints expressed above and thanks to the relations between each variable.
Built-in types¶
The base types supported are string (between double or single quotes), integers or floats (the separator is the’.’), dates and boolean. We expect to receive a schema in which types String, Int, Float, Date and Boolean are defined.
String (literal: between double or single quotes).
Int, Float (separator being’.’).
- Date, Datetime, Time (literal: string YYYY/MM/DD [hh:mm] or keywords
TODAY and NOW).
Boolean (keywords TRUE and FALSE).
Keyword NULL.
Operators¶
Logical Operators¶
AND, OR, ','
‘,’ is equivalent to ‘AND’ but with the smallest among the priority of logical operators (see Operators priority).
Mathematical Operators¶
+, -, *, /
Comparison operators¶
=, <, <=, >=, > = ~, IN, LIKE, ILIKE
The operator = is the default operator.
The operator LIKE equivalent to ~= can be used with the special character % in a string to indicate that the chain must start or finish by a prefix/suffix:
Any X WHERE X name =~ 'Th%' Any X WHERE X name LIKE '%lt'
The operator ILIKE is a case-insensitive version of LIKE.
The operator IN provides a list of possible values:
Any X WHERE X name IN ( 'chauvat', 'fayolle', 'di mascio', 'thenault')
XXX nico: A trick <> ‘bar’ would not it be more convenient than NOT A trick ‘bar’?
Operators priority¶
‘*’, ‘/’
‘+’, ‘-’
‘and’
‘or’
‘,’
Advanced Features¶
Functions aggregates¶
COUNT, MIN, MAX, AVG, SUM
Functions on string¶
UPPER, LOWER
Optional relations¶
They allow you to select entities related or not to another.
You must use the ? behind the variable to specify that the relation toward it is optional:
Anomalies of a project attached or not to a version
Any X, V WHERE X concerns P, P eid 42, X corrected_in V?
All cards and the project they document if necessary
Any C, P WHERE C is Card, P? documented_by C
BNF grammar¶
The terminal elements are in capital letters, non-terminal in lowercase. The value of the terminal elements (between quotes) is a Python regular expression.
statement:: = (select | delete | insert | update) ';'
# select specific rules
select ::= 'DISTINCT'? E_TYPE selected_terms restriction? group? sort?
selected_terms ::= expression ( ',' expression)*
group ::= 'GROUPBY' VARIABLE ( ',' VARIABLE)*
sort ::= 'ORDERBY' sort_term ( ',' sort_term)*
sort_term ::= VARIABLE sort_method =?
sort_method ::= 'ASC' | 'DESC'
# delete specific rules
delete ::= 'DELETE' (variables_declaration | relations_declaration) restriction?
# insert specific rules
insert ::= 'INSERT' variables_declaration ( ':' relations_declaration)? restriction?
# update specific rules
update ::= 'SET' relations_declaration restriction
# common rules
variables_declaration ::= E_TYPE VARIABLE (',' E_TYPE VARIABLE)*
relations_declaration ::= simple_relation (',' simple_relation)*
simple_relation ::= VARIABLE R_TYPE expression
restriction ::= 'WHERE' relations
relations ::= relation (LOGIC_OP relation)*
| '(' relations')'
relation ::= 'NOT'? VARIABLE R_TYPE COMP_OP? expression
| 'NOT'? R_TYPE VARIABLE 'IN' '(' expression (',' expression)* ')'
expression ::= var_or_func_or_const (MATH_OP var_or_func_or_const) *
| '(' expression ')'
var_or_func_or_const ::= VARIABLE | function | constant
function ::= FUNCTION '(' expression ( ',' expression) * ')'
constant ::= KEYWORD | STRING | FLOAT | INT
# tokens
LOGIC_OP ::= ',' | 'GOLD' | 'AND'
MATH_OP ::= '+' | '-' | '/' | '*'
COMP_OP ::= '>' | '>=' | '=' | '<=' | '<' | '~=' | 'LIKE' | 'ILIKE'
FUNCTION ::= 'MIN' | 'MAX' | 'SUM' | 'AVG' | 'COUNT' | 'upper' | 'LOWER'
VARIABLE ::= '[A-Z][A-Z0-9]*'
E_TYPE ::= '[A-Z]\w*'
R_TYPE ::= '[a-z_]+'
KEYWORD ::= 'TRUE' | 'FALSE' | 'NULL' | 'TODAY' | 'NOW'
STRING ::= "'([^'\]|\\.)*'" |'"([^\"]|\\.)*\"'
FLOAT ::= '\d+\.\d*'
INT ::= '\d+'
Remarks¶
Sorting and groups¶
For grouped queries (e.g. with a GROUPBY clause), all selected variables should be grouped.
To group and/or sort by attributes, we can do: “X,L user U, U login L GROUPBY L, X ORDERBY L”
If the sorting method (SORT_METHOD) is not specified, then the sorting is ascendant.
Negation¶
A query such as Document X WHERE NOT X owned_by U means “the documents have no relation owned_by”.
But the query Document X WHERE NOT X owned_by U, U login “syt” means “the documents have no relation owned_by with the user syt”. They may have a relation “owned_by” with another user.
Identity¶
You can use the special relation identity in a query to
add an identity constraint between two variables. This is equivalent
to is
in python:
Any A WHERE A comments B, A identity B
return all objects that comment themselves. The relation identity is especially useful when defining the rules for securities with RQLExpressions.
Implementation¶
Internal representation (syntactic tree)¶
The tree research does not contain the selected variables (e.g. there is only what follows “WHERE”).
The insertion tree does not contain the variables inserted or relations defined on these variables (e.g. there is only what follows “WHERE”).
The removal tree does not contain the deleted variables and relations (e.g. there is only what follows the “WHERE”).
The update tree does not contain the variables and relations updated (e.g. there is only what follows the “WHERE”).
Select ((Relationship | And | Gold)?, Group?, Sort?)
Insert (Relations | And | Gold)?
Delete (Relationship | And | Gold)?
Update (Relations | And | Gold)?
And ((Relationship | And | Gold), (Relationship | And | Gold))
Or ((Relationship | And | Gold), (Relationship | And | Gold))
Relationship ((VariableRef, Comparison))
Comparison ((Function | MathExpression | Keyword | Constant | VariableRef) +)
Function (())
MathExpression ((MathExpression | Keyword | Constant | VariableRef), (MathExpression | Keyword | Constant | VariableRef))
Group (VariableRef +)
Sort (SortTerm +)
SortTerm (VariableRef +)
VariableRef ()
Variable ()
Keyword ()
Constant ()
Remarks¶
The current implementation does not support linking two relations of type ‘is’ with a OR. I do not think that the negation is supported on this type of relation (XXX FIXME to be confirmed).
Relations defining the variables must be left to those using them. For example:
Point P where P abs X, P ord Y, P value X+Y is valid, but:: Point P where P abs X, P value X+Y, P ord Y is not.
Conclusion¶
Limitations¶
It lacks at the moment:
COALESCE
restrictions on groups (HAVING)
and certainly other things …
A disadvantage is that to use this language we must know the format used (with real relation names and entities, not those viewing in the user interface). On the other hand, we can not really bypass that, and it is the job of a user interface to hide the RQL.
Topics¶
It would be convenient to express the schema matching relations (non-recursive rules):
Document class Type <-> Document occurence_of Fiche class Type
Sheet class Type <-> Form collection Collection class Type
Therefore 1. becomes:
Document X where
X class C, C name 'Cartoon'
X owned_by U, U login 'syt'
X available true
I’m not sure that we should handle this at RQL level …
There should also be a special relation ‘anonymous’.