Querying: RQL - Raven Query Language
Queries in RavenDB use an SQL-like language called RQL (Raven Query Language).
Overview
RQL is designed to expose externally the RavenDB query pipeline in a way that is easy to understand, easy to use, and not overwhelming to the user.
Query Optimizer
As soon as a query reaches a RavenDB instance, the instance calls its query optimizer to analyze the query and determine which indexes should be used to retrieve the requested data.
Dynamic and Indexed Queries
RavenDB has two types of queries:
- A dynamic query, e.g.
from Orders where ...
, which gives the query optimizer full freedom to choose which index the query will use. - An indexed query, that specifies the index it would use.
E.g.from index 'Orders/ByCompany' where ...
, which instructs RavenDB to use theOrders/ByCompany
index.
Queries Usage of Indexes
In other databases, the query optimizer may fail to find a suitable index and fall back into querying using a full scan.
RavenDB doesn't include support for full scans. If an index cannot be found for a query, the query optimizer will create a new index for the query.
RavenDB queries will use the index they have created or found, to minimize response time and return results at the same speed regardless of the size of the data.
You can read more about indexes here.
Exceptions: Time-Series and Counters
Queries in RavenDB are indexed automatically (unless they are already indexed).
Distributed Time-Series and
Distributed Counters are an exception
to this rule: they are not indexed automatically, though they can be indexed
manually.
Time series and counters that haven't been indexed manually, will be queried without index.
Indexing and queries in RavenDB
Indexing in RavenDB is a background operation.
An yet-unindexed query will wait for the indexing process to complete (or timeout).
A query that can be answered using an existing index will proceed normally
using this index.
When the creation of a new index has caught up, RavenDB will remove all the
old indexes that are now superseded by the new index.
RQL Keywords and Methods
declare
The keyword declare
gives you the ability to create a JavaScript function that can be reused in select
(when projection is done). You can read more about it here.
from
The keyword from
is used to determine the source data that will be used when a query is executed.
You have two options:
-
from <collection>
This option is used to perform:- Collection queries that perform just basic ID filtering.
When this is the case, there is no need to query an index and the required document is returned directly from the storage.
E.g.
from Companies where id() == 'companies/1-A'
- Dynamic queries that are executed using auto indexes.
All Documents
In order to query all documents, the
@all_docs
keyword can be used:- `from @all_docs where FirstName = 'Laura'` - `from @all_docs where id() = 'companies/1-A'`.
- Collection queries that perform just basic ID filtering.
-
from INDEX <index-name>
This option is used to perform RQL operations with a specific index.
group by
The keyword group by
is used to create an aggregation query. Please refer to the article about dynamic group by queries to find out more.
where
You can use the where
keyword with various operators
to filter chosen documents from the final result-set.
Operator: >=
, <=
, <>
, !=
, <
, >
, =
, ==
These basic operators can be used with all value types, including 'numbers' and 'strings'.
You can, for example, return every document from the companies collection whose field value = a given input.
from Companies
where Name = 'The Big Cheese'
Filtering on nested properties is also supported, so in order to return all companies from 'Albuquerque' we need to execute following query:
from Companies
where Address.City = 'Albuquerque'
Operator: between
The operator between
returns results inclusively, and the type of border values used must match. It works on both 'numbers' and 'strings' and can be substituted with the >=
and <=
operators (see the example below).
from Products
where PricePerUnit between 10.5 and 13.0
or
from Products
where PricePerUnit >= 10.5 and PricePerUnit <= 13.0
Operator: in
The operator in
is validating if a given field contains passed values. It will return results if a given field matches any of the passed values.
from Companies
where Name IN ('The Big Cheese', 'Unknown company name')
from Orders
where Lines[].ProductName in ('Chang', 'Spegesild', 'Unknown product name')
Operator: all in
This operator checks if all passes values are matching a given field. Due to its mechanics, it is only useful when used on array fields.
The following query will yield no results in contrast to an 'in' operator.
from Orders
where Lines[].ProductName all in ('Chang', 'Spegesild', 'Unknown product name')
but removing the 'Unknown product name' will give you orders that only contains products with both 'Chang' and 'Spegesild' names
from Orders
where Lines[].ProductName all in ('Chang', 'Spegesild')
Binary Operators: AND
, OR
, NOT
Binary operators can be used to build more complex statements. The NOT
operator can only be used with one of the other binary operators creating OR NOT
or AND NOT
ones.
from Companies
where Name = 'The Big Cheese' OR Name = 'Richter Supermarkt'
from Orders
where Freight > 500 AND ShippedAt > '1998-01-01'
from Orders
where Freight > 500 AND ShippedAt > '1998-01-01' AND NOT Freight = 830.75
Subclauses: (
, )
Subclauses can be used along with binary operators to build even more complex logical statements. They are self-explanatory so no example will be given.
order by
To perform sorting, order by
must be used.
Read more about this subject in the article dedicated to sorting.
load
When there is a need to use data from an external document in projection, load
can be used. Please refer to the following projection article to find out more about it.
select
Projections can be performed by using select
. Please read our dedicated projection article here.
update
To patch documents on the server-side, use update
with the desired JavaScript script that will be applied to any documents matching the query criteria. For more information, please refer to our patching article.
include
The keyword include
has been introduced to support:
- including additional documents or counters to the query response
- highlighting results
- query timings
- explanations
with
The keyword with
is used to determine the data source of a graph query.
There are two types of with
clauses, regular with
and with edges
.
-
with:
with {from Orders} as o
The above statement means that the data source referred to by the aliaso
is the result of thefrom Orders
query -
with edges:
with edges (Lines) { where Discount >= 0.25 select Product } as cheap
The above statement means that our data source is the propertyLines
of the source documents and we filter all lines that matchDiscount >= 0.25
query the destination referred to by thecheap
alias is the product pointed by theProduct
property of the order line
match
The keyword match
is used to determine the pattern of a graph query.
match (Orders as o)-[Lines as cheap where Discount >= 0.25 select Product]->(Products as p)
The above statement means that we are searching for a pattern that starts with an order and traverse using the
order lines referred to by the Lines
property where their Discount
property is larger than 25% and the destination is the product referred to by the Product
property.
A match may contain an edge in both direction, a right edge would look like so (node1)-[right]->(node2)
and a left one would look like so (node1)<-[left]-(node2)
.
Any combination of edges is allowed in a match clause e.g.
(node1)-[right]->(node2)<-[left]-(node3)
The above match will actually be translated to:
(node1)-[right]->(node2)
and
(node3)-[left]->(node2)
where the and
is a set intersection between the two patterns.
Keywords and Methods
The following keywords and methods are available in RQL:
- DECLARE
-
FROM
- INDEX
- GROUP BY
-
WHERE
- id()
- search()
- cmpxchg()
- boost()
- regex()
- startsWith()
- endsWith()
- lucene()
- exists()
- exact()
- intersect()
- spatial.within()
- spatial.contains()
- spatial.disjoint()
- spatial.intersects()
- moreLikeThis()
- ORDER BY
- LOAD
-
SELECT
- DISTINCT
- key()
- sum()
- count()
- facet()
- timeseries()
- counter()
- UPDATE
- INCLUDE
- WITH
- MATCH
With the following operators:
- >=
- <=
- <> or !=
- <
- >
- = or ==
- BETWEEN
- IN
- ALL IN
- OR
- AND
- NOT
- (
- )
And the following values:
- true
- false
- null
- string e.g. 'John' or "John"
- number (long and double) e.g. 17
- parameter e.g. $param1