Exploration Queries



filter

Exploration queries can be applied using -

  • query.filter
  • RQL's filter keyword

The added filtering is parsed and executed by RavenDB's Javascript engine.

The provided filtering operations resemble those implemented by where and can be further enhanced by Javascript functions of your own. Read here about creating and using your own Javascript function in your filters.

When Should Exploration Queries Be Used

filter can be applied to a Full-Collection query, for example:

// A full-collection query:
// ========================

from Employees as e
filter e.Address.Country == "USA"

// Results include only employees from USA
// No auto-index is created

It can also be applied to queries handled by an index:

// A dynamic query:
// ================

from Employees as e 
where e.Title == "Sales Representative" // This triggers auto-index creation
filter e.Address.Country == "USA"

// Results include only employees from USA having the specified title
// The auto-index created only indexes the 'Title' field

// A static-index query:
// =====================

from index "Orders/ByCompany" 
filter Count > 10

// Results include orders with Count > 10 using the specified static-index

Both in a collection query and in a query handled by an index, all the results that are retrieved by the query are scanned and filtered by filter.

When to use

Use filter for an ad-hoc exploration of the retrieved dataset, that matches no existing index and is not expected to be repeated much.

  • You gain the ability to filter post-query results on the server side, for both collection queries and when an index was used.
  • The dataset will be filtered without creating an unrequired index that the cluster would continue updating from now on.

Limit the query, and prefer where for recurring queries

Be aware that when a large dataset is retrieved, like the whole collection in the case of a collection query,
exploring it using filter would tax the server in memory and CPU usage while it checks the filter condition for each query result, and cost the user a substantial waiting time. Therefore:

  • Limit the number of records that an exploration query filters, e.g.:

    from Orders
    // Apply filter
    filter ShipTo.Country == "UK"
    // Limit the number of records that will be scanned by the filter operation
    filter_limit 100 
    
    // While there are 830 records in the Orders collection, 
    // only the first 100 records that are retrieved by the query are scanned by 'filter'
    // Running this RQL on the sample data returns 4 matching results out of the 100 scanned.
  • Use where rather than filter for recurring filtering.
    where will use an index, creating it if necessary, to accelerate the filtering in subsequent queries.

Usage examples

With collection queries:

Use filter with a full-collection query to scan and filter the entire collection.

const filteredCompanies = await session
     // Make a full-collection query
    .query({ collection: "companies" })
     // Apply a filter, scan only first 50 records from query results
    .filter(x => x.equals("Address.Country", "USA"), 50)
    .all();

// Results: 
// ========

// * While a full-collection query on the 'companies' colletion yields 91 results
//   only the first 50 records are scanned by the filter operation -
//   resulting in 5 matching documents.
//
// * No auto-index is created.
const filteredCompanies = await session
    .advanced
    .rawQuery("from Companies filter Address.Country == 'USA' filter_limit 50")
    .all();
from "Companies" 
filter Address.Country == "USA"
filter_limit 50

Filtering a sizable collection will burden the server and prolong user waiting time.
It is recommended to set a filter_limit to restrict the number of filtered records.


With queries that use an index:

Use filter after a whereEquals clause to filter the results retrieved by the query.

const filteredCompanies = await session
     // Make a dynamic query on a collection
    .query({ collection: "companies" })
     // Apply some condition - this will trigger auto-index creation
    .whereEquals("Contact.Title", "Sales Representative")
     // Apply a filter 
    .filter(x => x.equals("Address.Country", "Germany"))
    .all();

// Results: 
// ========

// * The dynamic query results (before applying the filter) contain 17 results.
//   Applying the filter results in 4 matching documents. 
//
// * Since a query predicate was applied (using 'whereEquals') 
//   an auto-index that is indexing field 'Contact.Title' is created.
//
// * Field 'Address.Country' is Not included in the auto-index
//   since it is part of the filter operation. 
const filteredCompanies = await session
    .advanced
    .rawQuery(`from Companies 
               where Contact.Title == 'Sales Representative'
               filter Address.Country == 'Germany'`)
    .all();
from "Companies"
where Contact.Title == "Sales Representative"
filter Address.Country == "Germany"

With projections:

The filtered results can be projected using selectFields, like those of any other query.

const filteredCompanies = await session
     // Make a collection query
    .query({ collection: "companies" })
     // Apply a filter 
    .filter(x => x.equals("Address.Country", "Germany"))
     // Any fields can be projected in the results
    .selectFields([ "Name", "Address.City", "Address.Country"])
    .all();

 // Results: 
 // ========

 // * Results include all companies with country = 'Germany'
 //   Each resluting object contains only the selected fields.
 //
 // * No auto-index is created.
const filteredCompanies = await session
    .advanced
    .rawQuery(`from Companies 
               filter Address.Country == 'Germany'
               select Name, Address.City, Address.Count`)
    .all();
from "Companies"
filter Address.Country == "Germany"
select Name, Address.City, Address.Country

With user-defined JavaScript functions (declare):

When using RQL, you can define a JavaScript function using the declare keyword.
This function can then be used as part of your filter condition to further customize the results.
For example:

// Declare a Javascript function:
// ==============================

declare function filterByTitlePrefix(employee, prefix) 
{ 
    // Include any filtering logic that suits your needs
    return employee.Title.startsWith(prefix)
} 

// Use the function in your RQL:
// =============================

from Employees as employee
// Filter using the declared function
filter filterByTitlePrefix(employee, "Sales")
filter_limit 10

Syntax

filter(builder);
filter(builder, limit);
Parameter Type Description
builder (factory) => void The filtering method
limit number The number of records from the query results that filter should scan.
Default: all retrieved records.