Searching

One of the most common functionalities that many real world applications provide is a search feature. Many times it will be enough to apply where closure to create a simple condition, for example to get all users whose name equals John Doe use the code:

QSearching_User u = QSearching_User.user;
List<User> users = session
  .query(User.class, Users_ByName.class)
  .where(u.name.eq("John"))
  .toList();
QSearching_User u = QSearching_User.user;
List<User> users = session
  .advanced()
  .documentQuery(User.class, Users_ByName.class)
  .whereEquals(u.name, "John")
  .toList();
QueryResult result = store
  .getDatabaseCommands()
  .query("Users/ByName",
    new IndexQuery("Name:John"));
public static class Users_ByName extends AbstractIndexCreationTask {
  public Users_ByName() {
    QSearching_User u = QSearching_User.user;
    map =
     " from user in docs.Users " +
     " select new              " +
     " {                       " +
     "     user.Name           " +
     " }; ";

    index(u.name, FieldIndexing.ANALYZED);
  }
}

where User class is defined as follows:

@QueryEntity
public static class User {
  private String id;
  private String name;
  private byte age;
  private List<String> hobbies;

  public String getId() {
    return id;
  }
  public void setId(String id) {
    this.id = id;
  }
  public String getName() {
    return name;
  }
  public void setName(String name) {
    this.name = name;
  }
  public byte getAge() {
    return age;
  }
  public void setAge(byte age) {
    this.age = age;
  }
  public List<String> getHobbies() {
    return hobbies;
  }
  public void setHobbies(List<String> hobbies) {
    this.hobbies = hobbies;
  }
}

The where statement also is good if you want to perform a really simple text field search, for example let's create a query to retrieve users whose name starts with Jo:

QSearching_User u = QSearching_User.user;
List<User> users = session
  .query(User.class, Users_ByName.class)
  .where(u.name.startsWith("Jo"))
  .toList();
QSearching_User u = QSearching_User.user;
List<User> users = session
  .advanced()
  .documentQuery(User.class, Users_ByName.class)
  .whereStartsWith(u.name, "Jo")
  .toList();
store
  .getDatabaseCommands()
  .query("Users/ByName",
    new IndexQuery("Name:Jo*"));
public static class Users_ByName extends AbstractIndexCreationTask {
  public Users_ByName() {
    QSearching_User u = QSearching_User.user;
    map =
     " from user in docs.Users " +
     " select new              " +
     " {                       " +
     "     user.Name           " +
     " }; ";

    index(u.name, FieldIndexing.ANALYZED);
  }
}

Eventually all queries are always transformed into a Lucene query. The query like above will be translated into Name:Jo*.

Safe By Default

An attempt to use contains() method as condition of where closure, will throw IllegalStateException. That is because the search term like *term* (note wildcards at the beginning and at the end) can cause performance issues. Due to Raven's safe-by-default paradigm such operation is forbidden. If you really want to achieve this case, you will find more details in one of the next section below.

Information

Note that that results of a query might be different depending on an analyzer that was applied.


Multiple terms

When you need to do a more complex text searching use search method. This method allows you to pass a few search terms that will be used in searching process for a particular field. Here is a sample code that uses search to get users with name John or Adam:

QSearching_User u = QSearching_User.user;
List<User> users = session
  .query(User.class, Users_ByName.class)
  .search(u.name, "John Adam")
  .toList();
QSearching_User u = QSearching_User.user;
List<User> users = session
  .advanced()
  .documentQuery(User.class, Users_ByName.class)
  .search(u.name, "John Adam")
  .toList();
QueryResult result = store
  .getDatabaseCommands()
  .query("Users/ByName",
    new IndexQuery("Name:(John Adam)"));
public static class Users_ByName extends AbstractIndexCreationTask {
  public Users_ByName() {
    QSearching_User u = QSearching_User.user;
    map =
     " from user in docs.Users " +
     " select new              " +
     " {                       " +
     "     user.Name           " +
     " }; ";

    index(u.name, FieldIndexing.ANALYZED);
  }
}

Each of search terms (separated by space character) will be checked independently. The result documents must match exact one of the passed terms.

The same way you are also able to look for users that have some hobby:

QSearching_User u = QSearching_User.user;
List<User> users = session
  .query(User.class, Users_ByHobbies.class)
  .search(u.hobbies, "looking for someone who likes sport books computers")
  .toList();
QSearching_User u = QSearching_User.user;
List<User> users = session
  .advanced()
  .documentQuery(User.class, Users_ByHobbies.class)
  .search(u.hobbies, "looking for someone who likes sport books computers")
  .toList();
QueryResult result = store
  .getDatabaseCommands()
  .query("Users/ByHobbies",
    new IndexQuery("Name:(looking for someone who likes sport books computers)"));
public static class Users_ByHobbies extends AbstractIndexCreationTask {
  public Users_ByHobbies() {
    QSearching_User u = QSearching_User.user;
    map =
     " from user in docs.Users " +
     " select new              " +
     " {                       " +
     "     user.Hobbies        " +
     " }; ";

    index(u.hobbies, FieldIndexing.ANALYZED);
  }
}

In result you will get users that are interested in sport, books or computers.


Multiple fields

By using search you are also able to look for by multiple indexed fields. First let's introduce the index:

public static class Users_ByNameAndHobbies extends AbstractIndexCreationTask {
  public Users_ByNameAndHobbies() {
    QSearching_User u = QSearching_User.user;
    map =
     " from user in docs.Users " +
     " select new              " +
     " {                       " +
     "     user.Name,          " +
     "     user.Hobbies        " +
     " }; ";

    index(u.name, FieldIndexing.ANALYZED);
    index(u.hobbies, FieldIndexing.ANALYZED);
  }
}

Now we are able to search by using Name and Hobbies fields:

QSearching_User u = QSearching_User.user;
List<User> users = session
  .query(User.class, Users_ByNameAndHobbies.class)
  .search(u.name, "Adam")
  .search(u.hobbies, "sport")
  .toList();
QSearching_User u = QSearching_User.user;
List<User> users = session
  .advanced()
  .documentQuery(User.class, Users_ByNameAndHobbies.class)
  .search(u.name, "Adam")
  .search(u.hobbies, "sport")
  .toList();
QueryResult result = store
  .getDatabaseCommands()
  .query("Users/ByNameAndHobbies",
    new IndexQuery("Name:(Adam) OR Hobbies:(sport)"));

Boosting

Indexing in RavenDB is built upon Lucene engine that provides a boosting term mechanism. This feature introduces the relevance level of matching documents based on the terms found. Each search term can be associated with a boost factor that influences the final search results. The higher the boost factor, the more relevant the term will be. RavenDB also supports that, in order to improve your searching mechanism and provide the users with much more accurate results you can specify the boost argument. Let's see the example:

QSearching_User u = QSearching_User.user;
List<User> users = session
  .query(User.class, Users_ByHobbies.class)
  .search(u.hobbies, "I love sport", 10)
  .search(u.hobbies, "but also like reading books", 5)
  .toList();
QSearching_User u = QSearching_User.user;
List<User> users = session
  .advanced()
  .documentQuery(User.class, Users_ByHobbies.class)
  .search(u.hobbies, "I love sport")
  .boost(10.0)
  .search(u.hobbies, "but also like reading books")
  .boost(5.0)
  .toList();
QueryResult result = store
  .getDatabaseCommands()
  .query("Users/ByHobbies",
    new IndexQuery("Hobbies:(I love sport)^10 OR Hobbies:(but also like reading books)^5"));
public static class Users_ByHobbies extends AbstractIndexCreationTask {
  public Users_ByHobbies() {
    QSearching_User u = QSearching_User.user;
    map =
     " from user in docs.Users " +
     " select new              " +
     " {                       " +
     "     user.Hobbies        " +
     " }; ";

    index(u.hobbies, FieldIndexing.ANALYZED);
  }
}

The search above will promote users who do sports before book readers and they will be placed at the top of the result list.


Search options

In order to specify the logic of search expression specify the options argument of the search method. It is SearchOptions enum with the following values:

  • OR,
  • AND,
  • NOT,
  • GUESS (default).

By default RavenDB attempts to guess and match up the semantics between terms. If there are consecutive searches, they will be OR together, otherwise AND semantic will be used by default.

The following query:

QSearching_User u = QSearching_User.user;
List<User> users = session
  .query(User.class, Users_ByNameAgeAndHobbies.class)
  .search(u.hobbies, "computer")
  .search(u.name, "James")
  .where(u.age.eq((byte)20))
  .toList();
QSearching_User u = QSearching_User.user;
List<User> users = session
  .advanced()
  .documentQuery(User.class, Users_ByNameAgeAndHobbies.class)
  .search(u.hobbies, "computer")
  .search(u.name, "James")
  .whereEquals(u.age, (byte)20)
  .toList();
QueryResult result = store
  .getDatabaseCommands()
  .query("Users/ByNameAgeAndHobbies",
    new IndexQuery("(Hobbies:(computers) OR Name:(James)) AND Age:20"));
public static class Users_ByNameAgeAndHobbies extends AbstractIndexCreationTask {
  public Users_ByNameAgeAndHobbies() {
    QSearching_User u = QSearching_User.user;
    map =
      " from user in docs.Users " +
      " select new              " +
      " {                       " +
      "     user.Name,          " +
      "     user.Age,           " +
      "     user.Hobbies        " +
      " }; ";

    index(u.name, FieldIndexing.ANALYZED);
    index(u.hobbies, FieldIndexing.ANALYZED);
  }
}

will be translated into (Hobbies:(computers) Name:(James)) AND (Age:20) (if there is no boolean operator then OR is used).

You can also specify what exactly the query logic should be. The applied option will influence a query term where it was used. The query as follow:

QSearching_User u = QSearching_User.user;
List<User> users = session
  .query(User.class, Users_ByNameAndHobbies.class)
  .search(u.name, "Adam")
  .search(u.hobbies, "sport", SearchOptionsSet.of(SearchOptions.AND))
  .toList();
QSearching_User u = QSearching_User.user;
List<User> users = session
  .advanced()
  .documentQuery(User.class, Users_ByNameAndHobbies.class)
  .search(u.name, "Adam")
  .andAlso()
  .search(u.hobbies, "sport")
  .toList();
QueryResult result = store
  .getDatabaseCommands()
  .query("Users/ByNameAndHobbies",
    new IndexQuery("Name:(Adam) AND Hobbies:(sport)"));
public static class Users_ByNameAndHobbies extends AbstractIndexCreationTask {
  public Users_ByNameAndHobbies() {
    QSearching_User u = QSearching_User.user;
    map =
     " from user in docs.Users " +
     " select new              " +
     " {                       " +
     "     user.Name,          " +
     "     user.Hobbies        " +
     " }; ";

    index(u.name, FieldIndexing.ANALYZED);
    index(u.hobbies, FieldIndexing.ANALYZED);
  }
}

will result in the following Lucene query: Name:(Adam) AND Hobbies:(sport)

If you want to negate the term use SearchOptions.NOT:

QSearching_User u = QSearching_User.user;
List<User> users = session
  .query(User.class, Users_ByName.class)
  .search(u.name, "James", SearchOptionsSet.of(SearchOptions.NOT))
  .toList();
QSearching_User u = QSearching_User.user;
List<User> users = session
  .advanced()
  .documentQuery(User.class, Users_ByName.class)
  .not()
  .search(u.name, "James")
  .toList();
QueryResult result = store
  .getDatabaseCommands()
  .query("Users/ByName",
    new IndexQuery("-Name:(James)"));
public static class Users_ByName extends AbstractIndexCreationTask {
  public Users_ByName() {
    QSearching_User u = QSearching_User.user;
    map =
     " from user in docs.Users " +
     " select new              " +
     " {                       " +
     "     user.Name           " +
     " }; ";

    index(u.name, FieldIndexing.ANALYZED);
  }
}

According to Lucene syntax it will be transformed to the query: -Name:(James).

You can create combination of SearchOptions values, e.g.:

QSearching_User u = QSearching_User.user;
List<User> users = session
  .query(User.class, Users_ByNameAndHobbies.class)
  .search(u.name, "Adam")
  .search(u.hobbies, "sport", SearchOptionsSet.of(SearchOptions.NOT, SearchOptions.AND))
  .toList();
QSearching_User u = QSearching_User.user;
List<User> users = session
  .advanced()
  .documentQuery(User.class, Users_ByNameAndHobbies.class)
  .search(u.name, "Adam")
  .andAlso()
  .not()
  .search(u.hobbies, "sport")
  .toList();
QueryResult result = store
  .getDatabaseCommands()
  .query("Users/ByName",
    new IndexQuery("Name:(Adam) AND -Hobbies:(sport)"));
public static class Users_ByNameAndHobbies extends AbstractIndexCreationTask {
  public Users_ByNameAndHobbies() {
    QSearching_User u = QSearching_User.user;
    map =
     " from user in docs.Users " +
     " select new              " +
     " {                       " +
     "     user.Name,          " +
     "     user.Hobbies        " +
     " }; ";

    index(u.name, FieldIndexing.ANALYZED);
    index(u.hobbies, FieldIndexing.ANALYZED);
  }
}

It will produce the following Lucene query: Name:(Adam) AND -Hobbies:(sport).


Query escaping

The code examples presented in this section have hard coded searching terms. However in a real use case the user will specify the term. You are able to control the escaping strategy of the provided query by specifying the EscapeQueryOptions parameter. It's the enum that can have one of the following values:

  • ESCAPE_ALL (default),
  • ALLOW_POSTFIX_WILDCARD,
  • ALLOW_ALL_WILDCARDS,
  • RAW_QUERY.

By default all special characters contained in the query will be escaped (ESCAPE_ALL) when Query from session is used. However you can add a bit more of flexibility to your searching mechanism. EscapeQueryOptions.ALLOW_POSTFIX_WILDCARD enables searching against a field by using search term that ends with wildcard character:

QSearching_User u = QSearching_User.user;
List<User> users = session
  .query(User.class, Users_ByName.class)
  .search(u.name, "Jo* Ad*", EscapeQueryOptions.ALLOW_POSTFIX_WILDCARD)
  .toList();
QSearching_User u = QSearching_User.user;
List<User> users = session
  .advanced()
  .documentQuery(User.class, Users_ByName.class)
  .search(u.name, "Jo* Ad*", EscapeQueryOptions.ALLOW_POSTFIX_WILDCARD)
  .toList();
QueryResult result = store
  .getDatabaseCommands()
  .query("Users/ByName",
    new IndexQuery("Name:(Jo* Ad*)"));
public static class Users_ByName extends AbstractIndexCreationTask {
  public Users_ByName() {
    QSearching_User u = QSearching_User.user;
    map =
     " from user in docs.Users " +
     " select new              " +
     " {                       " +
     "     user.Name           " +
     " }; ";

    index(u.name, FieldIndexing.ANALYZED);
  }
}

The next option EscapeQueryOptions.ALLOW_ALL_WILDCARDS extends the previous one by allowing the wildcard character to be present at the beginning as well as at the end of the search term.

QSearching_User u = QSearching_User.user;
List<User> users = session
  .query(User.class, Users_ByName.class)
  .search(u.name, "*oh* *da*", EscapeQueryOptions.ALLOW_ALL_WILDCARDS)
  .toList();
QSearching_User u = QSearching_User.user;
List<User> users = session
  .advanced()
  .documentQuery(User.class, Users_ByName.class)
  .search(u.name, "*oh* *da*", EscapeQueryOptions.ALLOW_ALL_WILDCARDS)
  .toList();
QueryResult result = store
  .getDatabaseCommands()
  .query("Users/ByName",
    new IndexQuery("Name:(*oh* *da*)"));
public static class Users_ByName extends AbstractIndexCreationTask {
  public Users_ByName() {
    QSearching_User u = QSearching_User.user;
    map =
     " from user in docs.Users " +
     " select new              " +
     " {                       " +
     "     user.Name           " +
     " }; ";

    index(u.name, FieldIndexing.ANALYZED);
  }
}

Warning

RavenDB allows to search by using such queries but you have to be aware that leading wildcards drastically slow down searches.

Consider if you really need to find substrings, most cases looking for words is enough. There are also other alternatives for searching without expensive wildcard matches, e.g. indexing a reversed version of text field or creating a custom analyzer.

The last option makes that the query will not be escaped and the raw term will be relayed to Lucene:

QSearching_User u = QSearching_User.user;
List<User> users = session
  .query(User.class, Users_ByName.class)
  .search(u.name, "*J?n*", EscapeQueryOptions.RAW_QUERY)
  .toList();
QSearching_User u = QSearching_User.user;
List<User> users = session
  .advanced()
  .documentQuery(User.class, Users_ByName.class)
  .search(u.name, "*J?n*", EscapeQueryOptions.RAW_QUERY)
  .toList();
QueryResult result = store
  .getDatabaseCommands()
  .query("Users/ByName",
    new IndexQuery("Name:(*J?n*)"));
public static class Users_ByName extends AbstractIndexCreationTask {
  public Users_ByName() {
    QSearching_User u = QSearching_User.user;
    map =
     " from user in docs.Users " +
     " select new              " +
     " {                       " +
     "     user.Name           " +
     " }; ";

    index(u.name, FieldIndexing.ANALYZED);
  }
}

EscapeQueryOptions

Default EscapeQueryOptions value for Query is EscapeQueryOptions.ESCAPE_ALL.

Default EscapeQueryOptions value for DocumentQuery is EscapeQueryOptions.RAW_QUERY.