SkippedResults
When querying RavenDB, you sometimes get a result that contained skipped results. What are those? And why do we care? Let us assume that we have the following index:
from img in docs.Images
from tag in img.Tags
select new { tag }
And you issue the following query:
/indexes/ImagesByTag?query=tag:NoSQL
Each image may have multiple tags, so it may have multiple results in the index. Here is an example of the actual physical index structure:
{ "__document_id": "imgs/1", "tag": "RavenDB" }
{ "__document_id": "imgs/1", "tag": "NoSql" }
{ "__document_id": "imgs/2", "tag": "NoSQL" }
{ "__document_id": "imgs/2", "tag": "NoSql" }
{ "__document_id": "imgs/3", "tag": "Databases" }
As you can see, we have several documents that contains multiple results for the same document.
Now, the query above is going to return the follow results from the index:
{ "__document_id": "imgs/1", "tag": "NoSql" }
{ "__document_id": "imgs/2", "tag": "NoSQL" }
{ "__document_id": "imgs/2", "tag": "NoSql" }
Note that imgs/2 appears twice in the result set, however, when we are querying for documents, there isn't really a point in returning the same document twice (and it drastically increase the response size), so we filter it out and return each document only once.
When SkippedResults is greater than 0 it implies that we skipped over some results in the index because they represent a document that we already load. We have to report this information to the client, because it is an important factor when paging. You starting point is (pageSize * currentPage + SkippedResults), not just (pageSize * currentPage).