RavenDB is an amazing document database, born out of the NoSQL approach and designed specifically to play nice with .NET. I have been using RavenDB for the last year and it certainly has made development a breeze. I highly recommend it. That being said, there is still a learning curve involved and if you approach a document database like RavenDB with a SQL approach you will most likely suffer for it. In this post, I will show you two ways to page data in RavenDB. I will also explain the pros and cons of each approach. The first approach uses a awesome library called PagedList and the second approach will use the built in faculties of RavenDB, both approaches have their advantages and disadvantages.
To start you will need RavenDB and PagedList from NuGet. You can install these using the following commands. For this post we will use the embedded version of RavenDB.
This post assumes you have some prior knowledge of RavenDB but don't worry if you don't because this applies to most LINQ based providers. (Entity Framework, LINQ2SQL, etc.).
PM> Install-Package RavenDB.Embedded
PM> Install-Package PagedList
So let us assume you have a collection of students and you would like to page them based on their email addresses. What would that look like?
var results = session.Query<Students_ByEmail.Result, Students_ByEmail>() // always have an order .OrderBy(x => x.Email) .OfType<Student>() // call ToPagedList with page # and size .ToPagedList(1, 10);
There you have it, you just got the first page with 10 results of students ordered by email address. It couldn't get easier, but keep these things in mind when paging this way:
- PagedList uses Count() to get the total count server side, which is another web request.
- An OrderBy is highly recommended, without it you will be ordering by the Lucene score, which seems to be the last time a document was updated.
- PagedList will give you some very helpful stats like : TotalItemCount, PageNumber, PageSize, PageNumber, and more.
So that is the first approach, let's take a look at the native and more optimized approach.
Let's take the same scenario of students ordered by their emails and paged properly. This time we will only make one request and get all the data we need.
var page = 1; var size = 10; RavenQueryStatistics stats; var results = session.Query<Students_ByEmail.Result, Students_ByEmail>() // get statistics from index .Statistics(out stats) // order by email .OrderBy(x => x.Email) .OfType<Student>() // we need to skip the results // based on the page .Skip(Math.Min(0, page - 1) * size) // we take based on the page size .Take(size) .ToList();
You'll notice that there is a lot more happening here then there was in the example using PagedList. We have to apply the Skip and Take ourselves which can add a few extra lines of code. We could move the paging into an extension method and mimic what PagedList does, but I wanted to show you the details.
Things to keep in mind with this second approach:
- It does add some code bloat, which could be alleviated with an extension method.
- It does require the use of an out parameter, which might make you hate yourself a little.
- You don't get the nice properties and values that PagedList offers.
- It is more efficient since you are making one request rather than two.
I usually start with PagedList because the extra call to Count() internally is not usually crushing to your final response time, but when I start optimizing I switch over to the LINQ based approach to reduce the requests on my RavenDB instance. Feel free to try both and see which one feels better for you, but understand the advantages and disadvantages to both.
As a bonus you can use both PagedList and the LINQ approach together. Let's have a look.
var page = 1; var size = 10; RavenQueryStatistics stats; var results = session.Query<Students_ByEmail.Result, Students_ByEmail>() // get statistics from index .Statistics(out stats) // order by email .OrderBy(x => x.Email) .OfType<Student>() // we need to skip the results // based on the page .Skip(Math.Min(0, page - 1) * size) // we take based on the page size .Take(size) .ToList(); var pagedList = new StaticPagedList<Student>(results, page, size, stats.TotalResults);
This approach let's you use all the helpful properties on a PagedList while still only making one request. How awesome is that?