We are now working on proper modeling scenarios for RavenDB’s time series as part of our release cycle. We are trying to consider as many possible scenarios and see that we have good answer to them. As part of this, we looked at applying timeseries in RavenDB to problems that were raised by customers in the past.
The scenario in question is storing data from a traffic camera. The idea is that we have a traffic camera that will report [Time, Car License Number, Speed] for each car that it capture. The camera will report all cars, not just those that are speeding. Obviously, we don’t want to store a document for each and every car registered by the camera. At the same time, we are interested in knowing the speed on the roads over time.
There for, we are going to handle this in the following manner:
This allows us to handle both the ticket issuance and recording the traffic on the road over time. This works great, but it does leave one thing undone. How do I correlate the measurement to the ticket?
In this case, let’s assume that I have some additional information about the measurement that I record in the time series (for example, the confidence level of the camera in its speed report) and that I need to be able to go from the ticket to the actual measurement and vice versa.
The question is how to do this? The whole point of time series is that we are able to compress the data we record significantly. We use about 4 bits per entry, and that is before we apply actual compression here. That means that if we want to be able to use the minimal amount of disk space, we need to consider how to do this.
One way of handling this is to first create the ticket and attach the Ticket’s Id to the measurement. That is where the tag on the entry comes into play. This works, but it isn’t ideal. The idea about the tag on the entry is that we expect there to be a lot of common values. For example, if we have a camera that uses two separate sensors, we’ll use the tag to denote which sensor took the measurement. Or maybe it will use the make & model of the sensor, etc. The set of values for the tag is expected to be small and to highly repeat itself. If the number of tickets issued is very small, of course, we probably wouldn’t mind. But let’s assume that we can’t make that determination.
So we need to correlate the measurement to the ticket, and the simplest way to handle that is to record the time of the measurement in the ticket, as well as which camera generated the report. With this information, you can load the relevant measurement easily enough. But there is one thing to consider. RavenDB’s timestamps use millisecond accuracy, while .NET’s DateTime has 100 nanosecond accuracy. You’ll need to account for that when you store the value.
With that in place, you can do all sort of interesting things. For example, consider the following query.
This will allow us to show the ticket as well as the road conditions around the time of the ticket. You can use it to say “but everyone does it”, which I am assured is a valid legal defense strategy.