-
Notifications
You must be signed in to change notification settings - Fork 59
Producing multiple images
Sometimes, you may have a very large data set, but all of the data is not required by all of your applications. Zeno provides a facility to efficiently produce multiple images from the objects added to one state engine.
One way you might want to split your data, for example, is along the axis of geographical region. Some applications might require data from across the globe, and you can create an image for that. Some application deployments may require only data from north america, and some may require only data from south america, and you can create two more images for these.
Separating objects by images in this way is valuable because it can help to minimize the resources required on your client machines to hold and keep updated your data set in memory.
Let’s imagine we want to produce two images, each containing different data. Let’s further assume we have only three objects to distribute across these images, and we want this data distributed in the following way:
We need to assign each of these images some index (from 0-n), then use that index to maintain a separation between the images. We do this in the following ways:
- When we create a state engine, we tell it how many images we want to create:
- When we add object instances to the state engine, we tell it which images we want the object instances to be assigned to:
- When we create a FastBlobWriter, we tell it the index of the image for which we want to produce blobs:
The FastBlobStateEngine lifecycle remains the same. When our data origination process writes the data to disk, now it simply creates multiple FastBlobWriters and writes multiple streams. Each “image” will produce an independent delta chain for consumption by blob consumers. You must only namespace your images in your blob persistence storage so that you can specify to blob consumers which image they should load.
The resource overhead required to produce these additional images is negligible. In memory, it requires a single additional bit per image. The CPU overhead required to segregate instances into multiple images is also negligible. The only significant difference when producing multiple resources will be the I/O required to stream the additional data from the state engine.
