Sharding
The file system client offers the sharding support. The basic principles remain the same like for sharding documents. If you aren't familiar with the sharding concept, please read this article first.
Sharded client creation
Three steps are necessary in order to create the RavenFS client with the sharding support:
- First you need to specify URLs of servers and the names of file systems that you want to shard on.
- Next you have to create
ShardStrategy
(you can use its default behavior or overwrite such options as:ShardAccessStrategy
,ShardResolutionStrategy
andModifyFileName
). - Create the instance of
AsyncShardedFilesServerClient
object and pass the configured sharding strategy.
var shards = new Dictionary<string, IAsyncFilesCommands>
{
{"europe", new AsyncFilesServerClient("http://localhost:8080", "NorthwindFS")},
{"asia", new AsyncFilesServerClient("http://localhost:8081", "NorthwindFS")},
};
var shardStrategy = new ShardStrategy(shards)
{
/*
ShardAccessStrategy = ...
ShardResolutionStrategy = ...
ModifyFileName = ...
*/
};
var shardedCommands = new AsyncShardedFilesServerClient(shardStrategy);
Usage
The AsyncShardedFilesServerClient
is the sharding equivalent of IAsyncFilesCommands for file management and searching functionalities.
File names in the sharded environment
The sharding strategy relays on the names of shards specified during the setup. In order to properly work with files you need to take into account
that the upload operation returns the new name of a file, created according to the mentioned ModifyFileName
function. Its default formula is:
(convention, shardId, fileName) => convention.IdentityPartsSeparator + shardId + convention.IdentityPartsSeparator + fileName
. It
means that the file doc.txt
stored on the shard named europe
will obtain the following name: /europe/doc.txt
.
File operations
The below examples present the basic usage of CURD methods.
string fileName = await shardedCommands.UploadAsync("test.bin", new RavenJObject()
{
{
"Owner", "Admin"
}
}, new MemoryStream()); // will return either /europe/test.bin or /asia/test.bin name
// you need to pass the returned file name here to let the client know on which shard the file exists
using (var content = await shardedCommands.DownloadAsync(fileName))
{
}
string renamed = await shardedCommands.RenameAsync(fileName, "new.bin");
await shardedCommands.DeleteAsync(renamed);
Browsing / searching files
The file browsing and searching looks exactly the same like for non-sharded environment. The results from the sharded file systems are merged according to the
ApplyAsync
method of IShardAccessStrategy
. The default implementation is SequentialShardAccessStrategy
that combines results in the sequential order
according to the list of shards passes to the ShardStrategy
object.
FileHeader[] fileHeaders = await shardedCommands.BrowseAsync();
SearchResults searchResults = await shardedCommands.SearchAsync("__fileName:test*");
Custom IShardResolutionStrategy
The default implementation of the IShardResolutionStrategy
alternately uploads files to shards. However you can overwrite that and
for instance use metadata to select the appropriate shard server.
RegionMetadataBasedResolutionStrategy
The actual decision is made in GenerateShardIdFor
method.
public class RegionMetadataBasedResolutionStrategy : IShardResolutionStrategy
{
private int counter;
private readonly IList<string> shardIds;
private readonly ShardStrategy.ModifyFileNameFunc modifyFileName;
private readonly FilesConvention conventions;
public RegionMetadataBasedResolutionStrategy(IList<string> shardIds, ShardStrategy.ModifyFileNameFunc modifyFileName, FilesConvention conventions)
{
this.shardIds = shardIds;
this.modifyFileName = modifyFileName;
this.conventions = conventions;
}
public ShardResolutionResult GetShardIdForUpload(string filename, RavenJObject metadata)
{
var shardId = GenerateShardIdFor(filename, metadata);
return new ShardResolutionResult
{
ShardId = shardId,
NewFileName = modifyFileName(conventions, shardId, filename)
};
}
public string GetShardIdFromFileName(string filename)
{
if (filename.StartsWith("/"))
filename = filename.TrimStart(new[] { '/' });
var start = filename.IndexOf(conventions.IdentityPartsSeparator, StringComparison.OrdinalIgnoreCase);
if (start == -1)
throw new InvalidDataException("file name does not have the required file name");
var maybeShardId = filename.Substring(0, start);
if (shardIds.Any(x => string.Equals(maybeShardId, x, StringComparison.OrdinalIgnoreCase)))
return maybeShardId;
throw new InvalidDataException("could not find a shard with the id: " + maybeShardId);
}
public string GenerateShardIdFor(string filename, RavenJObject metadata)
{
// choose shard based on the region
var region = metadata.Value<string>("Region");
string shardId = null;
if (string.IsNullOrEmpty(region) == false)
shardId = shardIds.FirstOrDefault(x => x.Equals(region, StringComparison.OrdinalIgnoreCase));
return shardId ?? shardIds[Interlocked.Increment(ref counter) % shardIds.Count];
}
public IList<string> PotentialShardsFor(ShardRequestData requestData)
{
// for future use
throw new NotImplementedException();
}
}
Use the following code for initialization with this strategy:
var strategy = new ShardStrategy(shards);
strategy.ShardResolutionStrategy = new RegionMetadataBasedResolutionStrategy(shards.Keys.ToList(), strategy.ModifyFileName, strategy.Conventions);
var client = new AsyncShardedFilesServerClient(strategy);