Currently, for remote shards (uploaded to S3), only an empty file <shard-name>.remote exists locally. All .info data for such shards is retrieved either from .frac_cache (if cached) or directly from S3.
Problems
-
Cannot determine shard format without an S3 request
We cannot tell whether a shard is a single index or split into multiple files without making a request to S3.
-
Losing the only local source of .info when removing .frac_cache
If we remove .frac_cache, every store initialization will require fetching .info from S3 for each remote shard. This is unacceptable in terms of speed and cost.
Solution
For each remote shard, store a local file <shard-name>.remote-info – in exactly the same format as .info for regular shards.
Legacy shard format indicator:
info.BinaryDataVer < config.BinaryDataV3
Disk space estimate
- Average
.info size: 450–500 bytes
- One shard every 5 minutes (after compaction)
Yearly volume:
500 bytes * (525,600 min / 5 min) ≈ 50 MB
Currently, for remote shards (uploaded to S3), only an empty file
<shard-name>.remoteexists locally. All.infodata for such shards is retrieved either from.frac_cache(if cached) or directly from S3.Problems
Cannot determine shard format without an S3 request
We cannot tell whether a shard is a single index or split into multiple files without making a request to S3.
Losing the only local source of
.infowhen removing.frac_cacheIf we remove
.frac_cache, every store initialization will require fetching.infofrom S3 for each remote shard. This is unacceptable in terms of speed and cost.Solution
For each remote shard, store a local file
<shard-name>.remote-info– in exactly the same format as.infofor regular shards.Legacy shard format indicator:
info.BinaryDataVer < config.BinaryDataV3Disk space estimate
.infosize: 450–500 bytesYearly volume:
500 bytes * (525,600 min / 5 min) ≈ 50 MB