Skip to content

Conversation

@xinyangge
Copy link

Please ensure your PR title follows the format:

type(scope): subject

Example:
feat(api): add user login endpoint

Available types:

  • feat: A new feature
  • fix: A bug fix
  • docs: Documentation only changes
  • style: Changes that do not affect the meaning of the code (white-space, formatting, missing semi-colons, etc)
  • refactor: A code change that neither fixes a bug nor adds a feature
  • perf: A code change that improves performance
  • test: Adding missing tests or correcting existing tests
  • build: Changes that affect the build system or external dependencies (example scopes: gulp, broccoli, npm)
  • ci: Changes to our CI configuration files and scripts (example scopes: Travis, Circle, BrowserStack, SauceLabs)
  • chore: Other changes that don't modify src or test files
  • revert: Reverts a previous commit

Description

Link to the issue in case of a bug fix.

Testing details

  1. Manual - NA
  2. Unit tests - NA
  3. Integration tests - NA

Any backward incompatible change? If so, please explain.

claude and others added 13 commits November 7, 2025 19:12
This commit adds the foundational infrastructure for sparse file support
in file cache mode to optimize random I/O by downloading only requested
chunks instead of entire files.

Changes:
- Add ByteRangeMap data structure to track downloaded byte ranges
- Extend FileInfo with SparseMode flag and DownloadedRanges tracking
- Update FileInfo.Size() to return actual downloaded bytes for sparse files
- Add config options: enable-sparse-file and sparse-file-chunk-size-mb
- Update CacheHandler to initialize FileInfo with sparse mode when configured
- Update all NewCacheHandler callsites to pass file cache config

The sparse file feature is controlled by the enable-sparse-file config
option and defaults to 1MB chunk size for partial downloads.
This commit implements the core sparse file functionality to support
partial file downloads in file cache mode, optimizing for random I/O
workloads by downloading only requested chunks instead of entire files.

Key changes:
- Add Job.DownloadRange() method to download specific byte ranges
- Modify CacheHandle.Read() to check downloaded ranges and trigger
  partial downloads for sparse files
- Align chunk downloads to chunk boundaries for better cache efficiency
- Support chunk size configuration via sparse-file-chunk-size-mb
- Handle sparse files in both active job and completed job scenarios
- Fall back to GCS when requested range is not cached

Implementation details:
- For sparse mode files, reads check if the requested byte range is
  already downloaded using ByteRangeMap
- If not downloaded, calculate chunk boundaries and download the chunk
- Chunk size defaults to 1MB and aligns downloads to chunk boundaries
- Downloaded ranges are tracked in FileInfo.DownloadedRanges
- Cache size accounting uses actual downloaded bytes via FileInfo.Size()
- Eviction and cleanup work automatically with sparse files

The sparse file feature enables efficient random reads without downloading
entire files, significantly reducing bandwidth and storage for workloads
that only access portions of large files.
Replace the complex arbitrary range tracking implementation with a simpler
chunk-based approach that leverages the 1MB-aligned download strategy.

Changes:
- Use map[chunkID]bool instead of sorted slice of arbitrary ranges
- Remove complex range merging and overlap detection logic
- Simplify all operations to O(chunks_in_range) complexity
- Track full chunks only, even for partial byte range requests
- Add explicit chunk alignment test case

Benefits:
- Simpler, more maintainable code (~50% fewer lines in core logic)
- Faster operations with O(1) chunk lookup instead of binary search
- Better alignment with the existing chunk-aligned download strategy
- Lower memory overhead per operation
- Easier to reason about and debug

The chunk-based approach is ideal since downloads are already aligned to
chunk boundaries in cache_handle.go, eliminating the need for arbitrary
byte range tracking.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <[email protected]>
Previously, sparse files would stop growing once the download job was
completed or removed. This meant that subsequent reads to uncached regions
would always fall back to GCS without populating the cache.

This commit fixes the bug by recreating the download job when needed and
downloading missing chunks on-demand, regardless of whether the file was
accessed sequentially or randomly.

Changes:
- Add jobManager, bucket, object, and fileCacheConfig to CacheHandle
  to enable job recreation
- Update NewCacheHandle signature to accept these new parameters
- Modify sparse file handling logic to recreate jobs when fileDownloadJob
  is nil and download missing chunks on-demand
- Update cache_handler.go and cache_handle_test.go callsites

Benefits:
- Sparse file cache continues to grow with access patterns
- Random I/O workloads benefit from incremental caching
- No need for pre-warming or sequential access
- Cache naturally adapts to actual usage patterns

The fix ensures that sparse files remain useful for random I/O workloads
by continuously populating the cache as needed, rather than becoming
read-only after the initial job completes.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <[email protected]>
Added two new configuration fields to the file-cache section:
- enable-sparse-file: Enables sparse file mode for random I/O optimization
- sparse-file-chunk-size-mb: Configures chunk size for sparse downloads (default 1MB)

Regenerated config.go from params.yaml using go generate.

This fixes the Linux compilation error where SparseFileChunkSizeMb field
was manually added to config.go but not present in the schema, causing
the field to be missing when config.go was regenerated on Linux builds.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <[email protected]>
Fixed two compilation issues:
1. cache_handle.go: Changed to use fch.fileCacheConfig instead of
   fch.fileDownloadJob.fileCacheConfig (unexported field access)

2. cache_handler.go: Renamed fileInfo to newFileInfo in addEntryToCache
   block to avoid type conflict with lru.ValueType from outer scope

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <[email protected]>
…osure

The sparse file cache was closing cache handles prematurely after successful
random reads because validateEntryInFileInfoCache was checking the Offset
field, which only tracks the highest contiguous range from offset 0. For
sparse files with random access patterns, downloaded chunks at high offsets
would fail this validation even though they were successfully cached.

Changes:
- Modified validateEntryInFileInfoCache to skip Offset check for sparse files
- Changed sparse download error handling to return ErrFallbackToGCS instead
  of wrapped ErrInvalidFileInfoCache to prevent unnecessary handle closure
- Added debug logging for sparse file cache hit checks

This fix enables proper cache reuse for sparse file random reads, improving
performance from milliseconds to microseconds for cached chunks.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <[email protected]>
The LRU cache was not properly tracking size increases when sparse file
chunks were downloaded, causing the cache to grow unbounded beyond the
configured size limit.

Root cause: When updating FileInfo after downloading a chunk, the code
mutated the ByteRangeMap pointer in place, then called Insert. Since both
the old cached entry and the new entry pointed to the same mutated
ByteRangeMap, the LRU cache's size accounting logic would calculate:
  currentSize -= oldEntry.Size()  // Returns NEW size (already mutated)
  currentSize += newEntry.Size()  // Returns NEW size
  // Net effect: no change to currentSize

This prevented eviction from triggering even when files exceeded the limit.

Fix: Call Erase before mutating the ByteRangeMap, so the old entry's
original size is properly subtracted before the new (larger) size is added.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <[email protected]>
When a sparse file grows beyond the cache size limit, we now:
1. Read the just-downloaded chunk back into memory
2. Delete the entire sparse file to reclaim disk space
3. Recreate the file with only the current chunk
4. Update FileInfo to track only this chunk

This prevents unbounded growth for single-file workloads where the file
exceeds the cache size limit. The most recently accessed chunk is preserved
while old unused chunks are discarded.

The overhead of reading back 1MB into memory is acceptable since hitting
the cache size limit should be infrequent.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <[email protected]>
Problem: Previously, sparse file reads caused double page caching:
1. Cache file on disk had its own page cache
2. FUSE mount had its own page cache
This wasted memory with duplicate data.

Solution:
- Use O_DIRECT flag when writing cache files (bypasses page cache)
- Download chunks to memory-aligned buffers
- Return downloaded data from DownloadRange
- Store in-memory chunk in CacheHandle (sparseChunkData)
- Serve reads from in-memory data when available

Data flow now:
  GCS → aligned buffer → disk (O_DIRECT) → memory (sparseChunkData)
  On read: sparseChunkData → FUSE (no disk I/O for cached chunk)

Benefits:
- Eliminates duplicate page cache
- Only FUSE mount uses page cache
- Faster reads (no disk I/O for just-downloaded chunks)
- Memory usage proportional to active chunks, not total cached data

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <[email protected]>
The Job is now always created eagerly in GetCacheHandle, eliminating the
need for lazy initialization logic. This removes unnecessary fields from
CacheHandle (jobManager, bucket, object) and simplifies the code.

Changes:
- Eagerly create Job in cache_handler.GetCacheHandle if nil
- Remove jobManager, bucket, object fields from CacheHandle
- Simplify NewCacheHandle signature (removed 3 parameters)
- Remove complex lazy Job recreation logic in cache_handle Read path
- Update tests to match new signature

This makes the code cleaner and easier to understand - CacheHandle no
longer needs to recreate Jobs on-demand since they're guaranteed to exist
at creation time for sparse files.
For sparse files, the Offset field is now set to MaxUint64 as a sentinel
value instead of tracking the highest contiguous offset from 0. This allows
us to simplify the code by removing SparseMode checks when validating Offset.

Changes:
- Set Offset to MaxUint64 (^uint64(0)) for sparse files in cache_handler
- Remove !fileInfoData.SparseMode check in validateEntryInFileInfoCache
- Remove complex Offset calculation logic in DownloadRange
- Update comments to document the sentinel value approach

Benefits:
- Simpler code: fileInfoData.Offset < requiredOffset works for both modes
- No need for SparseMode branching in validation logic
- Offset field for sparse files was unused anyway - DownloadedRanges is authoritative
@google-cla
Copy link

google-cla bot commented Nov 8, 2025

Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

View this failed invocation of the CLA check for more information.

For the most up to date status, view the checks section at the bottom of the pull request.

@github-actions
Copy link

github-actions bot commented Nov 8, 2025

Hey there and thank you for opening this pull request! 👋🏼

We require pull request titles to follow the Conventional Commits specification and it looks like your proposed title needs to be adjusted.

Details:

No release type found in pull request title "Sparse file range read". Add a prefix to indicate what kind of release this pull request corresponds to. For reference, see https://www.conventionalcommits.org/

Available types:
 - feat: A new feature
 - fix: A bug fix
 - docs: Documentation only changes
 - style: Changes that do not affect the meaning of the code (white-space, formatting, missing semi-colons, etc)
 - refactor: A code change that neither fixes a bug nor adds a feature
 - perf: A code change that improves performance
 - test: Adding missing tests or correcting existing tests
 - build: Changes that affect the build system or external dependencies (example scopes: gulp, broccoli, npm)
 - ci: Changes to our CI configuration files and scripts (example scopes: Travis, Circle, BrowserStack, SauceLabs)
 - chore: Other changes that don't modify src or test files
 - revert: Reverts a previous commit

xinyangge and others added 15 commits November 8, 2025 09:00
This change eliminates lazy initialization of fileDownloadJob for sparse
files by eagerly creating the job when CacheHandle is constructed.

Key changes:
- Removed jobManager, bucket, and object fields from CacheHandle struct
- Changed NewCacheHandle to no longer accept these three parameters
- Updated cache_handler.go to call CreateJobIfNotExists instead of GetJob,
  ensuring the job is created upfront
- Simplified sparse file download logic by assuming fileDownloadJob is
  always available (no longer needs to recreate it on-demand)
- Updated test to match new function signature

This reduces CacheHandle's field count and eliminates the complexity of
lazy job recreation for sparse file on-demand downloads.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <[email protected]>
The comment about MaxUint64 sentinel for sparse files is no longer
accurate after recent simplifications to the sparse file handling logic.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <[email protected]>
Instead of storing the entire *cfg.FileCacheConfig, CacheHandle now stores
only the specific value it needs: sparseFileChunkSizeMb (int64).

Changes:
- Replaced fileCacheConfig field with sparseFileChunkSizeMb field
- Updated NewCacheHandle to accept int64 instead of *cfg.FileCacheConfig
- Extract sparseFileChunkSizeMb at call sites (cache_handler.go and test)
- Removed unused cfg import from cache_handle.go

This simplifies CacheHandle by storing only what's actually needed,
reducing coupling to the config package.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <[email protected]>
…e config

Replaced the sparse-file-chunk-size-mb configuration with sequential-read-size-mb
to simplify configuration and reuse existing parameters.

Changes:
- Added SequentialReadSizeMb() getter to JobManager
- Updated cache_handler.go to use jobManager.SequentialReadSizeMb()
- Updated cache_handle_test.go to use DefaultSequentialReadSizeMb directly
- Removed SparseFileChunkSizeMb field from FileCacheConfig
- Removed sparse-file-chunk-size-mb from params.yaml
- Cleaned up resolveSparseFileConfig() in rationalize.go

Now sparse file downloads use the same chunk size as sequential reads,
eliminating the need for a separate configuration parameter.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <[email protected]>
Removed the now-empty resolveSparseFileConfig function and its call site
after eliminating the SparseFileChunkSizeMb configuration field.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <[email protected]>
…uentialReadSizeMb

This change eliminates redundant storage of the chunk size by retrieving it
directly from the fileDownloadJob when needed. This maintains a single source
of truth for the configuration value.

Changes:
- Removed sparseFileChunkSizeMb field from CacheHandle struct
- Updated NewCacheHandle to not accept sparseFileChunkSizeMb parameter
- Modified Read method to call fileDownloadJob.SequentialReadSizeMb() instead
- Added SequentialReadSizeMb() getter method to Job
- Updated all call sites to use new signature

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <[email protected]>
…eConfig

Replace fileCacheConfig field with isSparse boolean to simplify the struct.
The isSparse value is computed once in the constructor from the config parameter.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <[email protected]>
This change simplifies the NewCacheHandler function signature by passing
only the isSparse boolean flag instead of the entire FileCacheConfig struct.
This reduces coupling and makes the dependency more explicit.

Changes:
- Updated NewCacheHandler to accept isSparse bool parameter
- Removed cfg import from cache_handler.go
- Updated all call sites to pass isSparse directly
- In production code, extract isSparse from config: EnableSparseFile
- In test code, pass false for isSparse

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <[email protected]>
This change eliminates duplicate code by consolidating the sparse file
download handling into a single code path. Previously, the sparse file
DownloadRange logic appeared in two locations, which was redundant since
fileDownloadJob is guaranteed to be non-nil in sparse mode.

Changes:
- Removed fileDownloadJob != nil check from sparse mode condition
- Removed redundant DownloadedRanges != nil check (simplified to direct call)
- Eliminated duplicate sparse file handling in the else block
- Restructured control flow: sparse → non-sparse with job → completed/no job
- Removed nested if condition that was always checking SparseMode inside sparse block

This simplifies the code while maintaining the same functionality.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <[email protected]>
Updated comment to be more accurate: when fileDownloadJob is nil, it means
either the job successfully completed OR it failed/was never created.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <[email protected]>
This change removes the custom oDirectFlag constant (0x4000) and uses
the standard syscall.O_DIRECT instead. This is cleaner and more idiomatic
since the codebase is Linux-only.

Changes:
- Removed oDirectFlag constant definition
- Replaced oDirectFlag with syscall.O_DIRECT in os.OpenFile calls
- syscall package is already imported, so no import changes needed

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <[email protected]>
This change removes the in-memory caching of sparse file chunks that was
intended to avoid reading back from disk after O_DIRECT writes. The logic
added complexity and the sparse chunk data fields are now unused.

Simplified the read path to always read from the file handle, which makes
the code cleaner and easier to maintain. The O_DIRECT optimization in
DownloadRange already bypasses the page cache for writes.

Changes:
- Removed sparseChunkData in-memory buffer logic from Read method
- Simplified read path to always use fileHandle.ReadAt
- Removed 17 lines of conditional logic for sparse chunk handling

Note: sparseChunkData and sparseChunkStart fields remain in the struct
but are no longer used in Read. They can be removed in a future cleanup.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <[email protected]>
The check for whether a range is already downloaded belongs in the caller
(CacheHandle.Read), not in DownloadRange. The caller already checks
DownloadedRanges.ContainsRange before calling DownloadRange, making this
check redundant.

Removing this simplifies the function and clarifies responsibilities:
- Caller decides when to download
- DownloadRange performs the download

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <[email protected]>
The check for whether a file is in sparse mode is already performed by
the caller (CacheHandle.Read) before calling DownloadRange. This redundant
validation in DownloadRange is unnecessary and can be removed.

Simplifies the function by removing defensive programming that duplicates
caller responsibilities.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <[email protected]>
DownloadedRanges is always initialized when FileInfo is created for sparse
files (in addFileInfoEntryAndCreateDownloadJob). This nil check is
unnecessary and can be removed.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <[email protected]>
xinyangge and others added 2 commits November 8, 2025 11:30
…rn only error

Since in-memory chunk caching was removed, the sparseChunkData and
sparseChunkStart fields in CacheHandle are no longer used. Additionally,
DownloadRange no longer needs to return the downloaded bytes.

Changes:
- Removed sparseChunkData and sparseChunkStart fields from CacheHandle
- Changed DownloadRange signature from ([]byte, error) to error
- Updated all return statements in DownloadRange to return only errors
- Updated caller in CacheHandle.Read to not expect bytes return value
- Simplified comment about avoiding double page cache

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <[email protected]>
Simplified DownloadRange to directly mutate DownloadedRanges without
Erase+Insert pattern. This removes the complexity of maintaining accurate
LRU cache size accounting for incrementally growing sparse files.

The LRU cache will no longer track the actual downloaded bytes for sparse
files, but this is acceptable as the primary goal is to enable partial
downloads for large files rather than perfect cache accounting.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants