Skip to content

[BUG]: cufile error out when using kvbm disk cache #4149

@wangqia0309

Description

@wangqia0309

Describe the Bug

Hi, I integrated vLLM with the latest Dynamo, and when enabling kvbm disk offload, I started vLLM container with the following options:

--mount-workspace --use-nixl-gds

I also set these environment variables:

export DYN_KVBM_CPU_CACHE_GB=50
export DYN_KVBM_DISK_CACHE_GB=100

After launching vLLM, a file named cufile.log appeared and contained the following two lines:

06-11-2025 03:44:55:695 [pid=424 tid=488] NOTICE  cufio-drv:830 running in compatible mode
06-11-2025 03:45:10:764 [pid=424 tid=488] ERROR  cufio-fs:79 mount option not found in mount table data device: /dev/vda1

When I check with df -Th, I get:

/dev/vda1      ext4       697G  658G   39G  95% /tmp

I’m not sure why cuFile cannot find the mount configuration.
The 100GB disk space seems to have been allocated, but I don’t see any corresponding file under /tmp.
When I run lsof to inspect open files, it shows that the disk cache files have been deleted while still in use:

VLLM::Wor 424 root 127u REG 253,1 99998760960 8670 /tmp/dynamo-kvbm-disk-cache-d2VwbN (deleted)
VLLM::Wor 424 root 128u REG 253,1 99998760960 8670 /tmp/dynamo-kvbm-disk-cache-d2VwbN (deleted)

Could you please help explain:

Why cuFile reports mount option not found in mount table for /dev/vda1?

Why the disk cache file is deleted while still being used (shown as (deleted) in lsof)?

Thanks a lot for your help!

Steps to Reproduce

  1. launch the container with run.sh
  2. launch the vllm with disk offload in container

Expected Behavior

disk offload is ok

Actual Behavior

cufile error out

Environment

ai-dynamo 0.6.0
vllm 0.11.0

Additional Context

No response

Screenshots

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions