vllm.v1.attention.ops.chunked_prefill_paged_decode ¶
has_native_kv_cache_layout ¶
Return whether KV cache blocks can use the native ROCm pairing.
The C++ ops.paged_attention_rocm custom kernel requires each block to be contiguous in memory. Returns False for stride-padded hybrid layouts and for the unified KV cache (RFC #42082, see :meth:PagedAttention.split_kv_cache), routing them to Triton.