Skip to content

Releases: ROCm/flash-attention

v2.6.3-cktile

17 Sep 18:52
e2182cc
Compare
Choose a tag to compare

We send the PR to upstream in this PR

  1. Update the ROCm backend (CK), so I modify how to call ck due to changing of CK api.
  2. Improve backward performance by updating the CK (1)
  3. Implement mha_fwd_kvcache().
  4. Change of compile flag to support ROCm6.2
  5. Change bf16 rounding to RTN (round to nearest)

v2.6.2-cktile

14 Aug 10:02
Compare
Choose a tag to compare

This release is the first version of supporting composable kernel tile backend

vllm-v2.5.9post1-90a.942-240719

19 Jul 14:18
23a2b1c
Compare
Choose a tag to compare
Pre-release

This release is created solely for convenient installation by vLLM. The attached wheel is created from the ck_tile branch as of 07/19/2024, with commit hash 23a2b1c2f21, for architectures gfx90a;gfx942, and is designed for use with torch==2.5.0.dev20240710 (this requirement is not strict) and ROCm 6.1.

To install matching version of torch:

python3 -m pip install --no-cache-dir --pre \
                torch==2.5.0.dev20240710 torchvision==0.20.0.dev20240710 \
               --index-url https://download.pytorch.org/whl/nightly/rocm6.1