Skip to content

Conversation

@LoserCheems
Copy link
Collaborator

Update module references, import statements, and function names to use the more descriptive flash_dmattn naming convention. Enhance the kernel generator by changing the description and adding a default output directory for better usability. Reorganize source file paths for improved code organization. Maintain consistency across CUDA and Triton implementations.

Updates import statements and function references to use the more descriptive
flash_dmattn naming convention instead of flash_dma.

Ensures consistency across CUDA and Triton implementations while maintaining
the same functionality and API structure.
Changes the description from "flash_attention" to "flash_dmattn" to reflect the correct kernel type being generated.

Adds a default value of "instantiations" for the output directory parameter to improve usability when no specific directory is provided.
Updates package name throughout the build configuration to reflect the new flash_dmattn naming convention. Changes include package name, environment variable prefixes, CUDA extension name, and build exclusions.

Reorganizes source file paths by moving instantiation files into a dedicated instantiations subdirectory for better code organization.
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

The PR renames the core module and files from flash_dma to flash_dmattn, reorganizes CUDA source paths into an instantiations folder, and improves the kernel generator CLI by updating its description and adding a default output directory.

  • Rename all occurrences of flash_dma to flash_dmattn, including package name, env vars, imports, and extension names
  • Move CUDA kernel source files into csrc/src/instantiations/ for better organization
  • Update generate_kernels.py to default --output_dir to instantiations and adjust its description

Reviewed Changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated no comments.

File Description
setup.py Updated package/env var names and extension identifiers
csrc/src/generate_kernels.py Changed CLI description and added default output directory
benchmarks/benchmark_forward_equivalence.py Renamed imports and function references for CUDA/Triton
Comments suppressed due to low confidence (2)

benchmarks/benchmark_forward_equivalence.py:36

  • In the Triton import block, flash_dmattn_func is only defined in the except clause; on successful import you should alias or assign the imported function/module to flash_dmattn_func so it's always defined before it's used.
    print(f"❌ Failed to import flash_dmattn_triton: {e}")

csrc/src/generate_kernels.py:108

  • The help text says it will default to the current directory, but the actual default is now instantiations; update this message to reflect the new default folder.
        help="Where to generate the kernels "

@LoserCheems LoserCheems merged commit 01d046b into main Jul 7, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants