Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GPU Batch 11 #1196

Merged
merged 1 commit into from
Sep 16, 2024
Merged

GPU Batch 11 #1196

merged 1 commit into from
Sep 16, 2024

Commits on Sep 12, 2024

  1. Add device only impl

    Add function for device only impl
    
    Fix function signatures
    
    Fix arrays
    
    Add basic test for compilation
    
    Allow serial implementation to be run on host under NVCC
    
    Add verification steps
    
    Add arrays of levels coefficient sizes
    
    Cleanup test set
    
    Add double test set
    
    Add structure for the doubles support
    
    Save space by using pointer to different size arrays rather than 2d
    
    Separate the double precision weights into their own arrays
    
    Remove stray call to std::abs
    
    Add NVRTC testing
    
    Add documentation section
    
    Add device function signature for sinh_sinh_integrate
    
    Add float coefficients
    
    Add double coeffs
    
    Add device specific impl
    
    Add sinh_sinh CUDA testing
    
    Add sinh_sinh NVRTC testing
    mborland committed Sep 12, 2024
    Configuration menu
    Copy the full SHA
    b5214b5 View commit details
    Browse the repository at this point in the history