[inductor] Added non-integer expr support for floordiv in triton codegen #115751

vfdev-5 · 2023-12-13T17:14:41Z

Description:

Added non-integer expr support for floordiv in triton codegen
Added a test
- cpp test is skipped as failing and Fix c10::div_floor_floating compile error #115647 may fix it

This PR is fixing compilation error with the following code:

import torch

def func(x, a):
    n = (a * 1.234) // 8.234
    y = x + n
    return y

cfunc = torch.compile(func, dynamic=True, fullgraph=True)

device = "cuda"
x = torch.tensor(0, dtype=torch.float32, device=device)
a = 33

out = cfunc(x, a)
expected = func(x, a)
torch.testing.assert_close(out, expected)

Error message on Nightly:

  File "/usr/lib/python3.8/concurrent/futures/_base.py", line 389, in __get_result                                                                           
    raise self._exception                                                                                                                           
torch._dynamo.exc.BackendCompilerFailed: backend='compile_fx_wrapper' raised:                                                                                                                                                
CompilationError: at 7:38:def triton_(in_ptr0, out_ptr0, ks0, xnumel, XBLOCK : tl.constexpr):                                                                                                                                
    xoffset = tl.program_id(0) * XBLOCK                                                                                                                                                                                                                           
    xindex = xoffset + tl.arange(0, XBLOCK)[:]                                                                          
    xmask = xindex < xnumel                                                                                                                                                                                                                                       
    x0 = xindex                                                                                                                                                                                                                                                   
    tmp0 = tl.load(in_ptr0 + (x0), xmask)                                                                                               
    tmp1 = ((1.23400000000000*ks0) // 8.23400000000000)                                                                                         
                                      ^                                                                                                                                                                                                                          
AssertionError()

cc @voznesenskym @penguinwu @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @peterbell10 @ipiszy @yf225 @chenyang78 @kadeng @muchulee8 @aakhundov @ColinPeppler

pytorch-bot · 2023-12-13T17:14:45Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/115751

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit the bot commands wiki or our office hours

Note: Links to docs will display an error until the docs builds have been completed.

✅ You can merge normally! (1 Unrelated Failure)

As of commit 8f2c7ca with merge base 310f6ab ():

FLAKY - The following job failed but was likely due to flakiness present on trunk:

inductor / cuda12.1-py3.10-gcc9-sm86 / test (inductor_distributed, 1, 1, linux.g5.12xlarge.nvidia.gpu) (gh)
test/distributed/tensor/parallel/test_fsdp_2d_parallel.py::TestNew2dParallelTraining::test_2d_e2e_training_use_orig_params

This comment was automatically generated by Dr. CI and updates every 15 minutes.

vfdev-5 · 2023-12-13T20:37:22Z

@pytorchbot merge

pytorchmergebot · 2023-12-13T20:39:19Z

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

…gen (pytorch#115751) Description: - Added non-integer expr support for floordiv in triton codegen - Added a test - cpp test is skipped as failing and pytorch#115647 may fix it This PR is fixing compilation error with the following code: ```python import torch def func(x, a): n = (a * 1.234) // 8.234 y = x + n return y cfunc = torch.compile(func, dynamic=True, fullgraph=True) device = "cuda" x = torch.tensor(0, dtype=torch.float32, device=device) a = 33 out = cfunc(x, a) expected = func(x, a) torch.testing.assert_close(out, expected) ``` Error message on Nightly: ``` File "/usr/lib/python3.8/concurrent/futures/_base.py", line 389, in __get_result raise self._exception torch._dynamo.exc.BackendCompilerFailed: backend='compile_fx_wrapper' raised: CompilationError: at 7:38:def triton_(in_ptr0, out_ptr0, ks0, xnumel, XBLOCK : tl.constexpr): xoffset = tl.program_id(0) * XBLOCK xindex = xoffset + tl.arange(0, XBLOCK)[:] xmask = xindex < xnumel x0 = xindex tmp0 = tl.load(in_ptr0 + (x0), xmask) tmp1 = ((1.23400000000000*ks0) // 8.23400000000000) ^ AssertionError() ``` Pull Request resolved: pytorch#115751 Approved by: https://github.com/peterbell10

[inductor] Added float input support for floordiv in triton codegen

8f2c7ca

vfdev-5 requested a review from peterbell10 December 13, 2023 17:14

github-actions bot added module: inductor ciflow/inductor labels Dec 13, 2023

vfdev-5 changed the title ~~[inductor] Added float input support for floordiv in triton codegen~~ [inductor] Added non-integer expr support for floordiv in triton codegen Dec 13, 2023

pytorchbot added the open source label Dec 13, 2023

vfdev-5 added the release notes: inductor label Dec 13, 2023

peterbell10 approved these changes Dec 13, 2023

View reviewed changes

pytorch-bot bot added the ciflow/trunk Trigger trunk jobs on your pull request label Dec 13, 2023

vfdev-5 mentioned this pull request Dec 13, 2023

Fix c10::div_floor_floating compile error #115647

Closed

pytorchmergebot added the merging label Dec 13, 2023

pytorchmergebot added Merged and removed merging labels Dec 13, 2023

pytorchmergebot closed this in c7ae2c1 Dec 13, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[inductor] Added non-integer expr support for floordiv in triton codegen #115751

[inductor] Added non-integer expr support for floordiv in triton codegen #115751

vfdev-5 commented Dec 13, 2023 •

edited

Loading

pytorch-bot bot commented Dec 13, 2023 •

edited

Loading

vfdev-5 commented Dec 13, 2023

pytorchmergebot commented Dec 13, 2023

[inductor] Added non-integer expr support for floordiv in triton codegen #115751

[inductor] Added non-integer expr support for floordiv in triton codegen #115751

Conversation

vfdev-5 commented Dec 13, 2023 • edited Loading

pytorch-bot bot commented Dec 13, 2023 • edited Loading

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/115751

✅ You can merge normally! (1 Unrelated Failure)

vfdev-5 commented Dec 13, 2023

pytorchmergebot commented Dec 13, 2023

Merge started

vfdev-5 commented Dec 13, 2023 •

edited

Loading

pytorch-bot bot commented Dec 13, 2023 •

edited

Loading