Verified Solution

[pytorch/pytorch] [CUDA] AdaptiveAvgPool2d: START_IND macro causes int32 overflow → OOB read

Sponsored Content
### ROOT CAUSE The `START_IND` macro in `AdaptiveAveragePooling.cu` uses integer arithmetic that can overflow for large input dimensions. Specifically, the expression `(a) / (b) * (c)` and `(a) % (b)` can produce intermediate results exceeding `int32` limits when `a`, `b`, or `c` are large. This overflow leads to incorrect indices, causing an out-of-bounds memory read during the computation. ### CODE FIX Replace the `START_IND` macro with a version using `int64_t` for intermediate calculations to prevent overflow: ```c // Original (problematic) macro: #define START_IND(a,b,c) ((int64_t)(a) / (b) * (c) + ((int64_t)(a) % (b)) // Fixed macro: #define START_IND(a,b,c) (((int64_t)(a) / (b)) * (c) + ((int64_t)(a) % (b))) ``` This change ensures all intermediate calculations are performed in `int64_t`, avoiding overflow for large values. The fix is applied in the CUDA kernel code for `adaptive_average_pool`.
Deploy on DigitalOcean ($200 Credit)

Related Fixes

[rust-lang/rust] [ICE]: `type variables should not be hashed`
[pytorch/pytorch] [dynamo] Raising non-BaseException values causes graph break instead of TypeError
[StackOverflow/kubernetes] How to handle `node(s) had taint` error in eks cluster?