Verified Solution[tensorflow/tensorflow] Integer overflow in Tile kernel MultiplyShapeDims causes heap buffer overflow and SIGBUS crash
Sponsored Content
### ROOT CAUSE
The issue arises from an unchecked integer multiplication in the `MultiplyShapeDims` function, which is used to compute tensor dimensions. When two large dimensions are multiplied, the product overflows the maximum value of `int64_t`, resulting in a negative number. This negative dimension is then used to allocate memory, causing a heap buffer overflow and SIGBUS crash due to invalid memory access.
### CODE FIX
Add overflow checks before multiplication to prevent negative results. For both GPU and CPU implementations:
```cpp
// For tile_op_gpu.cu.h (GPU kernel)
template
__global__ void MultiplyShapeDimsKernelLauncher(const int64_t dim1_count,
const int64_t dim2_count,
const T* shape1,
const T* shape2,
T* output) {
CUDA_KERNEL_LAUNCH_READONLY() {
int i = blockIdx.x * blockDim.x + threadIdx.x;
if (i < dim1_count && i < dim2_count) {
int64_t a = shape1[i];
int64_t b = shape2[i];
if (b == 0 || a == 0) {
output[i] = 0;
} else if (a > std::numeric_limits::max() / b) {
// Handle overflow: set to 0 (or use error handling)
output[i] = 0;
} else {
output[i] = a * b;
}
}
}
}
```
```cpp
// For tile_op.cc (CPU function)
Status MultiplyShapeDims(const std::vector& shape1,
const std::vector& shape2,
std::vector* output) {
if (shape1.size() != shape2.size()) {
return errors::InvalidArgument(
"shape1 and shape2 must have the same size: ",
shape1.size(), " vs ", shape2.size());
}
output->resize(shape1.size());
for (int i = 0; i < shape1.size(); i++) {
int64_t a = shape1[i];
int64_t b = shape2[i];
if (b == 0 || a == 0) {
output->at(i) = 0;
} else if (a > std::numeric_limits::max() / b) {
// Handle overflow: set to 0 (or use error handling)
output->at(i) = 0;
} else {
output->at(i) = a * b;
}
}
return OkStatus();
}
```
Deploy on DigitalOcean ($200 Credit)
Related Fixes
[StackOverflow/rust] Pi Pico Embedded Rust Always elf2uf2-r Always Produces Error: "Unrecognized ABI"
[rust-lang/rust] rustdoc-json: Move `#[doc` attribute out of `Attribute::Other` to own well-typed variant
[tensorflow/tensorflow] tf.linalg.logdet aggressively outputs NaNs on batched 4x4 float64 inputs during framework translation