Verified Solution[pytorch/pytorch] Feature Request: Extend device checks to support privateuse1 for shared at::native operators logic
Sponsored Content
### ROOT CAUSE
The issue arises because the device checks in some shared at::native operators block the `privateuse1` device from reusing the stable operator implementations. These checks are too restrictive and prevent the use of `privateuse1` even in device-agnostic operators, forcing code duplication.
### CODE FIX
To resolve this, modify the device checks in the at::native operators to allow `privateuse1` devices. Specifically, update the condition in the operators to include `at::DeviceType::PrivateUse1` as a valid device for execution. For example:
```cpp
if (device.type() != at::DeviceType::CPU && device.type() != at::DeviceType::PrivateUse1) {
// Instead of blocking all non-CPU devices, allow PrivateUse1
// This enables reuse of the operator implementation for custom backends
}
```
This change ensures that the `privateuse1` device is recognized and the operator implementation is reused, reducing code duplication and leveraging PyTorch's stable logic. Apply this fix to all affected operators in the codebase.
Deploy on DigitalOcean ($200 Credit)
Related Fixes
[pytorch/pytorch] torch.dot under vmap lowers to unfused extern_kernels.bmm — pointwise mul+sum is 25-30% faster for small vectors
[microsoft/vscode] "Gemini 3 Flash (Preview)" is actually "Gemini 2.0 Flash (Preview)"
[docker/cli] Incorrect volume mapping/swapping when multiple local directories are mounted in docker sandbox run