Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add XPU support (duplicate #125) #209

Open
wants to merge 6 commits into
base: main
Choose a base branch
from
Open

Add XPU support (duplicate #125) #209

wants to merge 6 commits into from

Conversation

ma595
Copy link
Member

@ma595 ma595 commented Dec 20, 2024

Adds XPU support to examples and associated instructions in the documentation.

Copy link

Cpp-Linter Report ⚠️

Some files did not pass the configured checks!

clang-format (v12.0.0) reports: 1 file(s) not formatted
  • src/ctorch.cpp

Have any feedback or feature suggestions? Share it here.

@ma595
Copy link
Member Author

ma595 commented Dec 20, 2024

Build script for CSD3 (@ma595 needs to check this works end to end).

module purge
module load default-dawn
module load intel-oneapi-compilers/2025.0.3/gcc/sb5vj5us
module load gcc/14.2.0/vaetnoca
module load python/3.11.9/gcc/7xr7o47s

python3 -m venv ./venv3-pvc
source venv3-pvc/bin/activate
pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/test/xpu

git clone git@github.com:Cambridge-ICCS/FTorch.git
cd FTorch/src; mkdir build; cd build

export TORCH=$(python -c "import torch; print(torch.__path__[0])")

export CMAKE_PREFIX_PATH=$TORCH

cmake .. -DCMAKE_BUILD_TYPE=Release -DCMAKE_INSTALL_PREFIX=/rds/project/rds-5mCMIDBOkPU/rse/ftorch/FTorch/src/build/install

cmake .. -DCMAKE_BUILD_TYPE=Release -DCMAKE_INSTALL_PREFIX=/rds/project/rds-5mCMIDBOkPU/rse/ftorch/FTorch/src/build/install -DCMAKE_BUILD_TESTS=TRUE  -DCMAKE_Fortran_COMPILER=$(which ifx)

cmake --build . --target install

@ma595 ma595 changed the title Add PVC support (duplicate #125) Add XPU support (duplicate #125) Dec 20, 2024
@ma595
Copy link
Member Author

ma595 commented Dec 20, 2024

Running the 2_ResNet_18 example (using gfortran).

./resnet_infer_fortran

[ERROR]: 0 <= device && static_cast<size_t>(device) < device_allocators.size() INTERNAL ASSERT FAILED at "/pytorch/c10/xpu/XPUCachingAllocator.cpp":555, please report a bug to PyTorch. Allocator not initialized for device 0: did you call init?

Program received signal SIGSEGV: Segmentation fault - invalid memory reference.

Backtrace for this error:
#0  0x14626d1e6688 in ???
#1  0x146251733d68 in ???
#2  0x1463501f95af in ???
#3  0x14633846fca4 in ???
#4  0x1463385010bc in ???
#5  0x14633835ca87 in ???
#6  0x146350e79501 in ???
#7  0x1463501fc1f6 in ???
#8  0x146350e65fc6 in ???
Segmentation fault

@ma595 ma595 self-assigned this Dec 21, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants