-
Notifications
You must be signed in to change notification settings - Fork 2
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
* Add missing using * Fix rng usage and remove machine argument * Add runtime generated functions init to extensions * Fix github docs link * Add more docs to the kernel generation and add AMDGPU kernel * WIP fix GPU ext * Make the extensions load correctly * Update Numa impl too * Add warning that GPUs are experimental * Include new file in the docs
- Loading branch information
1 parent
857c322
commit 695bdb6
Showing
16 changed files
with
167 additions
and
125 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,8 +1,20 @@ | ||
module AMDGPUExt | ||
|
||
using ComputableDAGs, AMDGPU | ||
using ComputableDAGs | ||
using UUIDs | ||
using AMDGPU | ||
|
||
function __init__() | ||
@debug "Loading AMDGPUExt" | ||
|
||
push!(ComputableDAGs.DEVICE_TYPES, ROCmGPU) | ||
ComputableDAGs.CACHE_STRATEGIES[ROCmGPU] = [LocalVariables()] | ||
|
||
return nothing | ||
end | ||
|
||
# include specialized AMDGPU functions here | ||
include("devices/rocm/impl.jl") | ||
include("devices/rocm/function.jl") | ||
|
||
end |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,31 @@ | ||
function ComputableDAGs.kernel(::Type{ROCmGPU}, graph::DAG, instance) | ||
machine = cpu_st() | ||
tape = ComputableDAGs.gen_tape(graph, instance, machine, context_module) | ||
|
||
init_caches = Expr(:block, tape.initCachesCode...) | ||
assign_inputs = Expr(:block, ComputableDAGs.expr_from_fc.(tape.inputAssignCode)...) | ||
code = Expr(:block, ComputableDAGs.expr_from_fc.(tape.computeCode)...) | ||
|
||
function_id = ComputableDAGs.to_var_name(UUIDs.uuid1(ComputableDAGs.rng[1])) | ||
res_sym = eval( | ||
ComputableDAGs.gen_access_expr( | ||
ComputableDAGs.entry_device(tape.machine), tape.outputSymbol | ||
), | ||
) | ||
expr = Meta.parse( | ||
"function compute_$(function_id)(input_vector, output_vector, n::Int64) | ||
id = (workgroupIdx().x - 1) * workgroupDim().x + workgroupIdx().x | ||
if (id > n) | ||
return | ||
end | ||
@inline data_input = input_vector[id] | ||
$(init_caches) | ||
$(assign_inputs) | ||
$code | ||
@inline output_vector[id] = $res_sym | ||
return nothing | ||
end" | ||
) | ||
|
||
return expr | ||
end |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,44 @@ | ||
# file for struct definitions used by the extensions | ||
# since extensions can't export names themselves | ||
|
||
""" | ||
CUDAGPU <: AbstractGPU | ||
Representation of a specific CUDA GPU that code can run on. Implements the [`AbstractDevice`](@ref) interface. | ||
!!! note | ||
This requires CUDA to be loaded to be useful. | ||
""" | ||
mutable struct CUDAGPU <: AbstractGPU | ||
device::Any # CuDevice | ||
cacheStrategy::CacheStrategy | ||
FLOPS::Float64 | ||
end | ||
|
||
""" | ||
oneAPIGPU <: AbstractGPU | ||
Representation of a specific Intel GPU that code can run on. Implements the [`AbstractDevice`](@ref) interface. | ||
!!! note | ||
This requires oneAPI to be loaded to be useful. | ||
""" | ||
mutable struct oneAPIGPU <: AbstractGPU | ||
device::Any # oneAPI.oneL0.ZeDevice | ||
cacheStrategy::CacheStrategy | ||
FLOPS::Float64 | ||
end | ||
|
||
""" | ||
ROCmGPU <: AbstractGPU | ||
Representation of a specific AMD GPU that code can run on. Implements the [`AbstractDevice`](@ref) interface. | ||
!!! note | ||
This requires AMDGPU to be loaded to be useful. | ||
""" | ||
mutable struct ROCmGPU <: AbstractGPU | ||
device::Any # HIPDevice | ||
cacheStrategy::CacheStrategy | ||
FLOPS::Float64 | ||
end |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.