Has queue:
ary.queue -> dpctl.SyclQueue
.
Accommodates different types of USM data:
ary.usm_type -> "shared"|"device"|"host"
.
resAry = arLib.func( ary1, ary2, ... )
# All objK.queue must have the same SYCL context, or error is raised
# Kernel is executed on one of `obj1.queue`, `obj2.queue`, ....
# Where to run is up to the application [must be worked out and documented]
# Kernels access array date using USM array pointers.
# most common case: all queues are the same, result is on that device.
# Result is USMArray
# Allocated on the device/ctx in execution queue.
# Promotion of usm_type ? (favor shared, but preserve if all input types are same)
Python object encompassing host array, and storing SYCL queue.
arLib.ashostarray(obj, queue=q) -> hostAry
# Must validate array type (can not contain)
# Host memory is read-only.
resAry = arLib.func( hostAry1, usmAry2, ... )
# All queues must have the same SYCL context, or error is raised
# Kernel is executed on one of queues, up to application
# Kernel accesses data via SYCL buffer and SYCL USM pointer (if any)
# Result is USMArray
# Allocated on the device/ctx in execution queue
# Promotion of usm_type ?
Alternative: Copy Host memory to USMArray.
Pros of using Host Array: Avoid double memory allocation on host
Cons: buffer must be created/destroyed by arLib.func
and buffer destruction
performs synchronization with the host. This may hurt concurrency.
Xsh = arLib.ones(, device="gpu")
Ysh = arLib.func1(Xsh)
Zsh = arLib.empty(...)
hLib.func2(Xsh, out=Zsh) #
def foo(X):
# stmnt1 : implemented on host only
A_host_memory = stmnt1(X)
# stmnt2 : implemented on host and on SYCL
if is_sycl_visible(X):
B = sycl_capable_stmnt2(X)
else:
B = host_stmnt2(X)
# stmnt3 : implemented on host only
return comb(A_host_memory, B)
foo(Xsh) # sycl_capable stmnt2 is called
foo(Xhost) # generic host stmnt2 is called
The need to dispatch is generally present for code intending to support multiple types of inputs. It is endemic to Python at large.
Array-API solves this by standardizing across array library implementations.