Skip to content
This repository has been archived by the owner on Aug 19, 2024. It is now read-only.

Latest commit

 

History

History
77 lines (59 loc) · 2.3 KB

Data_containers.md

File metadata and controls

77 lines (59 loc) · 2.3 KB

Data API compliant data containers

USM Array

Has queue:

ary.queue -> dpctl.SyclQueue.

Accommodates different types of USM data:

ary.usm_type -> "shared"|"device"|"host".

resAry = arLib.func( ary1, ary2, ... )
   # All objK.queue must have the same SYCL context, or error is raised
   # Kernel is executed on one of `obj1.queue`, `obj2.queue`, ....
   # Where to run is up to the application [must be worked out and documented]
   # Kernels access array date using USM array pointers.
   #    most common case: all queues are the same, result is on that device.
   # Result is USMArray
   #    Allocated on the device/ctx in execution queue.
   #    Promotion of usm_type ? (favor shared, but preserve if all input types are same)

Host Array

Python object encompassing host array, and storing SYCL queue.

arLib.ashostarray(obj, queue=q) -> hostAry
   # Must validate array type (can not contain)
   # Host memory is read-only.
resAry = arLib.func( hostAry1, usmAry2, ... )
   # All queues must have the same SYCL context, or error is raised
   # Kernel is executed on one of queues, up to application
   # Kernel accesses data via SYCL buffer and SYCL USM pointer (if any)
   # Result is USMArray
   #    Allocated on the device/ctx in execution queue
   #    Promotion of usm_type ?

Alternative: Copy Host memory to USMArray.

Pros of using Host Array: Avoid double memory allocation on host Cons: buffer must be created/destroyed by arLib.func and buffer destruction performs synchronization with the host. This may hurt concurrency.

Example

Xsh = arLib.ones(, device="gpu")
Ysh = arLib.func1(Xsh)
Zsh = arLib.empty(...)
hLib.func2(Xsh, out=Zsh) #

def foo(X):
   # stmnt1 : implemented on host only
   A_host_memory = stmnt1(X)
   # stmnt2 : implemented on host and on SYCL
   if is_sycl_visible(X):
       B = sycl_capable_stmnt2(X)
   else:
       B = host_stmnt2(X)
   # stmnt3 : implemented on host only
   return comb(A_host_memory, B)

foo(Xsh)     # sycl_capable stmnt2 is called
foo(Xhost)   # generic host stmnt2 is called

The need to dispatch is generally present for code intending to support multiple types of inputs. It is endemic to Python at large.

Array-API solves this by standardizing across array library implementations.