Skip to content
This repository has been archived by the owner on Aug 19, 2024. It is now read-only.

Latest commit

 

History

History
executable file
·
48 lines (31 loc) · 2.57 KB

USMArray_Goals.md

File metadata and controls

executable file
·
48 lines (31 loc) · 2.57 KB

USMArray::goals

  1. Supports USM allocated data of all sycl::usm::alloc types ("shared", "device", "host").
  2. Has features needed to accommodate zero-copy sharing of malloc_device memory allocated by DL frameworks.
  3. Implements operations on USM data using precompiled SYCL kernels only.
    1. Necessity for usm::alloc::device.
    2. Implementation applies to all types of usm::alloc.
  4. Near terms goal: data-container supporting structural operations (all that can be supported using SYCL alone, which is data-API core with exception of linear_algebra). This allows package to be lean (small binary size).
  5. Data container used by dpnp.

USMArray::implications

  1. USMArray can not be subclass of numpy.ndarray.
  2. USMArray must be strided to accommodate Torch/TF tensors
  3. USMArray must be converted to NumPy explicitly (zero-copy op. for shared/host USM).

USMArray::PyData_Evolution

  1. Implement data-API compliant array library and promote data-API adoption upstream
  2. Allow for zero-copy conversion from USMArray(type=Union["shared","host"]) to numpy.ndarray.
  3. Enabling path to adoption is not a guarantee of adoption.
    1. Points in our favor: CPU array computations become multi-threaded
    2. Code can run across XPUs

Cons of numpy.ndarray sub-classing as a mainstream data container

  1. NumPy documentation recommends against it, so subclassing is likely to create friction.

  2. Subclass is an implicit view of USM shared memory as host memory (goes against explicit is better than implicit Zen)

    1. Explicitly conversion is recommended by data-API and Python philosophy.
  3. Technical issues:

    1. Not possible to properly write sub-classing of NumPy array in Cython (c.f. Cython/issue/799).

    2. Subclass must guarantee use of USM buffer (impossible to accomplish using Numpy'a public API functionality).

      numpy.ndarray.__new__(subclass, ...) -> "creates subclass instance with malloc memory rather than usm_shared memory"
    3. Possible overhead of dispatching [?]

  4. Sub-classing also goes against the grain of data-API approach (see assumptions::dependencies).

  5. Rules out support for usm::alloc::device