[RFC] Integration with cuDNN via IREE compiler/runtime plugins #12

ezhulenev · 2023-04-18T18:29:15Z

One of the initial goals of openxla-nvgpu plugin is to show how to integrate NVIDIA libraries with the IREE compiler/runtime. The work is already in progress, and few PRs are merged. The design of this integration is outlined in this document: https://docs.google.com/document/d/1WzSH7LdQdL1CQmlIOUyy6auDiX6d3cl5LAzZU_I4KCY/edit#

The text was updated successfully, but these errors were encountered:

chsigg · 2023-04-20T16:04:52Z

Thanks Eugene for sharing the doc, it looks like a solid plan.

I will try to cover some things that are not discussed in the doc here.

Input dialect

We initially started with lowering from mhlo, but switched to stablehlo now. The two dialects are mostly the same, so this shouldn't be difficult, but we will likely need to update the resnet50 model where we want to show the performance advantage of using libraries compared to the code that IREE is currently able to generate.

Do I understand correctly that the various IREE importers are targeting StableHLO and the standard IREE pipeline is able to consume this?

Layout assignment

This is currently being worked out on the IREE side. My very high level thinking is that it will provide an external layout interface that can be injected to StableHLO ops to communicate preferred layouts. We could then inject the same interfaces to cuDNN ops to constraint the layouts to what cuDNN expects.

Cost model

We will implement a model that determines the cost of StableHLO ops and their cuDNN graph equivalent. This will determine which subgraphs are outlined to cuDNN ops. This needs to take downstream fusion opportunities into account, because the performance profiles of a fused vs unfused op are vastly different.

Compilation pipeline

I need to leave this as a placeholder, because I haven't looked into it yet. But we should document our requirements for hooking into the IREE compilation pipeline. The main open issue here is when/how we perform the outlining of StableHLO ops and lowering to cuDNN ops. As far as I know, the downstream path from cuDNN ops is already working.

ezhulenev added the enhancement New feature or request label Apr 18, 2023

ezhulenev assigned ezhulenev, frgossen and chsigg Apr 18, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[RFC] Integration with cuDNN via IREE compiler/runtime plugins #12

[RFC] Integration with cuDNN via IREE compiler/runtime plugins #12

ezhulenev commented Apr 18, 2023

chsigg commented Apr 20, 2023

[RFC] Integration with cuDNN via IREE compiler/runtime plugins #12

[RFC] Integration with cuDNN via IREE compiler/runtime plugins #12

Comments

ezhulenev commented Apr 18, 2023

chsigg commented Apr 20, 2023

Input dialect

Layout assignment

Cost model

Compilation pipeline