Add Documentation and Compat for Registry (#13)

Co-authored-by: turquoisedragon2926 <rarockiasamy3@gatech.edu>
slimgroup · Apr 11, 2024 · 2d90aef · 2d90aef
1 parent 32fd1c9
commit 2d90aef
Show file tree

Hide file tree

Showing 5 changed files with 350 additions and 0 deletions.
diff --git a/.github/workflows/TagBot.yml b/.github/workflows/TagBot.yml
@@ -0,0 +1,31 @@
+name: TagBot
+on:
+  issue_comment:
+    types:
+      - created
+  workflow_dispatch:
+    inputs:
+      lookback:
+        default: 3
+permissions:
+  actions: read
+  checks: read
+  contents: write
+  deployments: read
+  issues: read
+  discussions: read
+  packages: read
+  pages: read
+  pull-requests: read
+  repository-projects: read
+  security-events: read
+  statuses: read
+jobs:
+  TagBot:
+    if: github.event_name == 'workflow_dispatch' || github.actor == 'JuliaTagBot'
+    runs-on: ubuntu-latest
+    steps:
+      - uses: JuliaRegistries/TagBot@v1
+        with:
+          token: ${{ secrets.GITHUB_TOKEN }}
+          ssh: ${{ secrets.DOCUMENTER_KEY }}
diff --git a/Project.toml b/Project.toml
@@ -3,6 +3,22 @@ uuid = "db9e0614-c73c-4112-a40c-114e5b366d0d"
 authors = ["Richard Rex <richardr2926@gatech.edu>", "Thomas Grady <tgrady@gatech.edu>"]
 version = "1.1.1"
 
+[compat]
+julia = "1.9.4"
+CUDA = "5.2.0"
+ChainRulesCore = "1.23.0"
+Combinatorics = "1.0.2"
+DataStructures = "0.18.18"
+FFTW = "1.8.0"
+Flux = "0.14.15"
+GraphViz = "0.2.0"
+JSON = "0.21.4"
+LRUCache = "1.6.1"
+LaTeXStrings = "1.3.1"
+MPI = "0.20.19"
+Match = "2.0.0"
+OMEinsum = "0.8.1"
+
 [deps]
 CUDA = "052768ef-5323-5732-b1bb-66c8b64840ba"
 ChainRulesCore = "d360d2e6-b24c-11e9-a2a3-2a2ae2dbcce4"

diff --git a/README.md b/README.md
@@ -0,0 +1,227 @@
+# ParametricOperators.jl
+
+[![][license-img]][license-status] 
+<!-- [![][zenodo-img]][zenodo-status] -->
+
+ParametricOperators.jl is a Julia Language-based scientific library designed to facilitate the creation and manipulation of tensor operations involving large-scale data using Kronecker products. It provides an efficient and mathematically consistent way to express tensor programs and distribution in the context of machine learning.
+
+## Features
+- <b>Kronecker Product Operations:</b> Implement tensor operations using Kronecker products for linear operators acting along multiple dimensions.
+- <b>Parameterization Support:</b> Enables parametric functions in tensor programs, crucial for statistical optimization algorithms.
+- <b>High-Level Abstractions:</b> Close to the underlying mathematics, providing a seamless user experience for scientific practitioners.
+- <b>Distributed Computing:</b> Scales Kronecker product tensor programs and gradient computation to multi-node distributed systems.
+- <b>Domain-Specific Language:</b> Optimized for Julia's just-in-time compilation, allowing for the construction of complex operators entirely at compile time.
+
+## Setup
+
+   ```julia
+   julia> using Pkg
+   julia> Pkg.activate("path/to/your/project")
+   julia> Pkg.add("ParametricOperators")
+   ```
+
+This will add `ParametricOperators.jl` as dependency to your project
+
+## Examples
+
+### 1. FFT of 3D Tensor
+
+```julia
+using Pkg
+Pkg.activate("./path/to/your/environment")
+
+using ParametricOperators
+
+T = Float32
+
+gt, gx, gy = 100, 100, 100
+
+# Define a transform along each dimension
+Ft = ParDFT(T, gt)
+Fx = ParDFT(Complex{T}, gx)
+Fy = ParDFT(Complex{T}, gy)
+
+# Create a Kronecker operator than chains together the transforms
+F = Fy ⊗ Fx ⊗ Ft
+
+# Apply the transform on a random input
+x = rand(T, gt, gx, gy) |> gpu
+y = F * vec(x)
+```
+
+### 2. Distributed FFT of a 3D Tensor:
+
+Make sure to add necessary dependencies. You might also need to load a proper MPI implementation based on your hardware.
+
+```julia
+julia> using Pkg
+julia> Pkg.activate("path/to/your/project")
+julia> Pkg.add("MPI")
+julia> Pkg.add("CUDA")
+```
+
+Copy the following code into a `.jl` file
+```julia
+using Pkg
+Pkg.activate("./path/to/your/environment")
+
+using ParametricOperators
+using CUDA
+using MPI
+
+MPI.Init()
+
+comm = MPI.COMM_WORLD
+rank = MPI.Comm_rank(comm)
+size = MPI.Comm_size(comm)
+
+# Julia requires you to manually assign the gpus, modify to your case.
+CUDA.device!(rank % 4)
+partition = [1, 1, size]
+
+T = Float32
+
+# Define your Global Size and Data Partition
+gt, gx, gy = 100, 100, 100
+nt, nx, ny = [gt, gx, gy] .÷ partition
+
+# Define a transform along each dimension
+Ft = ParDFT(T, gt)
+Fx = ParDFT(Complex{T}, gx)
+Fy = ParDFT(Complex{T}, gy)
+
+# Create and distribute the Kronecker operator than chains together the transforms
+F = Fy ⊗ Fx ⊗ Ft
+F = distribute(F, partition)
+
+# Apply the transform on a random input
+x = rand(T, nt, nx, ny) |> gpu
+y = F * vec(x)
+
+MPI.Finalize()
+```
+
+You can run the above by doing:
+
+`srun -n N_TASKS julia code_above.jl`
+
+### 3. Parametrized Convolution on 3D Tensor
+
+Make sure to add necessary dependencies to compute the gradient
+
+```julia
+julia> using Pkg
+julia> Pkg.activate("path/to/your/project")
+julia> Pkg.add("Zygote")
+```
+
+```julia
+using Pkg
+Pkg.activate("./path/to/your/environment")
+
+using ParametricOperators
+using Zygote
+
+T = Float32
+
+gt, gx, gy = 100, 100, 100
+
+# Define a transform along each dimension
+St = ParMatrix(T, gt, gt)
+Sx = ParMatrix(T, gx, gx)
+Sy = ParMatrix(T, gy, gy)
+
+# Create a Kronecker operator than chains together the transforms
+S = Sy ⊗ Sx ⊗ St
+
+# Parametrize our transform
+θ = init(S) |> gpu
+
+# Apply the transform on a random input
+x = rand(T, gt, gx, gy) |> gpu
+y = S(θ) * vec(x)
+
+# Compute the gradient wrt some objective of our parameters
+θ′ = gradient(θ -> sum(S(θ) * vec(x)), θ)
+```
+
+### 4. Distributed Parametrized Convolution of a 3D Tensor:
+
+Make sure to add necessary dependencies. You might also need to load a proper MPI implementation based on your hardware.
+
+```julia
+julia> using Pkg
+julia> Pkg.activate("path/to/your/project")
+julia> Pkg.add("MPI")
+julia> Pkg.add("CUDA")
+julia> Pkg.add("Zygote")
+```
+
+Copy the following code into a `.jl` file
+```julia
+using Pkg
+Pkg.activate("./path/to/your/environment")
+
+using ParametricOperators
+using CUDA
+using MPI
+
+MPI.Init()
+
+comm = MPI.COMM_WORLD
+rank = MPI.Comm_rank(comm)
+size = MPI.Comm_size(comm)
+
+# Julia requires you to manually assign the gpus, modify to your case.
+CUDA.device!(rank % 4)
+partition = [1, 1, size]
+
+T = Float32
+
+# Define your Global Size and Data Partition
+gt, gx, gy = 100, 100, 100
+nt, nx, ny = [gt, gx, gy] .÷ partition
+
+# Define a transform along each dimension
+St = ParMatrix(T, gt, gt)
+Sx = ParMatrix(T, gx, gx)
+Sy = ParMatrix(T, gy, gy)
+
+# Create and distribute the Kronecker operator than chains together the transforms
+S = Sy ⊗ Sx ⊗ St
+S = distribute(S, partition)
+
+# Parametrize our transform
+θ = init(S) |> gpu
+
+# Apply the transform on a random input
+x = rand(T, nt, nx, ny) |> gpu
+y = S(θ) * vec(x)
+
+# Compute the gradient wrt some objective of our parameters
+θ′ = gradient(θ -> sum(S(θ) * vec(x)), θ)
+
+MPI.Finalize()
+```
+
+You can run the above by doing:
+
+`srun -n N_TASKS julia code_above.jl`
+<!-- ## Citation
+
+If you use our software for your research, we appreciate it if you cite us following the bibtex in [CITATION.bib](CITATION.bib). -->
+
+## Issues
+
+This section will contain common issues and corresponding fixes. Currently, we only provide support for Julia-1.9
+
+## Authors
+
+Richard Rex, [richardr2926@gatech.edu](mailto:richardr2926@gatech.edu) <br/>
+Thomas Grady <br/>
+Mark Glines <br/>
+
+[license-status]:LICENSE
+<!-- [zenodo-status]:https://doi.org/10.5281/zenodo.6799258 -->
+[license-img]:http://img.shields.io/badge/license-MIT-brightgreen.svg?style=flat?style=plastic
+<!-- [zenodo-img]:https://zenodo.org/badge/DOI/10.5281/zenodo.3878711.svg?style=plastic -->
diff --git a/examples/3D_conv.jl b/examples/3D_conv.jl
@@ -0,0 +1,43 @@
+using Pkg
+Pkg.activate("./")
+
+using ParametricOperators
+using Zygote
+using CUDA
+using MPI
+
+MPI.Init()
+
+comm = MPI.COMM_WORLD
+rank = MPI.Comm_rank(comm)
+size = MPI.Comm_size(comm)
+
+CUDA.device!(rank % 4)
+partition = [1, 1, size]
+
+T = Float32
+
+# Define your Global Size and Data Partition
+gt, gx, gy = 100, 100, 100
+nt, nx, ny = [gt, gx, gy] .÷ partition
+
+# Define a transform along each dimension
+St = ParMatrix(T, gt, gt)
+Sx = ParMatrix(T, gx, gx)
+Sy = ParMatrix(T, gy, gy)
+
+# Create and distribute the Kronecker operator than chains together the transforms
+S = Sy ⊗ Sx ⊗ St
+S = distribute(S, partition)
+
+# Parametrize our transform
+θ = init(S) |> gpu
+
+# Apply the transform on a random input
+x = rand(T, nt, nx, ny) |> gpu
+y = S(θ) * vec(x)
+
+# Compute the gradient wrt some objective of our parameters
+θ′ = gradient(θ -> sum(S(θ) * vec(x)), θ)
+
+MPI.Finalize()
diff --git a/examples/3d_fft.jl b/examples/3d_fft.jl
@@ -0,0 +1,33 @@
+using Pkg
+Pkg.activate("./")
+
+using ParametricOperators
+using CUDA
+using MPI
+
+MPI.Init()
+
+comm = MPI.COMM_WORLD
+rank = MPI.Comm_rank(comm)
+size = MPI.Comm_size(comm)
+
+CUDA.device!(rank % 4)
+partition = [1, 1, size]
+
+T = Float32
+
+# Global and Local Partition
+gt, gx, gy = 100, 100, 100
+nt, nx, ny = [gt, gx, gy] .÷ partition
+
+Ft = ParDFT(T, gt)
+Fx = ParDFT(Complex{T}, gx)
+Fy = ParDFT(Complex{T}, gy)
+
+F = Fy ⊗ Fx ⊗ Ft
+F = distribute(F, partition)
+
+x = rand(T, nt, nx, ny) |> gpu
+y = F * vec(x)
+
+MPI.Finalize()