Skip to content

Latest commit

 

History

History
128 lines (85 loc) · 4.34 KB

File metadata and controls

128 lines (85 loc) · 4.34 KB

Holographic-Projector

High performance simulations of holographic projectors for GPU's, using CUDA. The implementations are explained in this paper.

The main CLI application cuda/holo computes superpositions of multiple target positions w.r.t multiple source positions.

Project structure

cuda # C++/CUDA code. Additionally the CUDA libraries cuBLAS, thrust and CUB are used.
matlab # Matlab scripts that can be used to run simulations.
py # legacy Python code which was used for prototyping. This does not require a GPU.

Results

Holographic Projection Example

Quality

A brute-force simulation.

Brute-force simulation

A Monte Carlo simulation (approximation).

Monte Carlo simulation

Performance

Three algorithms, in comparison to a baseline.

Speedup

More
Efficiency
FLOPS Runtime

Setup & Dependenceis

The main dependencies are nvcc and gcc. Make sure they are installed and added to the path. E.g. in case of Nikhef intranet:

/usr/local/cuda-11.0/bin
/cvmfs/sft.cern.ch/lcg/releases/gcc/*/bin

and optionally run

LD_LIBRARY_PATH=/cvmfs/sft.cern.ch/lcg/releases/gcc/8.3.0.1/x86_64-centos7/lib64:$LD_LIBRARY_PATH

Usage

Compile the CUDA program using

make build

which is an alias for nvcc -o holo main.cu -l curand -l cublas -std=c++14 -arch=compute_70 -code=sm_70. The compile-time constants can be changed by appending -D{compile_time_constants}. Then you can run the CLI application using

cuda/holo

Run cuda/holo -h to list all valid arguments.

By default, the required source and target positions are generated automatically. There are 3 types of distributions, and they are indicated using the symbols:

  • x the original input distribution, with positions u.
  • y the projector distribution, with positions v.
  • z the projection distribution, with positions w. This distribution will represent the original input distribution.

Using x as source, a target distribution y can be computed. Then, using y as source, z can be computed.

Alternatively it is possible to use external data. Use the flag -f {directory} to incdicate the name of the directory that contains the dataset files. These files should be binary arrays for double-precision floating-point values (Little-endian encoding by default) and should be named as follows.

  • x0_amp.dat, x0_phase.dat
  • u0.dat (for the source data)
  • v0.dat (for the projector target positions)
  • w0_0.dat (for the projection target positions) - this file is only used if the boolean flag -F is supplied.

The projector distribution is written to y0_amp.dat and the projection distribution is written to z0_0_amp.dat.


Interference patterns of 1 an 5 points.

Interference Pattern 1 Point

Interference Pattern 5 Points

Overview of CUDA code

The main kernel to compute superpositions is superposition::per_block, which repeatedly calls superposition::phasor_displacement.

The main files are cuda/main.h,.cu, in which functions declared in cuda/transform.cu are called to compute superposition transformations:

  • a brute force transformation transform_full()

The are a number of variants of SuperpositionPerBlockHelper macros which are used in combination with superposition_per_block_helper() functions. They allow the GPU geometry to be included as templates, which is required for CUB library functions.

Additionally cuda/macros.h contain macros and constants.


Various tests are included in cuda/test.cu and cuda/test_gpu.cu. There are no tests written for the MC estimators (transform()).