Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use GPUCompiler.lower_byval to avoid using the @codelet for inserting the preamble to fetch the arguments #25

Open
giordano opened this issue Jul 29, 2023 · 1 comment
Labels
code generation Related to GPUCompiler code generation infrastructure enhancement New feature or request

Comments

@giordano
Copy link
Collaborator

giordano commented Jul 29, 2023

CC: @maleadt

@giordano giordano added enhancement New feature or request code generation Related to GPUCompiler code generation infrastructure labels Jul 29, 2023
@maleadt
Copy link

maleadt commented Jul 29, 2023

using LLVM

function main()
    Context() do ctx
        mod = LLVM.Module("test")

        # dummy function
        ft = LLVM.FunctionType(LLVM.Int32Type(), [LLVM.Int32Type(), LLVM.Int32Type()])
        f = LLVM.Function(mod, "add", ft)
        IRBuilder() do builder
            bb = BasicBlock(f, "entry")
            position!(builder, bb)
            val = add!(builder, parameters(f)[1], parameters(f)[2])
            ret!(builder, val)
        end

        # new function
        new_ft = LLVM.FunctionType(LLVM.Int32Type())
        new_f = LLVM.Function(mod, "new_add", new_ft)
        value_map = Dict{LLVM.Value, LLVM.Value}()
        IRBuilder() do builder
            bb = BasicBlock(new_f, "parameters")
            position!(builder, bb)
            for (i, param) in enumerate(parameters(f))
                intr_ft = LLVM.FunctionType(value_type(param))
                intr = LLVM.Function(mod, "get_param_$i", intr_ft)
                new_param = call!(builder, intr_ft, intr)
                value_map[param] = new_param
            end

            # inline IR
            value_map[f] = new_f
            clone_into!(new_f, f; value_map,
                        changes=LLVM.API.LLVMCloneFunctionChangeTypeGlobalChanges)

            # fall through
            br!(builder, blocks(new_f)[2])
        end

        display(mod)
    end
    return
end

isinteractive() || main()
; ModuleID = 'test'
source_filename = "test"

define i32 @add(i32 %0, i32 %1) {
entry:
  %2 = add i32 %0, %1
  ret i32 %2
}

define i32 @new_add() {
parameters:
  %0 = call i32 @get_param_1()
  %1 = call i32 @get_param_2()
  br label %entry

entry:                                            ; preds = %parameters
  %2 = add i32 %0, %1
  ret i32 %2
}

declare i32 @get_param_1()

declare i32 @get_param_2()

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
code generation Related to GPUCompiler code generation infrastructure enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants