Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RISCV] CodeGen of RVE and ilp32e/lp64e ABIs #76777

Merged
merged 1 commit into from
Jan 16, 2024
Merged

Conversation

wangpc-pp
Copy link
Contributor

This commit includes the necessary changes to clang and LLVM to support
codegen of RVE and the ilp32e/lp64e ABIs.

The differences between RVE and RVI are:

  • RVE reduces the integer register count to 16(x0-x16).
  • The ABI should be ilp32e for 32 bits and lp64e for 64 bits.

RVE can be combined with all current standard extensions.

The central changes in ilp32e/lp64e ABI, compared to ilp32/lp64 are:

  • Only 6 integer argument registers (rather than 8).
  • Only 2 callee-saved registers (rather than 12).
  • A Stack Alignment of 32bits (rather than 128bits).
  • ilp32e isn't compatible with D ISA extension.

If ilp32e or lp64 is used with an ISA that has any of the registers
x16-x31 and f0-f31, then these registers are considered temporaries.

To be compatible with the implementation of ilp32e in GCC, we don't use
aligned registers to pass variadic arguments and set stack alignment
to 4-bytes for types with length of 2*XLEN.

FastCC is also supported on RVE, while GHC isn't since there is only one
avaiable register.

Differential Revision: https://reviews.llvm.org/D70401

@llvmbot llvmbot added clang Clang issues not falling into any other category backend:RISC-V clang:driver 'clang' and 'clang++' user-facing binaries. Not 'clang-cl' clang:frontend Language frontend issues, e.g. anything involving "Sema" clang:codegen mc Machine (object) code llvm:support labels Jan 3, 2024
@wangpc-pp wangpc-pp requested review from topperc and yetingk January 3, 2024 04:24
@llvmbot
Copy link
Member

llvmbot commented Jan 3, 2024

@llvm/pr-subscribers-clang-driver
@llvm/pr-subscribers-backend-risc-v
@llvm/pr-subscribers-clang-codegen
@llvm/pr-subscribers-llvm-support

@llvm/pr-subscribers-clang

Author: Wang Pengcheng (wangpc-pp)

Changes

This commit includes the necessary changes to clang and LLVM to support
codegen of RVE and the ilp32e/lp64e ABIs.

The differences between RVE and RVI are:

  • RVE reduces the integer register count to 16(x0-x16).
  • The ABI should be ilp32e for 32 bits and lp64e for 64 bits.

RVE can be combined with all current standard extensions.

The central changes in ilp32e/lp64e ABI, compared to ilp32/lp64 are:

  • Only 6 integer argument registers (rather than 8).
  • Only 2 callee-saved registers (rather than 12).
  • A Stack Alignment of 32bits (rather than 128bits).
  • ilp32e isn't compatible with D ISA extension.

If ilp32e or lp64 is used with an ISA that has any of the registers
x16-x31 and f0-f31, then these registers are considered temporaries.

To be compatible with the implementation of ilp32e in GCC, we don't use
aligned registers to pass variadic arguments and set stack alignment
to 4-bytes for types with length of 2*XLEN.

FastCC is also supported on RVE, while GHC isn't since there is only one
avaiable register.

Differential Revision: https://reviews.llvm.org/D70401


Patch is 422.14 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/76777.diff

45 Files Affected:

  • (modified) clang/docs/ReleaseNotes.rst (+2)
  • (modified) clang/lib/Basic/Targets/RISCV.cpp (+13-1)
  • (modified) clang/lib/Basic/Targets/RISCV.h (+12)
  • (modified) clang/lib/CodeGen/CodeGenModule.cpp (+2-1)
  • (modified) clang/lib/CodeGen/TargetInfo.h (+2-1)
  • (modified) clang/lib/CodeGen/Targets/RISCV.cpp (+25-10)
  • (modified) clang/lib/Driver/ToolChains/Arch/RISCV.cpp (+4)
  • (modified) clang/test/CodeGen/RISCV/riscv32-abi.c (+3)
  • (added) clang/test/CodeGen/RISCV/riscv32-ilp32e-error.c (+4)
  • (modified) clang/test/CodeGen/RISCV/riscv32-vararg.c (+421-141)
  • (modified) clang/test/CodeGen/RISCV/riscv64-abi.c (+4)
  • (modified) clang/test/CodeGen/RISCV/riscv64-vararg.c (+2)
  • (modified) clang/test/Preprocessor/riscv-target-features.c (+22)
  • (modified) llvm/docs/RISCVUsage.rst (+6)
  • (modified) llvm/docs/ReleaseNotes.rst (+2)
  • (modified) llvm/include/llvm/Support/RISCVAttributes.h (+1-1)
  • (modified) llvm/lib/Support/RISCVISAInfo.cpp (+4-4)
  • (modified) llvm/lib/Target/RISCV/GISel/RISCVCallLowering.cpp (+1-1)
  • (modified) llvm/lib/Target/RISCV/MCTargetDesc/RISCVBaseInfo.cpp (+5)
  • (modified) llvm/lib/Target/RISCV/MCTargetDesc/RISCVTargetStreamer.cpp (+8-5)
  • (modified) llvm/lib/Target/RISCV/RISCVCallingConv.td (+14-1)
  • (modified) llvm/lib/Target/RISCV/RISCVFeatures.td (+7)
  • (modified) llvm/lib/Target/RISCV/RISCVFrameLowering.cpp (+21-3)
  • (modified) llvm/lib/Target/RISCV/RISCVFrameLowering.h (+1-6)
  • (modified) llvm/lib/Target/RISCV/RISCVISelLowering.cpp (+65-28)
  • (modified) llvm/lib/Target/RISCV/RISCVISelLowering.h (+1-1)
  • (modified) llvm/lib/Target/RISCV/RISCVRegisterInfo.cpp (+17-3)
  • (modified) llvm/lib/Target/RISCV/RISCVTargetMachine.cpp (+13-3)
  • (modified) llvm/test/CodeGen/RISCV/callee-saved-fpr32s.ll (+432-2)
  • (modified) llvm/test/CodeGen/RISCV/callee-saved-fpr64s.ll (+216-1)
  • (modified) llvm/test/CodeGen/RISCV/callee-saved-gprs.ll (+530)
  • (added) llvm/test/CodeGen/RISCV/calling-conv-ilp32e.ll (+2556)
  • (added) llvm/test/CodeGen/RISCV/calling-conv-lp64e.ll (+221)
  • (added) llvm/test/CodeGen/RISCV/calling-conv-rv32f-ilp32e.ll (+83)
  • (modified) llvm/test/CodeGen/RISCV/interrupt-attr.ll (+745)
  • (added) llvm/test/CodeGen/RISCV/rv32e.ll (+25)
  • (added) llvm/test/CodeGen/RISCV/rv64e.ll (+25)
  • (removed) llvm/test/CodeGen/RISCV/rve.ll (-8)
  • (modified) llvm/test/CodeGen/RISCV/stack-realignment-with-variable-sized-objects.ll (+60)
  • (modified) llvm/test/CodeGen/RISCV/stack-realignment.ll (+652)
  • (modified) llvm/test/CodeGen/RISCV/target-abi-valid.ll (+4-5)
  • (added) llvm/test/CodeGen/RISCV/vararg-ilp32e.ll (+148)
  • (modified) llvm/test/CodeGen/RISCV/vararg.ll (+1261)
  • (modified) llvm/test/MC/RISCV/option-invalid.s (-3)
  • (modified) llvm/test/MC/RISCV/target-abi-invalid.s (+6-3)
diff --git a/clang/docs/ReleaseNotes.rst b/clang/docs/ReleaseNotes.rst
index 0c8fec691bf3c9..aa8d27f5c17551 100644
--- a/clang/docs/ReleaseNotes.rst
+++ b/clang/docs/ReleaseNotes.rst
@@ -971,6 +971,8 @@ RISC-V Support
 ^^^^^^^^^^^^^^
 - Unaligned memory accesses can be toggled by ``-m[no-]unaligned-access`` or the
   aliases ``-m[no-]strict-align``.
+- CodeGen of RV32E/RV64E are supported experimentally.
+- CodeGen of ilp32e/lp64e are supported experimentally.
 
 - Default ABI with F but without D was changed to ilp32f for RV32 and to lp64f
   for RV64.
diff --git a/clang/lib/Basic/Targets/RISCV.cpp b/clang/lib/Basic/Targets/RISCV.cpp
index 6bc57a83a2d5ae..8db989df04c87b 100644
--- a/clang/lib/Basic/Targets/RISCV.cpp
+++ b/clang/lib/Basic/Targets/RISCV.cpp
@@ -154,7 +154,7 @@ void RISCVTargetInfo::getTargetDefines(const LangOptions &Opts,
   else
     Builder.defineMacro("__riscv_float_abi_soft");
 
-  if (ABIName == "ilp32e")
+  if (ABIName == "ilp32e" || ABIName == "lp64e")
     Builder.defineMacro("__riscv_abi_rve");
 
   Builder.defineMacro("__riscv_arch_test");
@@ -215,6 +215,13 @@ void RISCVTargetInfo::getTargetDefines(const LangOptions &Opts,
     Builder.defineMacro("__riscv_misaligned_fast");
   else
     Builder.defineMacro("__riscv_misaligned_avoid");
+
+  if (ISAInfo->hasExtension("e")) {
+    if (Is64Bit)
+      Builder.defineMacro("__riscv_64e");
+    else
+      Builder.defineMacro("__riscv_32e");
+  }
 }
 
 static constexpr Builtin::Info BuiltinInfo[] = {
@@ -386,6 +393,11 @@ bool RISCVTargetInfo::handleTargetFeatures(std::vector<std::string> &Features,
   if (llvm::is_contained(Features, "+experimental"))
     HasExperimental = true;
 
+  if (ABI == "ilp32e" && ISAInfo->hasExtension("d")) {
+    Diags.Report(diag::err_invalid_feature_combination)
+        << "ILP32E must not be used with the D ISA extension";
+    return false;
+  }
   return true;
 }
 
diff --git a/clang/lib/Basic/Targets/RISCV.h b/clang/lib/Basic/Targets/RISCV.h
index f98c88cd45f831..bfbdafb682c851 100644
--- a/clang/lib/Basic/Targets/RISCV.h
+++ b/clang/lib/Basic/Targets/RISCV.h
@@ -132,6 +132,12 @@ class LLVM_LIBRARY_VISIBILITY RISCV32TargetInfo : public RISCVTargetInfo {
   }
 
   bool setABI(const std::string &Name) override {
+    if (Name == "ilp32e") {
+      ABI = Name;
+      resetDataLayout("e-m:e-p:32:32-i64:64-n32-S32");
+      return true;
+    }
+
     if (Name == "ilp32" || Name == "ilp32f" || Name == "ilp32d") {
       ABI = Name;
       return true;
@@ -156,6 +162,12 @@ class LLVM_LIBRARY_VISIBILITY RISCV64TargetInfo : public RISCVTargetInfo {
   }
 
   bool setABI(const std::string &Name) override {
+    if (Name == "lp64e") {
+      ABI = Name;
+      resetDataLayout("e-m:e-p:64:64-i64:64-i128:128-n32:64-S64");
+      return true;
+    }
+
     if (Name == "lp64" || Name == "lp64f" || Name == "lp64d") {
       ABI = Name;
       return true;
diff --git a/clang/lib/CodeGen/CodeGenModule.cpp b/clang/lib/CodeGen/CodeGenModule.cpp
index d78f2594a23764..4e999ddf9fe027 100644
--- a/clang/lib/CodeGen/CodeGenModule.cpp
+++ b/clang/lib/CodeGen/CodeGenModule.cpp
@@ -229,7 +229,8 @@ createTargetCodeGenInfo(CodeGenModule &CGM) {
       ABIFLen = 32;
     else if (ABIStr.ends_with("d"))
       ABIFLen = 64;
-    return createRISCVTargetCodeGenInfo(CGM, XLen, ABIFLen);
+    bool EABI = ABIStr.endswith("e");
+    return createRISCVTargetCodeGenInfo(CGM, XLen, ABIFLen, EABI);
   }
 
   case llvm::Triple::systemz: {
diff --git a/clang/lib/CodeGen/TargetInfo.h b/clang/lib/CodeGen/TargetInfo.h
index 0c0781a2d5ab9d..7682f197041c74 100644
--- a/clang/lib/CodeGen/TargetInfo.h
+++ b/clang/lib/CodeGen/TargetInfo.h
@@ -496,7 +496,8 @@ createPPC64_SVR4_TargetCodeGenInfo(CodeGenModule &CGM, PPC64_SVR4_ABIKind Kind,
                                    bool SoftFloatABI);
 
 std::unique_ptr<TargetCodeGenInfo>
-createRISCVTargetCodeGenInfo(CodeGenModule &CGM, unsigned XLen, unsigned FLen);
+createRISCVTargetCodeGenInfo(CodeGenModule &CGM, unsigned XLen, unsigned FLen,
+                             bool EABI);
 
 std::unique_ptr<TargetCodeGenInfo>
 createCommonSPIRTargetCodeGenInfo(CodeGenModule &CGM);
diff --git a/clang/lib/CodeGen/Targets/RISCV.cpp b/clang/lib/CodeGen/Targets/RISCV.cpp
index 1e1d249b37ac06..0851d1993d0c0f 100644
--- a/clang/lib/CodeGen/Targets/RISCV.cpp
+++ b/clang/lib/CodeGen/Targets/RISCV.cpp
@@ -25,8 +25,9 @@ class RISCVABIInfo : public DefaultABIInfo {
   // ISA might have a wider FLen than the selected ABI (e.g. an RV32IF target
   // with soft float ABI has FLen==0).
   unsigned FLen;
-  static const int NumArgGPRs = 8;
-  static const int NumArgFPRs = 8;
+  const int NumArgGPRs;
+  const int NumArgFPRs;
+  const bool EABI;
   bool detectFPCCEligibleStructHelper(QualType Ty, CharUnits CurOff,
                                       llvm::Type *&Field1Ty,
                                       CharUnits &Field1Off,
@@ -34,8 +35,10 @@ class RISCVABIInfo : public DefaultABIInfo {
                                       CharUnits &Field2Off) const;
 
 public:
-  RISCVABIInfo(CodeGen::CodeGenTypes &CGT, unsigned XLen, unsigned FLen)
-      : DefaultABIInfo(CGT), XLen(XLen), FLen(FLen) {}
+  RISCVABIInfo(CodeGen::CodeGenTypes &CGT, unsigned XLen, unsigned FLen,
+               bool EABI)
+      : DefaultABIInfo(CGT), XLen(XLen), FLen(FLen), NumArgGPRs(EABI ? 6 : 8),
+        NumArgFPRs(FLen != 0 ? 8 : 0), EABI(EABI) {}
 
   // DefaultABIInfo's classifyReturnType and classifyArgumentType are
   // non-virtual, but computeInfo is virtual, so we overload it.
@@ -86,7 +89,7 @@ void RISCVABIInfo::computeInfo(CGFunctionInfo &FI) const {
   }
 
   int ArgGPRsLeft = IsRetIndirect ? NumArgGPRs - 1 : NumArgGPRs;
-  int ArgFPRsLeft = FLen ? NumArgFPRs : 0;
+  int ArgFPRsLeft = NumArgFPRs;
   int NumFixedArgs = FI.getNumRequiredArgs();
 
   int ArgNum = 0;
@@ -396,9 +399,12 @@ ABIArgInfo RISCVABIInfo::classifyArgumentType(QualType Ty, bool IsFixed,
   // Determine the number of GPRs needed to pass the current argument
   // according to the ABI. 2*XLen-aligned varargs are passed in "aligned"
   // register pairs, so may consume 3 registers.
+  // TODO: To be compatible with GCC's behaviors, we don't align registers
+  // currently if we are using ILP32E calling convention. This behavior may be
+  // changed when RV32E/ILP32E is ratified.
   int NeededArgGPRs = 1;
   if (!IsFixed && NeededAlign == 2 * XLen)
-    NeededArgGPRs = 2 + (ArgGPRsLeft % 2);
+    NeededArgGPRs = 2 + (EABI && XLen == 32 ? 0 : (ArgGPRsLeft % 2));
   else if (Size > XLen && Size <= 2 * XLen)
     NeededArgGPRs = 2;
 
@@ -480,6 +486,13 @@ Address RISCVABIInfo::EmitVAArg(CodeGenFunction &CGF, Address VAListAddr,
 
   auto TInfo = getContext().getTypeInfoInChars(Ty);
 
+  // TODO: To be compatible with GCC's behaviors, we force arguments with
+  // 2×XLEN-bit alignment and size at most 2×XLEN bits like `long long`,
+  // `unsigned long long` and `double` to have 4-byte alignment. This
+  // behavior may be changed when RV32E/ILP32E is ratified.
+  if (EABI && XLen == 32)
+    TInfo.Align = std::min(TInfo.Align, CharUnits::fromQuantity(4));
+
   // Arguments bigger than 2*Xlen bytes are passed indirectly.
   bool IsIndirect = TInfo.Width > 2 * SlotSize;
 
@@ -499,8 +512,9 @@ namespace {
 class RISCVTargetCodeGenInfo : public TargetCodeGenInfo {
 public:
   RISCVTargetCodeGenInfo(CodeGen::CodeGenTypes &CGT, unsigned XLen,
-                         unsigned FLen)
-      : TargetCodeGenInfo(std::make_unique<RISCVABIInfo>(CGT, XLen, FLen)) {}
+                         unsigned FLen, bool EABI)
+      : TargetCodeGenInfo(
+            std::make_unique<RISCVABIInfo>(CGT, XLen, FLen, EABI)) {}
 
   void setTargetAttributes(const Decl *D, llvm::GlobalValue *GV,
                            CodeGen::CodeGenModule &CGM) const override {
@@ -526,6 +540,7 @@ class RISCVTargetCodeGenInfo : public TargetCodeGenInfo {
 
 std::unique_ptr<TargetCodeGenInfo>
 CodeGen::createRISCVTargetCodeGenInfo(CodeGenModule &CGM, unsigned XLen,
-                                      unsigned FLen) {
-  return std::make_unique<RISCVTargetCodeGenInfo>(CGM.getTypes(), XLen, FLen);
+                                      unsigned FLen, bool EABI) {
+  return std::make_unique<RISCVTargetCodeGenInfo>(CGM.getTypes(), XLen, FLen,
+                                                  EABI);
 }
diff --git a/clang/lib/Driver/ToolChains/Arch/RISCV.cpp b/clang/lib/Driver/ToolChains/Arch/RISCV.cpp
index 0717e3b813e1e2..76fedc576b8656 100644
--- a/clang/lib/Driver/ToolChains/Arch/RISCV.cpp
+++ b/clang/lib/Driver/ToolChains/Arch/RISCV.cpp
@@ -210,6 +210,7 @@ StringRef riscv::getRISCVABI(const ArgList &Args, const llvm::Triple &Triple) {
   // rv32e -> ilp32e
   // rv32* -> ilp32
   // rv64g | rv64*d -> lp64d
+  // rv64e -> lp64e
   // rv64* -> lp64
   StringRef Arch = getRISCVArch(Args, Triple);
 
@@ -285,6 +286,7 @@ StringRef riscv::getRISCVArch(const llvm::opt::ArgList &Args,
   // 3. Choose a default based on `-mabi=`
   //
   // ilp32e -> rv32e
+  // lp64e -> rv64e
   // ilp32 | ilp32f | ilp32d -> rv32imafdc
   // lp64 | lp64f | lp64d -> rv64imafdc
   if (const Arg *A = Args.getLastArg(options::OPT_mabi_EQ)) {
@@ -292,6 +294,8 @@ StringRef riscv::getRISCVArch(const llvm::opt::ArgList &Args,
 
     if (MABI.equals_insensitive("ilp32e"))
       return "rv32e";
+    else if (MABI.starts_with_insensitive("lp64e"))
+      return "rv64e";
     else if (MABI.starts_with_insensitive("ilp32"))
       return "rv32imafdc";
     else if (MABI.starts_with_insensitive("lp64")) {
diff --git a/clang/test/CodeGen/RISCV/riscv32-abi.c b/clang/test/CodeGen/RISCV/riscv32-abi.c
index 040ae500fc60e4..3645e15ccefcdc 100644
--- a/clang/test/CodeGen/RISCV/riscv32-abi.c
+++ b/clang/test/CodeGen/RISCV/riscv32-abi.c
@@ -5,6 +5,8 @@
 // RUN:   | FileCheck -check-prefixes=ILP32-ILP32F-ILP32D,ILP32F-ILP32D,ILP32-ILP32F,ILP32F %s
 // RUN: %clang_cc1 -triple riscv32 -target-feature +f -target-feature +d -target-abi ilp32d -emit-llvm %s -o - \
 // RUN:   | FileCheck -check-prefixes=ILP32-ILP32F-ILP32D,ILP32F-ILP32D,ILP32D %s
+// RUN: %clang_cc1 -triple riscv32 -emit-llvm -target-abi ilp32e %s -o - \
+// RUN:     | FileCheck -check-prefixes=ILP32-ILP32F-ILP32D,ILP32-ILP32F,ILP32,ILP32E %s
 
 #include <stddef.h>
 #include <stdint.h>
@@ -2064,4 +2066,5 @@ union float16_u f_ret_float16_u(void) {
 }
 
 //// NOTE: These prefixes are unused and the list is autogenerated. Do not add tests below this line:
+// ILP32E: {{.*}}
 // ILP32F: {{.*}}
diff --git a/clang/test/CodeGen/RISCV/riscv32-ilp32e-error.c b/clang/test/CodeGen/RISCV/riscv32-ilp32e-error.c
new file mode 100644
index 00000000000000..86444112bde297
--- /dev/null
+++ b/clang/test/CodeGen/RISCV/riscv32-ilp32e-error.c
@@ -0,0 +1,4 @@
+// RUN: not %clang_cc1 -triple riscv32 -target-feature +d -emit-llvm -target-abi ilp32e %s 2>&1 \
+// RUN:     | FileCheck -check-prefix=ILP32E-WITH-FD %s
+
+// ILP32E-WITH-FD: error: invalid feature combination: ILP32E must not be used with the D ISA extension
diff --git a/clang/test/CodeGen/RISCV/riscv32-vararg.c b/clang/test/CodeGen/RISCV/riscv32-vararg.c
index 02b1ed38e26556..1c4e41f2f54c8f 100644
--- a/clang/test/CodeGen/RISCV/riscv32-vararg.c
+++ b/clang/test/CodeGen/RISCV/riscv32-vararg.c
@@ -1,9 +1,11 @@
 // NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py UTC_ARGS: --version 2
 // RUN: %clang_cc1 -triple riscv32 -emit-llvm %s -o - | FileCheck %s
 // RUN: %clang_cc1 -triple riscv32 -target-feature +f -target-abi ilp32f -emit-llvm %s -o - \
-// RUN:     | FileCheck %s
+// RUN:     | FileCheck %s -check-prefixes=CHECK,CHECK-ILP32F
 // RUN: %clang_cc1 -triple riscv32 -target-feature +d -target-feature +f -target-abi ilp32d -emit-llvm %s -o - \
-// RUN:     | FileCheck %s
+// RUN:     | FileCheck %s -check-prefixes=CHECK,CHECK-ILP32D
+// RUN: %clang_cc1 -triple riscv32 -target-abi ilp32e -emit-llvm %s -o - \
+// RUN:     | FileCheck %s -check-prefixes=CHECK,CHECK-ILP32E
 
 #include <stddef.h>
 #include <stdint.h>
@@ -102,24 +104,60 @@ int f_va_1(char *fmt, ...) {
 // used to pass varargs with 2x xlen alignment and 2x xlen size. Ensure the
 // correct offsets are used.
 
-// CHECK-LABEL: define dso_local double @f_va_2
-// CHECK-SAME: (ptr noundef [[FMT:%.*]], ...) #[[ATTR0]] {
-// CHECK-NEXT:  entry:
-// CHECK-NEXT:    [[FMT_ADDR:%.*]] = alloca ptr, align 4
-// CHECK-NEXT:    [[VA:%.*]] = alloca ptr, align 4
-// CHECK-NEXT:    [[V:%.*]] = alloca double, align 8
-// CHECK-NEXT:    store ptr [[FMT]], ptr [[FMT_ADDR]], align 4
-// CHECK-NEXT:    call void @llvm.va_start(ptr [[VA]])
-// CHECK-NEXT:    [[ARGP_CUR:%.*]] = load ptr, ptr [[VA]], align 4
-// CHECK-NEXT:    [[TMP0:%.*]] = getelementptr inbounds i8, ptr [[ARGP_CUR]], i32 7
-// CHECK-NEXT:    [[ARGP_CUR_ALIGNED:%.*]] = call ptr @llvm.ptrmask.p0.i32(ptr [[TMP0]], i32 -8)
-// CHECK-NEXT:    [[ARGP_NEXT:%.*]] = getelementptr inbounds i8, ptr [[ARGP_CUR_ALIGNED]], i32 8
-// CHECK-NEXT:    store ptr [[ARGP_NEXT]], ptr [[VA]], align 4
-// CHECK-NEXT:    [[TMP1:%.*]] = load double, ptr [[ARGP_CUR_ALIGNED]], align 8
-// CHECK-NEXT:    store double [[TMP1]], ptr [[V]], align 8
-// CHECK-NEXT:    call void @llvm.va_end(ptr [[VA]])
-// CHECK-NEXT:    [[TMP2:%.*]] = load double, ptr [[V]], align 8
-// CHECK-NEXT:    ret double [[TMP2]]
+// CHECK-ILP32F-LABEL: define dso_local double @f_va_2
+// CHECK-ILP32F-SAME: (ptr noundef [[FMT:%.*]], ...) #[[ATTR0]] {
+// CHECK-ILP32F-NEXT:  entry:
+// CHECK-ILP32F-NEXT:    [[FMT_ADDR:%.*]] = alloca ptr, align 4
+// CHECK-ILP32F-NEXT:    [[VA:%.*]] = alloca ptr, align 4
+// CHECK-ILP32F-NEXT:    [[V:%.*]] = alloca double, align 8
+// CHECK-ILP32F-NEXT:    store ptr [[FMT]], ptr [[FMT_ADDR]], align 4
+// CHECK-ILP32F-NEXT:    call void @llvm.va_start(ptr [[VA]])
+// CHECK-ILP32F-NEXT:    [[ARGP_CUR:%.*]] = load ptr, ptr [[VA]], align 4
+// CHECK-ILP32F-NEXT:    [[TMP0:%.*]] = getelementptr inbounds i8, ptr [[ARGP_CUR]], i32 7
+// CHECK-ILP32F-NEXT:    [[ARGP_CUR_ALIGNED:%.*]] = call ptr @llvm.ptrmask.p0.i32(ptr [[TMP0]], i32 -8)
+// CHECK-ILP32F-NEXT:    [[ARGP_NEXT:%.*]] = getelementptr inbounds i8, ptr [[ARGP_CUR_ALIGNED]], i32 8
+// CHECK-ILP32F-NEXT:    store ptr [[ARGP_NEXT]], ptr [[VA]], align 4
+// CHECK-ILP32F-NEXT:    [[TMP1:%.*]] = load double, ptr [[ARGP_CUR_ALIGNED]], align 8
+// CHECK-ILP32F-NEXT:    store double [[TMP1]], ptr [[V]], align 8
+// CHECK-ILP32F-NEXT:    call void @llvm.va_end(ptr [[VA]])
+// CHECK-ILP32F-NEXT:    [[TMP2:%.*]] = load double, ptr [[V]], align 8
+// CHECK-ILP32F-NEXT:    ret double [[TMP2]]
+//
+// CHECK-ILP32D-LABEL: define dso_local double @f_va_2
+// CHECK-ILP32D-SAME: (ptr noundef [[FMT:%.*]], ...) #[[ATTR0]] {
+// CHECK-ILP32D-NEXT:  entry:
+// CHECK-ILP32D-NEXT:    [[FMT_ADDR:%.*]] = alloca ptr, align 4
+// CHECK-ILP32D-NEXT:    [[VA:%.*]] = alloca ptr, align 4
+// CHECK-ILP32D-NEXT:    [[V:%.*]] = alloca double, align 8
+// CHECK-ILP32D-NEXT:    store ptr [[FMT]], ptr [[FMT_ADDR]], align 4
+// CHECK-ILP32D-NEXT:    call void @llvm.va_start(ptr [[VA]])
+// CHECK-ILP32D-NEXT:    [[ARGP_CUR:%.*]] = load ptr, ptr [[VA]], align 4
+// CHECK-ILP32D-NEXT:    [[TMP0:%.*]] = getelementptr inbounds i8, ptr [[ARGP_CUR]], i32 7
+// CHECK-ILP32D-NEXT:    [[ARGP_CUR_ALIGNED:%.*]] = call ptr @llvm.ptrmask.p0.i32(ptr [[TMP0]], i32 -8)
+// CHECK-ILP32D-NEXT:    [[ARGP_NEXT:%.*]] = getelementptr inbounds i8, ptr [[ARGP_CUR_ALIGNED]], i32 8
+// CHECK-ILP32D-NEXT:    store ptr [[ARGP_NEXT]], ptr [[VA]], align 4
+// CHECK-ILP32D-NEXT:    [[TMP1:%.*]] = load double, ptr [[ARGP_CUR_ALIGNED]], align 8
+// CHECK-ILP32D-NEXT:    store double [[TMP1]], ptr [[V]], align 8
+// CHECK-ILP32D-NEXT:    call void @llvm.va_end(ptr [[VA]])
+// CHECK-ILP32D-NEXT:    [[TMP2:%.*]] = load double, ptr [[V]], align 8
+// CHECK-ILP32D-NEXT:    ret double [[TMP2]]
+//
+// CHECK-ILP32E-LABEL: define dso_local double @f_va_2
+// CHECK-ILP32E-SAME: (ptr noundef [[FMT:%.*]], ...) #[[ATTR0]] {
+// CHECK-ILP32E-NEXT:  entry:
+// CHECK-ILP32E-NEXT:    [[FMT_ADDR:%.*]] = alloca ptr, align 4
+// CHECK-ILP32E-NEXT:    [[VA:%.*]] = alloca ptr, align 4
+// CHECK-ILP32E-NEXT:    [[V:%.*]] = alloca double, align 8
+// CHECK-ILP32E-NEXT:    store ptr [[FMT]], ptr [[FMT_ADDR]], align 4
+// CHECK-ILP32E-NEXT:    call void @llvm.va_start(ptr [[VA]])
+// CHECK-ILP32E-NEXT:    [[ARGP_CUR:%.*]] = load ptr, ptr [[VA]], align 4
+// CHECK-ILP32E-NEXT:    [[ARGP_NEXT:%.*]] = getelementptr inbounds i8, ptr [[ARGP_CUR]], i32 8
+// CHECK-ILP32E-NEXT:    store ptr [[ARGP_NEXT]], ptr [[VA]], align 4
+// CHECK-ILP32E-NEXT:    [[TMP0:%.*]] = load double, ptr [[ARGP_CUR]], align 4
+// CHECK-ILP32E-NEXT:    store double [[TMP0]], ptr [[V]], align 8
+// CHECK-ILP32E-NEXT:    call void @llvm.va_end(ptr [[VA]])
+// CHECK-ILP32E-NEXT:    [[TMP1:%.*]] = load double, ptr [[V]], align 8
+// CHECK-ILP32E-NEXT:    ret double [[TMP1]]
 //
 double f_va_2(char *fmt, ...) {
   __builtin_va_list va;
@@ -133,40 +171,106 @@ double f_va_2(char *fmt, ...) {
 
 // Two "aligned" register pairs.
 
-// CHECK-LABEL: define dso_local double @f_va_3
-// CHECK-SAME: (ptr noundef [[FMT:%.*]], ...) #[[ATTR0]] {
-// CHECK-NEXT:  entry:
-// CHECK-NEXT:    [[FMT_ADDR:%.*]] = alloca ptr, align 4
-// CHECK-NEXT:    [[VA:%.*]] = alloca ptr, align 4
-// CHECK-NEXT:    [[V:%.*]] = alloca double, align 8
-// CHECK-NEXT:    [[W:%.*]] = alloca i32, align 4
-// CHECK-NEXT:    [[X:%.*]] = alloca double, align 8
-// CHECK-NEXT:    store ptr [[FMT]], ptr [[FMT_ADDR]], align 4
-// CHECK-NEXT:    call void @llvm.va_start(ptr [[VA]])
-// CHECK-NEXT:    [[ARGP_CUR:%.*]] = load ptr, ptr [[VA]], align 4
-// CHECK-NEXT:    [[TMP0:%.*]] = getelementptr inbounds i8, ptr [[ARGP_CUR]], i32 7
-// CHECK-NEXT:    [[ARGP_CUR_ALIGNED:%.*]] = call ptr @llvm.ptrmask.p0.i32(ptr [[TMP0]], i32 -8)
-// CHECK-NEXT:    [[ARGP_NEXT:%.*]] = getelementptr inbounds i8, ptr [[ARGP_CUR_ALIGNED]], i32 8
-// CHECK-NEXT:    store ptr [[ARGP_NEXT]], ptr [[VA]], align 4
-// CHECK-NEXT:    [[TMP1:%.*]] = load double, ptr [[ARGP_CUR_ALIGNED]], align 8
-// CHECK-NEXT:    store double [[TMP1]], ptr [[V]], align 8
-// CHECK-NEXT:    [[ARGP_CUR1:%.*]] = load ptr, ptr [[VA]], align 4
-// CHECK-NEXT:    [[ARGP_NEXT2:%.*]] = getelementptr inbounds i8, ptr [[ARGP_CUR1]], i32 4
-// CHECK-NEXT:    store ptr [[ARGP_NEXT2]], ptr [[VA]], align 4
-// CHECK-NEXT:    [[TMP2:%.*]] = load i32, ptr [[ARGP_CUR1]], align 4
-// CHECK-NEXT:    store i32 [[TMP2]], ptr [[W]], align 4
-// CHECK-NEXT:    [[ARGP_CUR3:%.*]] = load ptr, ptr [[VA]], align 4
-// CHECK-NEXT:    [[TMP3:%.*]] = getelementptr inbounds i8, ptr [[ARGP_CUR3]], i32 7
-// CHECK-NEXT:    [[ARGP_CUR3_ALIGNED:%.*]] = call ptr @llvm.ptrmask.p0.i32(ptr [[TMP3]], i32 -8)
-// CHECK-NEXT:    [[ARGP_NEXT4:%.*]] = getelementptr inbounds i8, ptr [[ARGP_CUR3_ALIGNED]], i32 8
-// CHECK-NEXT:    store ptr [[ARGP_NEXT4]], ptr [[VA]], align 4
-// CHECK-NEXT:    [[TMP4:%.*]] = load double, ptr [[ARGP_CUR3_ALIGNED]], align 8
-// CHECK-NEXT:    store double [[TMP4]], ptr [[X]], align 8
-// CHECK-NEXT:    call void @llvm.va_end(ptr [[VA]])
-// CHECK-NEXT:    [[TMP5:%.*]] = load double, ptr [[V]], align 8
-// CHECK-NEXT:    [[TMP6:%.*]] = load double, ptr [[X]], align 8
-// CHECK-NEXT:    [[ADD:%.*]] = fadd double [[TMP5]], [[TMP6]]
-// CHECK-NEXT:    ret double [[ADD]]
+// CHECK-ILP32F-LABEL: define dso_local double @f_va_3
+// CHECK-ILP32F-SAME: (ptr noundef [[FMT:%.*]], ...) #[[ATTR0]] {
+// CHECK-ILP32F-NEXT:  entry:
+// CHECK-ILP32F-NEXT:    [[FMT_ADDR:%.*]] = alloca ptr, align 4
+// CHECK-ILP32F-NEXT:    [[VA:%.*]] = alloca ptr, align 4
+// CHECK-ILP32F-NEXT:    [[V:%.*]] = alloca double, align 8
+// CHECK-ILP32F-NEXT:    [[W:%.*]] = alloca i32, align 4
+// CHECK-ILP32F-NEXT:    [[X:%.*]] = alloca double, align 8
+// CHECK-ILP32F-NEXT:    store ptr [[FMT]], ptr [[FMT_ADDR]], align 4
+// CHECK-ILP32F-NEXT:    call void @llvm.va_start(ptr [[VA]])
+// CHECK-ILP32F-NEXT:    [[ARGP_CUR:%.*]] = load ptr, ptr [[VA]], align 4
+// CHECK-ILP32F-NEXT:    [[TMP0:%.*]] = getelementptr inbounds i8, ptr [[ARGP_CUR]], i32 7
+// CHECK-ILP32F-NEXT:    [[ARGP_CUR_ALIGNED:%.*]] = call ptr @llvm.ptrmask.p0.i32(ptr [[TMP0]], i32 -8)
+// CHECK-ILP32F-NEXT:    [[ARGP_NEXT:%.*]] = getelementptr inbounds i8, ptr [[ARGP_CUR_ALIGNED]], i32 8
+// CHECK-ILP32F-NEXT:    store ptr [[ARGP_NEXT]], ptr [[VA]], align 4
+// CHECK-ILP32F-NEXT:    [[TMP1:%.*]] = load double, ptr [[ARGP_CUR_ALIGNED]], align 8
+// CHECK-ILP32F-NEXT:    store double [[...
[truncated]

@topperc
Copy link
Collaborator

topperc commented Jan 3, 2024

What was the last bit of discussion on the phabricator review? I can no longer access it.

clang/lib/Driver/ToolChains/Arch/RISCV.cpp Outdated Show resolved Hide resolved
llvm/lib/Target/RISCV/MCTargetDesc/RISCVBaseInfo.cpp Outdated Show resolved Hide resolved
llvm/lib/Target/RISCV/RISCVFrameLowering.cpp Outdated Show resolved Hide resolved
llvm/lib/Target/RISCV/RISCVISelLowering.cpp Outdated Show resolved Hide resolved
llvm/lib/Target/RISCV/RISCVRegisterInfo.cpp Outdated Show resolved Hide resolved
llvm/test/CodeGen/RISCV/calling-conv-lp64e.ll Outdated Show resolved Hide resolved
llvm/test/CodeGen/RISCV/calling-conv-ilp32e.ll Outdated Show resolved Hide resolved
@athei
Copy link

athei commented Jan 3, 2024

What was the last bit of discussion on the phabricator review? I can no longer access it.

It was basically just waiting for final approval. The agreement was that it is good enough to be merged in as experimental even with unfinalized ABI.

I really hope we can get this in before end of this month to be able to use it in rustc.

@asb
Copy link
Contributor

asb commented Jan 4, 2024

The conclusion from the previous review was this was OK to merge. I think I held it up by not responding to a ping (apologies). I've had another scan through and don't see a problem with merging this and considering it experimental once Craig's review comments are addressed.

For the psABI, there's a gap here I think as GCC and LLVM need to deviate from what is currently written. I think riscv-non-isa/riscv-elf-psabi-doc#257 tried to partially address this. I wonder if we need a tracking issue in that repo to try to get this properly documented via that PR or otherwise (what do you think @jrtc27 @kito-cheng?).

@kito-cheng
Copy link
Member

Hmmm, RISC-V ISA is growth after riscv-non-isa/riscv-elf-psabi-doc#257 again, I mean...we have zfinx and zdinx, which is also valid combination with rv32e/rv64e, so we may need to revise ilp32e ABI again on the psABI side, but my intention is not to block this PR.

Anyway, I think we could track that on psABI repo, and we...already has an old issue to track that: riscv-non-isa/riscv-elf-psabi-doc#269

@kito-cheng
Copy link
Member

@wangpc-pp did you have interested on helping psABI side? it would be great if you can help since I suspect I don't have enough bandwidth to deal with that soon.

@wangpc-pp wangpc-pp requested a review from zixuan-wu January 4, 2024 17:28
@wangpc-pp
Copy link
Contributor Author

wangpc-pp commented Jan 4, 2024

@wangpc-pp did you have interested on helping psABI side? it would be great if you can help since I suspect I don't have enough bandwidth to deal with that soon.

Yes, I'm glad to. I think what we need to do is to fix some Zdinx issues? :-)

And, I think I have to explain the background here:
I started to maintain this patch about two years ago when I was in T-head, since they have some old RV32E cores (AFAIK, SiFive has RV32E cores too?) and some customers needed the support of RV32E/ilp32e. Now, both T-head and SiFive didn't release RV32E cores any more, and the RVI doesn't put its focuses on RVE too I think. But I don't think there is no vendor that developed or is developing their own RV32E implementations.
Now I am no longer working at T-head and my current work is not about RV32E, so I don't have environment to test it. But I think this patch can work just fine since it has been used in T-head's downstream for a long time (since LLVM 13, IIRC) and @zixuan-wu has reported several issues (already solved) before.
Apart from vendors' need, the request is mainly from Rust language community I think. IIUC, this is for the Rust implementation of SBI (https://github.com/rustsbi/rustsbi). They are using T-head E902 (rv32emc?) and RV32E core from SiFive.

So, yeah, I will try my best to fix the ABI issue, but I think we still need the help of the RISC-V community. I will contact with T-head guys to see if they are interested in this.

@asb
Copy link
Contributor

asb commented Jan 4, 2024

@nemanjai I'm curious if you have an interest / need to support RVE or not?

llvm/docs/RISCVUsage.rst Outdated Show resolved Hide resolved
@koute
Copy link

koute commented Jan 4, 2024

But I think this patch can work just fine [...] Apart from vendors' need, the request is mainly from Rust language community I think.

To put our 3 cents here, we also need it. We are (well, it's mostly me right now since it's still a prototype) currently building a virtual machine that is based on RV32E (and in the future also RV64E), and, long story short, thanks to the magic of RV32E it achieves execution performance roughly the same as wasmtime (which is a state-of-art WASM VM) while compiling into native code over ~200 times faster and being orders of magnitude simpler (because you can mostly naively translate RV32E machine code 1-to-1 into native AMD64 code as the reduced number of registers makes this barely possible, and the code is still very fast).

As far as testing this patch, as a fairly decent test case I've used one of the previous versions to compile DOOM into RV32E and ran it under our VM (in fact, you can play it here it you want), so the patch definitely works.

Anyway, people are underestimating RV32E thinking it's only for bottom tier microcontrollers, but this couldn't be further from the truth. It's not perfect, but it's also a really great VM bytecode; dare I say, I think it's a better WASM than WASM for non-Web uses.

@koute
Copy link

koute commented Jan 5, 2024

Side note: shouldn't we also update compiler-rt/lib/builtins/riscv/{save, restore}.S? E.g. with something like this:

diff --git a/compiler-rt/lib/builtins/riscv/restore.S b/compiler-rt/lib/builtins/riscv/restore.S
index 73f64a920d66..2e185ecae3f7 100644
--- a/compiler-rt/lib/builtins/riscv/restore.S
+++ b/compiler-rt/lib/builtins/riscv/restore.S
@@ -20,6 +20,46 @@
 
   .text
 
+#if __riscv_abi_rve == 1
+
+#if __riscv_xlen == 32
+  .globl  __riscv_restore_2
+  .type   __riscv_restore_2,@function
+  .globl  __riscv_restore_1
+  .type   __riscv_restore_1,@function
+  .globl  __riscv_restore_0
+  .type   __riscv_restore_0,@function
+__riscv_restore_1:
+__riscv_restore_0:
+  lw      s1,  4(sp)
+  lw      s0,  8(sp)
+  lw      ra,  12(sp)
+  addi    sp, sp, 16
+  ret
+#elif __riscv_xlen == 64
+  .globl  __riscv_restore_2
+  .type   __riscv_restore_2,@function
+__riscv_restore_2:
+  ld      s1,  8(sp)
+  addi    sp, sp, 16
+  // fallthrough into __riscv_restore_1/0
+
+  .globl  __riscv_restore_1
+  .type   __riscv_restore_1,@function
+  .globl  __riscv_restore_0
+  .type   __riscv_restore_0,@function
+__riscv_restore_1:
+__riscv_restore_0:
+  ld      s0,  0(sp)
+  ld      ra,  8(sp)
+  addi    sp, sp, 16
+  ret
+#else
+# error "xlen must be 32 or 64 for save-restore implementation
+#endif
+
+#else
+
 #if __riscv_xlen == 32
 
   .globl  __riscv_restore_12
@@ -164,3 +204,5 @@ __riscv_restore_0:
 #else
 # error "xlen must be 32 or 64 for save-restore implementation
 #endif
+
+#endif
diff --git a/compiler-rt/lib/builtins/riscv/save.S b/compiler-rt/lib/builtins/riscv/save.S
index 85501aeb4c2e..86b62a369cd0 100644
--- a/compiler-rt/lib/builtins/riscv/save.S
+++ b/compiler-rt/lib/builtins/riscv/save.S
@@ -16,6 +16,54 @@
 
   .text
 
+#if __riscv_abi_rve == 1
+
+#if __riscv_xlen == 32
+  .globl  __riscv_save_2
+  .type   __riscv_save_2,@function
+  .globl  __riscv_save_1
+  .type   __riscv_save_1,@function
+  .globl  __riscv_save_0
+  .type   __riscv_save_0,@function
+__riscv_save_3:
+__riscv_save_2:
+__riscv_save_1:
+__riscv_save_0:
+  addi    sp, sp, -16
+  sw      s1,  4(sp)
+  sw      s0,  8(sp)
+  sw      ra,  12(sp)
+  jr      t0
+#elif __riscv_xlen == 64
+  .globl  __riscv_save_2
+  .type   __riscv_save_2,@function
+__riscv_save_3:
+__riscv_save_2:
+  addi   sp, sp, -112
+  li     t1, 80
+.Lriscv_save_3_2:
+  sd     s1, 88(sp)
+  sd     s0, 96(sp)
+  sd     ra, 104(sp)
+  add    sp, sp, t1
+  jr     t0
+
+  .globl  __riscv_save_1
+  .type   __riscv_save_1,@function
+  .globl  __riscv_save_0
+  .type   __riscv_save_0,@function
+__riscv_save_1:
+__riscv_save_0:
+  addi   sp, sp, -16
+  sd     s0, 0(sp)
+  sd     ra, 8(sp)
+  jr     t0
+#else
+# error "xlen must be 32 or 64 for save-restore implementation
+#endif
+
+#else
+
 #if __riscv_xlen == 32
 
   .globl  __riscv_save_12
@@ -184,3 +232,5 @@ __riscv_save_0:
 #else
 # error "xlen must be 32 or 64 for save-restore implementation
 #endif
+
+#endif

(I don't remember why exactly since I did it a long time ago, but for some reason I do have this patch in my LLVM fork, so it probably was necessary for something.)

@wangpc-pp
Copy link
Contributor Author

Side note: shouldn't we also update compiler-rt/lib/builtins/riscv/{save, restore}.S? E.g. with something like this:
[...]
(I don't remember why exactly since I did it a long time ago, but for some reason I do have this patch in my LLVM fork, so it probably was necessary for something.)

Oh, it may be about stack alignment issue for libcalls. I may have fixed it.
As for your diffs, it seems that you only handle the __riscv_save/restore_[2|1|0], which is incomplete. And the code is not different with non-rve cases?

@koute
Copy link

koute commented Jan 5, 2024

As for your diffs, it seems that you only handle the __riscv_save/restore_[2|1|0], which is incomplete. And the code is not different with non-rve cases?

Yes, I mostly copy-pasted the existing code and removed all of the code dealing with registers not available on RV32E, so having only __riscv_save/restore_[2|1|0] is intended I suppose because there are only this many saved registers on RV32E. (There's probably a better way of doing it, and it looks like I screwed up the RV64E part of the patch.)

If I remember correctly I think I did this because otherwise compiling compiler-rt was not possible for RV32E and the compilation was spewing out errors about the unavailable registers? But I need to check this again once I finish porting this newest version of the patch to the most recent version of Rust.

@wangpc-pp
Copy link
Contributor Author

wangpc-pp commented Jan 5, 2024

As for your diffs, it seems that you only handle the __riscv_save/restore_[2|1|0], which is incomplete. And the code is not different with non-rve cases?

Yes, I mostly copy-pasted the existing code and removed all of the code dealing with registers not available on RV32E, so having only __riscv_save/restore_[2|1|0] is intended I suppose because there are only this many saved registers on RV32E. (There's probably a better way of doing it, and it looks like I screwed up the RV64E part of the patch.)

If I remember correctly I think I did this because otherwise compiling compiler-rt was not possible for RV32E and the compilation was spewing out errors about the unavailable registers? But I need to check this again once I finish porting this newest version of the patch to the most recent version of Rust.

Oh, I see. My previous comment was wrong.
I just checked GCC implementation, we do need to handle RVE cases (but GCC still lacks of RV64E handling). LLVM hasn't handled this mainly because we haven't support RVE now, I think.
Can you fire a PR for these changes? I think we should support it in compiler-rt once we have merged this PR.

@koute
Copy link

koute commented Jan 5, 2024

Can you fire a PR for these changes? I think we should support it in compiler-rt once we have merged this PR.

Once this PR is merged, sure, I can make a PR. (With a better patch than my current hacky copy-paste job.)

@nemanjai
Copy link
Member

nemanjai commented Jan 8, 2024

@nemanjai I'm curious if you have an interest / need to support RVE or not?

I most certainly do. Thank you for alerting me to this PR.

@asb
Copy link
Contributor

asb commented Jan 9, 2024

The RISCVUsage change looks good to me, and it's fantastic that @nemanjai might be able to help push this forward further.

I think all the review comments were addressed, so I'd be happy to merge if @topperc is happy his comments were resolved. Though also fine to wait longer if @nemanjai you wanted to review this as well.

Copy link
Member

@nemanjai nemanjai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is not my intent to hold up approval of this patch. In addition to the minor comments I added, I plan to do some local testing but even if the testing reveals issues, they can be fixed on subsequent commits.

llvm/lib/Target/RISCV/RISCVCallingConv.td Outdated Show resolved Hide resolved
llvm/lib/Target/RISCV/RISCVRegisterInfo.cpp Outdated Show resolved Hide resolved
llvm/lib/Target/RISCV/RISCVFeatures.td Show resolved Hide resolved
llvm/docs/RISCVUsage.rst Outdated Show resolved Hide resolved
llvm/lib/Target/RISCV/MCTargetDesc/RISCVTargetStreamer.cpp Outdated Show resolved Hide resolved
@@ -985,9 +1003,10 @@ void RISCVFrameLowering::determineCalleeSaves(MachineFunction &MF,
};

for (auto Reg : CSRegs)
SavedRegs.set(Reg);
if (Reg < RISCV::X16 || !Subtarget.isRVE())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens if we are using ilp32e/lp64e ABI on a subtarget that isn't RVE? Should these registers be saved?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The psABI says:

If used with an ISA that has any of the registers x16-x31 and f0-f31, then these registers are considered temporaries.

So I think we should.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good. Maybe just a little comment here to demonstrate to the user that this isn't an omission but a conscious decision.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Though it's nearly impossible to have such configuration in real application, I saved x15-x16 in 20ffba3.

Diags.Report(diag::err_invalid_feature_combination)
<< "ILP32E cannot be used with the D ISA extension";
return false;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it also need to check ilp32e isn't compatible with zcmp?

Copy link
Contributor Author

@wangpc-pp wangpc-pp Jan 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I think we should!
(Sigh...even RISC-V International has forgotten RVE, as many new ISA extensions don't even think about impacts on RVE 😞).


There are still many problems to fix, I will leave this zcmp issue there and fix it after revising the psABI part.

@wangpc-pp wangpc-pp requested review from nemanjai and topperc January 11, 2024 07:44
Copy link
Member

@nemanjai nemanjai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My comments have been addressed, so this LGTM. I'll of course defer to @asb and @topperc for final approval.

Copy link
Contributor

@asb asb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks for your persistence on this!

@wangpc-pp
Copy link
Contributor Author

It's not easy to push this forward and thanks a lot!
I'll merge this soon. If there are some issues (hope not 😄), follow-up patches will fix them.

This commit includes the necessary changes to clang and LLVM to support
codegen of `RVE` and the `ilp32e`/`lp64e` ABIs.

The differences between `RVE` and `RVI` are:
* `RVE` reduces the integer register count to 16(x0-x16).
* The ABI should be `ilp32e` for 32 bits and `lp64e` for 64 bits.

`RVE` can be combined with all current standard extensions.

The central changes in ilp32e/lp64e ABI, compared to ilp32/lp64 are:
* Only 6 integer argument registers (rather than 8).
* Only 2 callee-saved registers (rather than 12).
* A Stack Alignment of 32bits (rather than 128bits).
* ilp32e isn't compatible with D ISA extension.

If `ilp32e` or `lp64` is used with an ISA that has any of the registers
x16-x31 and f0-f31, then these registers are considered temporaries.

To be compatible with the implementation of ilp32e in GCC, we don't use
aligned registers to pass variadic arguments and set stack alignment\
to 4-bytes for types with length of 2*XLEN.

FastCC is also supported on RVE, while GHC isn't since there is only one
avaiable register.

Differential Revision: https://reviews.llvm.org/D70401
@wangpc-pp wangpc-pp merged commit 3ac9fe6 into llvm:main Jan 16, 2024
4 of 5 checks passed
@wangpc-pp wangpc-pp deleted the main-rve branch January 16, 2024 12:44
@koute
Copy link

koute commented Jan 16, 2024

As a third-party bystander I just want to say, thank you to everyone involved! And especially @wangpc-pp, a huge thank you for pushing this through!

justinfargnoli pushed a commit to justinfargnoli/llvm-project that referenced this pull request Jan 28, 2024
This commit includes the necessary changes to clang and LLVM to support
codegen of `RVE` and the `ilp32e`/`lp64e` ABIs.

The differences between `RVE` and `RVI` are:
* `RVE` reduces the integer register count to 16(x0-x16).
* The ABI should be `ilp32e` for 32 bits and `lp64e` for 64 bits.

`RVE` can be combined with all current standard extensions.

The central changes in ilp32e/lp64e ABI, compared to ilp32/lp64 are:
* Only 6 integer argument registers (rather than 8).
* Only 2 callee-saved registers (rather than 12).
* A Stack Alignment of 32bits (rather than 128bits).
* ilp32e isn't compatible with D ISA extension.

If `ilp32e` or `lp64` is used with an ISA that has any of the registers
x16-x31 and f0-f31, then these registers are considered temporaries.

To be compatible with the implementation of ilp32e in GCC, we don't use
aligned registers to pass variadic arguments and set stack alignment\
to 4-bytes for types with length of 2*XLEN.

FastCC is also supported on RVE, while GHC isn't since there is only one
avaiable register.

Differential Revision: https://reviews.llvm.org/D70401
wangpc-pp pushed a commit that referenced this pull request Apr 11, 2024
Register spills (save/restore) in RISC-V embedded work differently
because there are less registers and different stack alignment.

[GCC equivalent
](https://github.com/gcc-mirror/gcc/blob/master/libgcc/config/riscv/save-restore.S#L298C16-L336)

Follow up from #76777.

---------

Signed-off-by: xermicus <cyrill@parity.io>
llvmbot pushed a commit to llvmbot/llvm-project that referenced this pull request Apr 12, 2024
Register spills (save/restore) in RISC-V embedded work differently
because there are less registers and different stack alignment.

[GCC equivalent
](https://github.com/gcc-mirror/gcc/blob/master/libgcc/config/riscv/save-restore.S#L298C16-L336)

Follow up from llvm#76777.

---------

Signed-off-by: xermicus <cyrill@parity.io>
(cherry picked from commit bd32aaa)
tstellar pushed a commit to llvmbot/llvm-project that referenced this pull request Apr 15, 2024
Register spills (save/restore) in RISC-V embedded work differently
because there are less registers and different stack alignment.

[GCC equivalent
](https://github.com/gcc-mirror/gcc/blob/master/libgcc/config/riscv/save-restore.S#L298C16-L336)

Follow up from llvm#76777.

---------

Signed-off-by: xermicus <cyrill@parity.io>
(cherry picked from commit bd32aaa)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backend:RISC-V clang:codegen clang:driver 'clang' and 'clang++' user-facing binaries. Not 'clang-cl' clang:frontend Language frontend issues, e.g. anything involving "Sema" clang Clang issues not falling into any other category llvm:support mc Machine (object) code
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants