Skip to content

Commit

Permalink
spark-3.5-scala-2.13/GHSA-8qv5-68g4-248j/GHSA-2jc4-r94c-rp7h-fix (#39819
Browse files Browse the repository at this point in the history
)

**Context and implications of the changes:**

The inability to update `org.scala-lang:scala-library` to version
`2.13.9` or higher in `spark-3.5.x` arises from a critical build issue
documented in
[SPARK-44376](https://issues.apache.org/jira/browse/SPARK-44376). This
problem stems from the transition to Scala `2.13.11`
([SPARK-40497](https://issues.apache.org/jira/browse/SPARK-40497)) and
the deprecation of the `-target` argument in favor of `-release` in the
[scala-maven-plugin](davidB/scala-maven-plugin#722).
This change introduces stricter compatibility checks, breaking builds
when using Java 11 or later.

The key errors include inaccessible `sun.*` classes like `DirectBuffer`
and `Unsafe`, which are not exported in Java versions above 8
([source](scala/bug#12643)). Despite manual
attempts to adjust Maven configurations, such as switching
`-target:jvm-1.8` to `-release:8`, compilation failures persist due to
inherent restrictions imposed by the `-release` argument. This issue is
compounded by the Scala Maven plugin's automatic addition of the
`-release` argument for Scala `2.13.9` and above, leading to
incompatibilities when targeting Java 8 compatibility while running on
newer Java versions.

While upstream remediated by upgrading to `2.13.11`, even `2.13.9` is
not possible with `scala-maven-plugin` version `4.8.0`. Currently,
`scala.version` is hardcoded to `2.13.8`, and fixing this requires
implementation of the following PRs:
1. https://github.com/apache/spark/pull/41626/files
2. https://github.com/apache/spark/pull/42899/files

**The only reason remediation is achievable is due to the following
conditions:**
1. It is already merged upstream and awaiting a `4.0.0` release.
2. The commits are not intertwined with other more complicated
initiatives or functional changes.
3. Support for Java 8 runtime dependency is not dropped with this
implementation.
4. This is a critical CVE.
5. Thorough package and image-level testing is in place.
6. Downgrading `scala-maven-plugin` to `4.7.1` does not introduce any
new CVEs beyond those already existing in `4.8.0`.

**This was a more involved remediation due to:**
1. The dynamic setting of this version property via
[change-scala-version.sh](https://github.com/apache/spark/blob/3b0ac45391708642ce6a1779e3c234bab0e40b66/dev/change-scala-version.sh#L72C1-L81C21).
2. The malformed `scala.version` property, which does not follow usual
`pom.xml` conventions, with the same property value defined twice in the
`pom.xml`, causing maven/pombump to not function.
3. The incredibly careful and diligent investigation required to gain
confidence in the fix, alongside the experience required from working on
previous spark-3.5 issues.

Bonus: Was able to tack on an additional remediation for
ivy/GHSA-2jc4-r94c-rp7h

---------

Signed-off-by: jamie-albert <jamie.albert@chainguard.dev>
  • Loading branch information
jamie-albert authored Jan 17, 2025
1 parent d10c014 commit be7bdf0
Show file tree
Hide file tree
Showing 3 changed files with 108 additions and 4 deletions.
4 changes: 2 additions & 2 deletions spark-3.5-scala-2.13.yaml
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
package:
name: spark-3.5-scala-2.13
version: 3.5.4
epoch: 0
epoch: 1
description: Unified engine for large-scale data analytics
copyright:
- license: Apache-2.0
Expand Down Expand Up @@ -50,7 +50,7 @@ pipeline:
# These are not CVE patches, but patches to make the build work.
- uses: patch
with:
patches: make-distribution.patch mvn.patch
patches: make-distribution.patch mvn.patch GHSA-8qv5-68g4-248j-fix.patch

- runs: |
./dev/change-scala-version.sh 2.13
Expand Down
104 changes: 104 additions & 0 deletions spark-3.5-scala-2.13/GHSA-8qv5-68g4-248j-fix.patch
Original file line number Diff line number Diff line change
@@ -0,0 +1,104 @@
From 55764a9ae8ef92512ef65229deac221c1c6afa95 Mon Sep 17 00:00:00 2001
From: yangjie01 <yangjie01@baidu.com>
Date: Fri, 15 Sep 2023 18:54:05 -0700
Subject: [PATCH] [SPARK-40497][BUILD] Re-upgrade Scala to 2.13.11

### What changes were proposed in this pull request?
This PR aims to re-upgrade Scala to 2.13.11, after SPARK-45144 was merged, the build issues mentioned in https://github.com/apache/spark/pull/41943 should no longer exist.

- https://www.scala-lang.org/news/2.13.11

Additionally, this pr adds a new suppression rule for warning message: `Implicit definition should have explicit type`, this is a new compile check introduced by https://github.com/scala/scala/pull/10083, we must fix it when we upgrading to use Scala 3

### Why are the changes needed?
This release improves collections, adds support for JDK 20 and 21, adds support for JDK 17 `sealed`:
- https://github.com/scala/scala/pull/10363
- https://github.com/scala/scala/pull/10184
- https://github.com/scala/scala/pull/10397
- https://github.com/scala/scala/pull/10348
- https://github.com/scala/scala/pull/10105

There are 2 known issues in this version:

- https://github.com/scala/bug/issues/12800
- https://github.com/scala/bug/issues/12799

For the first one, there is no compilation warning messages related to `match may not be exhaustive` in Spark compile log, and for the second one, there is no case of `method.isAnnotationPresent(Deprecated.class)` in Spark code, there is just

https://github.com/apache/spark/blob/8c84d2c9349d7b607db949c2e114df781f23e438/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/JavaTypeInference.scala#L130

in Spark Code, and I checked `javax.annotation.Nonnull` no this issue.

So I think These two issues will not affect Spark itself, but this doesn't mean it won't affect the code written by end users themselves

The full release notes as follows:

- https://github.com/scala/scala/releases/tag/v2.13.11

### Does this PR introduce _any_ user-facing change?
Yes, this is a Scala version change.

### How was this patch tested?
- Existing Test

### Was this patch authored or co-authored using generative AI tooling?
No

Closes #42918 from LuciferYang/SPARK-40497-2.

Authored-by: yangjie01 <yangjie01@baidu.com>
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
---
pom.xml | 6 +++++-
project/SparkBuild.scala | 4 +++-
2 files changed, 8 insertions(+), 2 deletions(-)

diff --git a/pom.xml b/pom.xml
index 91d93e8c3cc2c..779f9e64f1dcb 100644
--- a/pom.xml
+++ b/pom.xml
@@ -177,7 +177,7 @@
<scala-collection-compat.version>2.7.0</scala-collection-compat.version>
<scalatest-maven-plugin.version>2.2.0</scalatest-maven-plugin.version>
<!-- dont update scala-maven-plugin to version 4.8.1 SPARK-42809 and SPARK-43595 -->
- <scala-maven-plugin.version>4.8.0</scala-maven-plugin.version>
+ <scala-maven-plugin.version>4.7.1</scala-maven-plugin.version>
<maven.scaladoc.skip>false</maven.scaladoc.skip>
<versions-maven-plugin.version>2.16.0</versions-maven-plugin.version>
<!-- for now, not running scalafmt as part of default verify pipeline -->
@@ -3630,7 +3630,7 @@
<profile>
<id>scala-2.13</id>
<properties>
- <scala.version>2.13.8</scala.version>
+ <scala.version>2.13.11</scala.version>
<scala.binary.version>2.13</scala.binary.version>
</properties>
<build>
@@ -3689,6 +3689,10 @@
-->
<arg>-Wconf:cat=unused-imports&amp;src=org\/apache\/spark\/graphx\/impl\/VertexPartitionBase.scala:s</arg>
<arg>-Wconf:cat=unused-imports&amp;src=org\/apache\/spark\/graphx\/impl\/VertexPartitionBaseOps.scala:s</arg>
+ <!--
+ SPARK-40497 Upgrade Scala to 2.13.11 and suppress `Implicit definition should have explicit type`
+ -->
+ <arg>-Wconf:msg=Implicit definition should have explicit type:s</arg>
</args>
<compilerPlugins combine.self="override">
</compilerPlugins>
diff --git a/project/SparkBuild.scala b/project/SparkBuild.scala
index f67be83561d35..dc2e20c644b2a 100644
--- a/project/SparkBuild.scala
+++ b/project/SparkBuild.scala
@@ -286,7 +286,9 @@ object SparkBuild extends PomBuild {
// TODO(SPARK-43850): Remove the following suppression rules and remove `import scala.language.higherKinds`
// from the corresponding files when Scala 2.12 is no longer supported.
"-Wconf:cat=unused-imports&src=org\\/apache\\/spark\\/graphx\\/impl\\/VertexPartitionBase.scala:s",
- "-Wconf:cat=unused-imports&src=org\\/apache\\/spark\\/graphx\\/impl\\/VertexPartitionBaseOps.scala:s"
+ "-Wconf:cat=unused-imports&src=org\\/apache\\/spark\\/graphx\\/impl\\/VertexPartitionBaseOps.scala:s",
+ // SPARK-40497 Upgrade Scala to 2.13.11 and suppress `Implicit definition should have explicit type`
+ "-Wconf:msg=Implicit definition should have explicit type:s"
)
}
}

4 changes: 2 additions & 2 deletions spark-3.5-scala-2.13/pombump-properties.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,7 @@ properties:
value: "3.25.5"
- property: woodstox-core.version
value: "5.4.0"
- property: scala.version
value: "2.13.10"
- property: zookeeper.version
value: "3.7.2" # Changed from 3.7.3 to an existing version
- property: ivy.version
value: "2.5.3"

0 comments on commit be7bdf0

Please sign in to comment.