Upgrade Hive client to 2.3.9 and HIVE platform JDK toolchain to 17#177
Open
YogeshKothari26 wants to merge 1 commit into
Open
Upgrade Hive client to 2.3.9 and HIVE platform JDK toolchain to 17#177YogeshKothari26 wants to merge 1 commit into
YogeshKothari26 wants to merge 1 commit into
Conversation
Bumps the plugin's pinned Hive client 1.2.2 -> 2.3.9 and the HIVE platform's JDK toolchain 8 -> 17 in Defaults.java. Bytecode for the HIVE platform stays at Java 8 (options.release.set(8)) so produced consumer UDF jars remain runnable on Java 8 runtimes. Spark (spark_2.11, spark_2.12) and Trino subprojects are not changed. Motivation: downstream UDF projects can move their build JVM to Java 17 without per-project workarounds for Hive 1.2.2's unresolvable org.pentaho transitive dep or its JDK 17 reflection access issues. See PR description for the full change list and testing summary.
b904862 to
0c71148
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Bumps the plugin's pinned Hive client
1.2.2 → 2.3.9and the HIVE platform's JDK toolchain8 → 17inDefaults.java. Bytecode for the HIVE platform stays at Java 8 (options.release.set(8)) so the produced consumer UDF jars remain runnable on Java 8 runtimes.Spark (
spark_2.11,spark_2.12) and Trino subprojects are not changed.Motivation
1.2.2transitively pullsorg.pentaho:pentaho-aggdesigner-algorithm:5.1.5-jhyde, which is not resolvable on Maven Central; downstream builds that don't add an explicit exclusion fail to resolve. Hive2.3.9is excludable cleanly.1.2.2's embedded HiveServer2 + DataNucleus + Derby reflection paths fail under JDK 17. Hive2.3.9is JDK-17-friendly with the standard--add-opensflags applied here.Changes
Defaults.java: HIVE platformJavaLanguageVersion.of(8) → of(17)+ addorg.pentahoexclusion to the consumer'shive-execcompileOnlyconfiguration.TransportPlugin.java: pin bytecoderelease = 8for non-Trino Java platforms; add--add-opensJVM args to the test launcher when the platform's JLV is>= 17and the platform is non-Trino.transportable-udfs-plugin/build.gradle:hive-version 1.2.2 → 2.3.9in the generatedversion-info.properties.transportable-udfs-hive/build.gradle:hive-exec 1.2.2 → 2.3.9(compileOnly+testImplementation), addorg.pentahoexclusion, add--add-opensto the subproject's own test task when the build JVM is JDK 17.transportable-udfs-test-hive/build.gradle:hive-exec/hive-service 1.2.2 → 2.3.9, addorg.pentahoexclusion.HiveTester.java: two Hive 2.x compat fixes — replace the removedFunctionInfo(boolean, String, GenericUDF)ctor withFunctionInfo(FunctionType.PERSISTENT, ...); disableMETASTORE_SCHEMA_VERIFICATIONand enabledatanucleus.schema.autoCreateAllon the embeddedHiveConf(Hive 2.3.x's embedded Derby strictly verifies schema version on startup).Testing
1. End-to-end Spark integration
A Spark cluster job registers Hive UDFs built by this plugin snapshot and runs SQL queries that exercise:
StdStringupper-case with null and empty-string edge cases.StdLongaddition with null-argument propagation.StdMap<StdString,StdString>construction with key-lookup, size, and null-key cases.All cases registered via
CREATE TEMPORARY FUNCTION+SELECTagainst aSparkSession. Exercises driver↔executor serialization, Spark's Hive UDF compatibility shim, and metastore-backed function registration end-to-end. All test cases pass.2. JAR diff vs
masterSHA-256 + size + class-set + bytecode-major-version on the 7 shipped plugin artifacts (snapshot vs baseline 0.2.0):
CONTENT_IDENTICAL(every class file byte-identical; zip metadata drift only). The 2 jars with deliberate content change aretransportable-udfs-plugin.jar(the toolchain wiring) andtransportable-udfs-test-hive.jar(the Hive 2.x call-site fixes) — i.e. only the files this PR touches.version-info.propertiesflipshive-version=1.2.2 → 2.3.9as intended.major version: 52viajavap -v(Java 8 bytecode target — preserves grid runtime compatibility). Trino wrapper unchanged atmajor version: 61.3. Plugin example UDFs (build + unit)
./gradlew build teston the PR branch — 51 / 51 example tests pass (17hiveTeston JDK 17 launcher, 17trinoTest, 17 generic) acrosstransportable-udfs-examplescovering every StdUDF arity + Std type (primitives, StdArray, StdMap, StdStruct, nested). Build is green on both JDK 8 and JDK 17 build JVMs.Note for consumers bumping their build JVM to JDK 17
The Spark platforms (
spark_2.11,spark_2.12) intentionally stay atJavaLanguageVersion.of(8)because Spark3.1.1hits SPARK-33772 on JDK 17. If you bump your build JVM to JDK 17 and use either Spark platform, keep a JDK 8 toolchain reachable from Gradle (e.g., add its path toorg.gradle.java.installations.pathsin yourgradle.properties). The HIVE platform itself needs no consumer-side workaround. This pin lifts once the plugin's pinned Spark version is bumped to 3.5.x.