[VL] Change the loadQuantum config if velox cache is enabled. #8197

yikf · 2024-12-10T07:59:53Z

What changes were proposed in this pull request?

#8186 followup, change the loadQuantum config if velox cache is enabled only.

How was this patch tested?

manual tests

github-actions · 2024-12-10T08:00:10Z

Thanks for opening a pull request!

Could you open an issue for this pull request on Github Issues?

https://github.com/apache/incubator-gluten/issues

Then could you also rename commit message and pull request title in the following format?

[GLUTEN-${ISSUES_ID}][COMPONENT]feat/fix: ${detailed message}

See also:

Other pull requests

github-actions · 2024-12-10T08:00:25Z

Run Gluten Clickhouse CI on x86

github-actions · 2024-12-10T08:15:51Z

Run Gluten Clickhouse CI on x86

yikf · 2024-12-10T08:18:39Z

@Yohahaha @PHILO-HE @jackylee-ch Could you please take a look, thanks!

yikf · 2024-12-10T08:18:56Z

also cc @FelixYBW

github-actions · 2024-12-10T08:30:06Z

Run Gluten Clickhouse CI on x86

PHILO-HE · 2024-12-10T09:35:00Z

gluten-core/src/main/scala/org/apache/gluten/GlutenPlugin.scala

+      logWarning(
+        s"Velox currently only support up to 8MB load quantum size on SSD cache, change config " +
+          s"{$LOAD_QUANTUM.key} value from loadQuantum to $maxLoadQuantumOfVeloxCache")
+      conf.set(LOAD_QUANTUM.key, maxLoadQuantumOfVeloxCache.toString)


I note this is a static config, Can it be really reset?

I prefer to check load quantum in VeloxBackend::initCache, throws when load quantum larger than 8m when cache enabled.

@Yohahaha When users use the default values and enable velox cache, throws directly may not be a good choice.

VeloxBackend::initCache modifying the load quantum fundamentally also involves checking and modifying the backendConf_, which seems to make backendConf_ read-only for safety purposes.

I am fine with either approach.

@PHILO-HE It can be reset, which happens during the initialization stage of the SparkContext, before the initialization of the SparkSession.

When users use the default values and enable velox cache, throws directly may not be a good choice.

it's a good choice, not all default values are fit velox cache.

Keep check in GlutenPlugin is ok, but modify config does not make sense to me, we could add load quantum check like spark.gluten.sql.columnar.backend.velox.fileHandleCacheEnabled.

@PHILO-HE @jackylee-ch have any thought on this?

PHILO-HE · 2024-12-10T09:53:19Z

shims/common/src/main/scala/org/apache/gluten/GlutenConfig.scala

@@ -2097,13 +2097,12 @@ object GlutenConfig {
      .intConf
      .createWithDefault(1)

-  // Velox currently only support up to 8MB load quantum size on SSD.
  val LOAD_QUANTUM =
    buildStaticConf("spark.gluten.sql.columnar.backend.velox.loadQuantum")
      .internal()
      .doc("Set the load quantum for velox file scan")


It is worth adding more comments here. E.g., "Recommend to use the default value (256MB) for performance consideration. If Velox cache is enabled, it can be 8MB at most."

Yohahaha · 2024-12-10T10:57:55Z

gluten-core/src/main/scala/org/apache/gluten/GlutenPlugin.scala

@@ -249,6 +249,18 @@ private[gluten] class GlutenDriverPlugin extends DriverPlugin with Logging {
        s"${COLUMNAR_VELOX_CACHE_ENABLED.key} and " +
          s"${COLUMNAR_VELOX_FILE_HANDLE_CACHE_ENABLED.key} should be enabled together.")
    }
+
+    val maxLoadQuantumOfVeloxCache = 8 * 1024 * 1024


nit: maxLoadQuantumForVeloxCache

github-actions · 2024-12-10T12:17:50Z

Run Gluten Clickhouse CI on x86

github-actions · 2024-12-10T12:18:44Z

Run Gluten Clickhouse CI on x86

jackylee-ch · 2024-12-10T13:50:30Z

gluten-core/src/main/scala/org/apache/gluten/GlutenPlugin.scala

+    ) {
+      logWarning(
+        s"Velox currently only support up to 8MB load quantum size on SSD cache, change config " +
+          s"{$LOAD_QUANTUM.key} value from loadQuantum to $maxLoadQuantumForVeloxCache")


nit:

s"${LOAD_QUANTUM.key} value from ${loadQuantum} to $maxLoadQuantumForVeloxCache. " + s"User can set ${LOAD_QUANTUM.key}=$maxLoadQuantumForVeloxCache to skip this warning."

github-actions · 2024-12-11T10:11:58Z

Run Gluten Clickhouse CI on x86

github-actions · 2024-12-11T10:50:50Z

Run Gluten Clickhouse CI on x86

github-actions · 2024-12-11T10:52:35Z

Run Gluten Clickhouse CI on x86

Yohahaha

👍

yikf · 2024-12-12T09:25:42Z

@Yohahaha @jackylee-ch @PHILO-HE thanks!

…8197)

github-actions bot added CORE works for Gluten Core VELOX labels Dec 10, 2024

yikf force-pushed the loadQuantum-followup branch from f12995f to 8c8ae3f Compare December 10, 2024 08:15

yikf mentioned this pull request Dec 10, 2024

[VL] Change loadQuantum default value to 8MB from 256MB #8186

Merged

PHILO-HE reviewed Dec 10, 2024

View reviewed changes

Yohahaha reviewed Dec 10, 2024

View reviewed changes

yikf force-pushed the loadQuantum-followup branch from 4bedd2b to aa7f1d2 Compare December 10, 2024 12:17

jackylee-ch reviewed Dec 10, 2024

View reviewed changes

yikf force-pushed the loadQuantum-followup branch from 3e94164 to 2c1bd22 Compare December 11, 2024 10:11

yikf force-pushed the loadQuantum-followup branch from 2c1bd22 to 4c32021 Compare December 11, 2024 10:50

change the loadQuantum config if velox cache is enabled.

5e0a4c7

yikf force-pushed the loadQuantum-followup branch from 4c32021 to 5e0a4c7 Compare December 11, 2024 10:52

jackylee-ch approved these changes Dec 12, 2024

View reviewed changes

Yohahaha approved these changes Dec 12, 2024

View reviewed changes

Yohahaha merged commit 65f4ad1 into apache:main Dec 12, 2024
48 checks passed

yikf deleted the loadQuantum-followup branch December 12, 2024 09:25

yikf added a commit to yikf/incubator-gluten that referenced this pull request Dec 13, 2024

[VL] Change the loadQuantum config if velox cache is enabled (apache#…

763e38b

…8197)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[VL] Change the loadQuantum config if velox cache is enabled. #8197

[VL] Change the loadQuantum config if velox cache is enabled. #8197

yikf commented Dec 10, 2024

github-actions bot commented Dec 10, 2024

github-actions bot commented Dec 10, 2024

github-actions bot commented Dec 10, 2024

yikf commented Dec 10, 2024

yikf commented Dec 10, 2024

github-actions bot commented Dec 10, 2024

PHILO-HE Dec 10, 2024

Yohahaha Dec 10, 2024

yikf Dec 10, 2024

yikf Dec 10, 2024

Yohahaha Dec 11, 2024

yikf Dec 11, 2024

PHILO-HE Dec 10, 2024

Yohahaha Dec 10, 2024

github-actions bot commented Dec 10, 2024

github-actions bot commented Dec 10, 2024

jackylee-ch Dec 10, 2024

github-actions bot commented Dec 11, 2024

github-actions bot commented Dec 11, 2024

github-actions bot commented Dec 11, 2024

Yohahaha left a comment

yikf commented Dec 12, 2024

[VL] Change the loadQuantum config if velox cache is enabled. #8197

[VL] Change the loadQuantum config if velox cache is enabled. #8197

Conversation

yikf commented Dec 10, 2024

What changes were proposed in this pull request?

How was this patch tested?

github-actions bot commented Dec 10, 2024

github-actions bot commented Dec 10, 2024

github-actions bot commented Dec 10, 2024

yikf commented Dec 10, 2024

yikf commented Dec 10, 2024

github-actions bot commented Dec 10, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

github-actions bot commented Dec 10, 2024

github-actions bot commented Dec 10, 2024

Choose a reason for hiding this comment

github-actions bot commented Dec 11, 2024

github-actions bot commented Dec 11, 2024

github-actions bot commented Dec 11, 2024

Yohahaha left a comment

Choose a reason for hiding this comment

yikf commented Dec 12, 2024