Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[VL] Change the loadQuantum config if velox cache is enabled. #8197

Merged
merged 1 commit into from
Dec 12, 2024

Conversation

yikf
Copy link
Contributor

@yikf yikf commented Dec 10, 2024

What changes were proposed in this pull request?

#8186 followup, change the loadQuantum config if velox cache is enabled only.

How was this patch tested?

manual tests

@github-actions github-actions bot added CORE works for Gluten Core VELOX labels Dec 10, 2024
Copy link

Thanks for opening a pull request!

Could you open an issue for this pull request on Github Issues?

https://github.com/apache/incubator-gluten/issues

Then could you also rename commit message and pull request title in the following format?

[GLUTEN-${ISSUES_ID}][COMPONENT]feat/fix: ${detailed message}

See also:

Copy link

Run Gluten Clickhouse CI on x86

@yikf yikf force-pushed the loadQuantum-followup branch from f12995f to 8c8ae3f Compare December 10, 2024 08:15
Copy link

Run Gluten Clickhouse CI on x86

@yikf
Copy link
Contributor Author

yikf commented Dec 10, 2024

@Yohahaha @PHILO-HE @jackylee-ch Could you please take a look, thanks!

@yikf
Copy link
Contributor Author

yikf commented Dec 10, 2024

also cc @FelixYBW

Copy link

Run Gluten Clickhouse CI on x86

logWarning(
s"Velox currently only support up to 8MB load quantum size on SSD cache, change config " +
s"{$LOAD_QUANTUM.key} value from loadQuantum to $maxLoadQuantumOfVeloxCache")
conf.set(LOAD_QUANTUM.key, maxLoadQuantumOfVeloxCache.toString)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I note this is a static config, Can it be really reset?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I prefer to check load quantum in VeloxBackend::initCache, throws when load quantum larger than 8m when cache enabled.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Yohahaha When users use the default values and enable velox cache, throws directly may not be a good choice.

VeloxBackend::initCache modifying the load quantum fundamentally also involves checking and modifying the backendConf_, which seems to make backendConf_ read-only for safety purposes.

I am fine with either approach.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@PHILO-HE It can be reset, which happens during the initialization stage of the SparkContext, before the initialization of the SparkSession.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When users use the default values and enable velox cache, throws directly may not be a good choice.

it's a good choice, not all default values are fit velox cache.

Keep check in GlutenPlugin is ok, but modify config does not make sense to me, we could add load quantum check like spark.gluten.sql.columnar.backend.velox.fileHandleCacheEnabled.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@PHILO-HE @jackylee-ch have any thought on this?

@@ -2097,13 +2097,12 @@ object GlutenConfig {
.intConf
.createWithDefault(1)

// Velox currently only support up to 8MB load quantum size on SSD.
val LOAD_QUANTUM =
buildStaticConf("spark.gluten.sql.columnar.backend.velox.loadQuantum")
.internal()
.doc("Set the load quantum for velox file scan")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is worth adding more comments here. E.g., "Recommend to use the default value (256MB) for performance consideration. If Velox cache is enabled, it can be 8MB at most."

@@ -249,6 +249,18 @@ private[gluten] class GlutenDriverPlugin extends DriverPlugin with Logging {
s"${COLUMNAR_VELOX_CACHE_ENABLED.key} and " +
s"${COLUMNAR_VELOX_FILE_HANDLE_CACHE_ENABLED.key} should be enabled together.")
}

val maxLoadQuantumOfVeloxCache = 8 * 1024 * 1024
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: maxLoadQuantumForVeloxCache

@yikf yikf force-pushed the loadQuantum-followup branch from 4bedd2b to aa7f1d2 Compare December 10, 2024 12:17
Copy link

Run Gluten Clickhouse CI on x86

1 similar comment
Copy link

Run Gluten Clickhouse CI on x86

) {
logWarning(
s"Velox currently only support up to 8MB load quantum size on SSD cache, change config " +
s"{$LOAD_QUANTUM.key} value from loadQuantum to $maxLoadQuantumForVeloxCache")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit:

s"${LOAD_QUANTUM.key} value from ${loadQuantum} to $maxLoadQuantumForVeloxCache. " +
s"User can set ${LOAD_QUANTUM.key}=$maxLoadQuantumForVeloxCache to skip this warning."

@yikf yikf force-pushed the loadQuantum-followup branch from 3e94164 to 2c1bd22 Compare December 11, 2024 10:11
Copy link

Run Gluten Clickhouse CI on x86

@yikf yikf force-pushed the loadQuantum-followup branch from 2c1bd22 to 4c32021 Compare December 11, 2024 10:50
Copy link

Run Gluten Clickhouse CI on x86

@yikf yikf force-pushed the loadQuantum-followup branch from 4c32021 to 5e0a4c7 Compare December 11, 2024 10:52
Copy link

Run Gluten Clickhouse CI on x86

Copy link
Contributor

@Yohahaha Yohahaha left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@Yohahaha Yohahaha merged commit 65f4ad1 into apache:main Dec 12, 2024
48 checks passed
@yikf
Copy link
Contributor Author

yikf commented Dec 12, 2024

@Yohahaha @jackylee-ch @PHILO-HE thanks!

@yikf yikf deleted the loadQuantum-followup branch December 12, 2024 09:25
yikf added a commit to yikf/incubator-gluten that referenced this pull request Dec 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CORE works for Gluten Core VELOX
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants