-
Notifications
You must be signed in to change notification settings - Fork 443
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[VL] Change the loadQuantum config if velox cache is enabled. #8197
Conversation
Thanks for opening a pull request! Could you open an issue for this pull request on Github Issues? https://github.com/apache/incubator-gluten/issues Then could you also rename commit message and pull request title in the following format?
See also: |
Run Gluten Clickhouse CI on x86 |
f12995f
to
8c8ae3f
Compare
Run Gluten Clickhouse CI on x86 |
@Yohahaha @PHILO-HE @jackylee-ch Could you please take a look, thanks! |
also cc @FelixYBW |
Run Gluten Clickhouse CI on x86 |
logWarning( | ||
s"Velox currently only support up to 8MB load quantum size on SSD cache, change config " + | ||
s"{$LOAD_QUANTUM.key} value from loadQuantum to $maxLoadQuantumOfVeloxCache") | ||
conf.set(LOAD_QUANTUM.key, maxLoadQuantumOfVeloxCache.toString) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I note this is a static config, Can it be really reset?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I prefer to check load quantum in VeloxBackend::initCache
, throws when load quantum larger than 8m when cache enabled.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Yohahaha When users use the default values and enable velox cache, throws directly may not be a good choice.
VeloxBackend::initCache modifying the load quantum fundamentally also involves checking and modifying the backendConf_, which seems to make backendConf_ read-only for safety purposes.
I am fine with either approach.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@PHILO-HE It can be reset, which happens during the initialization stage of the SparkContext, before the initialization of the SparkSession.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When users use the default values and enable velox cache, throws directly may not be a good choice.
it's a good choice, not all default values are fit velox cache.
Keep check in GlutenPlugin is ok, but modify config does not make sense to me, we could add load quantum check like spark.gluten.sql.columnar.backend.velox.fileHandleCacheEnabled
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@PHILO-HE @jackylee-ch have any thought on this?
@@ -2097,13 +2097,12 @@ object GlutenConfig { | |||
.intConf | |||
.createWithDefault(1) | |||
|
|||
// Velox currently only support up to 8MB load quantum size on SSD. | |||
val LOAD_QUANTUM = | |||
buildStaticConf("spark.gluten.sql.columnar.backend.velox.loadQuantum") | |||
.internal() | |||
.doc("Set the load quantum for velox file scan") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is worth adding more comments here. E.g., "Recommend to use the default value (256MB) for performance consideration. If Velox cache is enabled, it can be 8MB at most."
@@ -249,6 +249,18 @@ private[gluten] class GlutenDriverPlugin extends DriverPlugin with Logging { | |||
s"${COLUMNAR_VELOX_CACHE_ENABLED.key} and " + | |||
s"${COLUMNAR_VELOX_FILE_HANDLE_CACHE_ENABLED.key} should be enabled together.") | |||
} | |||
|
|||
val maxLoadQuantumOfVeloxCache = 8 * 1024 * 1024 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: maxLoadQuantumForVeloxCache
4bedd2b
to
aa7f1d2
Compare
Run Gluten Clickhouse CI on x86 |
1 similar comment
Run Gluten Clickhouse CI on x86 |
) { | ||
logWarning( | ||
s"Velox currently only support up to 8MB load quantum size on SSD cache, change config " + | ||
s"{$LOAD_QUANTUM.key} value from loadQuantum to $maxLoadQuantumForVeloxCache") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit:
s"${LOAD_QUANTUM.key} value from ${loadQuantum} to $maxLoadQuantumForVeloxCache. " +
s"User can set ${LOAD_QUANTUM.key}=$maxLoadQuantumForVeloxCache to skip this warning."
3e94164
to
2c1bd22
Compare
Run Gluten Clickhouse CI on x86 |
2c1bd22
to
4c32021
Compare
Run Gluten Clickhouse CI on x86 |
4c32021
to
5e0a4c7
Compare
Run Gluten Clickhouse CI on x86 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
@Yohahaha @jackylee-ch @PHILO-HE thanks! |
What changes were proposed in this pull request?
#8186 followup, change the loadQuantum config if velox cache is enabled only.
How was this patch tested?
manual tests