Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[oneMKL][DFT] Full revision of DFT parts of specs + addition of {F,B}WD_STRIDES configuration parameters #506

Merged

Conversation

raphael-egan
Copy link
Contributor

@raphael-egan raphael-egan commented Oct 24, 2023

Full revision of the DFT-owned parts of the oneMKL specs. This is motivated by the need to introduce new configuration parameters {F,B}WD_STRIDES to the SYCL APIs of oneMKL's DFT domain, which could not be done in a consistent way without addressing the many inconsistencies of the current form of the specs.

  • The introduction of those new parameters was extensively discussed for (closed-source) oneMKL;
  • The proposal was accepted at oneMKL PxT meeting from 2023/10/17;
  • These suggested changes will be presented at Math SIG meeting on 2023/10/25.

I am keeping an archive of a local build off a commit (referenced in the name of the archive) for easier reading/review of the suggested changes (can be opened in a web browser to read a user-friendly rendered version): onemkl_dft_specs_fwd_bwd_strides_build_9e336a2f8707b4a85cdc11e6b404a49b54960b34.tar.gz

…lue as 1) this is the only appearance of REAL_COMPLEX found in the sources (the specs lack much clarification about it if we keep it there), 2) the changes targeted herein - addition of {F,B}WD_STRIDES - would result in even more nightmare-ish additions of other formats than CCE_FORMAT (and clarifications thereof) that would be supported only for 1D real transfomrs on CPU devices by closed-source oneMKL anyways... Also removing config_param::PACKED_FORMAT as its very existence is questionable (and confusing imo) if CCE is the only format registered in the specs for conjugate-even complex sequences anyways
Copy link
Contributor

@Rbiessy Rbiessy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks much better to me overall!

consequence, mix-and-matching either of ``config_param::INPUT_STRIDES`` or
``config_param::OUTPUT_STRIDES`` with either of ``config_param::FWD_STRIDES`` or
``config_param::BWD_STRIDES`` is **not** to be supported and would result in an
exception being thrown at commit time due to invalid configuration, if attempted.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this contradicts with the previous sentence since you are saying that setting (FWD|BWD)_STRIDES wipes out any previous value set with (INPUT|OUTPUT)_STRIDES. That would mean the descriptor has to remember that an exception needs to be thrown in set_value and throw it later at commit time. I think it would be easier to debug for a user and simpler to implement if the exception is thrown in set_value.
Also what is the point of wiping out the other values in this case?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, the intention was to actually not burden ourselves with such convoluted logic that would intend to keep track of what was done or not, in which order and, if what was done doesn't make sense to us somehow at some point throw right away (before even getting to commit) for the sole purpose of supporting (in part) something that we are actually deprecating along with a new feature. Usually the full configuration settings are required to determine whether a configuration is consistent or not so a strategy to throw before commit could quickly become nightmare-ish to define, let alone implement.
I can see how the suggested wording might be ambiguous though and I would suggest the following alternative, if desired: " As a consequence, setting either of config_param::INPUT_STRIDES or config_param::OUTPUT_STRIDES along with either of config_param::FWD_STRIDES or config_param::BWD_STRIDES must not to be supported and would result in an exception being thrown at commit time due to invalid configuration, if attempted."
Please let me know if you'd like that and if you agree with it.

More details:

To successfully commit, the descriptor must be fully aware of the intended data layouts in either domain and those must abide by the general consistency requirements (possibly only for one compute direction if using (IN|OUT)PUT_STRIDES). During the deprecation period, the users would have the choice to communicate them

  • either via (F|B)WD_STRIDES;
  • or via (IN|OUT)PUT_STRIDES;

but not via one of each couple of parameters, for instance.

In other words, what is meant to be conveyed here is that one may choose either couple of configuration parameters but not mix-n-match. For instance, a user attempting to configure a descriptor for a backward transform using something like
desc.set_value(config_param::INPUT_STRIDES, ...);
desc.set_value(config_param::FWD_STRIDES, ...);
desc.commit(...);
would receive 1) a warning message from the first operation advising them to prefer (F|B)WD_STRIDES over (IN,OUT)PUT_STRIDES (see following paragraph), 2) an exception thrown at commit time due to inconsistent configuration. Why? Because the second operation wipes out what the first operation did set, resulting in fwd strides being the only strides defined and set internally (Note: the default backward strides would be wiped out by the first operation here above and left unset thereafter). As a result, there is not couple of strides (internally) that satisfies the "general consistency requirement" specified above, therefore the descriptor is not properly configured and it can't commit: an error must be thrown at commit.

The rationale here is to force users to not mix-n-match during the deprecation period. While the above example could arguably be valid for a backward dft thereafter, it would not be for a forward dft. The motivation for introducing (F|B)WD_STRIDES and deprecating (IN|OUT)PUT_STRIDES was precisely to avoid this kind of ill-defined descriptor that can commit to one direction only.

If not opting for such a protocol, there are many more possible use case scenarii that could or could not make sense and would or would not be ill-defined, for instance
1)
desc.set_value(config_param::INPUT_STRIDES, a);
desc.set_value(config_param::FWD_STRIDES, c);
desc.set_value(config_param::OUTPUT_STRIDES, b);
desc.set_value(config_param::BWD_STRIDES, d);
where c != a, c != b, d != a, d != b. Which strides are to be used/considered at commit in that case? Which set_value should throw if any?
2)
desc.set_value(config_param::INPUT_STRIDES, a);
desc.set_value(config_param::OUTPUT_STRIDES, b);
// fwd and bwd strides not set but also left to their default values if not wiped
where default_fwd != a, default_fwd != b, default_bwd != a, default_bwd != b. Which strides are to be used/considered at commit in that case? Which set_value should throw if any?
3) etc.

In such cases, the suggested protocol clarifies what's expected to happen/be implemented, in my opinion.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I completely agree that we need some way to prevent the user from mix-and-matching the 2 strides configs. Thinking about it some more I have a slightly different suggestion, we could wipe the other stride config as you suggest but throw the exception in set_value. To compare the 2 suggestions with your examples:

desc.set_value(config_param::INPUT_STRIDES, a);
desc.set_value(config_param::FWD_STRIDES, c);
desc.set_value(config_param::OUTPUT_STRIDES, b);
desc.set_value(config_param::BWD_STRIDES, d);
where c != a, c != b, d != a, d != b
  • With your original proposal, the third set_value wipes fwd_strides and bwd_strides and the fourth set_value wipes input_strides and output_strides. At commit time the descriptor ends up with only bwd_strides set to some values. The descriptor can figure out that an exception needs to be thrown in this case. It cannot know whether the user set fwd_strides to an empty vector or if it was wiped because of setting INPUT_STRIDES or OUTPUT_STRIDES.
  • With my new suggestion, the first set_value is valid and it wipes fwd_strides and bwd_strides. The second set_value sees that fwd_strides are wiped, the exception is thrown now. The user gets the exception earlier.
desc.set_value(config_param::INPUT_STRIDES, a);
desc.set_value(config_param::OUTPUT_STRIDES, b);
// fwd and bwd strides not set but also left to their default values if not wiped
where default_fwd != a, default_fwd != b, default_bwd != a, default_bwd != b
  • With your original proposal, at commit time the descriptor needs to check a boolean to understand that fwd_strides and bwd_strides are wiped but no exception need to be thrown i.e. the user didn't try to set fwd_strides and bwd_strides before setting input_strides, output_strides. It's simple to deduce that input_strides and output_strides must be used.
  • With my new suggestion, the situation is the same than with your proposal except that no booleans are required.

It seems better to me if we can throw the exception in set_value rather than at commit time unless I'm missing something?
In any case, we may also want to add that setting any stride to an empty vector throws an invalid_argument exception to avoid misleading error messages.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think double-setting the strides with both INPUT/OUTPUT and FWD/BWD is a configuration consistency problem and should be caught at commit like all other consistency problems. We can't really tell if we wiped the strides to zero or if the customer set the strides to zero themselves and then reset them to the real strides, which would be valid. I think we should keep any errors thrown at set_value specific to invalid arguments, like an empty vector.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While I understand that these specs are not meant to be (nor necessarily become) a "documentation" for the closed-source oneMKL library, one cannot deny the fact that whoever wrote them in the first place was (more than) strongly "inspired" by the behavior and design of the closed-source oneMKL library, however questionable that choice was. That might have been in part because I keep hearing "rumors" that closed-source oneMKL will have to become spec-compliant some time "soon" but I don't actually know more details about that. Anyhow, I meant my changes to basically follow the footsteps of that "original author" (unknown to me) in that sense, i.e., opting for not checking anything configuration-related before commit.

As a developer of the closed-source oneMKL library, my understanding is that anything that relates to users calling set_value is literally "free game" in the following sense: they can use it however many times they want, possibly overwriting previously set values doing so, so long as their usage is not nonsensical (configuration parameters are valid and configuration values are valid, too). The configuration values themselves, the configuration parameters being set, and the order in which they are set, are not meant to be verified on-the-fly in any way - unless the function itself can't make sense of it all, of course (e.g. passing an empty vector in this case, which would be a good reason to make it throw a oneapi::mkl::invalid_argument() as specified in the section "set_value" here and in light of the expected nature for the configuration value that is "std::vector<int64_t> of size $(d +1)$", which is now specified in the table of section "config_param" here) - since all of the configuration settings "may still change" until commit is called.

Please also note that get_value is supposed and expected to write the content of a "std::vector<int64_t> of size $(d +1)$" too, now based on those changes, for any *_STRIDES configuration parameters being queried, at any time (before and/or after commit). Similarly, to set_value the usage of get_value is supposed to be "free game".

@Rbiessy, I have updated the content of section "Configuring strides for input and output data [deprecated, not recommended]" in here. Please review and let me know if that looks ok to you (with a thumbs-up) or comment. The most notable update is that the wording "wipe(d)" is no longer used but the supposedly-not-used stride configuration values are "reset to std::vector<std::int64_t>(d+1, 0)" instead. In light of the updated content therein and the consistency requirements, an exception is to be thrown at commit time for both of your example, and no flag is required either for your case no 2.

Finally, I would like to probe everyone's opinion (@hparks-intel, @t4c1, @Rbiessy, @FMarno) on the following question: the specs used to be written in a way that, for in-place complex descriptors, the output strides were basically ignored and overwritten by the input strides (see the last paragraph of this page). I personally think that is pretty bad and I have (unconsciously) omitted that in my changes so far: does anyone of you want me to re-introduce that in the current changes? Note: if we were to re-introduce it without any additional change, a use case scenario like (desc being an in-place complex descriptor)
desc.set_value(config_param::FWD_STRIDES, a);
desc.set_value(config_param::INPUT_STRIDES, c);
desc.set_value(config_param::BWD_STRIDES, b);
desc.set_value(config_param::OUTPUT_STRIDES, d);
would lead to an error at commit time but
desc.set_value(config_param::FWD_STRIDES, a);
desc.set_value(config_param::OUTPUT_STRIDES, c);
desc.set_value(config_param::BWD_STRIDES, b);
desc.set_value(config_param::INPUT_STRIDES, d);
would not. I don't want to be the person trying to justify that to any user, personally so my personal vote is to not re-introduce it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alright, I understand the idea that set_value and get_value should be "free game" as you say. I think it can lead to confusing errors at commit time but that's how it is.

I'm also happy with the last paragraph that you mention. It makes sense to allow different OUTPUT_STRIDES for in-place transforms IMO.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see your point about potential confusion with the two stride options during the deprecation period, and I'm fine with removing that line for now. It might merit further discussion though about whether "in-place" just means the starting configuration and ending configuration both fit in the same data container, so you could input a column batch and output a row batch to the same memory, or if it means each transform in the batch is truly in-place, which would disallow that sort of transposing the batch. I think the latter was the intent of that provision. It's easier to develop for the scenario where in-place means every transform is in-place, but that doesn't mean it's the best thing for the customers and therefore the spec.

Copy link
Contributor

@FMarno FMarno left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lots of improvements, nice!

Copy link
Contributor

@hparks-intel hparks-intel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great overall. I just have some minor nitpicking of phrasing so far, and I upvoted a bunch of Finlay's comments.

@raphael-egan raphael-egan changed the title [oneMKL][DFT] Revision of data layout and data storage configurations and addition of {F,B}WD_STRIDES configuration parameters [oneMKL][DFT] Full revision of DFT parts of specs inc. data layout,data storage and addition of {F,B}WD_STRIDES configuration parameters Oct 30, 2023
@raphael-egan raphael-egan changed the title [oneMKL][DFT] Full revision of DFT parts of specs inc. data layout,data storage and addition of {F,B}WD_STRIDES configuration parameters [oneMKL][DFT] Full revision of DFT parts of specs + addition of {F,B}WD_STRIDES configuration parameters Oct 30, 2023
…ayouts.rst

Co-authored-by: Parks, Helen <helen.parks@intel.com>
Copy link
Contributor

@hparks-intel hparks-intel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A bunch of minor comments from a last-pass close read. I've got three more pages to read closely.

raphael-egan and others added 4 commits October 31, 2023 09:36
Co-authored-by: Parks, Helen <helen.parks@intel.com>
Co-authored-by: Parks, Helen <helen.parks@intel.com>
Co-authored-by: Parks, Helen <helen.parks@intel.com>
Co-authored-by: Parks, Helen <helen.parks@intel.com>
Copy link
Contributor

@hparks-intel hparks-intel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Last round of nitpicking from me.

Comment on lines +188 to +189
- for complex descriptors, the offset, stride(s) (and distances, if relevant)
must be equal in forward and backward domain;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Relevant to another discussion here, we are already explicitly disallowing different input/output forward/backward strides for in-place transforms with this comment. So there's no change of behavior implied by removing the line about ignoring certain strides for in-place transforms.

Copy link
Contributor Author

@raphael-egan raphael-egan Oct 31, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, not really... currently something like
// desc is a fresh newly created 2D complex descriptor for sizes (n1 x n2), set for in-place by default
desc.set_value(config_param::INPUT_STRIDES, {0, n2, 1})
// not setting the output strides or even setting them to literally **anything** different than {0, n2, 1}
desc.commit(queue)
is not expected to break since "output strides are ignored" due to the nature of the descriptor and (current) specified behavior for such in-place DFTs. With the changes herein though, the output strides (all of values 0 if not set or not consistent with input strides, if set) would not satisfy the consistency requirement and commit would be expected to fail.

raphael-egan and others added 2 commits October 31, 2023 11:26
…ams.rst

Co-authored-by: Parks, Helen <helen.parks@intel.com>
Co-authored-by: Finlay <finlay.marno@codeplay.com>
Copy link
Contributor

@hparks-intel hparks-intel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Barring a few minor edits this looks good to me. Thanks!

@spencerpatty spencerpatty merged commit 3717a9a into uxlfoundation:main Nov 2, 2023
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants