Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[clang][SYCL] Allow structs as free function kernel arguments #15334

Merged
merged 19 commits into from
Sep 26, 2024

Conversation

Fznamznon
Copy link
Contributor

@Fznamznon Fznamznon commented Sep 9, 2024

Support for non-decomposed structs with pointers is added.
Doesn't include support for structs containing SYCL special types aka accessors, streams and etc.

@Fznamznon Fznamznon marked this pull request as ready for review September 10, 2024 12:21
@Fznamznon Fznamznon requested review from a team as code owners September 10, 2024 12:21
Copy link
Contributor

@tahonermann tahonermann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comments on code added. I haven't looked at the test additions in depth yet.

else if (ParamTy->isStructureOrClassType()) {
if (KF_FOR_EACH(handleStructType, Param, ParamTy)) {
CXXRecordDecl *RD = ParamTy->getAsCXXRecordDecl();
visitRecord(RD, Param, RD, ParamTy, Handlers...);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Passing RD for both the 1st and 3rd arguments seems surprising here. The situation doesn't seem quite analogous to visitField() above. I'm having a difficult time figuring out exactly what visitRecord() is actually intending to do; the owner/wrapper distinction seems weird to me.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, a lot of code around the visitors is intended to have some "owner" because the original use case is lambda/functor whose fields need to be visited. However it doesn't seem to affect things anyhow and I suspect not having it is fine. Also this comment

// type (which doesn't exist in cases where it is a FieldDecl in the

suggests so.
I transformed this argument to nullptr to avoid confusion.

Comment on lines 1981 to 1987
// Check that the type is defined at namespace scope.
const DeclContext *DeclCtx = RD->getDeclContext();
if (!DeclCtx->isTranslationUnit() && !isa<NamespaceDecl>(DeclCtx)) {
Diag.Report(PD->getLocation(), diag::err_bad_kernel_param_type)
<< ParamTy;
IsInvalid = true;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this needs updates to handle declarations in ExternCContextDecl and LinkageSpecDecl declaration contexts. We should presumably traverse through those to the enclosing TranslationUnitDecl or NamespaceDecl context. Tests for that would be good; the forward declaration in the integration header should reproduce the enclosing extern "C", extern "C++", etc... context.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added handling for LinkageSpecDecl here, however I'm not able to write the code that would give me ExternCContextDecl to handle. Does clang emit it still?

the forward declaration in the integration header should reproduce the enclosing extern "C", extern "C++", etc... context.

I wonder, what would be the benefit of doing that? I suppose linkage declaration contexts shouldn't affect the name. I see that the code generating forward declarations is intentionally skipping LinkageSpecDecl .

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I looked into ExternCContextDecl and confirmed that it is never used as either a lexical or semantic context for a declaration (it is used to collect and identify extern "C" declarations that might appear in distinct lexical contexts). So, nothing to be done for it; the change to handle LinkageSpecDecl is all that is needed.

I wonder, what would be the benefit of doing that? I suppose linkage declaration contexts shouldn't affect the name. I see that the code generating forward declarations is intentionally skipping LinkageSpecDecl .

It could affect name mangling for variable and function declarations which in turn could affect mangling elsewhere. A highly contrived example involving kernel names is below. It looks like icx currently fails to handle such cases regardless of whether f() is declared extern "C" though; https://godbolt.org/z/KWhh5xWq9. In general, it looks like icx fails to generate correct integration headers for class templates with non-type template parameters that reference other symbols.

#include <sycl/sycl.hpp>
extern "C" void f();
template<void(*)()> class kernel_name {};
int main() {
  sycl::queue q;
  q.submit([](sycl::handler &h) {
	h.single_task<kernel_name<f>>([]{});
  });
  q.wait();
}

I don't know how important such cases are. Since SYCL doesn't support function pointers in device code, it could be useful to smuggle a function reference through the type system. icx accepts the following example: https://godbolt.org/z/TGW5Ynnjz.

#include <sycl/sycl.hpp>
extern "C" SYCL_EXTERNAL void f();
template<void(&FN)()> struct X {
  void operator()() const {
	FN();
  }
};
int main() {
  sycl::queue q;
  X<f> x;
  q.submit([=](sycl::handler &h) {
	h.single_task<struct KN>([=]{ x(); });
  });
  q.wait();
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, it seems the problem here is that we don't support forward declaring functions in the integration header. I'm not sure how useful the examples above, since plain SYCL doesn't allow function pointers anyway. I see that supporting these cases may require big functional changes around integration header generation, so I'm not sure if we should do this as a part of the PR. My preference is to add support for these changes as a separate PR. WDYT?

// TODO
unsupportedFreeFunctionParamType();
// This is a field which should not be decomposed.
CXXRecordDecl *FieldRecordDecl = ParamTy->getAsCXXRecordDecl();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FieldRecordDecl doesn't seem like the right name here. Perhaps ParamRecordDecl?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, thanks for the catch!

@Fznamznon
Copy link
Contributor Author

Fznamznon commented Sep 12, 2024

@intel/dpcpp-cfe-reviewers I also noticed that passing arrays, including passing arrays of pointers when wrapped within a struct seems to be working properly with this PR. Does it make sense to add a test for the case as a part of this PR or is it better to do that separately?

@Fznamznon
Copy link
Contributor Author

Fznamznon commented Sep 13, 2024

If there are no code changes I guess adding it to this PR works. It is up to you. I am fine either way

Ok, this PR is already 500 lines long, let's create a new one for array tests once this one is merged.

Copy link
Contributor

@elizabethandrews elizabethandrews left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM pending resolution of @tahonermann 's comments

@Fznamznon
Copy link
Contributor Author

Ping @tahonermann

Copy link
Contributor

@tahonermann tahonermann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall this looks good. I suggested some changes, some of which can be argued are out of scope for this PR.

Comment on lines 1964 to 1985
// For free functions all struct/class kernel arguments are forward declared
// in integration header, that adds additional restrictions for kernel
// arguments.
// Lambdas are not forward declarable. So, diagnose them properly.
if (RD->isLambda()) {
Diag.Report(PD->getLocation(), diag::err_bad_kernel_param_type)
<< ParamTy;
IsInvalid = true;
return isValid();
}

// Check that the type is defined at namespace scope.
const DeclContext *DeclCtx = RD->getDeclContext();
while (!DeclCtx->isTranslationUnit() &&
(isa<NamespaceDecl>(DeclCtx) || isa<LinkageSpecDecl>(DeclCtx)))
DeclCtx = DeclCtx->getParent();

if (!DeclCtx->isTranslationUnit()) {
Diag.Report(PD->getLocation(), diag::err_bad_kernel_param_type)
<< ParamTy;
IsInvalid = true;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps it would make sense to factor this out to an isForwardDeclarable() function? I looked for an existing one, but didn't find one. There are several cases where a forward declarable declaration is required and I'm already skeptical that we're diagnosing violations correctly. See DiagnoseKernelNameType() for additional code that could be factored out and merged.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Comment on lines +2064 to +2066
// TODO manipulate struct depth once special types are supported for free
// function kernels.
// ++StructFieldDepth;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we at least diagnose cases that involve SYCL special types now?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIRC this is already diagnosed by calling handleOtherType during visitation.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually maybe not. It is not obvious to me whether we will hit that code since we aren't decomposing yet.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, we didn't diagnose because we don't decompose yet. I added diagnosing.

SourceLocation(), SourceLocation(), SourceRange());
}

Expr *createStructTemporary(ParmVarDecl *OrigFunctionParameter) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Name suggestion:

Suggested change
Expr *createStructTemporary(ParmVarDecl *OrigFunctionParameter) {
Expr *createCopyInitExpr(ParmVarDecl *OrigFunctionParameter) {

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, thanks!

// CHECK-NEXT: UnaryOperator {{.*}} '__generated_Derived1 *' prefix '&' cannot overflow
// CHECK-NEXT: DeclRefExpr {{.*}} '__generated_Derived1' lvalue ParmVar {{.*}} '__arg_S2' '__generated_Derived1'
// CHECK-NEXT: ImplicitCastExpr {{.*}} 'int' <LValueToRValue>
// CHECK-NEXT: DeclRefExpr {{.*}} 'int' lvalue ParmVar {{.*}} '__arg_end' 'int'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test doesn't really validate code gen. Perhaps it should be in clang/test/AST; see the ast-dump-* tests there. Perhaps we should create a clang/test/ASTSYCL directory.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this test is living in the right folder. We have similar AST checks in SemaSYCL.

This test doesn't really validate code gen

Can you elaborate on what you mean by this? Why should a test in SemaSYCL validate code gen? We have CodeGenSYCL for that

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I can elaborate! I apparently looked at the wrong file name and/or suffered a hallucination :)

In general though, I would expect to find tests that validate the AST under clang/test/AST*, tests that validate diagnostics (or lack there of) under clang/test/Sema*, and tests that validate IR under clang/test/CodeGen*. I would admit no surprise to being informed that we are not so disciplined in distributing tests in that way though! :)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea we haven't separated AST tests from diagnostic tests. They all live in SemaSYCL. Some tests in SemaSYCL which purely check AST have a -ast appended to the name but we aren't consistent with that either.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR follows existing style. If I add just one test to a new directory, this will create a confusion for other contributors on where to add such tests. To avoid the confusion, I think we should move all AST checking tests to a separate directory in a separate PR.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A separate PR for moving the AST tests makes sense to me.


__attribute__((sycl_device))
[[__sycl_detail__::add_ir_attributes_function("sycl-single-task-kernel", 0)]]
void ff_4(Outer::DefinedWithinAClass S1) { // expected-error {{'Outer::DefinedWithinAClass' cannot be used as the type of a kernel parameter}}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This diagnostic message could be improved; it doesn't explain why the type can't be used.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added notes emitted. Hopefully this improves the situation.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this looks good, thank you!

@Fznamznon
Copy link
Contributor Author

@tahonermann , @elizabethandrews I updated the patch according to the last feedback, it suddenly became quite bigger, so could you please take a look again whenever you have time?

Copy link
Contributor

@tahonermann tahonermann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great. I noted two minor issues to be addressed, but this looks good to me; I don't feel a need to review again.

Comment on lines +4352 to +4355
return UnaryOperator::Create(SemaSYCLRef.getASTContext(), E, UO_Deref,
E->getType()->getPointeeType(), VK_LValue,
OK_Ordinary, SourceLocation(), false,
SemaSYCLRef.SemaRef.CurFPFeatureOverrides());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For consistency with surrounding functions. Clang-format might have other ideas though.

Suggested change
return UnaryOperator::Create(SemaSYCLRef.getASTContext(), E, UO_Deref,
E->getType()->getPointeeType(), VK_LValue,
OK_Ordinary, SourceLocation(), false,
SemaSYCLRef.SemaRef.CurFPFeatureOverrides());
return UnaryOperator::Create(
SemaSYCLRef.getASTContext(), E, UO_Deref,
E->getType()->getPointeeType(), VK_LValue,
OK_Ordinary, SourceLocation(), false,
SemaSYCLRef.SemaRef.CurFPFeatureOverrides());

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Clang-format doesn't agree.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clang-format makes bad choices sometimes :)

def note_free_function_kernel_param_type_not_fwd_declarable : Note<
"%0 is not forward declarable">;
def note_free_function_kernel_param_type_not_supported : Note<
"%0 is not yet supported as free function kernel parameter">;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"%0 is not yet supported as free function kernel parameter">;
"%0 is not yet supported as a free function kernel parameter">;

Copy link
Contributor

@elizabethandrews elizabethandrews Sep 24, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think having different diagnostics for free function parameter type and kernel param type is unnecessary. Other than the diagnostics you added/modified in this PR, won't other invalid types in free functions generate the old diagnostic err_bad_kernel_param_type? IMO I think we can keep the old diagnostic, passing type to it. The note diagnostic can be generated additionally for extra information where required

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree the diagnostics could be unified, but that would require changes to the existing diagnostics since their message text explicitly references the concept of a "kernel name". I'm not sure such unification would provide much benefit.

def err_nullptr_t_type_in_sycl_kernel : Error<"%0 is an invalid kernel name, "
                                        "'std::nullptr_t' is declared in the 'std' namespace ">;
def err_invalid_std_type_in_sycl_kernel : Error<"%0 is an invalid kernel name, " 
                                          "%q1 is declared in the 'std' namespace ">;
def err_sycl_kernel_incorrectly_named : Error<
  "%select{%1 is invalid; kernel name should be forward declarable "
  "at namespace scope"
  "|unscoped enum %1 requires fixed underlying type"
  "|unnamed type %1 is invalid; provide a kernel name, or use "
  "'-fsycl-unnamed-lambda' to enable unnamed kernel lambdas"
  "}0">;

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I used err_bad_kernel_param_type instead of a new one.

@tahonermann , I think @elizabethandrews meant another set of diagnostics.

@Fznamznon
Copy link
Contributor Author

Fznamznon commented Sep 25, 2024

The test failures seem to be unrelated. See #15407 (comment)

@Fznamznon
Copy link
Contributor Author

@intel/llvm-gatekeepers , this is ready for merge. Please merge.

@martygrant martygrant merged commit 4d0d876 into intel:sycl Sep 26, 2024
14 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants