-
Notifications
You must be signed in to change notification settings - Fork 15.5k
[SYCL] SYCL host kernel launch support for the sycl_kernel_entry_point attribute. #152403
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
[SYCL] SYCL host kernel launch support for the sycl_kernel_entry_point attribute. #152403
Conversation
56407fb to
5b42f6b
Compare
7a913b2 to
70f34c3
Compare
|
✅ With the latest revision this PR passed the C/C++ code formatter. |
|
nit: typo in PR description: |
…t attribute.
The `sycl_kernel_entry_point` attribute facilitates the generation of an
offload kernel entry point function with parameters corresponding to the
(potentially decomposed) kernel arguments and a body that (potentially
reconstructs the arguments and) executes the kernel. This change adds
symmetric support for the SYCL host through an interface that provides
symbol names and (potentially decomposed) kernel arguments to the SYCL
library.
Consider the following function declared with the `sycl_kernel_entry_point`
attribute with a call to this function occurring in the implementation of
a SYCL kernel invocation function such as `sycl::handler::single_task()`.
template<typename KernelNameType, typename KernelType>
[[clang::sycl_kernel_entry_point(KernelNameType)]]
void kernel_entry_point(KernelType kerne) {
kernel();
}
The body of the above function specifies the parameters and body of the
generated offload kernel entry point. Clearly, a call to the above function
by a SYCL kernel invocation function is not intended to execute the body
as written. Previously, code generation emitted an empty function body so
that calls to the function had no effect other than to trigger the generation
of the offload kernel entry point. The function body is therefore available
to hook for SYCL library support and is now substituted with a call to a
(SYCL library provided) function template named `sycl_enqueue_kernel_launch()`
with the kernel name type passed as the first template argument, the
symbol name of the offload kernel entry point passed as a string literal for
the first function argument, and the (possibly decomposed) parameters passed
as the remaining explicit function arguments. Given a call like this:
kernel_entry_point<struct KN>([]{})
the body of the instantiated `kernel_entry_point()` specialization would be
substituted as follows with "kernel-symbol-name" substituted for the
generated symbol name and `kernel` forwarded (This assumes no kernel
argument decomposition; if decomposition was required, `kernel` would be
replaced with its corresponding decomposed arguments).
sycl_enqueue_kernel_launch<KN>("kernel-symbol-name", kernel)
Name lookup and overload resolution for the `sycl_enqueue_kernel_launch()`
function is performed at the point of definition of the
`sycl_kernel_entry_point` attributed function (or the point of instantiation
for an instantiated function template specialization). If overload
resolution fails, the program is ill-formed.
Implementation of the `sycl_enqueue_kernel_launch()` function might require
additional information provided by the SYCL library. This is facilitated by
removing the previous prohibition against use of the `sycl_kernel_entry_point`
attribute with a non-static member function. If the `sycl_kernel_entry_point`
attributed function is a non-static member function, then overload resolution
for the `sycl_enqueue_kernel_launch()` function template may select a
non-static member function in which case, `this` will be implicitly passed
as the implicit object argument.
If a `sycl_kernel_entry_point` attributed function is a non-static member
function, use of `this` in a potentially evaluated expression is prohibited
in the definition (since `this` is not a kernel argument and will not be
available within the generated offload kernel entry point function).
Support for kernel argument decomposition and reconstruction is not yet
implemented.
87f0c2b to
2a1c23d
Compare
…#51) * Add support for host kernel launch stmt generation This adds generation of a call to sycl_enqueue_kernel_launch function aka "launcher" function. The launcher function can be a memeber of a class or a free function defined at namespace scope. The lookup is performed from SKEP attributed function scope. Because unqualified lookup requires Scope object present and it only exists during parsing stage and already EOLed at the point where templates instantiated, I had to move some parts of SYCLKernelCallStmt generation to earlier stages and now TreeTransform knows how to process SYCLKernelCallStmt. I also had to invent a new expression - UnresolvedSYCLKernelExpr which represents a string containing kernel name of a kernel that doesn't exist yet. This expression is supposed to be transformed to a StringLiteral during template instantiation phase. It should never reach AST consumers like CodeGen of constexpr evaluators. This still requires more testing and FIXME cleanups, but since it evolved into a quite complicated patch I'm pushing it for earlier feedback. * Remove a fixme from SemaSYCL * Do not crash if original body was invalid * Add AST test for skep-attributed member * Fix a warning * Extend codegen test a bit * Find and replace UnresolvedSYCLKernelNameExpr -> UnresolvedSYCLKernelLaunchExpr * Implement the thing * One more find and replace * I don't know how it looks like * Find and replace again * Switch to UnresolvedSYCLKernelEntryPointStmt * Apply suggestions from code review * Remove log.txt * Implement visiting * Add tests * Apply suggestions from code review Co-authored-by: Tom Honermann <tom@honermann.net> * IdExpr -> KernelLaunchIdExpr * Don't rely on compound * UnresolvedSYCLKernelEntryPointStmt -> UnresolvedSYCLKernelCall * Fix warnings * Rename sycl_enqueue_kernel_launch -> sycl_kernel_launch * Apply suggestions from code review Co-authored-by: Tom Honermann <tom@honermann.net> * Remove array decay * Add windows run line to the sema test --------- Co-authored-by: Tom Honermann <tom@honermann.net>
In case a function with skep attribute is instantiated two times with the same kernel name the attribute is invalid due to the conflicting name. Make sure to exit from instantiation of UnresolvedSYCLKernelCallStmt in this case.
…ivially-copyable (#53) device-copyable doesn't mean trivially-copyable, so we may encounter arguments that need cleanup. Adds test that verifies presence of the dtor call in the synthesized code.
- Added/edited lots of comments. - Expanded testing for sycl_kernel_launch lookup and reorganized the tests to make it easier to identify and/or plug testing gaps. - Added a missing sycl_kernel_entry_point attribute in a test. - Reordered various declarations for better grouping and consistency. - Renamed some variables and functions. - Removed an unnecessary include directive. - Adjusted the location used for the implicit sycl_kernel_launch call. The opening brace of the original function body is used where available and the general function location is used otherwise. - Corrected lookup to unconditionally look for a template name; previously a spurious error about a 'template' keyword could be issued. - Added missing code gen checks for use of a member function as the SYCL kernel entry point function. - Other misc style edits for consistency.
…atic member function with an explicit object parameter.
Reworked diagnostics to include a synthesized code context and generation of notes detailing implicit calls to sycl_kernel_launch().
🐧 Linux x64 Test Results
✅ The build succeeded and all tests passed. |
Fznamznon
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just minors. I don't like the size of the PR but it is not horribly huge. I think this should be moved out of draft state.
Prompted by code review, this rewording is intended to clarify the prose and better tie it to the example code. Mention of kernel argument decomposition and reconstruction is removed since such support is not yet implemented. Additional changes include use of identifiers that match the SYCL specification for consistency, avoidance of duplication, a change of std::forward<>() to std::move(), and oher misc edits.
|
@erichkeane, @bader, this PR is now ready for community review. |
| // empty body; the original body is emitted in the offload kernel entry | ||
| // point and the synthesized kernel launch code is only relevant for host | ||
| // compilation. | ||
| return; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I find myself wondering whether we should insert some sort of 'halt' instruction here. It seems that this is an 'error' case, but we're emitting it if it is ODR used (since we don't have a way of determining reachability).
however, since we're skipping the body, this is obviously just a case where it is UB to actually EXECUTE this function. I know we can now do device asserts, so some sort of panic/terminate/etc would be appropriate here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is a good idea. I'll poke around to see if I can find a good option.
| } | ||
| } | ||
|
|
||
| if (isa<CXXConstructorDecl>(FD)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is happening in this file change? What relationship does this have to the kernel-launch support?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We previously rejected any non-static member function and that caught constructors and destructors. We now allow non-static member functions for the express purpose of passing this to sycl_kernel_launch(), but still want to reject constructors and destructors, so diagnostics for them now require more explicit support.
| } | ||
| if (const auto *MD = dyn_cast<CXXMethodDecl>(FD)) { | ||
| if (!MD->isStatic()) { | ||
| if (MD->isExplicitObjectMemberFunction()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why does this patch now allow new 'types' of functions? IMO, this should be a separate patch in a quite sizable patch.
That said, allowing member functions is interesting, but I don't see why we would reject explicit-object-member-functions, they are effectively static?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The motivation for allowing the sycl_kernel_entry_point attribute on non-static member functions is solely so that the call to sycl_kernel_launch will have a this pointer available. This allows the SYCL RT to store per-invocation state in the enclosing class without having to resort to some kind of (probably type unsafe) context pointer to be passed to sycl_kernel_launch.
The support for non-static member functions prohibits (implicit or explicit) use of this within the function body (except for in the implicit call to sycl_kernel_launch()); see the diagnostic added to OutlinedFunctionDeclBodyInstantiator::TransformCXXThisExpr() in clang/lib/Sema/SemaSYCL.cpp. We could potentially do similarly for an explicit object parameter, but the desired semantics aren't clear to me. this is always a pointer and the pointed to object is never a kernel argument. An explicit object type can be a non-reference/non-pointer type. In that case, should it become a kernel argument? Should the implicit call to sycl_kernel_launch() implicitly use the explicit object parameter? With no use case or particular motivation for supporting explicit object parameters, I felt the conservative thing to do is just prohibit their use. We can relax this if motivation arises with clear semantics arises.
| for (const Expr *Arg : Args) { | ||
| ExprValueKind EVK = Arg->getValueKind(); | ||
| const char *ValueCategory = | ||
| (EVK == VK_LValue ? "lvalue" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Whats going on here? This sort of printing is more of a 'dump' kinda thing, right? IF we're using this for diagnostics, we've done SOMETHING seriously wrong either in implementation or design.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I spent a long time trying to find a more "right" way to do this without success. I'm open to suggestions. I followed the closest precedent I could find; convertCallArgsToString() as used for __builtin_dump_struct() (which, yes, does more of a 'dump' kinda thing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This function doesn't actually print anything, so the name is a bit surprising. But I've got concerns with this approach as well.
- We want all diagnostic text to be in the .td file (largely to ease folks who are translating diagnostics into other languages)
- This is using standards terminology in an unhelpful way; value categories are not something we want to force users to reason about. Why is the value category even important?
- I suspect this is going to be verbose in practice given how long C++ type names can be (and this scales with the number of arguments passe).
Can you help me understand how this information helps the user determine how to fix the reported issue? Maybe we don't need to print it at all?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These diagnostics are intended to help SYCL run-time library implementers. SYCL programmers will only see these diagnostics if the SYCL RT implementation is defective.
The motivation for including the value category is to help SYCL RT implementers debug overload resolution failures. In this PR, the arguments passed to sycl_kernel_launch() (aside from the first which is always a string literal) are always xvalues. That will change in one of the next PRs when we implement support for SYCL library types that require special handling when present within a kernel argument (e.g., as the type of a data member). When that support lands, references to subobjects of those types will be passed as an additional lvalue argument and the corresponding kernel argument will be passed as an lvalue instead of an xvalue. The SYCL RT may use perfect forwarding in its implementation of sycl_kernel_launch() and thus be influenced by whether arguments are passed as lvalues or xvalues. In practice, sycl_kernel_launch() is expected to be implemented something like this:
template<typename KernelName, typename... KernelArgs>
auto sycl_kernel_launch(const char* kn, KernelArgs&&... args) {
// Process each argument individually in order.
set_kernel_argument(std::forward<KernelArgs>(args))...;
}
Take a look at the diagnostics issued for test bad14 in clang/test/SemaSYCL/sycl-kernel-launch.cpp. There is no source code to refer to for the implicit call to sycl_kernel_launch(). If the call arguments were not provided in the diagnostic, it would be quite difficult to understand why the call is ambiguous. In this case, someone who is familiar with how the sycl_kernel_entry_point attribute works would be able to infer that an int was passed, but that becomes more difficult when dependent types are involved; and more difficult still once the support for SYCL special library types lands.
I'm happy to rename the function. print... was chosen for consistency with PrintInstantiationStack(). How about convertCallArgsValueCategoryAndTypeToString() for consistency with convertCallArgsToString?
I would be happy to get suggestions for how better to present this information. I'm not totally opposed to dropping the value category information, but I do think it has potential to be helpful. In the event of a SYCL RT defect leading to a SYCL programmer seeing one of these diagnostics, I would expect all of this detail to be helpful in a bug report filed against the SYCL RT.
I understand the desire to keep all diagnostic text in the .td file for translation purposes. But honestly, from what I've learned from my Unicode involvement, major changes will be needed to Clang's diagnostic representation to support modern translation; it will need to be upgraded to something that can support features provided by the Unicode MessageFormat2 project. If someone comes forward to take that on, addressing cases like this will be a minor distraction.
|
|
||
| template<typename KernelName, typename... Ts> | ||
| void sycl_kernel_launch(const char *, Ts... Args) {} | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is this change doing here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With this PR, any call to a sycl_kernel_entry_point attributed function requires a sycl_kernel_launch declaration to be in scope (similar to how a CUDA kernel call requires a cudaConfigureCall() declaration to be in scope).
|
|
||
| // A generic kernel launch function. | ||
| template<typename KN, typename... Ts> | ||
| void sycl_kernel_launch(const char *, Ts...) {} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would have been lovely for the purposes of actually reviewing such a large patch if these were just done in a separate review, even if they were ineffectual at the time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The downside of doing so is that a requested change to the name or in how the implicit calls are constructed would have required the declarations to be updated. I understand the goal of making the patch smaller, but I think there is also value in keeping it contained (especially if it were to be reverted later).
|
|
||
| // A generic kernel launch function. | ||
| template<typename KN, typename... Ts> | ||
| void sycl_kernel_launch(const char *, Ts...) {} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or at least... do we intend the library to support these? Maybe we should have an Inputs header for all of these...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The SYCL RT implementation will provide these declarations. For most tests, no implementation is required (this declaration could have omitted the function body).
| template<typename KN, typename KT> | ||
| void sycl_kernel_launch(const char *, KT); | ||
| // expected-error@+4 {{no matching function for call to 'sycl_kernel_launch'}} | ||
| // expected-note-re@+2 {{in implicit call to 'sycl_kernel_launch' with template argument 'BADKN<5>' and function arguments (lvalue of type 'const char[{{[0-9]*}}]', xvalue of type 'BADKT<5>', xvalue of type 'int') required here}} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm VERY much not a fan of how this is being displayed/printed here. This is really kind of an awful way of presenting this information. WE have planty of ways of printing argument lists, inventing a new one isn't a good idea.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I recognize that this is novel and I welcome suggestions for how to do better.
The reason I went with this approach is because there is no source line to refer to, so a diagnostic can't reference a line of code for the (SYCL RT) programmer to refer to in order to understand how arguments are passed. Debugging an overload resolution failure requires knowing the number of arguments, their types, and their value categories. The diagnostic notes for candidates don't necessarily suffice on their own to understand what is going wrong without this additional information.
| // potentially result in additional errors. If construction of | ||
| // 'LaunchIDExpr' failed, then 'SYCLKernelLaunchIdExpr' will be assigned | ||
| // a null pointer value below; that is expected. | ||
| getCurFunction()->SYCLKernelLaunchIdExpr = LaunchIdExpr.get(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So if I understand correctly, a declaration with no definition is then not diagnosed as an error despite the attribute being invalid, because we never get a function definition?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would not categorize the attribute as being invalid in this case, but rather that a semantic effect of the attribute encountered an error. That diagnosis cannot be made in the absence of a definition because lookup for sycl_kernel_launch is performed from the point of definition.
| if (FD->isDefaulted()) { | ||
| Diag(SKEPAttr->getLocation(), diag::err_sycl_entry_point_invalid) | ||
| << SKEPAttr << /*defaulted function*/ 3; | ||
| << SKEPAttr << /*defaulted function*/ 2; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we're at the point where this diagnostic should change to use %enum_select instead of %select so that we can use the enumerators here. WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The challenge is always where to declare the enumeration. I would prefer to separate the diagnostics rather than introduce additional declarations for the sole purpose of specifying a selection argument. I understand that there has been some sentiment expressed in the past for avoiding duplicated subtext in diagnostics to ease future translation, but I think that advice is misplaced; modern translation may require alterations to the non-substituted text to align with substituted fields (and in more ways than simple plural handling).
Either way, I would prefer to make such a change as a separate PR. I would be ok with landing that PR first and then rebasing this PR on top of it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See enum_select in the Attr.td file. We've defined something exactly for this purpose. I'd be against doing it as a separate PR, its bad now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note Aaron is out on PTO until the new year, so this will have to wait for approval anyway.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you, I wasn't aware that enum_select had been implemented like this. That is much better! I submitted a separate PR (#173122) to switch the existing uses to it and will rebase this PR on that one once it lands.
| SourceLocation Loc) { | ||
| // The current context must be the function definition context to ensure | ||
| // that parameter references occur within the correct scope. | ||
| assert(SemaRef.CurContext == FD); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good to add && "explanation" to the assert.
|
|
||
| // Prepare a string literal that contains the kernel name. | ||
| ASTContext &Ctx = SemaRef.getASTContext(); | ||
| const std::string KernelName = SKI->GetKernelName(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| const std::string KernelName = SKI->GetKernelName(); | |
| const std::string &KernelName = SKI->GetKernelName(); |
or StringRef?
| // The current context must be the function definition context to ensure | ||
| // that name lookup and parameter and local variable creation are performed | ||
| // within the correct scope. | ||
| assert(SemaRef.CurContext == FD); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another good place for some && "explanation". (You should assume this comment applies to all asserts.)
| for (const Expr *Arg : Args) { | ||
| ExprValueKind EVK = Arg->getValueKind(); | ||
| const char *ValueCategory = | ||
| (EVK == VK_LValue ? "lvalue" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This function doesn't actually print anything, so the name is a bit surprising. But I've got concerns with this approach as well.
- We want all diagnostic text to be in the .td file (largely to ease folks who are translating diagnostics into other languages)
- This is using standards terminology in an unhelpful way; value categories are not something we want to force users to reason about. Why is the value category even important?
- I suspect this is going to be verbose in practice given how long C++ type names can be (and this scales with the number of arguments passe).
Can you help me understand how this information helps the user determine how to fix the reported issue? Maybe we don't need to print it at all?
| /// Retrieve the original statement. | ||
| CompoundStmt *getOriginalStmt() { return cast<CompoundStmt>(OriginalStmt); } | ||
| const CompoundStmt *getOriginalStmt() const { | ||
| return cast<CompoundStmt>(OriginalStmt); | ||
| } | ||
|
|
||
| /// Set the original statement. | ||
| void setOriginalStmt(CompoundStmt *CS) { OriginalStmt = CS; } | ||
|
|
||
| /// Retrieve the kernel launch statement. | ||
| Stmt *getKernelLaunchStmt() { return KernelLaunchStmt; } | ||
| const Stmt *getKernelLaunchStmt() const { return KernelLaunchStmt; } | ||
|
|
||
| /// Set the kernel launch statement. | ||
| void setKernelLaunchStmt(Stmt *S) { KernelLaunchStmt = S; } | ||
|
|
||
| /// Retrieve the outlined function declaration. | ||
| OutlinedFunctionDecl *getOutlinedFunctionDecl() { return OFDecl; } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suggest we remove comments for these setters and getters. They do not seem useful to me as they just repeat the method's names.
The same applies to UnresolvedSYCLKernelCallStmt class methods.
|
|
||
| #. The ``sycl_kernel_launch`` template is expected to be provided by the SYCL | ||
| library implementation. It is responsible for copying the kernel arguments | ||
| to device memory and for scheduling execution of the generated offload |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| to device memory and for scheduling execution of the generated offload | |
| to device memory and for scheduling execution of the generated offload |
| library implementation. It is responsible for copying the kernel arguments | ||
| to device memory and for scheduling execution of the generated offload |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Assuming that SYCL kernel type decomposition + compiler optimizations eliminate the need for passing some kernel arguments how this information is supposed to be communicated to the SYCL runtime library?
| /// A SYCLKernelCallStmt record. | ||
| STMT_SYCLKERNELCALL, | ||
|
|
||
| /// A SYCLKernelCallStmt record. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please, make the difference between STMT_SYCLKERNELCALL and STMT_UNRESOLVED_SYCL_KERNEL_CALL clear in this comment.
Otherwise, it makes sense to unify these statement codes under a single comment like this (i.e. remove a separator):
/// A SYCLKernelCallStmt record.
STMT_SYCLKERNELCALL,
STMT_UNRESOLVED_SYCL_KERNEL_CALL,
/// A GCC-style AsmStmt record.| if (SemaRef.isPotentialImplicitMemberAccess(EmptySS, Result, | ||
| /*IsAddressOfOperand*/ false)) | ||
| // The lookup result allows for a possible implicit member access that | ||
| // would require an implicit or explicit 'this' argument. | ||
| IdExpr = SemaRef.BuildPossibleImplicitMemberExpr( | ||
| EmptySS, SourceLocation(), Result, &TALI, SemaRef.getCurScope()); | ||
| else | ||
| IdExpr = SemaRef.BuildTemplateIdExpr(EmptySS, SourceLocation(), Result, | ||
| /*RequiresADL*/ true, &TALI); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Coding standards suggests using braces for such if-statements.
https://llvm.org/docs/CodingStandards.html#don-t-use-braces-on-simple-single-statement-bodies-of-if-else-loop-statements
The
sycl_kernel_entry_pointattribute facilitates the generation of an offload kernel entry point function with parameters corresponding to the (potentially decomposed) kernel arguments and a body that (potentially reconstructs the arguments and) executes the kernel. This change adds symmetric support for the SYCL host through an interface that provides symbol names and (potentially decomposed) kernel arguments to the SYCL library.Consider the following function declared with the
sycl_kernel_entry_pointattribute with a call to this function occurring in the implementation of a SYCL kernel invocation function such assycl::handler::single_task().The body of the above function specifies the parameters and body of the generated offload kernel entry point. Clearly, a call to the above function by a SYCL kernel invocation function is not intended to execute the body as written. Previously, code generation emitted an empty function body so that calls to the function had no effect other than to trigger the generation of the offload kernel entry point. The function body is therefore available to hook for SYCL library support and is now substituted with a call to a (SYCL library provided) function template named
sycl_kernel_launch()with the kernel name type passed as the first template argument, the symbol name of the offload kernel entry point passed as a string literal for the first function argument, and the (possibly decomposed) parameters passed as the remaining explicit function arguments. Given a call like this:the body of the instantiated
kernel_entry_point()specialization would be substituted as follows with "kernel-symbol-name" substituted for the generated symbol name andkernelforwarded (This assumes no kernel argument decomposition; if decomposition was required,kernelwould be replaced with its corresponding decomposed arguments).sycl_kernel_launch<KN>("kernel-symbol-name", kernel)Name lookup and overload resolution for the
sycl_kernel_launch()function is performed at the point of definition of thesycl_kernel_entry_pointattributed function (or the point of instantiation for an instantiated function template specialization). If overload resolution fails, the program is ill-formed.Implementation of the
sycl_kernel_launch()function might require additional information provided by the SYCL library. This is facilitated by removing the previous prohibition against use of thesycl_kernel_entry_pointattribute with a non-static member function. If thesycl_kernel_entry_pointattributed function is a non-static member function, then overload resolution for thesycl_kernel_launch()function template may select a non-static member function in which case,thiswill be implicitly passed as the implicit object argument.If a
sycl_kernel_entry_pointattributed function is a non-static member function, use ofthisin a potentially evaluated expression is prohibited in the definition (sincethisis not a kernel argument and will not be available within the generated offload kernel entry point function).Support for kernel argument decomposition and reconstruction is not yet implemented.