[SYCL][Doc] Initial spec conversion to Sphinx #16540

gmlueck · 2025-01-07T16:24:58Z

The upstream clang project uses reStructuredText and Sphinx for its
documentation, so it makes sense to use these doc tools also for the
SYCL extension specifications. Converting these documents to Sphinx
should remove one of the barriers to upstreaming our SYCL extensions.

This PR converts only a few specifications in order to get feedback on
the style and document structure.

I also took this as an opportunity to update the template we use for
writing new specifications. We've gained a lot of experience since the
template was first written, and these updates demonstrate our new
preferred style for writing specifications. I've also expanded the
template to show sample formatting for common scenarios that occur when
writing SYCL extensions. For example, there is a sample section showing
an extension that adds member functions to an existing class, a sample
section showing an extension that adds new non-member functions, etc.

The upstream clang project uses reStructuredText and Sphinx for its documentation, so it makes sense to use these doc tools also for the SYCL extension specifications. Converting these documents to Sphinx should remove one of the barriers to upstreaming our SYCL extensions. This PR converts only a few specifications in order to get feedback on the style and document structure. I also took this as an opportunity to update the template we use for writing new specifications. We've gained a lot of experience since the template was first written, and these updates demonstrate our new preferred style for writing specifications. I've also expanded the template to show sample formatting for common scenarios that occur when writing SYCL extensions. For example, there is a sample section showing an extension that adds member functions to an existing class, a sample section showing an extension that adds new non-member functions, etc.

Come up with a strategy that allows cross-files references to work when the specifications are viewed either from GitHub or from the Sphinx generated HTML. See comments in "conf.py" for details.

gmlueck · 2025-01-07T17:04:43Z

Here are a few specific questions I'd like feedback on:

Any comments in general about using reStructuredText instead of Asciidoc?
I intentionally omitted the copyright notice from the new specification documents because I think "Copyright Intel. All rights reserved" would be a problem for upstreaming these documents. Other Sphinx documentation in the llvm project does not seem to have any copyright notice, so I think it would be consistent for our documentation also to have no copyright notice. Note that the other Sphinx documentation does generate a copyright notice in the footer of the generated HTML ("Copyright LLVM Project"). I have not done this as part of this PR, but it probably should be done.
Many of the specification documents refer to the implementation as "DPC++". I'm not sure what to do about this. Should these be changed to say that the implementation is "clang"?
Should the name of an extension be in lower case or upper case? Previously, we always referred to extensions in lower case (e.g. "sycl_ext_oneapi_kernel_compiler". However, I noticed that the official SYCL KHR extension names are in upper case (e.g. "SYCL_KHR_DEFAULT_CONTEXT"). I don't think this was a conscious decision, but I'd like to be consistent. This PR uses upper case for the newly formatted extensions, but I'm not sure I like that better. Opinions on upper vs. lower case?
I added a TOC at the top of each specification, which is consistent with other Sphinx documentation in the llvm project. I like it. Is anyone opposed?

Also, picking out a couple specific people for feedback:

@Pennycook: I'd appreciate your opinion on the formatting style for the .rst versions of the extensions and also on the guidelines in the updated "template.rst".

@bader: I'd appreciate your opinion on the tooling changes I made in this PR (e.g. "conf.py").

Of course, comments are welcome from everyone also!

gmlueck · 2025-01-07T17:06:42Z

sycl/doc/extensions/proposed/sycl_ext_oneapi_num_compute_units.rst

+   :align: left
+
+   ===========  ==========  =================
+   Device Type  Backend(s)  Number of Domains


@Pennycook: Is "Number of Domains" the right title for this column? This is what you had in the original Asciidoc version, but it seems like "Number of Compute Units" might be better.

Oops. You're right; I based this extension on the abandoned "execution domains" extension, and missed this.

bader

Any comments in general about using reStructuredText instead of Asciidoc?

IIRC, we choose Asciidoc to make it easier to integrate these documents into SYCL spec. Don't we need to preserve the Asciidoc format?

The more general question: Is the LLVM repository the right place to host SYCL specification extensions? Would it be more appropriate to host them in Khronos organization as we do for SPIR-V extensions?

bader · 2025-01-08T00:01:41Z

sycl/doc/conf.py

+# ways:
+#
+#   * From the HTML that is generated from Sphinx, or
+#   * By using a web browser to navigate to the .rst file in the repo.


Do we really need this option?
LLVM and Clang documentation uses :doc: syntax and I guess all users rely on HTML documentation for links to work.

You suggest we use different links style for writing SYCL documentation to enable cross-document links for GitHub rendered documents. I'm not sure how we can catch the use of wrong notation.

Can we ignore the GitHub rendering issue and follow LLVM/Clang link notation (i.e. use https://intel.github.io/llvm/ rendered documentation)?

It is extremely common for us to point people to the GitHub documentation for extensions. We do this all the time in email and internal trackers too. Does https://intel.github.io/llvm/ get updated automatically to reflect the top of the "sycl" (main) branch in the repo? If so, I guess we could get in the habit of pointing people there instead.

One problem could occur if we need to point people to the documentation in a particular branch of the repo. Maybe this is uncommon enough that it doesn't matter, though.

On the other hand, using the alternate link syntax seems easy to do. Is it really so offensive? It would be nice to be able to continue pointing people to the documentation in the repo.

Does https://intel.github.io/llvm/ get updated automatically to reflect the top of the "sycl" (main) branch in the repo?

Yes. We update it every night and each time anything changes in clang/docs/ or sycl/doc/ directories. Exact rules:

llvm/.github/workflows/sycl-docs.yml

Lines 4 to 19 in c064279

schedule:

- cron: 0 1 * * *

pull_request:

branches:

- sycl

paths:

- '.github/workflows/sycl-docs.yml'

- 'clang/docs/**'

- 'sycl/doc/**'

push:

branches:

- sycl

paths:

- '.github/workflows/sycl-docs.yml'

- 'clang/docs/**'

- 'sycl/doc/**'

I don't know if using alternative style for SYCL documentation references is a real issue for the project, but is using Asciidoc format is a real blocker for upstreaming SYCL?
MLIR project managed to get accepted although they use Markdown format.
To be fair, LLVM and Clang sub-project also use Markdown alongside reStructuredText format, but MLIR rendered version has completely different look than LLVM/Clang documentation as they use Hugo framework to generate website instead of Sphinx.

I don't know if using alternative style for SYCL documentation references is a real issue for the project, but is using Asciidoc format is a real blocker for upstreaming SYCL?

I'm not sure. When I talked to @AaronBallman, he seemed to think that converting to RST/Sphinx was important. I'm not sure if it's an absolute requirement, though.

I'd be OK with any of these options, but they all have some sort of down-side:

Option 1

Upstream only the "supported" and KHR extensions (not the "experimental" ones)

Move the "supported" specifications to the Khronos repo and use Asciidoc

Keep the "experimental" and "proposed" extensions only in intel/llvm and use Asciidoc for their specs

Con: Someone needs to refactor the code to remove all the "experimental" extensions before upstreaming

Option 2

Upstream only the KHR extensions (not the "supported" or "experimental" ones)

Keep the "supported", "experimental", and "proposed" extensions only in intel/llvm and use Asciidoc for their specs

Con: Someone needs to refactor the code to remove all the "supported" and "experimental" extensions before upstreaming

Option 3

Upstream the "supported", "experimental", and KHR extensions

Upstream the "supported" and "experimental" extension specs also and convert them to RST/Sphinx

Con: Someone needs to reformat the extension specs into RST/Sphinx; We need to convert them back to Asciidoc later in order to make them into KHRs.

Option 4

Upstream the "supported", "experimental", and KHR extensions

Upstream the "supported" and "experimental" extension specs also but leave them in Asciidoc format

Con: Unclear if clang community will accept this

To give some idea of the magnitude of the work, there are over 100 extensions ("supported", "experimental", and "proposed" combined), and their specifications total over 40K lines of text.

@bader, have you given any thought to what we will do with all the other SYCL documents in "sycl/doc" that are not extension specifications? Will we convert this to RST/Sphinx? Will the generated HTML be hosted someplace, or will the files just live in the repo?

EDIT: The total count of extension specifications includes "supported", "experimental", and "proposed". A previous version of this comment omitted the word "supported" from that list.

@bader, have you given any thought to what we will do with all the other SYCL documents in "sycl/doc" that are not extension specifications? Will we convert this to RST/Sphinx?

I didn't plan any changes in sycl/doc for the upstreaming. From day one, we use Markdown format for SYCL documentation to be upstream-ready. LLVM sub-projects use Markdown in addition to RST (even more extensively after moving to GitHub to be more friendly for first-time/irregular contributors).

Will the generated HTML be hosted someplace, or will the files just live in the repo?

I expect documentation for SYCL library will be in HTML format and will be hosted as at llvm.org as other sub-projects. See https://releases.llvm.org/ for examples. To make this happen we must upstream the minimal version of the working implementation fist.

Tagging @sergey-semenov, in case he has other plans. According to my understanding @sergey-semenov is working on upstreaming everything in sycl directory.

Oh! That's an interesting point -- at some point, we're going to upstream the library stuff at which point we will basically need a separate website (same as many of the other top-level projects in the monorepo).

Yes, all of the extensions I'm talking about are extensions to the SYCL library. None of them are extensions to the clang compiler proper. I don't suppose that changes your opinion on using Asciidoc? :-)

My thinking is that if the experimental extension is implemented/being implemented in upstream, then we'd want documentation upstream that explains it

That makes sense, and it's aligned with my thinking also.

Really? The website sure reads to me like you can take any format they support as input and output RST (because of ↔︎ = conversion from and to in the docs).

That's not how I read the website. The line for Asciidoc has a right-pointing arrow, which the legend defines as "conversion to". If they supported conversion both ways, I think that line would have a double-headed arrow. In addition, the release notes make several mentions of "asciidoc writer", but there is no mention of "asciidoc reader". For comparison, the release notes mention both "rst writer" and "rst reader".

One more comment for now ... I'm not sure if this is a question to @sergey-semenov or to the group as a whole. How do we expect the upstreamed SYCL documentation to be organized? Looking at the LLVM download page, it seems like there are 7 top-level "entry points" for the documentation:

llvm

clang

lld

clang-extra

libc++

poly

flang

Are we thinking that SYCL would be an 8th top-level entry point? Or, would SYCL support be part of "clang"? For comparison, OpenMP documentation is under "clang", so it seems (to me) that it would make sense for SYCL documentation to be there also.

In order for this to work, would we move the SYCL documentation files in the repo to "clang/docs" instead of "sycl/doc"? Or, would we change the CMake rules to build the "clang" documentation from both the "clang/docs" and "sycl/doc" directories?

The OpenMP provides compiler pragmas to access features, so it makes sense to have OpenMP documentation as part of the compiler documentation.
SYCL provides a library API, so I think adding another top-level entry fits better.

Yes, all of the extensions I'm talking about are extensions to the SYCL library. None of them are extensions to the clang compiler proper. I don't suppose that changes your opinion on using Asciidoc? :-)

It does -- for the SYCL library documentation, I think MLIR is a reasonable precedent to point to for not using RST because it will be its own top-level project in the monorepo and has different needs. I was opposed to mixing and matching within Clang.

That's not how I read the website. The line for Asciidoc has a right-pointing arrow, which the legend defines as "conversion to".

Ah, you may be right! Drat.

SYCL provides a library API, so I think adding another top-level entry fits better.

+1

Excellent! It sounds like we may be converging on Option 4, then:

Option 4

Upstream the "supported", "experimental", and KHR extensions

Upstream the "supported" and "experimental" extension specs also but leave them in Asciidoc format

Is anyone opposed to this?

Still to do, then, is to add build rules to generate HTML from the Asciidoc files and integrate this into the top-level SYCL website.

Pennycook · 2025-01-08T11:05:14Z

sycl/doc/extensions/index.rst

+.. toctree::
+   :maxdepth: 1
+
+   supported/sycl_ext_oneapi_free_function_queries


Until now, we've directed people to GitHub to read our extensions. This syntax doesn't render correctly on GitHub, producing this:

I'm used to finding and reading extensions on GitHub, so I don't really like this. If we did decide to switch everything over to RST, I think we'd need to make sure we do a good job of highlighting somewhere that people should check the online documentation instead. I see pros and cons to this: only the stable documentation would be visible on the web, which is good; but it would be harder for people to read through extensions that are implemented in open-source but not yet reflected in the HTML documentation.

Pennycook · 2025-01-08T11:08:44Z