-
-
Notifications
You must be signed in to change notification settings - Fork 77
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Introduce an automated build process #989
Conversation
One question comes to mind - how do we still support distributing the OWL schema as well as our default SHACL schema? |
I don't see how this question is related to this PR - I will try not to change any RDF content (only the severity of one SHACL shape needs to be touched). What am I missing? |
(Answers relate to the build process as it is currently implemented)
No, not without running the build.
No, not in the git repo - but the latest state of the
No, that file is generated during the build and ends up in |
Note: with the introduction of the changelog in its current form we state that we adhere to semantic versioning, and we should require/encourage each PR to update the changelog. |
I have installed Maven as per your instructions and it works as you state. However, is there not a way to make the latest version of QUDT available on our repository so that all I would need is to do a git pull of the repo and I'm back in business? Currently that is how my environment works in TopBraid. I worry that we could lose users if they have to do the extra build steps or even extract the zip file. Stated another way, even if we adopt the architecture you describe, could we not set up a separate folder somewhere that just contains the "built" graphs? Can we not invoke Maven on the Git host? |
It is definitely possible to include the build results in the repo. All we would need to do is un-ignore the Let's weigh advantages and disadvantages of including the build results in the git repo:
Disadvantages:
I am leaning toward only tracking the source files in the git repo, not the results:
Two alternative ideas, if we must have the result files in a git repo:
|
... But I might have misunderstood your specific problem, which is working with the source of any branch (not necessarily main or even a PR). You now expect to find the final (built) version of all files on any branch upon checkout, ie, without running a build. I don't think that will be possible (unless we add the built files to the repo as normal files). However, I think it will only be a problem for people doing qudt admin work. |
@fkleedorfer, you are raising some great issues, and some great architectural candidate solutions. We should definitely talk these through and come to a decision soon. You have captured my concerns as well. |
Well, if we agree that we do not want generated files as part of the normal repo, we can just leave the build system as is in that regard and work on its other functionality (eg version number replacement) If we ever determine that we really need it, we can create the built-branch or built-repo later and just recreate the commits from our release tags (check out tag, build project, copy to branch/repo, commit) and include a step in our github action that pushes the built files onto the branch/repo) |
Regarding the version number replacement: I think the best solution is to put a placeholder in every position where we want the current version to be put.
All of these are RDF string properties or Text, so putting a placeholder like Example: in the source file
becomes
In a normal build (not a release, this would be replaced by the snapshot version, such as:
Whereas in a release build, this would be replaced by the release version, such as:
|
Just so I understand, if we start with a Release of 2.1.44, and several merges are performed from various PRs, do we get: Is that how it works, or does the 45 keep getting incremented with each merge? |
From your earlier comment: Two alternative ideas, if we must have the result files in a git repo:
I'm leaning toward your second idea. We could name the repo something like qudt-distribution as a read-only repo to support users who have git-aware applications that want to just 'git pull' the latest main branch, or even pull some earlier version. This way, people could use the qudt-public-repo with Maven that will support contributions and edits if they want, or qudt-distribution that will reflect the latest Release or snapshot. |
Sorry, I did not explain this. It is actually very simple because there is no magic at all, it's all manual: The version number is in the pom.xml file, near the top:
This is the current version. The convention, as far as I've seen it is this:
So, it's like you suspected:
|
I am fine with this and wonder why we haven't done it before now.
…On Fri, Nov 8, 2024 at 7:15 AM Florian Kleedorfer ***@***.***> wrote:
Just so I understand, if we start with a Release of 2.1.44, and several
merges are performed from various PRs, do we get:
Existing Release: 2.1.44
First PR merge: 2.1.45-SNAPSHOT
Second PR merge: 2.1.45-SNAPSHOT
Next Release: 2.1.45
Is that how it works, or does the 45 keep getting incremented with each
merge?
Sorry, I did not explain this.
It is actually very simple because there is no magic at all, it's all
manual:
The version number is in the pom.xml file, near the top:
<?xml version="1.0" encoding="UTF-8"?>
<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>
<groupId>org.qudt</groupId>
<artifactId>qudt-public-repo</artifactId>
<version>2.1.45-SNAPSHOT</version> <--- there
<packaging>pom</packaging>
This is the current version. The convention, as far as I've seen it is
this:
1. while you work on the code and keep making commits, the version
does not change and is always the version you are aiming for with
'-SNAPSHOT' appended. (i.e. 12.34-SNAPSHOT).
2. when you make a release, the maven-release-plugin is run behind the
scenes, which requires as parameters the version you want to release and
the next 'development' (=snapshot) version. We pass these arguments to the
github action that makes the release - the only two arguments it requires.
It changes the version in the pom file, we package that for the release,
and then it changes the version in the pom file to the next development
version. (maybe you wouldn't call that whole process 'manual' as i did
above - but at least there is no automatic version increment)
So, it's like you suspected:
- existing Release: 2.1.44
- First PR merge: 2.1.45-SNAPSHOT
- Second PR merge: 2.1.45-SNAPSHOT
- Next Release: 2.1.45 - unless you decide to release a different
version, which may be warranted depending on what has changed (see semantic
versioning <https://semver.org/>, if we want to adhere to that). You'd
pass 2.1.45 as the release version and 2.1.46-SNAPSHOT as the next
development version in the github user interface to achieve that.
—
Reply to this email directly, view it on GitHub
<#989 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AATQRWNQWS7JCGSLSNOK4OLZ7TIPLAVCNFSM6AAAAABQYBW54WVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDINRVGAYDQNRRGA>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
--
Jack
|
3be408a
to
01e299f
Compare
Rebased on main and made all necessary changes. I think we can remove DRAFT status. Also, I think it would be a good idea to switch our settings such that the default action for PRs is to squash and rebase. In case of a squash I would use the commit message of the first commit in this branch as the commit message. |
I'm about to jump into another meeting, but I think your applicableUnits algorithm needs to ignore any deprecated units. I'm getting lots of validation errors because quantity kinds are referring to some deprecated units. Somewhere in your nested SELECT calls you need a
|
Excellent catch @steveraysteveray - I had lowered the severity of the shacl shape to warning to make the build not fail. I had not seen the connection with the applicableUnits calculation. Fixed now. |
Thanks @fkleedorfer, the target files that are in the imports closure now pass the validation cleanly! I stepped back to take a look at where we are now, and I have some observations/questions. I used the quantitykind file to focus my observations.
Are these remnants of a merge conflict? They should be removed, right? Or even ignore that whole folder in your scripts?
(For item 5. above, I'm thinking we have 2 communities we want to support:
For my own work, I expect to jump back and forth between both communities. Group 1 for my own updates to QUDT, and Group 2 for my use of QUDT in committee work. |
3 and 4 are accidents. |
Are people that creat profile distributions in either of your groups?I agree that a large number of QUDT users (I hesitate to say ‘overwhelming’) just want to download or link to the repository and not contribute. The experience for them should be simple and transparent.Having lost track of the path of this discussion through all the threads I would like to test out the build process myself. Where is the documentation?
|
@jhodgesatmb, see the qudt-board Slack channel at 2:42pm on October 28. |
@fkleedorfer, I'm still impressed by your applicableUnit query. I thought I would try to improve it because instances of qudt:SystemOfQuantityKinds also use the qudt:hasQuantityKind relation, so they could be accidentally picked up in other scenarios. I realize that in your pom.xml you only included the unit and quantitykind graphs to do the inferencing, but still I thought it would be good to make it more robust. However, I clearly messed up with my addition of
line in the query, so in the end I just put things back the way they were... |
@jhodgesatmb, to your first question, yes, I would imagine people in group 1 would create profile distributions, and people in group 2 would use such distributions. |
I will look into it, thanks! |
Ok
|
@steveraysteveray another good catch - if I include the 'systems' files, the applicableUnits.ttl contains entries such as:
Added the requirement for type qudt:Unit and verified it produces the same result as the original query without the systems files |
?unit qudt:hasQuantityKind ?qk | ||
?unit a qudt:Unit | ||
?unit qudt:hasQuantityKind ?qk . | ||
?unit rdf:type qudt:Unit . |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The same statement (?unit rdf:type qudt:Unit
) is also needed in line 79. I did that in a later commit, so don't worry about it.
@steveraysteveray your points 3 and 4 from earlier I cannot reproduce. When you clone the repo in a new folder and check out the branch 959/maven, it looks clean to me |
To your point 2 - it is totally possible to not make the zip file upon |
Did that in the last commit. |
Watching from a distance. Very pleased to see all this. |
In Terms of Tasks left to do in this PR (copied from above): These were the tasks:
I've decided that:
Which leaves us with the question whether or not we should leave the directory structure the way it is now: The former toplevel directories are moved to A bit of background about this structure (what's I would recommend that we keep the structure as it is in this PR at the moment: Advantages:
Disadvantages:
other considerations? |
I am guessing that by 2:42 pm you mean EST or EDT. If I back this 3 hours
it would be 11:42 am PDT. That took me to the github site but I didn't see
a procedure there.
Jack
…On Sat, Nov 9, 2024 at 8:33 AM steveraysteveray ***@***.***> wrote:
@jhodgesatmb <https://github.com/jhodgesatmb>, see the qudt-board Slack
channel at 2:42pm on October 28.
—
Reply to this email directly, view it on GitHub
<#989 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AATQRWOURAQIAVIUDEP6T73Z7Y2PLAVCNFSM6AAAAABQYBW54WVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDINRWGI3TOMRVGU>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
--
Jack
|
@jhodgesatmb there is some documentation on how to build the project in the README.md file in this PR. |
Do I have to check out the repository in order to see the procedure? There
is nothing on the landing page of this PR that says what the starting point
is. I would rather not check out a branch and confuse everything I have
locally, and I cannot seem to see the README just on the github page. If I
go to the 'Code' page and open the README.md file it is the regular README
and nothing about this process.
Jack
…On Mon, Nov 11, 2024 at 12:24 AM Florian Kleedorfer < ***@***.***> wrote:
@jhodgesatmb <https://github.com/jhodgesatmb> there is some documentation
on how to build the project in the README.md
<https://github.com/qudt/qudt-public-repo/blob/43f1193d2c937343b25e5e4fd27bd22d8cc07729/README.md>
file in this PR.
—
Reply to this email directly, view it on GitHub
<#989 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AATQRWMTPME764UD6ZGHYPT2ABSUTAVCNFSM6AAAAABQYBW54WVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDINRXGUYTKOBTGU>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
--
Jack
|
## Functionality of the build system: Run the build with `mvn install` - check RDF source formatting - fail build if there are violations - enable to fix formatting via `mvn spotless:apply` - remove all `qudt:applicableUnits` triples from the quantitykinds file - copy all relevant sources to `target/dist` - replace the version placeholder, `$$QUDT_VERSION$$` everywhere in `target/dist` - infer `qudt:applicableUnits` by applying `src/build/inference/inferApplicableUnits.ttl` and add those triples to the quantitykinds file in `target/dist` - evaluate all SHACL shapes on the build result in `target/dist` and fail the build if there are violations Profile `zip` builds the release zip: `mvn -Pzip install` ## Github Actions Github actions are defined in `.github/workflows`: - maven.yml - runs the build upon push to a PR or when a PR is merged to `main`. In the latter case, the action makes a github release `snapshot` and a tag with that name, which will overwrite the previous such snapshot release - release.yml - manually invokable action that makes a release. Parameter `release_version` and `next_development_version` are required and will be used for making the release and preparing the repo for the next development cycle. This action makes changes to the repo (`pom.xml`, `CHANGELOG.md`), which are committed to a new branch, and a PR to `main` is created during its execution. This PR has to be merged manually when we are happy with the results of the release. ## Changes to sources that were required - introduce a placeholder, `$$QUDT_VERSION$$` wherever the current version is needed - move the rdf source folders from the root dir to `src/main/rdf` (not technically required, but makes clear what is source and what is generated) - add CHANGELOG.md ## Other changes - Folder structure: the `collections` folder ended up as `src/main/validation` ## Documentation - build phases and associated plugin executions are listed in the comments in `pom.xml`
879a6d1
to
d7e5cc4
Compare
I've updated the pom.xml for the new folder structure and squashed everything into one commit with a nice commit message. I think this is ready for merge (preferably rebase, actually) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The dist folder loads cleanly into TopBraid 7.1.1, and validates there without errors.
The src/main/rdf folder loads successfully into a separate workspace in TopBraid 7.0 that I can use for the publication.
I will pause to let others do their testing.
@steveraysteveray you'll have to change the code that replaces the version number in those files. Would it not be better to create the release off the results in |
We will need some careful sequencing here. Publication of the web pages is distinct from creation of the GitHub Release. We need access to the src files to update the version numbers and metadata like the date of publication, for both publication and the Release. I cannot have both the src graphs and the dist graphs present in the TopBraid workspace at the same time because of base URI conflicts. So, the sequence could be something like:
So in the near term, I could do the sequence above. The published web pages would have the correct metadata, but the src files would not. Then I could manually invoke just the metadata update routines in a separate workspace, operating on the src files, and push those in a separate PR. This will be a little tricky, but doable I believe. Then we could discuss the testing and execution of the GitHub Release. How does this sound? |
I think, for now, we don't need to release a new version. I expect some bugfixing and tweaking after we merge this PR. So we have time to get the publication workflow fixed. I think it is a good idea to work on that after we merge. Maybe we can automate it fully, too. However, I don't seem to fully understand the problems you describe. For one, if the solution is to do a manual find/replace in I don't understand the problem with the metadata either... Is it that we should do a second replace for the date of publication? It would be no problem to add that to this PR! Using the placeholder approach, it's really quite easy. We can also adapt that approach for IRIs or non-string literals if needed. |
I do not understand the problem either. Perhaps the best way forward is to get the release process working under the build process, and then try to get the web pages to build the same way and compare to the current way they are created.There is no question that this is kind of scary but the upside would be huge.How does all this fit in with Ralph’s project?Jack Hodges, Ph.D.Arbor StudiosOn Nov 12, 2024, at 8:28 AM, Florian Kleedorfer ***@***.***> wrote:
I think, for now, we don't need to release a new version. I expect some bugfixing and tweaking after we merge this PR. So we have time to get the publication workflow fixed.
I think it is a good idea to work on that after we merge. Maybe we can automate it fully, too.
However, I don't seem to fully understand the problems you describe. For one, if the solution is to do a manual find/replace in src and then run the maven build (which replaces the version placeholder with the version), why to the replacing in the first place? And why not just use the release zip (once it has been released on github).
I don't understand the problem with the metadata either... Is it that we should do a second replace for the date of publication? It would be no problem to add that to this PR! Using the placeholder approach, it's really quite easy. We can also adapt that approach for IRIs or non-string literals if needed.
—Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you were mentioned.Message ID: ***@***.***>
|
OK, here's the excerpt from the webservices call to update the metadata in each of the graphs:
and for the catalog:
So if we can do that in this PR, that could work, along with your placeholder approach for the version. Want to take a shot at that? Then I can comment out the respective calls in the script and proceed with publishing the web pages. Happy to drive all this from either the dist folder or the zip file. |
...or we could just merge the current PR and keep working on the publication from there. |
Ok with me.Jack Hodges, Ph.D.Arbor StudiosOn Nov 12, 2024, at 11:53 AM, steveraysteveray ***@***.***> wrote:
...or we could just merge the current PR and keep working on the publication from there.
—Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you were mentioned.Message ID: ***@***.***>
|
That's the better approach. Sounds like we can make these webservice calls from the github action. Let's work on that in a subsequent PR. |
@ralphtq can we merge? |
Oops, I had forgotten to update the changelog! |
This is a - for now prototypical - implementation of a build process based on maven.
Structure
Operations
Note
Still TODO
src/main/rdf
is acceptabletarget
), either use a variable in those places or replace the previous version with the current oneTODOs that we should do after merging this PR but before the next release
TODOs we can do later
Addresses #959