Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update fp-002-format.md #1562

Merged
merged 3 commits into from
Aug 31, 2021
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
12 changes: 7 additions & 5 deletions principles/fp-002-format.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,15 +14,17 @@ The ontology is made available in a common formal language in an accepted concre

A common format allows the maximum number of people to access and reuse an ontology.

## Implementation
## Recommendations and Requirements

All ontologies MUST have at least one OWL product whose name corresponds to the registered id. Thus the ontology whose IRI is http://purl.obolibrary.org/obo/ro.owl (known to the OBO Foundry as 'RO'), must have at least the product ro.owl. Developers are free to use whatever combination of technologies and formats is appropriate for development. However, the official OWL PURL for the ontology must resolve to a syntactically valid OWL file using the [RDF-XML](https://www.w3.org/TR/rdf-syntax-grammar/) syntax.

### Recommendations and Requirements
Ontologies can OPTIONALLY produce an OBO-Format file. This is conventionally the same IRI as the owl, but with .owl changed to .obo. Note that an obo product is not listed by default. If you produce an OBO format product, you should register it under the 'products' field in the appropriate metadata file found in this [folder](https://github.com/OBOFoundry/OBOFoundry.github.io/tree/master/ontology).
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what about OBO Graph JSON?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Just an FYI: The text above was pulled from existing text found within the criteria for review link for this principle.)

I don't know enough about that format. I would say that any format convertible by ROBOT or Protege is fair game. Is that the case for OBO Graph JSON? If so, I'll generalize the statement and add that as another example.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, ROBOT also will generate OBO Graph JSON. I think you're right that it's good to mention that any other exports will probably help people

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My suggestion would be to replace the first sentence of this paragraph with: "Ontologies can OPTIONALLY produce products in other formats. For historical reasons, OBO-Format files are often produced." Then the rest of the paragraph (as written) is an example of a .obo product.


We make a distinction between how an ontology is developed and how it is presented for release. Developers are free to use whatever combination of technologies and formats is appropriate for development. However, the official OWL PURL for the ontology must resolve to a syntactically valid OWL file using the [RDF-XML](https://www.w3.org/TR/rdf-syntax-grammar/) syntax.
It does not matter to us if you maintain the source for your ontology in obo or owl or some hybrid. You have the option of either publishing the alternate format yourself (using a tool like ROBOT) or you can have the OBO central build pipeline do this for you. For more information, see the FAQ entry What is the Build field?.

Note: some groups publish an .obo version, and the OBO Foundry pipeline takes care of making the valid .owl file. See the FAQ for details. You may also submit the ontology for review as OBO, see 'criteria for review' below.
## Implementation

Note also that previously we recommended that ontologies may be available in Manchester syntax or OWL-XML, but we have revised this in order to make the official OWL release consumable by a wider variety of tools.
ROBOT offers functionality to convert a variety of formats, including OBO, to RDF/XML. Protégé allows you to save ontologies in RDF/XML, as well. The [Ontology 101 Tutorial](https://ontology101tutorial.readthedocs.io/en/latest/StartingProtege.html) has directions on starting and saving in Protégé.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Recommendations on validating OBO file format if curation is done primarily there?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is an ongoing discussion about whether or not the Foundry will take up the task of converting file formats. If the answer is 'yes', then we'd indeed need to say something about validating OBO format (and OBO Graph JSON and Manchester and others). Perhaps it would be enough to recommend that the developers at least attempt to use their file as input for ROBOT or Protege, both of which would complain if the format is incorrect.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For parsing OBO, there is also @althonos's pronto package. I'd recommend it with the caveat that it's quite strict and curators don't tend to like being told they've made mistakes by a computer program

Copy link
Member

@althonos althonos Aug 4, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the purpose of validation, i also wrote fastobo-validator that is available in the ODK, it's a bit better at reporting errors (and can detect some additional issues pronto can't find, like frames with duplicate definitions or broken ISBN references).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All the curators I know love to be told they made a mistake by a computer program.. In fact, many of them love to code nowadays!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nataled I don't think we should take on any such responsibilities (transformations into OBO etc) - we need to decentralise and enable (fishing rod) through tools and training rather than providing services that maintain the technical illiteracy. I would just add a link to fast-obo validator here as a way for people to validate their OBO formats (90% of all OBO files use ROBOTs ominous --check false which means they are broken). As EWG you need to decide wether we would want to require non-broken OBO as part of dashboard checking - I think that's a huge churn, but possible useful. No idea whats best.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@matentzn you're lucky to know such good curators! I am thinking back to the dark days of my PhD where I became a heretic when introducing the PyBEL biological expression language validator.

I agree with @matentzn - centralizing working with the data itself wouldn't be sustainable. Reporting on what could be converted might be a useful metric, but I wouldn't say it should be required at the moment

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@matentzn @cthoyt the decision to provide or not provide conversion services is not one the EWG makes. ALL principles decisions are made by Ops; EWG just puts them into words. The wording reflected above is merely a rewrite of what's already given in the principle and in the FAQ (IIRC). As I mentioned in my previous comment, there is an unfinished discussion about what services the Foundry are willing to provide in this regard. It's been proposed that we do NOT provide the conversion service, and so far as I recall, that seems to be the way it will be, but, again, until it is finalized the EWG cannot take the sentence out.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see the need to mention OBO-format validators on the Principle page. In general, there's so much to know about ontology development best practises, and that sort of documentation should live somewhere else.


### Examples

Expand Down