Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Metadata should be a nested arbitrary KV store #185

Open
earthboundkid opened this issue Feb 23, 2018 · 13 comments
Open

Metadata should be a nested arbitrary KV store #185

earthboundkid opened this issue Feb 23, 2018 · 13 comments

Comments

@earthboundkid
Copy link

I started using the metadata feature and was surprised to find that values are not inherited from the root object down to specific databases and tables. This makes metadata much less useful and requires a lot of pointless duplication.

Ideally, metadata should allow arbitrary key-value pairs, and there should be a way of accessing metadata either in an inherited or non-inherited manner. Something like metadata.page.key vs. metadata.this.key might work as an interface.

@simonw
Copy link
Owner

simonw commented Mar 4, 2018

Are you talking specifically about accessing metadata from HTML templates? That makes a lot of sense, I'll think about how this could work.

@earthboundkid
Copy link
Author

earthboundkid commented Mar 5, 2018

Yes. I think the simplest implementation is to change lines like

        metadata = self.ds.metadata.get('databases', {}).get(name, {})

to

metadata = {
    **self.ds.metadata,
    **self.ds.metadata.get('databases', {}).get(name, {}),
}

so that specified inner values overwrite outer values, but only if they exist.

@simonw
Copy link
Owner

simonw commented Mar 27, 2018

OK, I have an implementation of this. I realised that not ALL metadata should be inherited: it makes sense for source/source_url/license/license_url to be inherited, but it doesn't make sense for the title and description to be inherited down to the individual databases and tables.

simonw added a commit that referenced this issue Mar 27, 2018
…tadata

If you set the source_url/license_url/source/license fields in your root
metadata those values will now be inherited all the way down to the database
and table templates.

The title/description are NOT inherited.

Also added unit tests for the HTML generated by the metadata.

Refs #185
@simonw
Copy link
Owner

simonw commented Mar 27, 2018

One thing that's missing from this: if you set source/license data at the individual database level they should be inherited by tables within that database.

@simonw
Copy link
Owner

simonw commented Mar 27, 2018

Also needed: the ability to unset metadata. If the root metadata specifies a license_url it should be possible to set "license_url": null on a child database or table. The current implementation will ignore null (or empty string) values and default to the top level value.

I think the templates themselves should be able to indicate if they want the inherited values or not. That way we could support arbitrary key/values and avoid the application code having special knowledge of license_url etc.

@earthboundkid
Copy link
Author

I think the templates themselves should be able to indicate if they want the inherited values or not. That way we could support arbitrary key/values and avoid the application code having special knowledge of license_url etc.

Yes, you could have metadata that works like metadata does currently and inherited_metadata that works with inheritance.

@earthboundkid
Copy link
Author

It would be nice to also allow arbitrary keys (maybe under a parent key called params or something to prevent conflicts). For our datasette project, we just have a bunch of dictionaries defined in the base template for things like site URL and column humanized names: https://github.com/baltimore-sun-data/salaries-datasette/blob/master/templates/base.html It would be cleaner if this were in the metadata.json.

@simonw
Copy link
Owner

simonw commented Mar 27, 2018

I am SO inspired by what you've done with https://salaries.news.baltimoresun.com/ - that's pretty much my ideal use-case for Datasette, and it's by far the most elaborate customization I've seen so far. I'd love to hear other ideas that came up while building that.

@earthboundkid
Copy link
Author

@simonw Other than metadata, the biggest item on wishlist for the salaries project was the ability to reorder by column. Of course, that could be done with a custom SQL query, but we didn't want to have to reimplement all the nav/pagination stuff from scratch.

@carolinp, feel free to add your thoughts.

@simonw
Copy link
Owner

simonw commented Apr 9, 2018

@carlmjohnson in case you aren't following along with #189 I've shipped the first working prototype of sort-by-column - you can try it out here: https://datasette-issue-189-demo-2.now.sh/salaries-7859114-7859114/2017+Maryland+state+salaries?_search=university&_sort_desc=annual_salary

@simonw simonw added the feature label Jul 10, 2018
@simonw simonw added this to the Next release milestone Jul 24, 2018
@simonw
Copy link
Owner

simonw commented Aug 11, 2018

I've been worrying about how this one relates to #260 - I'd like to validate metadata (to help protect against people e.g. misspelling license_url and then being confused when their license isn't displayed properly), but this issue requests the ability to add arbitrary additional keys to the metadata structure.

I think the solution is to introduce a metadata key called extra_metadata_keys which allows you to specifically list the extra keys that you want to enable. Something like this:

{
    "title": "My title",
    "source": "Source",
    "source_url": "https://www.example.com/",
    "release_date": "2018-04-01",
    "extra_metadata_keys": ["release_date"]
}

@earthboundkid
Copy link
Author

That seems good to me.

@simonw simonw removed this from the Next release milestone May 13, 2019
@sivy
Copy link

sivy commented Dec 21, 2024

Here to vote for extra_metadata_keys if it would allow us to set custom metadata for databases and tables. :)

Use case -- I'm socializing datasette at work but our use case right now is writing tooling to download various data from government websites and other sources and consolidating them into one sqlite db that I can wrap in datasette to let us play with looking for connections in the data. But I really want to be able store source metadata (name, about url, download url, license, and some other internal stuff) at the table or even column level.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants