Create model schema from a JSON file #1054

datejada · 2025-02-26T09:48:27Z

This pull request includes the main changes to read the schema from a JSON file, which also includes a description, units, default, and constraints of the parameters in the model. The main changes are:

Configuration Management:

.pre-commit-config.yaml: Removed the Prettier hook to streamline the pre-commit configuration.

Code Enhancements:

src/TulipaEnergyModel.jl: Added the JSON module import to extend the functionality of the script.

Schema Handling:

src/input-schemas.jl: Refactored the schema definitions to read from a JSON file instead of hardcoding them in the script. This change simplifies the maintenance and updates of schema definitions.

Documentation updates:

docs/src/50-schemas.md: Updated the documentation to inform users that schemas can now be found in the input-schemas.json file and provided an example code snippet for generating Markdown documentation from the JSON schema. [1] [2]

Related issues

Closes #992, #1039

Checklist

I am following the contributing guidelines
Tests are passing
Lint workflow is passing
Docs were updated and workflow is passing

codecov · 2025-02-26T09:55:49Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 97.79%. Comparing base (aefad2e) to head (87cf5ee).
Report is 1 commits behind head on main.

Additional details and impacted files

@@           Coverage Diff           @@
##             main    #1054   +/-   ##
=======================================
  Coverage   97.79%   97.79%           
=======================================
  Files          29       29           
  Lines         951      951           
=======================================
  Hits          930      930           
  Misses         21       21

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

github-actions · 2025-02-26T09:58:30Z

Benchmark Results

	`aefad2e`...	`87cf5ee`...	`aefad2e`... / `87cf5ee`...
energy_problem/create_model	34.5 ± 3.4 s	35.1 ± 2.8 s	0.981
energy_problem/input_and_constructor	26.1 ± 0.062 s	26.2 ± 0.086 s	0.998
time_to_load	2.59 ± 0.015 s	2.61 ± 0.0048 s	0.991

	`aefad2e`...	`87cf5ee`...	`aefad2e`... / `87cf5ee`...
energy_problem/create_model	0.204 G allocs: 11.4 GB	0.204 G allocs: 11.4 GB	1
energy_problem/input_and_constructor	26.8 M allocs: 0.952 GB	26.8 M allocs: 0.952 GB	1
time_to_load	0.159 k allocs: 11.2 kB	0.159 k allocs: 11.2 kB	1

Benchmark Plots

A plot of the benchmark results have been uploaded as an artifact to the workflow run for this PR.
Go to "Actions"->"Benchmark a pull request"->[the most recent run]->"Artifacts" (at the bottom).

datejada · 2025-02-26T16:37:37Z

@suvayu and @abelsiqueira these are the changes to have the schema (and more) in a JSON file. I have updated the documentation to create the parameter section from this file, including the units of measurement, description, constraints, and default values. Your comments and suggestions are more than welcome 😉

clizbe · 2025-03-01T15:18:38Z

This is great! But now that we want the true schema inside the model, we should probably (after merging this) do another update that makes this JSON whatever format we want the user to send to TEM. I guess that's the User Format and then we use pre-processing to make the internal format. Let's discuss on Monday. :)

clizbe

Thanks for doing this! I'm happy about this change but have a couple of questions.

docs/src/50-schemas.md

clizbe · 2025-03-01T15:21:14Z

docs/src/50-schemas.md

@@ -2,7 +2,7 @@

 The optimization model parameters with the input data must follow the schema below for each table. To create these tables we currently use CSV files that follow this same schema and then convert them into tables using TulipaIO, as shown in the basic example of the [Tutorials](@ref basic-example) section.

-The schemas can be accessed at any time after loading the package by typing `TulipaEnergyModel.schema_per_table_name` in the Julia console. Here is the complete list of model parameters in the schemas per table (or CSV file):
+The schemas can be found in the `input-schemas.json` or can be accessed at any time after loading the package by typing `TulipaEnergyModel.schema_per_table_name` in the Julia console. Here is the complete list of model parameters in the schemas per table (or CSV file):


Also, does this command produce a proper output now? When I tried it, it only gave me a statement of "structure with 20 rows" or something lame but not the actual output.

src/input-schemas.jl

suvayu

Good first pass!

docs/src/50-schemas.md

src/input-schemas.jl

src/input-schemas.json

suvayu · 2025-03-01T22:00:51Z

src/input-schemas.json

+          null,
+          "basic"


What is the intention here? Does null mean no unit_commitment? In that case does TEM still expect this column in the tables?

I think the logically correct way to do this would be something like this:

{ "unit_commitment_integer": {...}, "unit_commitment_method": { ... "constraints": { "oneOf": ["basic", "something-else"] } }, "required": ["unit_commitment_integer"] }

This means, there will always be a unit_commitment_integer column, and unit_commitment_method will be present only when the previous one is true (which could be checked in code).

I understand if the current structure in TEM does not allow that, in that case this could be a future improvement.

src/input-schemas.json

docs/src/50-schemas.md

abelsiqueira

I made some comments. I don't know much about JSON Schemas, so take it with a grain of salt.
Another comment is that since the JSON schema is a particular type of file, it could benefit from a linter (quick googling gave this option: https://github.com/python-jsonschema/check-jsonschema). @suvayu might have a better recommendation.

I got a bit confused on how you're planning to use this, but since you're already in discussion with @suvayu on this, I will leave the details to you

src/input-schemas.jl

src/input-schemas.json

abelsiqueira · 2025-03-03T14:43:58Z

src/input-schemas.json

+  "assets_profiles": {
+    "asset": {
+      "UoM": null,
+      "constraints": null,


There might be a way to constrain the value of asset to be one of the values from the asset table. I don't actually know if it is possible

Technically it's possible, but we are not going to use them completely as it should (I'll respond to your earlier "how to use" question separately), so this is something we could do explicitly in code for now.

ok! I will check if it is possible. Thanks!

src/input-schemas.json

suvayu · 2025-03-03T16:29:18Z

I got a bit confused on how you're planning to use this, but since you're already in discussion with @suvayu on this, I will leave the details to you

So while discussing this during before starting on the change, I realised that doing this "properly" might be difficult because I didn't know any tooling within Julia to do this kind of schema based validation. So we essentially simplified the schema into a map of mini schemas. Using it then becomes simpler, because you don't do complex validation anymore, it's just for one column.

E.g. you will see it doesn't define the columns properly, something like this would be the proper way to do it.

{
   "name": string,
   "type": object,
   "properties": [
      {
        "name": "type",
        "type": string,
        "oneOf": ["INTEGER", "DOUBLE", "VARCHAR", "BOOL"]
      },
      {
        "name": "constraints",
        "type": object
      },
      {
         "name": "UoM",
         "type": string
      },
      {
         "name": "default"
      }
    ]
    "required": ["type"]
}

But then you need libraries that generate these schemas, and/or that can then validate data using these later (e.g. pydantic, or marshmallow in Python).

So we simplified our approach to whatever we can use effectively. So it is now a slightly more easily extendable version of the OrderedDict already in input-schemas.jl. It adds a few additional fields, like units, default, description, etc. And the use in code is unchanged, you get the same OrderedDict as before, but in the docs you can have richer more informative lists/tables.

I hope that answers your doubt.

EDIT: If you do it "properly" then you can do references, and restrict a value based on what you had somewhere else. But there are still some limits to it, since JSON schema does not have a generic conditional primitive, it's all predefined so what you can do is limited.

abelsiqueira

Thanks for the extra info, @suvayu.
LGTM

datejada · 2025-03-04T18:49:26Z

Hi @suvayu and @clizbe, After giving it some more thought, I decided to adopt different schemas for the assets_profiles and assets_timeframe_profiles. While they share the same columns, their constraints are different. This distinction wasn't relevant before we introduced the JSON file, which is why we previously reused the schema. Now that it is relevant, having separate schemas for each file allows us to establish a direct 1:1 relationship between the table definitions and their schemas. This change will streamline the code, making it cleaner and less repetitive.

Please let me know if you agree or if you have any additional comments or suggestions.

Thank you!

src/input-schemas.json

docs/src/50-schemas.md

clizbe · 2025-03-05T10:23:35Z

@datejada WOW that was a good decision because the new implementation is gorgeous! 😍

clizbe

Looks GREAT!
Sorry for the slow review.

datejada added the benchmark PR only - Run benchmark on PR label Feb 26, 2025

datejada force-pushed the 992-schema-json-file branch from 80fc0fe to 039310f Compare February 26, 2025 16:33

datejada requested review from abelsiqueira and suvayu February 26, 2025 16:34

clizbe requested changes Mar 1, 2025

View reviewed changes

suvayu requested changes Mar 1, 2025

View reviewed changes

datejada commented Mar 3, 2025

View reviewed changes

docs/src/50-schemas.md Show resolved Hide resolved

abelsiqueira requested changes Mar 3, 2025

View reviewed changes

abelsiqueira approved these changes Mar 4, 2025

View reviewed changes

datejada added 8 commits March 4, 2025 18:39

Remove prettier from the pre-commit

9d7f46a

Add JSON schema file

4aa55e5

Add JSON package to the dependencies

3fb9ccf

Uniform the names in the JSON file with the table names in DuckDB

59f608a

Read schema from JSON file

e300073

Update JSON file

23fe929

Update input schemas accordingly

3629d65

Update documentation

d796722

datejada force-pushed the 992-schema-json-file branch from 039310f to d796722 Compare March 4, 2025 17:40

datejada added 2 commits March 4, 2025 19:15

Add comments from code review

3b5aeda

Add comments from code review

50ac53a

datejada requested review from suvayu and clizbe March 4, 2025 18:49

suvayu requested changes Mar 4, 2025

View reviewed changes

src/input-schemas.json Outdated Show resolved Hide resolved

Update JSON file with comments from code review

43a7866

datejada requested a review from suvayu March 5, 2025 08:38

clizbe reviewed Mar 5, 2025

View reviewed changes

docs/src/50-schemas.md Outdated Show resolved Hide resolved

Update docs with comments from code review

87cf5ee

datejada requested a review from clizbe March 5, 2025 10:41

suvayu approved these changes Mar 5, 2025

View reviewed changes

clizbe approved these changes Mar 5, 2025

View reviewed changes

clizbe merged commit ae662fe into main Mar 5, 2025
8 checks passed

datejada deleted the 992-schema-json-file branch March 6, 2025 07:16

clizbe mentioned this pull request Mar 6, 2025

Put the defaults in a JSON file #1039

Closed

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Create model schema from a JSON file #1054

Create model schema from a JSON file #1054

datejada commented Feb 26, 2025 •

edited

Loading

codecov bot commented Feb 26, 2025 •

edited

Loading

github-actions bot commented Feb 26, 2025 •

edited

Loading

datejada commented Feb 26, 2025

clizbe commented Mar 1, 2025

clizbe left a comment

clizbe Mar 1, 2025

suvayu left a comment

suvayu Mar 1, 2025

abelsiqueira left a comment

abelsiqueira Mar 3, 2025

suvayu Mar 3, 2025

datejada Mar 3, 2025

suvayu commented Mar 3, 2025 •

edited

Loading

abelsiqueira left a comment

datejada commented Mar 4, 2025

clizbe commented Mar 5, 2025

clizbe left a comment

Create model schema from a JSON file #1054

Create model schema from a JSON file #1054

Conversation

datejada commented Feb 26, 2025 • edited Loading

Related issues

Checklist

codecov bot commented Feb 26, 2025 • edited Loading

Codecov Report

github-actions bot commented Feb 26, 2025 • edited Loading

Benchmark Results

Benchmark Plots

datejada commented Feb 26, 2025

clizbe commented Mar 1, 2025

clizbe left a comment

Choose a reason for hiding this comment

clizbe Mar 1, 2025

Choose a reason for hiding this comment

suvayu left a comment

Choose a reason for hiding this comment

suvayu Mar 1, 2025

Choose a reason for hiding this comment

abelsiqueira left a comment

Choose a reason for hiding this comment

abelsiqueira Mar 3, 2025

Choose a reason for hiding this comment

suvayu Mar 3, 2025

Choose a reason for hiding this comment

datejada Mar 3, 2025

Choose a reason for hiding this comment

suvayu commented Mar 3, 2025 • edited Loading

abelsiqueira left a comment

Choose a reason for hiding this comment

datejada commented Mar 4, 2025

clizbe commented Mar 5, 2025

clizbe left a comment

Choose a reason for hiding this comment

datejada commented Feb 26, 2025 •

edited

Loading

codecov bot commented Feb 26, 2025 •

edited

Loading

github-actions bot commented Feb 26, 2025 •

edited

Loading

suvayu commented Mar 3, 2025 •

edited

Loading