Skip to content

516 dynamic tables and columns#718

Open
Simon-Will wants to merge 37 commits intoOpenEnergyPlatform:developfrom
Simon-Will:516-dynamic-tables-and-columns
Open

516 dynamic tables and columns#718
Simon-Will wants to merge 37 commits intoOpenEnergyPlatform:developfrom
Simon-Will:516-dynamic-tables-and-columns

Conversation

@Simon-Will
Copy link
Copy Markdown
Contributor

@Simon-Will Simon-Will commented Jan 24, 2026

A first attempt at creating tables based on the MaStR XSD files.

Checklist until this is really ready:

  • Download current documentation and create database model from it
  • Use new database model for all the insertion code
  • Use fallback XSD for when using the current documentation fails for some reason
  • Implement CSV export
    • I radically simplified the existing CSV export. It was pretty complex, joined several tables and backfilled the basic units table. I found that a bit much for an export. There's probably a way to make it work in much the same way as it used to work, but I frankly didn't want to spend the time to fully understand all of what's going on there. Let's talk about it!
  • Implement translation feature
  • Give the user an easy way to use the mastr_table_to_db_model returned from Mastr.generate_data_model. E.g. by adding a function that generates a Python code snippet with the SQLALchemy models/tables.
    • I solved this by having Mastr.generate_data_model return SQLAlchemy core tables, not ORM models. They are easy to just print and a user can then copy them to their code & modify them. They are also the best common ground. After all, some users might not use the ORM.
    • There's also a function format_mastr_table_to_db_table that makes printing easy for the user.
  • Clear up the date situation. I made a couple of changes to utils_download_bulk.py because I found date handling unnecessarily complex. Add interactive download functionality for MaStR date selection #696 #697 changes the same code and adds support for retrieving available XML download dates. If Add interactive download functionality for MaStR date selection #696 #697 is merged, we have to update this retrieval logic to also retrieve the documentation download dates.
    • I reverted my changes regarding the dates.
    • I added the docs download to the download browsing, etc. Note that some old XSD files are invalid (e.g. 20240101) and cannot be read with XMLSchema. We fall back to the XSD files in the library in that case.
  • Think about how we handle the transition from users' existing databases. Especially w.r.t. to translated databases and also all the renamed columns.
    • My proposal: Since it is extremely difficult to provide an upgrade path from the old table & column names to the way they are done now, I think we should just tell users to adjust their existing queries so they fit the newest open-mastr version. This is fine imo because this whole thing here will trigger a major version bump anyway.
    • Please let's talk about table name translations. I'm using the old names here in this new code, but would rather like to create new names that are closer to the names of the original MaStR export files.
  • Create usage examples
  • Address a couple of open points
    • How to determine primary key of tables? By hardcoding it for MaStR tables we know? Or by checking the available columns and choosing the most likely one based on some hierarchy (e.g. "Id > MastrNummer > EinheitMastrNummer > …") Cf. this code
      • I hard-coded the id column for the tables we know. For unknown future tables, a column "openMastrId" will be inserted. This is also done for the EinheitenAenderungNetzbetreiberzuordnungen table because there is no primary key.
    • How much do we want to adjust/normalize column names? Cf. this code
      • I decided to only do straightforward changes (MaStR -> MaStR, ß -> ss, deleting surrounding whitespace, etc.). No singularization/pluralization of column names à la VerknuepfteEinheitenMaStRNummern -> VerknuepfteEinheit.
    • Do we want to handle the case where adding only some columns to a table fails? Cf. this code
      • I decided not to add special handling for that.
  • Go through the library and remove newly obsolete code
  • Add tests

Type of change (CHANGELOG.md)

Added

  • Add the new method Mastr.generate_data_model that downloads the newest MaStR documentation and uses the XSD file to build SQLAlchemy models from the contained definitions

Updated

  • Update the method Mastr.download with two optional new arguments mastr_table_to_db_table, with which the user can pass their own database schema, and alter_database_tables, with which the user can prevent open-mastr from issuing any DDL statements.
  • Change CSV export by removing joining tables. The tables are now exported as they are.
  • Change the default names of tables and columns that are created and used for the import.

Removed

  • Remove the method Mastr.translate. The user can now get English table and column names by passing english=True to the generate_data_model or download method.

Workflow checklist

Automation

Closes #516
Closes #577

PR-Assignee

Reviewer

  • 🐙 Follow the Reviewer Guidelines
  • 🐙 Provided feedback and show sufficient appreciation for the work done

@Simon-Will Simon-Will marked this pull request as draft January 24, 2026 15:40
@Simon-Will Simon-Will force-pushed the 516-dynamic-tables-and-columns branch from c85c1fe to fc5bfec Compare January 24, 2026 15:45
@Simon-Will Simon-Will force-pushed the 516-dynamic-tables-and-columns branch from 8550e9a to f31622c Compare February 3, 2026 18:13
@Simon-Will Simon-Will force-pushed the 516-dynamic-tables-and-columns branch 2 times, most recently from 68eb2ad to 2158b8a Compare February 5, 2026 11:35
@pt-kkraemer
Copy link
Copy Markdown
Collaborator

pt-kkraemer commented Feb 9, 2026

On 19.02.2026 a new public version of the Gesamtdatenexport will have a bugfix concerning attributes in the .xsd files:
grafik

This will probably have no effect on what you have done already right?

@Simon-Will
Copy link
Copy Markdown
Contributor Author

Interesting, thanks for pointing that out, @pt-kkraemer! It shouldn't make any difference for now because we make all attributes except the primary keys nullable anyway.

But I'll be sure to check out the difference in the XSD files to make sure I understand that point correctly.

As for this whole PR, it's almost ready now as you can see from the checklist. If anyone already has comments on the approach, please let them be heard. I don't think I will make substantial changes to the non-testing code anymore.

Comment thread CITATION.cff Outdated
Comment thread CITATION.cff Outdated
@Simon-Will
Copy link
Copy Markdown
Contributor Author

Hi @FlorianK13 and @pt-kkraemer, thanks a lot for your reviews and suggestions! I addressed your comments with the two latest commits. Could you have another look?

Comment thread open_mastr/mastr.py Outdated
@pt-kkraemer
Copy link
Copy Markdown
Collaborator

I just had the same error as #723 when using the latest pull from your fork @Simon-Will.
grafik

@FlorianK13 wrote in #723 that this will be fixed, I don't think it does. Maybe we can talk about this in our next dev-meeting?

Comment thread tests/test_mastr.py
Comment thread open_mastr/mastr.py Outdated
Comment thread open_mastr/utils/xsd_tables.py Outdated
@Simon-Will
Copy link
Copy Markdown
Contributor Author

All discussions resolved. I'm waiting for the 0.17 release and will then update this branch before it can be merged. :)

@FlorianK13
Copy link
Copy Markdown
Member

FlorianK13 commented Apr 7, 2026

@Simon-Will can you resolve the conflicts? Afterwards I think we can merge.

@Simon-Will
Copy link
Copy Markdown
Contributor Author

@FlorianK13, conflicts resolved. I'm running the download & import right now to see if it goes well.

@Simon-Will
Copy link
Copy Markdown
Contributor Author

The import is going well. So this is ready again, but I think we want to do #736 first.

@Simon-Will
Copy link
Copy Markdown
Contributor Author

Simon-Will commented Apr 20, 2026

I merged develop into this pull request branch again. So it's ready for merge again. To whoever wants to merge: I'd recommend squashing all these commits I made. The history isn't really clean enough to be useful imo.

@FlorianK13 FlorianK13 self-requested a review April 23, 2026 11:12
FlorianK13
FlorianK13 previously approved these changes Apr 23, 2026
@FlorianK13 FlorianK13 self-requested a review April 23, 2026 14:27
@FlorianK13 FlorianK13 dismissed their stale review April 23, 2026 14:28

As discussed we want to extend the changelog

Copy link
Copy Markdown
Member

@FlorianK13 FlorianK13 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Simon-Will can look at my comment and do this change to the CHANGELOG if you agree with the described changes there? Afterwards I an squash commit merge the PR.

Comment thread CHANGELOG.md

## [v0.xx.x] Unreleased - 202x-xx-xx
### Added
- Add the option to pass a custom database schema
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We discussed in a call and @nesnoj pointed out if we can use a LLM to generate an extended changelog - since a lot of stuff happend here. I did this and it created this new changelog:

[v0.xx.x] Unreleased - 202x-xx-xx

Added

  • Add Mastr.generate_data_model method that downloads the current MaStR documentation and generates SQLAlchemy tables from the XSD definitions; supports english=True for English column
    names
    #718
  • Add mastr_table_to_db_table argument to Mastr.download to pass a custom database schema
    #718
  • Add alter_database_tables argument to Mastr.download to prevent open-mastr from issuing DDL statements
    #718

Changed

  • Switch to dynamic table generation based on parsing of XSD files from the MaStR documentation; fall back to bundled XSD files if the downloaded documentation is invalid
    #718
  • Change default table and column names to align more closely with the original MaStR export file names
    #718
  • Simplify CSV export by removing cross-table joins; tables are exported as-is
    #718

Removed

  • Remove Mastr.translate; English table and column names are now available via the english=True parameter in generate_data_model and download
    #718

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

3 participants