Skip to content

FEAT - Add a fetcher for the electricity forecasting dataset#2013

Open
lisaleemcb wants to merge 14 commits into
skrub-data:mainfrom
lisaleemcb:issue1844_electricityexample_timeseries
Open

FEAT - Add a fetcher for the electricity forecasting dataset#2013
lisaleemcb wants to merge 14 commits into
skrub-data:mainfrom
lisaleemcb:issue1844_electricityexample_timeseries

Conversation

@lisaleemcb

Copy link
Copy Markdown
Contributor

Added a timeseries dataset for a forecasting example.

To do: upload to osf.io
Addresses #1844

@rcap107 rcap107 changed the title Issue1844 electricityexample timeseries FEAT - Add a fetcher for the electricity forecasting dataset Apr 7, 2026
@rcap107

rcap107 commented Apr 7, 2026

Copy link
Copy Markdown
Member

The dataset has been updated on osf at https://osf.io/download/d8ykq

@rcap107 rcap107 left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR @lisaleemcb! I left a few comments

Comment thread skrub/datasets/_fetching.py Outdated
Comment thread skrub/datasets/_fetching.py
Co-authored-by: Riccardo Cappuzzo <7548232+rcap107@users.noreply.github.com>
@jeromedockes

Copy link
Copy Markdown
Member

one thing we need to check is the license(s) of the original data, to make sure we can redistribute it like this.

@lisaleemcb lisaleemcb marked this pull request as ready for review June 1, 2026 08:31

@rcap107 rcap107 left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me, thanks a lot @lisaleemcb

@jeromedockes jeromedockes left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks @lisaleemcb ! :)

Comment thread doc/api_reference.py Outdated
Comment thread skrub/datasets/tests/test_fetching.py Outdated
Comment thread skrub/datasets/tests/test_fetching.py Outdated
Comment thread skrub/datasets/tests/test_fetching.py Outdated
lisaleemcb and others added 4 commits June 2, 2026 16:25
updated test_dataset_files()

Co-authored-by: Jérôme Dockès <jerome@dockes.org>
…lisaleemcb/skrub into issue1844_electricityexample_timeseries
Comment thread doc/api_reference.py Outdated
Comment thread skrub/datasets/tests/test_fetching.py Outdated
Comment thread skrub/datasets/tests/test_fetching.py Outdated

@jeromedockes jeromedockes left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM besides the little fix needed on the test! thank you @lisaleemcb

)
path = _fetching.fetch_electricity_forecasting()
downloaded = [f.name for f in Path(path).iterdir() if f.is_file()]
assert files == set(downloaded)

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it seems the weather ones are parquet not csv? also it is more common to write assert actual == expected than assert expected == actual so writing it in that order makes the pytest error a little easier to read for people used to that convention

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @jeromedockes , thanks for the tip about the boolean check!

Regarding the files, the folder should have both but I'm not checking for the parquet files. Do you want me to change that?

@rcap107

rcap107 commented Jun 15, 2026

Copy link
Copy Markdown
Member

Discussed IRL: there should be only csv files. We also need to upload the new zip version (with csv only) on osf.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants