Releases: pcalnon/juniper-data
Release list
juniper-data v0.9.0 — delay_product capacity generator (DP-3)
juniper-data v0.9.0 Release Notes
Release Date: 2026-06-22
Version: 0.9.0
Codename: DP-3 Capacity Dataset
Release Type: MINOR
Authored from the canonical
juniper-ml/notes/templates/TEMPLATE_RELEASE_NOTES.md.
Overview
Adds the delay_product synthetic time-series generator — the capacity-demonstrating dataset
for the juniper-recurrence DP-3 readout spectrum — plus routine dependency / CI maintenance. The new
generator's regression target is a bilinear product of two delayed in-window values, a quadratic
form in the LMU memory state that a linear readout provably cannot fit (so it exposes a clear
nonlinear ≫ linear r² gap, unlike the near-linear forecasting synthetics). Backward-compatible:
purely additive (a new generator + dependency bumps).
Status: STABLE — additive / backward-compatible; all existing generators and the 3-D NPZ
contract are unchanged.
Release Summary
- Release type: MINOR
- Primary focus: New
delay_productcapacity generator (DP-3) + dependency / CI maintenance - Breaking changes: NO
- Priority summary: Unblocks the juniper-recurrence DP-3 P2 bench (the RFF-readout capacity gap)
Features Summary
| ID | Feature | Status | Version | Phase |
|---|---|---|---|---|
| DP-3 §8a | delay_product capacity generator |
Done | 0.9.0 | P2 |
What's New
delay_product synthetic generator (DP-3 capacity instrument)
An irregularly-sampled sinusoid superposition (the same non-uniform Δt sampling as irregular_sine)
whose regression target is the bilinear product of two delayed in-window values,
y = x(t−τ₁)·x(t−τ₂), with lag1 / lag2 step-delays kept strictly inside the lookback.
Changes:
- The product is a quadratic form in the (linear) LMU memory state, so a linear readout
provably cannot fit it (r² bounded below 1) while a non-linear (random-Fourier-feature) readout
can — the capacity-demonstrating dataset that complements the near-linear synthetics (where the
linear readout is already at its ceiling). - Emits the standard additive 3-D NPZ contract
({X, y, dt, target_dt, observed_mask}_{train,test,full},task_type="regression",
time_unit="steps") and reuses the leakage-safewindow_timed_serieswindowing (the target reads
only the emitted window contents;y_full == concat(train, test)). - Registered as
delay_productin the generator registry; numpy-only, no extra. See juniper-ml
notes/JUNIPER_RECURRENCE_DP3_READOUT_SPECTRUM_DESIGN_2026-06-20.md§8a.
Bug Fixes
None.
Improvements
Routine maintenance bundled into this release:
- Dependency bumps —
actions/checkout6 → 7,anthropics/claude-code-action1.0.148 → 1.0.154,
and thepython-minordependency group (16 updates). - CI / tooling — local coverage reproduction (
make coverage+ util script),asyncio_mode=auto
for the pytest-asyncio config, and pre-push pre-commit gates wired viadefault_install_hook_types.
Test Results
The delay_product generator ships with a dedicated unit-test module (contract, genuinely
non-uniform dt, the known-answer bilinear target, determinism, parameter validation, and schema)
and is wired into the parametrized end-to-end synthetic-regression and scaling test suites. The full
juniper-data suite is green in CI.
Upgrade Notes
This is a backward-compatible MINOR release. No migration steps required.
pip install --upgrade juniper-data==0.9.0Known Issues
None known at time of release.
What's Next
- juniper-recurrence DP-3 P2 bench — the bench will delegate to
delay_product(via
juniper_data.generators) to demonstrate the RFF-readout capacity gap (nonlinear ≫ linearr²),
alongside the tie on the existing near-linear datasets.
Contributors
- Paul Calnon
Version History
| Version | Date | Description |
|---|---|---|
| 0.9.0 | 2026-06-22 | delay_product DP-3 capacity generator + maintenance |
| 0.8.0 | 2026-06-19 | Configurable equities regression_target |
| 0.7.1 | 2026-06-19 | equities wheel packaging fix |
| 0.7.0 | 2026-06-19 | Δt sequence data foundation |
Links
juniper-data v0.8.0 — configurable equities regression_target
MINOR release. Adds a configurable equities regression target.
EquitiesParams (inherited by EquitiesSeqParams) gains regression_target: "next_close" | "return" | "log_return", controlling the y_reg_* representation:
next_close(default) — raw next-day close, byte-identical to prior output;return—next_close / close - 1;log_return—ln(next_close / close).
The raw close is non-stationary; the return variants are stationary (standard conditioning for trending price data). Both equities and equities_seq honor it via a shared helper. No change to the direction target, the feature matrix, or any other array; the default keeps every existing artifact byte-identical.
Motivated by the juniper-recurrence Δt-LMU equities_seq finding (raw-close target → r²≈−50). Feature PR: #195. Full notes: notes/releases/RELEASE_NOTES_v0.8.0.md (#198).
🤖 Generated with Claude Code
v0.7.1 — equities wheel packaging fix
Juniper Data v0.7.1 Release Notes
Release Date: 2026-06-19
Version: 0.7.1
Release Type: PATCH
Overview
Patch release fixing a packaging defect in 0.7.0: the equities generators' bundled S&P 500
constituents CSV was not shipped inside the wheel, leaving the equities / equities_seq extras
non-functional from a pip install. No API change.
Status: STABLE — backward-compatible patch. No migration.
Fixed
- Ship
sp500_constituents.csvinside the wheel. 0.7.0 packaged only*.py, so the equities
generators raisedFileNotFoundErroron the bundled constituents file from a pip install of
juniper-data[equities]==0.7.0(the file is loaded viaPath(__file__).parent / "sp500_constituents.csv"
— fine in a source checkout, absent from the built wheel). Adds a[tool.setuptools.package-data]
entry (juniper_data.generators.equities = ["*.csv"]) so the constituents list ships in the
wheel + sdist, plus a CI build-step assertion that the CSV is present in the built wheel (guards
the actual failure mode against a future regression). (juniper-data#193)
The defect was surfaced by the juniper-recurrence benchmark's equities_seq row, which could not
load the generator from the published juniper-data[equities]==0.7.0 wheel.
Upgrade
pip install --upgrade "juniper-data[equities]" # 0.7.1 — equities/equities_seq now work from PyPIBackward-compatible; no migration steps. The synthetic generators (multi_sine, mackey_glass,
ar_p, irregular_sine) were unaffected by the 0.7.0 defect and continue to work without any extra.
Known Issues
None. All required CI checks pass; the new build-step assertion confirms the CSV ships in the wheel.
Version History
| Version | Date | Description |
|---|---|---|
| 0.7.1 | 2026-06-19 | Fix: equities constituents CSV now ships in the wheel |
| 0.7.0 | 2026-06-19 | Synthetic dt-sequence generators + scaling meta channel |
| 0.6.0 | 2026-04-08 | Versioning, batch ops, systemd, PostgreSQL fixes |
Links
- Full Changelog
- Previous Release
- Fix PR: juniper-data#193
v0.7.0 — Δt Sequence Data Foundation
Juniper Data v0.7.0 Release Notes
Release Date: 2026-06-19
Version: 0.7.0
Codename: Δt Sequence Data Foundation
Release Type: MINOR
Overview
This release completes the Δt-native sequence data foundation for the Juniper
recurrence workstream. JuniperData can now generate irregular- and regular-Δt
time-series datasets — both synthetic (closed-form, zero-dependency) and real
(S&P 500 equities) — that emit the additive 3-D NPZ sequence contract (WS-1),
plus an advisory scaling-meta channel and build provenance on the health surface.
Status: STABLE — backward-compatible, additive contract. No breaking changes.
Release Summary
- Release type: MINOR
- Primary focus: New features — irregular/regular-Δt sequence generators, the 3-D sequence contract, scaling meta, build provenance
- Breaking changes: NO (every existing classification generator and NPZ invariant is unchanged; all new fields are optional/additive)
- Headline: ships the generators that were merged to
mainafter v0.6.0 but were absent from the published 0.6.0 wheel — closing the publish-first gap that blocked thejuniper-recurrencebenchmark and recurrence-model evaluation
What's New
Δt sequence generators (the recurrence "hello-world" datasets)
Synthetic regression generators — multi_sine, mackey_glass, ar_p (#187)
Three numpy-only, deterministic, offline generators emitting the additive 3-D
sequence NPZ contract (WS-1) as task_type="regression". Each samples a process
at a regular Δt and windows it into (W, L, 1) sequences with a per-step dt, a
fixed target_dt forecast horizon, an all-ones observed_mask, and the target
carried directly in y_*. multi_sine is a superposition of K sinusoids
(closed-form known answer when noise-free); mackey_glass integrates the chaotic
delay-differential equation (β=0.2, γ=0.1, n=10, τ=17); ar_p is a stable
autoregressive process. No optional extra required — pure numpy.
Irregular-Δt synthetic generator — irregular_sine (#188)
A fourth numpy-only regression generator that samples a continuous-time sinusoid
superposition at non-uniform times (sample_dt · U[1−jitter, 1+jitter]), so
the windowed artifact carries a genuinely non-uniform per-step dt and a variable
target_dt. The synthetic, known-answer counterpart to equities_seq's
calendar-gap irregularity. Backed by a new window_timed_series(values, times, …)
helper.
Real irregular-Δt sequences — equities_seq (#171) and equities (#164)
equities produces daily per-(ticker, day) records for S&P 500 constituents
(Yahoo Finance OHLCV + SEC EDGAR shares/market-cap, 52-week high/low, cost basis)
with dual targets (one-hot next-day direction + auxiliary next-day-close
regression). equities_seq is its windowed 3-D sequence variant carrying genuine
calendar-gap irregular Δt. Both require the [equities] extra (yfinance,
pandas).
Advisory dt / target scaling-meta channel (#189)
A generator may now report how its per-step dt and regression target should be
standardized, via a reserved "scaling" key that the dataset route pops into two
new optional DatasetMeta fields — dt_scaling and target_scaling. The scaling
is advisory: the NPZ keeps RAW arrays (every contract invariant intact); a
consumer standardizes at ingestion and denormalizes for metrics using the
persisted stats. New core/scaling.py (exact-inverse standardize /
inverse_standardize, std≈0 guard) and core/meta.py::pop_scaling_meta. The four
synthetic generators gain a scaling: "identity" | "standardize" parameter
(standardize descriptors fit on the train split only — no test leakage).
Sequence contract foundation (WS-1) (#169, #170)
A per-entity sequence-windowing primitive with a Hypothesis leakage-property test
(#169), and a regression/sequence-tolerant dataset contract that makes class
metadata optional and dispatches on task_type (#170).
Build provenance on the health surface (#180)
/v1/health and /v1/health/ready now report the source git_sha and ISO-8601
build_date baked into the image (GIT_SHA / BUILD_DATE / APP_VERSION
build-args → OCI labels + env vars; new juniper_data.provenance accessor; values
flow into set_build_info(...) and the shared ReadinessResponse). Foundation for
ecosystem stale-image detection. Requires juniper-observability>=0.4.0.
Compatibility
- fastapi 0.137 route-introspection compatibility (
_IncludedRouter) (#181),
starlette>=1.0.1floor (CVE-2026-48710), and routine dependency bumps.
API Changes
New / changed response fields
| Surface | Change | Breaking? |
|---|---|---|
DatasetMeta |
New optional dt_scaling, target_scaling descriptors |
No |
/v1/health, /v1/health/ready |
New git_sha, build_date provenance fields |
No |
| Dataset metadata | n_classes / class_distribution now optional (task_type="regression") |
No |
New generators registered on the dataset route
multi_sine, mackey_glass, ar_p, irregular_sine (no extra) and equities,
equities_seq ([equities] extra). All emit the 3-D sequence NPZ contract:
X (n,T,F), y / y_reg, dt (n,T, dt[:,0]=0), target_dt (n,),
seq_lengths, observed_mask — split-suffixed (_train / _test / _full).
Upgrade Notes
This is a backward-compatible MINOR release. No migration steps are required for
existing classification datasets or consumers.
pip install --upgrade juniper-data # synthetic generators, core, API
pip install --upgrade "juniper-data[equities]" # + equities / equities_seq- The API server extra pulls
juniper-observability>=0.4.0(provenance helpers). - Synthetic Δt generators (
multi_sine,mackey_glass,ar_p,irregular_sine)
need no optional extra.
Known Issues
equities/equities_seqrequire network access to Yahoo Finance and SEC
EDGAR at generation time; they are excluded from the offline test path. Not a
functional defect.- None blocking. All required CI checks (unit/integration across Python
3.12–3.14, pre-commit, CodeQL, security, lockfile freshness, quality gate) pass.
What's Next
- Consumed downstream: the
juniper-recurrencebenchmark and recurrence-model
evaluation depend on these published generators (the Δt thesis was validated
againstirregular_sine). - Eval extensions: noisy synthetic variants (
noise_std > 0) and real
equities_seqbenchmarking. - Scaling/synthetic generator enhancements tracked under WS-4.
Version History
| Version | Date | Description |
|---|---|---|
| 0.7.0 | 2026-06-19 | Synthetic dt-sequence generators + scaling meta channel |
| 0.6.0 | 2026-04-08 | Versioning, batch ops, systemd, PostgreSQL fixes |
Links
v0.6.0
Highlights
Major release with dataset versioning, batch operations, security hardening, and infrastructure improvements.
Added
- Dataset Versioning (CAN-DEF-005 Phase 1): Logical dataset names with auto-incrementing version numbers. Atomic version allocation prevents duplicates under concurrency. New endpoints:
GET /v1/datasets/versions,GET /v1/datasets/latest. - Batch Operations (CAN-DEF-006):
POST /v1/datasets/batch-create,PATCH /v1/datasets/batch-tags,POST /v1/datasets/batch-exportfor operating on multiple datasets in single requests. - Docker Secrets: File-based secrets support via
get_secret()utility. - Systemd Integration: Service unit and management CLI for native Linux deployments.
- CSV Import Path Traversal Protection: New
JUNIPER_DATA_IMPORT_DIRsetting restricts CSV file imports to a configurable base directory.
Fixed
- Synchronized version across
__init__.py,pyproject.toml, andDockerfile - PostgreSQL metadata/artifact split-brain on save failure
- PostgreSQL temp artifact race conditions on concurrent saves
- Advisory lock namespace collision between dataset ID and version allocation
- Generic
n_classesfallback replacing spiral-specificparams.n_spirals - Removed inconsistent fallback that crashed for non-spiral generators with empty training sets
Changed
- Updated GitHub Actions (checkout v6, setup-python v6.2, upload-artifact v7, codecov v6)
- Sentry PII and traces sample rate now configurable (defaults: PII=False, sample_rate=0.1)
- AGENTS.md comprehensive audit and update
- Documentation link checker with cross-repo skip mode
Security
- CSV import generator now validates file paths against configurable import directory
- Sentry PII transmission disabled by default (was enabled)
Stats
- 849 tests passing
- 98%+ code coverage
- All pre-commit hooks passing
🤖 Generated with Claude Code