Release list

v0.2.2 Latest

Latest

kousuke-nakano released this 02 Jun 00:42

v0.2.2

262c64c

Stable release since v0.1.0. v0.2.2 ships everything accumulated across four alphas (v0.2.0a1, v0.2.1a1, v0.2.1a2, v0.2.2a1) plus a final round of polish. Per-alpha sections are preserved below; this entry is a roll-up of the highlights from v0.1.0 to v0.2.2.

Highlights (v0.1.0 -> v0.2.2)

Optimization

Linear Method (LM) optimizer integrated under method="sr" with a unified use_lm / lm_subspace_dim hierarchy (plain SR / aSR / LM). New |v_0|^2 < 0.9 fallback to plain SR keeps non-linear-regime updates from producing NaN energies.
Adaptive learning rate for Stochastic Reconfiguration.
MO optimization for JSD via the projection method with Attacalite-Sorella regularization, plus geminal AO -> MO projection.
AO basis optimization (opt_J3_basis_coeff/exp, opt_lambda_basis_coeff/exp) with shell-shared constraint and dual symmetrization.
Distributed tall-CG SR solver via psum, removing mpi_size-scaling memory in the SR solve.

Performance

Fast-update use across MCMC / VMC / LRDMC, with mat-vec hot paths converted to GEMM for better GPU utilization.
On-GPU VMC optimization with use_device_collectives auto-selected by JAX backend; multi-GPU run_optimize supported.
LU -> SVD in determinant / geminal / GFMC_n / GFMC_t for ill-conditioned stability; Cartesian / Spherical AO conversion (Cartesian GTOs are substantially faster on GPU); ECP fast path (compute_ecp_coulomb_potential_fast).

Numerical precision

Mixed-precision support with "full" / "mixed" modes and per-zone dtype control. Three explicit design principles. AGP/SD geminal stays fp64 to prevent log|det| amplification; electron-nucleus r - R differences are reconstructed in fp64 before downcast to avoid catastrophic cancellation. ao_grad_lap and mo_grad_lap zones are split for finer-grained control.

Features

LRDMC atomic forces with the Pathak-Wagner regularization.
Runtime-selectable Jastrow forms: jastrow_1b_type and jastrow_2b_type (exp / pade).
use_swct flag to toggle Space Warp Coordinate Transformation in MCMC and GFMC_n / GFMC_t.

`jqmc_workflow` automation package

jqmc-workflow is introduced as a multi-stage QMC pipeline orchestrator (WF conversion -> VMC opt -> MCMC / LRDMC production) with automatic step estimation, checkpointing, and remote job management.

bug fixes

GFMC_n / GFMC_t spin-polarized (n_up != n_dn, n_dn >= 1) MPI bug.
MPI deadlock in max_time / stop_flag checks; Allreduce vs allreduce for scalars.
Optimizer step estimation; force NaN; MCMC memory overflow from r_up_history / r_dn_history storage.

Infrastructure

Restart files migrated from pickle .chk to HDF5 .h5 (no backward compatibility).
Ruff lint pipeline (jqmc-lint-ruff.yml) and pre-commit updates; non-ASCII cleanup across code and docstrings.
Nightly CI + Codecov activated with the pytest-xdist support.
Examples: 11 end-to-end tutorials (jqmc-example01 to jqmc-example08, jqmc-workflow-example01 to jqmc-workflow-example03).
Project ownership transferred to the jqmc-project GitHub organization; URLs updated.

Breaking changes since v0.1.0

Restart files: pickle .chk is no longer supported; HDF5 .h5 is the only format.
Optimizer API: num_param_opt, opt_filter_min_SN_ratio, adaptive_learning_rate, and method="lm" are all removed or replaced; the Linear Method is now accessed via method="sr" with use_lm=true (and the new lm_subspace_dim / lm_cond parameters).

See the per-alpha sections below for full details.

Assets 2

v0.2.2a1 Pre-release

Pre-release

kousuke-nakano released this 19 May 04:45

v0.2.2a1

5ee1cf1

This release brings configurable mixed-precision support, deep kernel-level performance work (AOs, Jastrow, det/Jastrow ratios, GFMC), on-GPU VMC optimization, and a project-wide lint/cleanup.

New features

Mixed precision support: Added a configurable precision system with per-zone dtype control (fp64 by default; fp32 in selected zones in "mixed" mode). Refactored as selectable-precision modules with three explicit design principles. The geminal/AGP/SD path remains in fp64 to prevent fp32 amplification of log|det|, and electron-nucleus r - R differences are computed in fp64 before downcasting to avoid catastrophic cancellation. ao_grad_lap and mo_grad_lap precision zones are split for finer-grained control. The public API is reduced to a single mode selector: "full" or "mixed".
On-GPU VMC optimization: VMC parameter optimization can now run entirely on GPU. Added use_device_collectives (auto-selected by the JAX backend: GPU=True, otherwise False) with an MPI/JAX consistency check, along with matching CLI/TOML options. Multi-GPU run_optimize is supported.
Ruff lint pipeline: Added jqmc-lint-ruff.yml GitHub Action and updated .pre-commit-config.yaml. Applied auto-fixes and manual cleanups across the codebase, and removed non-ASCII characters from code and docstrings.

Performance & memory

AO module (HLO-level): Reduced L1/L2/DRAM traffics. Replaced segment_sum with a bucketed reduce+gather scheme (including V_l). Unrolled (8Z)**l in the Cartesian kernels to avoid XLA while loops. Removed eps from Cartesian GTOs. Fused AO/MO value/grad/lap into a single dispatch on hot paths.
Streaming caches on hot paths: Stream cached AO and paired tables into every det-ratio / jas-ratio hot paths. Improved the J3 streaming cache in jastrow_factor.py. Reused J3 streaming-state AOs in LRDMC mesh / ECP non-local ratios. Improved the K state carry in jqmc/wavefunction.py.
Jastrow ratios: Optimized J2 ratio from an O(N^2) baseline to O(N * N_grid) per-grid sums. Introduced a slim J3 state carry in the MCMC wavefunction update. Polished Jastrow with a dense (N, N) up-up / dn-dn pair reduction and removed scatter-add while loops.
GFMC_t: Added a streaming kinetic-state path (parity with GFMC_n) and replaced a Python while loop with lax.while_loop in the projection.
Misc: Vectorized the electron-configuration generator and PRNG key initialization. Switched the jackknife standard deviation to a two-pass centered sum-of-squares for better numerical stability. Replaced hessian() with jvp(grad) for the NN-Jastrow Laplacian.

Bug fixes

GFMC/MCMC logging: Improved loggers in jqmc/jqmc_gfmc.py and jqmc/jqmc_mcmc.py.

Workflow (`jqmc_workflow`)

Refactored workflow modules (vmc_workflow.py, mcmc_workflow.py, lrdmc_workflow.py, lrdmc_ext_workflow.py, workflow.py, and the _*.py helpers) for readability, with no behavior change.

Tests & infrastructure

Polished tolerance control across the test suite; introduced a medium tolerance for numerical-Laplacian tests and removed the separate autodiff tolerance.
Removed test_kinetic_energy_analytic_and_numerical and test_numerial_and_auto_grads_and_laplacians_ln_Det because the numerical references are intrinsically unstable; analytical versions are already validated against JAX autograd.
Shortened jqmc-run-full-pytest.yml and updated GitHub Actions.
Made the numerical-Laplacian debug functions more stable.

Assets 2

v0.2.1a2 Pre-release

Pre-release

kousuke-nakano released this 24 Apr 05:45

v0.2.1a2

dfaa1b0

Minor update focusing on workflow improvements, bug fixes, and new benchmark infrastructure.

New features

Kernel benchmark suite: Added benchmark modules and tests for profiling kernel performance.
cleanup_patterns option: Added a cleanup_patterns configuration option to jqmc_workflow for automatic post-run file cleanup, with support for recursive matching in subdirectories.

Bug fixes

MPI deadlock in max_time / stop_flag: Fixed a deadlock that could occur during max_time and stop_flag checks in MPI runs.

Assets 2

v0.2.1a1 Pre-release

Pre-release

kousuke-nakano released this 16 Apr 04:59

v0.2.1a1

f438e94

This release focuses on a major update of the VMC optimizer (Linear Method), extended AO basis optimization, memory/performance improvements of the jqmc kernel package, and substantial hardening of the jqmc_workflow automation package.

New features

Linear Method (LM) optimizer: Implemented the Linear Method optimizer that solves the generalized eigenvalue problem $\bar{H} v = E \bar{S} v$ for optimal parameter updates, providing a powerful alternative to the naive Stochastic Reconfiguration (SR). The optimization is robust and fast.
- LM is now integrated into the method="sr" code path, controlled by the use_lm flag and lm_subspace_dim parameter (inspired by TurboRVB's ncg=1 design).
- Unified optimizer hierarchy under method="sr":
  - use_lm=false: plain SR
  - use_lm=true, lm_subspace_dim=0: adaptive SR (aSR) with gamma scaling
  - use_lm=true, lm_subspace_dim>0: LM with SR collective variable + top-$p$ S/N ratio parameters
  - use_lm=true, lm_subspace_dim=-1: LM with SR collective variable + all parameters
- SR collective variable ($g = S^{-1}f$) is used as the first LM basis vector for stability.
- dgelscut-based preconditioning with iterative eigenvalue conditioning on the correlation matrix (condition number $\leq 1/\epsilon$), inspired by TurboRVB's implementation.
- S-orthonormalization ($P = U \Lambda^{-1/2}$) converts the generalized eigenvalue problem to standard form.
- Symmetrization of $H = K + B$ before eigenvalue solve to suppress finite-sample noise.
- Eigenvector selection by $\max |v_0|^2$ criterion.
- Separate epsilon (SR regularization) and lm_cond (LM dgelscut threshold) parameters.
- Fallback mechanisms: plain SR fallback ($\gamma=0.1$) when aSR finds no positive root; plain SR fallback ($0.1 \cdot g_\mathrm{sr}$) when LM does not predict energy improvement ($E_\mathrm{LM} > E_0 + 3\sigma$).
Extended AO basis optimization: Implemented opt_J3_basis_coeff, opt_J3_basis_exp, opt_lambda_basis_coeff, and opt_lambda_basis_exp options for optimizing three-body Jastrow and geminal AO basis exponents and coefficients in VMC.
- Shell-shared constraint: Same-atom, same-shell primitives share exponents/coefficients via symmetrize_metric (size-preserving shell averaging), consistent with j3_matrix/lambda_matrix symmetrization.
- Dual symmetrization strategy: $O_k$ derivatives are symmetrized at source in get_dln_WF for accurate $f$ and $S$, and post-hoc symmetrization is applied after apply_block_update to prevent floating-point drift over hundreds of optimization steps.
- Improved AO basis exponent selection using a log-spaced median window with widened margin (/2.5).
use_swct parameter: Added use_swct flag to MCMC, GFMC_t, and GFMC_n classes to control Space Warp Coordinate Transformation (SWCT) on/off for atomic force calculations. Default is True for MCMC and False for GFMC (LRDMC). When disabled, zero arrays are used for omega/grad_omega, and force formulas reduce to bare Hellmann-Feynman and Pulay forces.
S/N ratio filter: Applied S/N ratio filtering before SR matrix construction to reduce the effective matrix dimension, improving both speed and numerical stability. O_matrix_local is sliced to selected parameters before building $S$, so all SR computations operate in the reduced space.
Shape assertions: Added rigorous shape assertions (using mcmc_counter, num_walkers, n_atoms) to get_E, get_aF, get_gF, get_aH in MCMC, GFMC_t, GFMC_n, and their debug counterparts.

Performance & memory

Buffer-based MPI reduce in SR: Eliminated list() round-trips and switched to buffer-based MPI reduce in the SR optimizer for lower overhead.
Pre-compute collective observable: Compute $O_\mathrm{SR} = \delta O \cdot g$ while the full $O$-matrix is still in memory during the SR solve, avoiding a redundant get_dln_WF call in the LM path.
Avoid redundant JIT compilation: Refactored run_optimize in jqmc_mcmc.py to skip redundant computation via early continue, reducing unnecessary recompilation.
jax.clear_caches() after optimization loop: Added cache clearing after the optimization loop as an OOM workaround.
Store np.array instead of list(np.array): Refactored internal data storage to use np.array directly, reducing memory fragmentation.
Wrapped properties with np.asarray(): Prevent accidental storage of JAX arrays in checkpoint data to avoid OOM.
Avoid redundant energy/force post-processing: Skip unnecessary re-computation of energy and force post-processing.
Better memory management: Improved memory handling in jqmc_gfmc.py and jqmc_mcmc.py.

Bug fixes

GFMC_n / GFMC_t spin-polarized MPI bug: Fixed a critical bug for systems where n_up != n_dn and n_dn >= 1 with MPI >= 2 processes.
GFMC_t projection averaging: Fixed incorrect averaging of the number of projections across MPI ranks in GFMC_t.
SR with num_params >= num_samples: Fixed MPI bug when the number of optimizable parameters exceeds the number of samples.
MPI Allreduce for scalars: Replaced Allreduce with allreduce for scalar int and float values in jqmc_mcmc.py and jqmc_gfmc.py, as Allreduce for scalars exhibits implementation-dependent behavior.
Optimizer step estimation: Fixed estimate_required_steps — removed incorrect ceil rounding and max clamp that ignored walker_ratio; added min_steps parameter.
SR stability near convergence: Improved stability of SR with adaptive learning rate in the vicinity of convergence.
Pytree inconsistency: Fixed a JAX pytree structural mismatch.
S/N ratio diagnostics: Fixed averaging (last S/N ratio → averaged S/N ratios) and trivial output bugs.

Workflow (`jqmc_workflow`)

Major refactoring: Comprehensive overhaul of all workflow modules (vmc_workflow.py, mcmc_workflow.py, lrdmc_workflow.py, workflow.py) with improved robustness, cleaner code structure, and new _phase.py module for phase management.
SSH / file-descriptor leak fixes: Fixed SSH connection hangs and leaks; consolidated Machine objects to prevent resource exhaustion.
Continuation behavior: Changed and improved the behavior of workflow continuations with SHA256-based input fingerprinting for reliable restart detection.
Step count accumulation: Fixed a bug in accumulated step counts for VMC and LRDMC workflows.
VMC convergence check: Implemented a new VMC energy-slope-based convergence check.
New VMC workflow parameters: Introduced additional configurable parameters for vmc_workflow.py.
Output parser fixes: Fixed parsers for workflow output processing.
FileFrom handling: Fixed and polished FileFrom file-transfer logic.
Job ID checks: Updated and improved job ID check logic for remote execution.
Error estimation in workflows: Fixed error estimation methods used by workflows.

Breaking changes

Removed num_param_opt and opt_filter_min_SN_ratio: These parameters have been removed from run_optimize(), CLI, workflow, and TOML config. SR and optax now always optimize all parameters; parameter selection is handled internally by the LM subspace mechanism.
Replaced adaptive_learning_rate with use_lm: The adaptive_learning_rate flag is replaced by the use_lm flag, which controls the unified LM/aSR optimizer hierarchy.
Removed method="lm" as separate code path: The Linear Method is now accessed via method="sr" with use_lm=true.
New optimizer parameter names: lm_cond (default 0.001) replaces the previous LM-specific delta/epsilon naming.

Assets 2

v0.2.0a1 Pre-release

Pre-release

kousuke-nakano released this 10 Mar 12:43

v0.2.0a1

b3e1b9b

This is a major update with drastic performance improvements, new features, and a new workflow automation package (jqmc-workflow).

Performance

Drastic speedups: MCMC, VMC, and LRDMC are all significantly faster than the previous version thanks to pervasive use of fast-update algorithms throughout the code.
LU -> SVD replacement: Replaced LU factorizations with SVD across determinant, geminal, GFMC_n, and GFMC_t modules, greatly improving numerical stability for ill-conditioned matrices.
GEMM optimization: Converted matrix-vector operations to matrix-matrix (GEMM) operations in Coulomb potential, determinant, and Jastrow factor modules for better GPU utilization.
Cartesian / Spherical AO conversion: Implemented Cartesian AO <-> Spherical AO conversion. Cartesian GTOs are substantially faster than spherical GTOs on GPUs, so users can now exploit this for better throughput.
ECP fast computation: Implemented compute_ecp_coulomb_potential_fast for efficient pseudopotential evaluation.
vmap + jit fix: vmap-ed functions are now explicitly wrapped with jit, as vmap does not automatically JIT-compile the mapped function.
Removed mpi4jax dependency for CG: Conjugate gradient (CG) solver now uses pure MPI on CPUs, eliminating the mpi4jax dependency.

Optimization

Adaptive learning rate for Stochastic Reconfiguration: Implemented a linear-method-inspired automatic learning-rate adjustment scheme, leading to dramatically faster optimization convergence.
Molecular orbital optimization: Added MO optimization for JSD wavefunctions via the projection method with Attacalite-Sorella regularization.
Geminal AO -> MO projection: Implemented AO overlap matrix computation and geminal AO -> MO projection for constrained optimization.

New features

LRDMC force calculations: Implemented LRDMC atomic forces with the Pathak–Wagner regularization.
Jastrow functions: Added jastrow_1b_type ('exp' / 'pade') and jastrow_2b_type ('pade' / 'exp') fields to Jastrow_one_body_data and Jastrow_two_body_data, enabling runtime selection of the one-body and two-body Jastrow functional forms.
- Exponential form: $u(r) = \frac{1}{2b}(1 - e^{-br})$
- Padé form: $u(r) = \frac{r}{2(1 + br)}$

Bug fixes

Fixed force calculations producing NaN values; added NaN checks in all tests.
Fixed MCMC memory overflow caused by storing r_up_history / r_dn_history.
Fixed wavefunction without Jastrow not working for MCMC.
Fixed missing NN-Jastrow derivatives in _GFMC_n_debug.

Infrastructure

Restart file format change: Switched restart files from pickle-based *.chk to HDF5-based *.h5. Note: backward compatibility with old *.chk files is not maintained.
jqmc_workflow package: Introduced the jqmc_workflow automation package for orchestrating multi-stage QMC pipelines (WF conversion → VMC optimization → MCMC / LRDMC production) with automatic step estimation, checkpointing, and remote job management.
Removed SWCT_data: Cleaned up legacy SWCT_data class as part of codebase refactoring.
More comprehensive tests: Substantially expanded the test suite to cover the new features and improve overall reliability.
Expanded examples: Reorganized and enriched the examples/ directory with 11 end-to-end tutorials (jqmc-example01–jqmc-example08, jqmc-workflow-example01–jqmc-workflow-example03) covering single-point VMC/LRDMC, force calculations, GPU walker-scaling benchmarks, interaction-energy workflows, and PES scans with automated jqmc_workflow pipelines.

Assets 2

v0.1.0 Pre-release

Pre-release

kousuke-nakano released this 05 Feb 12:15

v0.1.0

f3e2518

Release of the first stable version of jQMC.

Known Limitation(s)

Periodic Boundary Condition (PBC) calculations are being implemented for the next major release.

Assets 2

0 Join discussion

v0.1.0a3 Pre-release

Pre-release

kousuke-nakano released this 24 Jan 05:46

v0.1.0a3

3ae7322

Release of the third alpha version of jQMC.

Key Features

Analytical derivatives:
- Implemented analytical gradients and Laplacians for atomic and molecular orbitals in both spherical and Cartesian GTO bases.
- JAX autograd is now used primarily for validating the analytical gradients.
- Logarithmic derivatives of the wavefunction and derivatives of atomic force calculations still use JAX autograd.
Testing precision:
- Tightened and systematized decimal controls in tests, improving overall reliability.
Fast updates:
- Expanded fast-update implementations to more functions, yielding significant speedups in both MCMC and GFMC modules.

Assets 2

v0.1.0a1 Pre-release

Pre-release

kousuke-nakano released this 14 Jan 01:48

v0.1.0a1

679220f

Release of the second alpha version of jQMC.

Key Features

Neural Network Jastrow:
- Introduced NNJastrow, a PauliNet-inspired neural network architecture for many-body Jastrow factors, enabling more accurate wavefunction ansatz.
Optimization Control:
- Implemented proper gradient masking mechanisms (e.g., with_param_grad_mask). This allows for selectively freezing or optimizing specific parameter blocks (One-body, Two-body, Three-body, NN, and Geminal coefficients) during the VMC optimizations.

Enhancements & Fixes

I/O: Changed the storage format for hamiltonian_data from pickled binary files to HDF5 (.h5) for better portability and compatibility.
Documentation: Updated README.md, docstrings, and API references to reflect recent changes and fix Sphinx warnings.
CI/CD: Updated pre-commit configurations and GitHub workflow triggers.
Code Quality: Refactored code based on suggestions and improved type hinting.

Assets 2

v0.1.0a0 Pre-release

Pre-release

kousuke-nakano released this 09 Jan 01:54

v0.1.0a0

c024bd3

We are pleased to announce the first alpha release of jQMC, a Python-based Quantum Monte Carlo package built on JAX.

Key Features

JAX-based Core: Fully utilizes JAX's Just-In-Time (JIT) compilation and automatic vectorization (vmap) for high-performance simulations on GPUs and TPUs.
Algorithms:
- Variational Monte Carlo (VMC): Supports wavefunction optimization via Stochastic Reconfiguration (SR) and Natural Gradient methods.
- Lattice Regularized Diffusion Monte Carlo (LRDMC): A stable and efficient projection method for ground state calculations.
Wavefunctions:
- Ansatz: Supports Jastrow-Slater Determinant (JSD) and Jastrow-Antisymmetrized Geminal Power (JAGP).
- Jastrow Factors: Includes One-body, Two-body, Three/Four-body terms.
- Determinant Types: Single Determinant (SD), Antisymmetrized Geminal Power (AGP), and Number-constrained AGP (AGPn).
I/O & Interoperability:
- TREX-IO Support: Interfaces with the TREX-IO library (HDF5 backend) for standardized input of molecular structure and basis sets (Cartesian & Spherical GTOs).
Parallelization:
- MPI Support: Implements mpi4py for efficient parallelization across multiple nodes.
Documentation:
- Comprehensive technical notes on Wavefunctions, VMC, LRDMC, and JAX implementation details.
- Examples demonstrating usage for various systems (H2, N2, Water, etc.).

Known Limitations (Alpha)

Periodic Boundary Conditions (PBC) are currently in development.
Atomic force calculations with spherical harmonics are computationally intensive on current JAX versions.
Complex wavefunctions are not yet supported.

Assets 2

Uh oh!

Releases: jqmc-project/jQMC

Release list

v0.2.2

Highlights (v0.1.0 -> v0.2.2)

Optimization

Performance

Numerical precision

Features

jqmc_workflow automation package

bug fixes

Infrastructure

Breaking changes since v0.1.0

Uh oh!

v0.2.2a1

New features

Performance & memory

Bug fixes

Workflow (jqmc_workflow)

Tests & infrastructure

Uh oh!

v0.2.1a2

New features

Bug fixes

Uh oh!

v0.2.1a1

New features

Performance & memory

Bug fixes

Workflow (jqmc_workflow)

Breaking changes

Uh oh!

v0.2.0a1

Performance

Optimization

New features

Bug fixes

Infrastructure

Uh oh!

v0.1.0

Known Limitation(s)

Uh oh!

v0.1.0a3

Key Features

Uh oh!

v0.1.0a1

Key Features

Enhancements & Fixes

Uh oh!

v0.1.0a0

Key Features

Known Limitations (Alpha)

Uh oh!

`jqmc_workflow` automation package

Workflow (`jqmc_workflow`)

Workflow (`jqmc_workflow`)