Skip to content
Draft
Show file tree
Hide file tree
Changes from 7 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion docs/examples/Power Flow Example.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -1537,7 +1537,7 @@
],
"metadata": {
"kernelspec": {
"display_name": "power-grid-model",
"display_name": "power-grid-model (3.14.3)",
Comment thread
figueroa1395 marked this conversation as resolved.
Outdated
"language": "python",
"name": "python3"
},
Expand Down
97 changes: 91 additions & 6 deletions docs/user_manual/dataset-terminology.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,89 @@ attribute.
For detailed data types used throughout `power-grid-model`, please refer to
[Python API Reference](../api_reference/python-api-reference.md).

## Buffer Type

Defines how component data is ordered in memory.
Comment thread
zhen0427 marked this conversation as resolved.
Outdated

### Row-based (row-major)

Attributes of the same component are stored contiguously before moving to the next component.

### Columnar (column-major)
Comment thread
zhen0427 marked this conversation as resolved.
Outdated

Attributes are grouped across components by attribute type.

## Buffer Representation

Defines whether component data can be interpreted as a dense 2D matrix.

### Dense

Dense buffers represent data as a rectangular matrix.
This representation implies that all scenarios contain the same number of component entries.

### Sparse

Component data is stored as a flattened 1D buffer.

Scenario boundaries are defined using an index pointer (`indptr`).
The `indptr` defines how the flattened buffer is segmented into per-scenario ranges.

Sparse buffers may be either uniform or non-uniform.

## Component Dataset Independency

Defines whether all scenarios operate on the same component IDs.

### Independent

All scenarios modify the same component IDs in the same order.

A reset is required between scenarios.

### Dependent

Different scenarios may modify different components.

A reset is required between scenarios.
Comment thread
zhen0427 marked this conversation as resolved.
Outdated

## Component Data Uniformity

Defines whether all scenarios contain the same number of component entries, independent of buffer representation.
Uniformity is independent of buffer representation.

### Uniform

All scenarios contain the same number of component entries.

- Dense buffers are always uniform (by construction)
- Sparse buffers may also be uniform

### Non-uniform

Scenarios contain different numbers of component entries.

- Only possible in sparse representation

## Serialization Representation

Defines how datasets are serialized.
Comment thread
zhen0427 marked this conversation as resolved.
Outdated

### Compact List

Uses positional arrays instead of named attributes.
Comment thread
zhen0427 marked this conversation as resolved.
The attributes present in the dataset are stored separately.

Generated when using `compact_list=True`.

### Named Map

Uses explicit attribute names per component.

### Mixed

Combination of compact list and named map (only possible in manual construction, e.g. validation datasets).

## Data structures

```{mermaid}
Expand Down Expand Up @@ -75,7 +158,7 @@ graph TD
elements of all components) for a single scenario.
- **{py:class}`BatchDataset <power_grid_model.data_types.BatchDataset>`:** A data type storing update and or output
data for one or more scenarios.
A batch dataset can contain sparse or dense data, depending on the component.
A batch dataset can contain dense or sparse representations per component.

- **{py:class}`ComponentData <power_grid_model.data_types.ComponentData>`:** The data corresponding to the component.
- **{py:class}`DataArray <power_grid_model.data_types.DataArray>`:** A data array can be a single or a batch array.
Expand All @@ -85,10 +168,11 @@ graph TD
- **{py:class}`BatchArray <power_grid_model.data_types.BatchArray>`:** Multiple batches of data can be represented
in sparse or dense forms.
- **{py:class}`DenseBatchArray <power_grid_model.data_types.DenseBatchArray>`:** A 2D structured numpy array
containing a list of components of the same type for each scenario.
containing a list of components of the same type for each scenario. This implies all scenarios contain the
same number of components (uniform structure).
- **{py:class}`SparseBatchArray <power_grid_model.data_types.SparseBatchArray>`:** A typed dictionary with a 1D
numpy array of `Indexpointer` type under `indptr` key and `SingleArray` under `data` key which is all components
flattened over all batches.
flattened across scenarios, with scenario boundaries defined by `indptr`.
- **{py:class}`ColumnarData <power_grid_model.data_types.ColumnarData>`:** A dictionary of attributes as keys and
individual numpy arrays as values.
This format is described in more detail in
Expand Down Expand Up @@ -183,9 +267,10 @@ The batch size is the number of scenarios.
- **n_scenarios:** The total number of scenarios in the batch.
(Same as Batch Size)

- **n_component_elements_per_scenario:** The number of elements of a specific component for each scenario.
This can be an integer (for dense batches), or a list of integers for sparse batches, where each integer in the list
represents the number of elements of a specific component for the scenario corresponding to the index of the integer.
- **n_component_elements_per_scenario:** The number of component instances per scenario, independent of representation
format (dense or sparse). This can be an integer (for dense batches), or a list of integers for sparse batches,
where each integer in the list represents the number of elements of a specific component for the scenario
corresponding to the index of the integer.

- **Sub-batch:** When computing in parallel, all scenarios in batch calculation are distributed over threads.
Each thread handles a subset of the `Batch`, called a `Sub-batch`.
Expand Down
62 changes: 30 additions & 32 deletions docs/user_manual/serialization.md
Original file line number Diff line number Diff line change
Expand Up @@ -40,16 +40,16 @@ data.

#### JSON schema attributes object

[`Attributes`](#json-schema-attributes-object) contains specified attributes per [`Component`](#json-schema-component)
type (e.g.: `"node"`).
It is only required for those components that contain `HomogeneousComponentData` objects and that data needs to follow
the attributes listed in this object.
It may be empty if for data for all instances certain component is `InhomogeneousComponentData`.
It reduces compression when a dataset largely follows the exact same pattern.
[`Attributes`](#json-schema-attributes-object) defines the attribute list and ordering
for each [`Component`](#json-schema-component) (e.g.: `"node"`)when component data is represented
using the compact list format (`use_compact_list=True`).

The order of attributes in this section determines the order of values in the compact list representation.
This is independent of whether the component data is stored as `DenseComponentData` or `SparseComponentData`.

- [`Attributes`](#json-schema-attributes-object): `Object`
- [`Component`](#json-schema-component): [`ComponentAttributes`](#json-schema-component-attributes) containing the
desired [`Attribute`](#json-schema-attribute)s for that [`Component`](#json-schema-component).
- [`Component`](#json-schema-component): [`ComponentAttributes`](#json-schema-component-attributes)
defining the ordered list of [`Attribute`](#json-schema-attribute)s for that component.

For example, for an `"update"` dataset that contains only updates to the `"from_status"` attribute of `"branch"`
components, it may be `{"branch": ["from_status"]}`.
Expand All @@ -59,6 +59,7 @@ components, it may be `{"branch": ["from_status"]}`.
A [`Component`](#json-schema-component) string contains the component name (see also the [Components](components.md)
reference).
E.g.: `"node"`
Fis
Comment thread
zhen0427 marked this conversation as resolved.
Outdated

- [`Component`](#json-schema-component): `string`

Expand Down Expand Up @@ -124,33 +125,30 @@ remains the same.

#### JSON schema component data object

A [`ComponentData`](#json-schema-component-data-object) object is either a
[`HomogeneousComponentData`](#json-schema-homogeneous-component-data-object) object or an
[`InhomogeneousComponentData`](#json-schema-inhomogeneous-component-data-object) object
A [`ComponentData`](#json-schema-component-data-object) represents the data of a single component instance.

It can be stored in either dense or sparse representation:

- [`ComponentData`](#json-schema-component-data-object):
[`HomogeneousComponentData`](#json-schema-homogeneous-component-data-object) |
[`InhomogeneousComponentData`](#json-schema-inhomogeneous-component-data-object)
- [`DenseComponentData`](#json-schema-component-data-object-dense-representation)
- [`SparseComponentData`](#json-schema-component-data-object-sparse-representation)

#### JSON schema homogeneous component data object
#### JSON schema component data object (dense representation)
Comment thread
zhen0427 marked this conversation as resolved.

A [`HomogeneousComponentData`](#json-schema-homogeneous-component-data-object) object contains the actual values of a
certain component following the exact order of the attributes listed in the [`attributes`](#json-schema-root-object)
field in the [`PowerGridModelRoot`](#json-schema-root-object) object.
A dense component data object stores values in a fixed positional order defined by the `attributes` field
in the root object.

- [`HomogeneousComponentData`](#json-schema-homogeneous-component-data-object): `Array`
- [`AttributeValue`](#json-schema-attribute-value): the value of each attribute.
- [`DenseComponentData`](#json-schema-component-data-object-dense-representation): `Array`
- [`AttributeValue`](#json-schema-attribute-value): values in the exact order defined by the component's attribute
list.

#### JSON schema inhomogeneous component data object
#### JSON schema component data object (sparse representation)

An [`InhomogeneousComponentData`](#json-schema-inhomogeneous-component-data-object) object contains actual values per
attribute of a certain component.
Contrary to the [`HomogeneousComponentData`](#json-schema-homogeneous-component-data-object), it lists the names of the
attributes for which the values are specified, so the attributes may be in arbitrary order and do not have to follow the
schema listed in the [`attributes`](#json-schema-root-object) field in the
[`PowerGridModelRoot`](#json-schema-root-object) object.
A component data object in sparse representation contains values grouped per attribute.
It stores values grouped by attribute, with explicit attribute names and no fixed ordering.
Unlike dense representation, it explicitly stores attribute names, allowing attributes to appear in arbitrary order
and vary between components or scenarios.

- [`InhomogeneousComponentData`](#json-schema-inhomogeneous-component-data-object): `Object`
- [`SparseComponentData`](#json-schema-component-data-object-sparse-representation): `Object`
- [`Attribute`](#json-schema-attribute): [`AttributeValue`](#json-schema-attribute-value): the value of each attribute
per attribute.

Expand Down Expand Up @@ -255,11 +253,11 @@ The type is listed for each attribute in [Components](components.md).

The following example contains an input dataset.
The nodes and sym_loads are represented using
[`HomogeneousComponentData`](#json-schema-homogeneous-component-data-object),
the lines are represented using [`InomogeneousComponentData`](#json-schema-inhomogeneous-component-data-object),
[`DenseComponentData`](#json-schema-component-data-object-dense-representation),
the lines are represented using [`SparseComponentData`](#json-schema-component-data-object-sparse-representation),
while the sources are represented using a mixture of
[`HomogeneousComponentData`](#json-schema-homogeneous-component-data-object) and
[`InomogeneousComponentData`](#json-schema-inhomogeneous-component-data-object).
[`DenseComponentData`](#json-schema-component-data-object-dense-representation) and
[`SparseComponentData`](#json-schema-component-data-object-sparse-representation).

```json
{
Expand Down
12 changes: 6 additions & 6 deletions src/power_grid_model/_core/power_grid_model.py
Comment thread
zhen0427 marked this conversation as resolved.
Original file line number Diff line number Diff line change
Expand Up @@ -613,11 +613,11 @@ def calculate_power_flow( # noqa: PLR0913
- key: Component type name to be updated in batch.
- value:

- For homogeneous update batch (a 2D numpy structured array):
- For dense (uniform) update batch (a 2D numpy structured array):

- Dimension 0: Each batch.
- Dimension 1: Each updated element per batch for this component type.
- For inhomogeneous update batch (a dictionary containing two keys):
- For sparse (non-uniform) update batch (a dictionary containing two keys)::

- indptr: A 1D numpy int64 array with length n_batch + 1. Given batch number k, the
update array for this batch is data[indptr[k]:indptr[k + 1]]. This is the concept of
Expand Down Expand Up @@ -800,11 +800,11 @@ def calculate_state_estimation( # noqa: PLR0913
- key: Component type name to be updated in batch.
- value:

- For homogeneous update batch (a 2D numpy structured array):
- For dense (uniform) update batch (a 2D numpy structured array):

- Dimension 0: Each batch.
- Dimension 1: Each updated element per batch for this component type.
- For inhomogeneous update batch (a dictionary containing two keys):
- For sparse (non-uniform) update batch (a dictionary containing two keys)::

- indptr: A 1D numpy int64 array with length n_batch + 1. Given batch number k, the
update array for this batch is data[indptr[k]:indptr[k + 1]]. This is the concept of
Expand Down Expand Up @@ -964,11 +964,11 @@ def calculate_short_circuit( # noqa: PLR0913
- key: Component type name to be updated in batch
- value:

- For homogeneous update batch (a 2D numpy structured array):
- For dense (uniform) update batch (a 2D numpy structured array):

- Dimension 0: each batch
- Dimension 1: each updated element per batch for this component type
- For inhomogeneous update batch (a dictionary containing two keys):
- For sparse (non-uniform) update batch (a dictionary containing two keys)::

- indptr: A 1D numpy int64 array with length n_batch + 1. Given batch number k, the
update array for this batch is data[indptr[k]:indptr[k + 1]]. This is the concept of
Expand Down
Loading