Skip to content
Open
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions AUTHORS.rst
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,7 @@ The following persons contributed to the development of the |pyam| package:
- Pietro Monticone `@pitmonticone <https://github.com/pitmonticone>`_
- Edward Byers `@byersiiasa <https://github.com/byersiiasa>`_
- Fridolin Glatter `@glatterf42 <https://github.com/glatterf42>`_
- Zachary Schmidt `@zacharyschmidt <https://github.com/zacharyschmidt>`_

| The core maintenance of the |pyam| package is done by
the *Scenario Services & Scientific Software* research theme
Expand Down
1 change: 1 addition & 0 deletions RELEASE_NOTES.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
# Next release

- [#875](https://github.com/IAMconsortium/pyam/pull/875) Add methods to the `compute` module implementing Kaya decomposition analysis.
- [#880](https://github.com/IAMconsortium/pyam/pull/880) Use `pd.Series.iloc[pos]` for forward-compatibility
- [#877](https://github.com/IAMconsortium/pyam/pull/xxx) Support `engine` and other `pd.ExcelFile` keywords.

Expand Down
244 changes: 244 additions & 0 deletions pyam/compute.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,8 +4,10 @@
import pandas as pd
import wquantiles

import pyam
from pyam._debiasing import _compute_bias
from pyam.index import replace_index_values
from pyam.kaya import kaya_factors, kaya_variables, lmdi
from pyam.timeseries import growth_rate
from pyam.utils import remove_from_list

Expand Down Expand Up @@ -249,6 +251,248 @@ def bias(self, name, method, axis):
"""
_compute_bias(self._df, name, method, axis)

def kaya_variables(self, scenarios, append=False):
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a method kaya_variables and an imported module with the same name, which then also has a method of that name... Hard to follow the code logic here.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed the function names to compute_kaya_variables and compute_kaya_factors. Now compute.py has methods kaya_variables and kaya_factors which call functions compute_kaya_variables and compute_kaya_factors. The functions are imported from the kaya_variables.py and kaya_factors.py modules, so those names are still duplicated. Let me know if this is still unclear and I'll make more changes.

"""Compute the variables needed to compute Kaya factors
for the Kaya Decomposition Analysis.
Comment thread
zacharyschmidt marked this conversation as resolved.
Outdated

Parameters
----------
scenarios : iterable of tuples (model, scenario, region)
The (model, scenario, region) combinations to be included.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wondering about two issues with the scenarios filter:

  1. Why not just compute kaya-variables for all data in the IamDataFrame?
  2. If it's necessary to filter, why not allow filter-kwargs and add those below, like?
    def kaya_variables(self, append=False, **kwargs)
        _data = self._df.filter(**kwargs)
       ...

I'm concerned about adding an argument "scenarios" that is actually a combination of model-scenario-region, that's quite confusing...

Copy link
Copy Markdown
Author

@zacharyschmidt zacharyschmidt Dec 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I removed the scenarios filter and followed suggestion 1--we'll try to compute kaya-variables for all scenario/model/region combinations in the dataframe. I didn't try to implement returning partial sets of kaya-variables for scenario/model/region combinations that have only partial input data (though, if you think this is an important feature I'll give it a try). There's a new helper function in the kaya_variables module, _is_input_data_incomplete(), that relies on the require_data() method to make sure the data is complete enough so that none of the arithmetic operations in later processing will throw errors.

require_data() only considers model/scenario combinations (it does not care if all variables are present in all regions), but I want to alert the user when regions without full input data are present, so there is extra logic in _is_input_data_incomplete() to log this information.

append : bool, optional
Whether to append computed timeseries data to this instance.

Returns
-------
:class:`IamDataFrame` or **None**
Computed timeseries data or None if `append=True`.

Notes
-----

Example of calling the method:

.. code-block:: python

df.compute.kaya_variables(scenarios=[("model_a", "scenario_a", "region_a"),
("model_b", "scenario_b", "region_b")],
append=True)

The IamDataFrame must contain the following variables, otherwise the method
will return None:
.. list-table::
- Required Variables
- Population
- GDP (MER or PPP)
- Final Energy
- Primary Energy
- Primary Energy|Coal
- Primary Energy|Oil
- Primary Energy|Gas
- Emissions|CO2|Industrial Processes
- Emissions|CO2|Carbon Capture and Storage
- Emissions|CO2|Carbon Capture and Storage|Biomass
- Emissions|CO2|Fossil Fuels and Industry
- Emissions|CO2|AFOLU
- Carbon Sequestration|CCS|Fossil|Energy
- Carbon Sequestration|CCS|Fossil|Industrial Processes
- Carbon Sequestration|CCS|Biomass|Energy
- Carbon Sequestration|CCS|Biomass|Industrial Processes

"""
valid_scenarios = _validate_kaya_scenario_args(scenarios=scenarios)
if valid_scenarios is None:
return None
kaya_variables_frame = kaya_variables.kaya_variables(self._df, valid_scenarios)
if kaya_variables_frame is None:
return None
if append:
self._df.append(
_find_non_duplicate_rows(self._df, kaya_variables_frame), inplace=True
)

return kaya_variables_frame

def kaya_factors(self, scenarios, append=False):
"""Compute the Kaya factors needed to compute factors
for the Kaya Decomposition Analysis.
Comment thread
zacharyschmidt marked this conversation as resolved.
Outdated

Parameters
----------
scenarios : iterable of tuples (model, scenario, region)
The (model, scenario, region) combinations to be included.
append : bool, optional
Whether to append computed timeseries data to this instance.

Returns
-------
:class:`IamDataFrame` or **None**
Computed timeseries data or None if `append=True`.

Notes
-----

Example of calling the method:

.. code-block:: python

df.compute.kaya_factors(scenarios=[("model_a", "scenario_a", "region_a"),
("model_b", "scenario_b", "region_b")],
append=True)

The IamDataFrame must contain the following variables, otherwise the method
will return None:
.. list-table::
- Required Variables
- Population
- GDP (MER or PPP)
- Final Energy
- Primary Energy
- Primary Energy|Coal
- Primary Energy|Oil
- Primary Energy|Gas
- Emissions|CO2|Industrial Processes
- Emissions|CO2|Carbon Capture and Storage
- Emissions|CO2|Carbon Capture and Storage|Biomass
- Emissions|CO2|Fossil Fuels and Industry
- Emissions|CO2|AFOLU
- Carbon Sequestration|CCS|Fossil|Energy
- Carbon Sequestration|CCS|Fossil|Industrial Processes
- Carbon Sequestration|CCS|Biomass|Energy
- Carbon Sequestration|CCS|Biomass|Industrial Processes

"""
valid_scenarios = _validate_kaya_scenario_args(scenarios=scenarios)
if valid_scenarios is None:
return None
kaya_variables = self.kaya_variables(valid_scenarios, append=False)
if kaya_variables is None:
return None
kaya_factors_frame = kaya_factors.kaya_factors(kaya_variables, valid_scenarios)
if kaya_factors_frame is None:
return None
if append:
self._df.append(
_find_non_duplicate_rows(self._df, kaya_factors_frame), inplace=True
)
return kaya_factors_frame

def kaya_lmdi(self, ref_scenario, int_scenario, append=False):
"""Calculate the logarithmic mean Divisia index (LMDI) decomposition
using Kaya factors.

Parameters
----------
ref_scenario : tuple of strings (model, scenario, region)
The (model, scenario, region) to be used as the reference scenario
in the LMDI calculation.
int_scenario : tuple of strings (model, scenario, region)
The (model, scenario, region) to be used as the intervention scenario
in the LMDI calculation.
append : bool, optional
Whether to append computed timeseries data to this instance.

Returns
-------
:class:`IamDataFrame` or **None**
Computed timeseries data or None if `append=True`.

Notes
-----

Example of calling the method:

.. code-block:: python

df.compute.kaya_lmdi(ref_scenario=("model_a", "scenario_a", "region_a"),
int_scenario=("model_b", "scenario_b", "region_b"),
append=True)

The IamDataFrame must contain the following variables, otherwise the method
will return None:
.. list-table::
- Required Variables
- Population
- GDP (MER or PPP)
- Final Energy
- Primary Energy
- Primary Energy|Coal
- Primary Energy|Oil
- Primary Energy|Gas
- Emissions|CO2|Industrial Processes
- Emissions|CO2|Carbon Capture and Storage
- Emissions|CO2|Carbon Capture and Storage|Biomass
- Emissions|CO2|Fossil Fuels and Industry
- Emissions|CO2|AFOLU
- Carbon Sequestration|CCS|Fossil|Energy
- Carbon Sequestration|CCS|Fossil|Industrial Processes
- Carbon Sequestration|CCS|Biomass|Energy
- Carbon Sequestration|CCS|Biomass|Industrial Processes

The model, scenario, and region fields for the results dataframe will be
concatenated values from the reference and intervention scenarios in the
form reference_scenario_value::intervention_scenario_value.

Example results data:

model scenario region variable unit year value
model_a::model_a scen_a::scen_b World::World FE/GNP (LMDI) unknown 2010 1.321788
model_a::model_a scen_a::scen_b World::World GNP/P (LMDI) unknown 2010 0.000000
model_a::model_a scen_a::scen_b World::World PEDEq/FE (LMDI) unknown 2010 0.816780
model_a::model_a scen_a::scen_b World::World PEFF/PEDEq (LMDI) unknown 2010 0.000000
model_a::model_a scen_a::scen_b World::World Population (LMDI) unknown 2010 0.000000
model_a::model_a scen_a::scen_b World::World TFC/PEFF (LMDI) unknown 2010 4.853221

"""
valid_ref_and_int_scenarios = _validate_kaya_scenario_args(
scenarios=[ref_scenario, int_scenario]
)
# we must have two different scenarios to calculate kaya_lmdi
if (valid_ref_and_int_scenarios is None) or (
len(valid_ref_and_int_scenarios) != 2
):
return None
kaya_factors = self.kaya_factors(valid_ref_and_int_scenarios, append=False)
if kaya_factors is None:
return None
kaya_lmdi_frame = lmdi.corrected_lmdi(kaya_factors, ref_scenario, int_scenario)
if kaya_lmdi_frame is None:
return None
if append:
self._df.append(
_find_non_duplicate_rows(self._df, kaya_lmdi_frame), inplace=True
)
return kaya_lmdi_frame


def _validate_kaya_scenario_args(scenarios):
validated_scenarios = []
for scenario in scenarios:
if (len(scenario) == 3) and _kaya_args_are_strings(scenario):
validated_scenarios.append(scenario)
# don't recalculate for identical scenarios
unique_scenarios = set(scenarios)
if len(unique_scenarios) == 0:
return None
return validated_scenarios


def _kaya_args_are_strings(scenario):
for arg in scenario:
if not isinstance(arg, str):
return False
return True


def _find_non_duplicate_rows(original_df, variables_to_add):
variables_for_append = pyam.IamDataFrame(
variables_to_add.as_pandas(meta_cols=False)
.merge(original_df.as_pandas(meta_cols=False), how="left", indicator=True)
.query('_merge=="left_only"')
.drop(columns="_merge")
)
return variables_for_append


def _compute_learning_rate(x, performance, experience):
"""Internal implementation for computing implicit learning rate from timeseries data
Expand Down
17 changes: 17 additions & 0 deletions pyam/kaya/input_variable_names.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
POPULATION = "Population"
GDP_MER = "GDP|MER"
GDP_PPP = "GDP|PPP"
FINAL_ENERGY = "Final Energy"
PRIMARY_ENERGY = "Primary Energy"
PRIMARY_ENERGY_COAL = "Primary Energy|Coal"
PRIMARY_ENERGY_OIL = "Primary Energy|Oil"
PRIMARY_ENERGY_GAS = "Primary Energy|Gas"
EMISSIONS_CO2_INDUSTRIAL_PROCESSES = "Emissions|CO2|Industrial Processes"
EMISSIONS_CO2_CCS = "Emissions|CO2|Carbon Capture and Storage"
EMISSIONS_CO2_CCS_BIOMASS = "Emissions|CO2|Carbon Capture and Storage|Biomass"
EMISSIONS_CO2_FOSSIL_FUELS_AND_INDUSTRY = "Emissions|CO2|Fossil Fuels and Industry"
EMISSIONS_CO2_AFOLU = "Emissions|CO2|AFOLU"
CCS_FOSSIL_ENERGY = "Carbon Sequestration|CCS|Fossil|Energy"
CCS_FOSSIL_INDUSTRY = "Carbon Sequestration|CCS|Fossil|Industrial Processes"
CCS_BIOMASS_ENERGY = "Carbon Sequestration|CCS|Biomass|Energy"
CCS_BIOMASS_INDUSTRY = "Carbon Sequestration|CCS|Biomass|Industrial Processes"
6 changes: 6 additions & 0 deletions pyam/kaya/kaya_factor_names.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
GNP_per_P = "GNP/P"
FE_per_GNP = "FE/GNP"
PEdeq_per_FE = "PEDEq/FE"
PEFF_per_PEDEq = "PEFF/PEDEq"
TFC_per_PEFF = "TFC/PEFF"
NFC_per_TFC = "NFC/TFC"
89 changes: 89 additions & 0 deletions pyam/kaya/kaya_factors.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,89 @@
from functools import reduce

from pyam.kaya import input_variable_names, kaya_factor_names, kaya_variable_names


def kaya_factors(kaya_variables_frame, scenarios):
kaya_factors_frames = []
for scenario in scenarios:
input = kaya_variables_frame.filter(
model=scenario[0], scenario=scenario[1], region=scenario[2]
)
if input.empty:
break
kaya_factors_frames.append(_calc_gnp_per_p(input))
Comment thread
zacharyschmidt marked this conversation as resolved.
Outdated
kaya_factors_frames.append(_calc_fe_per_gnp(input))
kaya_factors_frames.append(_calc_pedeq_per_fe(input))
kaya_factors_frames.append(_calc_peff_per_pedeq(input))
kaya_factors_frames.append(_calc_tfc_per_peff(input))
kaya_factors_frames.append(_calc_nfc_per_tfc(input))
kaya_factors_frames.append(
input.filter(
variable=[kaya_variable_names.TFC, input_variable_names.POPULATION]
)
)
if len(kaya_factors_frames) == 0:
return None
return reduce(lambda x, y: x.append(y), kaya_factors_frames)


def _calc_gnp_per_p(input_data):
variable = input_variable_names.GDP_PPP
if input_data.filter(variable=variable).empty:
variable = input_variable_names.GDP_MER
return input_data.divide(
variable,
input_variable_names.POPULATION,
kaya_factor_names.GNP_per_P,
append=False,
)


def _calc_fe_per_gnp(input_data):
variable = input_variable_names.GDP_PPP
if input_data.filter(variable=variable).empty:
variable = input_variable_names.GDP_MER
return input_data.divide(
input_variable_names.FINAL_ENERGY,
variable,
kaya_factor_names.FE_per_GNP,
append=False,
)


def _calc_pedeq_per_fe(input_data):
return input_data.divide(
input_variable_names.PRIMARY_ENERGY,
input_variable_names.FINAL_ENERGY,
kaya_factor_names.PEdeq_per_FE,
append=False,
)


def _calc_peff_per_pedeq(input_data):
return input_data.divide(
kaya_variable_names.PRIMARY_ENERGY_FF,
input_variable_names.PRIMARY_ENERGY,
kaya_factor_names.PEFF_per_PEDEq,
append=False,
)


def _calc_tfc_per_peff(input_data):
return input_data.divide(
kaya_variable_names.TFC,
kaya_variable_names.PRIMARY_ENERGY_FF,
kaya_factor_names.TFC_per_PEFF,
ignore_units="Mt CO2/EJ",
append=False,
)


def _calc_nfc_per_tfc(input_data):
return input_data.divide(
kaya_variable_names.NFC,
kaya_variable_names.TFC,
kaya_factor_names.NFC_per_TFC,
ignore_units=True,
append=False,
).rename(unit={"unknown": ""})
Comment thread
zacharyschmidt marked this conversation as resolved.
Outdated
3 changes: 3 additions & 0 deletions pyam/kaya/kaya_variable_names.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
PRIMARY_ENERGY_FF = "Primary Energy|Fossil"
TFC = "Total Fossil Carbon"
NFC = "Net Fossil Carbon"
Loading