-
Notifications
You must be signed in to change notification settings - Fork 129
Add kaya_variables, kaya_factors, and kaya_lmdi methods to the comput… #884
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from 1 commit
b741dbb
734d3c0
612e8f6
e7a7797
491b8ec
ae90ca8
e0ef67d
c9438db
5557830
126283a
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -4,8 +4,10 @@ | |
| import pandas as pd | ||
| import wquantiles | ||
|
|
||
| import pyam | ||
| from pyam._debiasing import _compute_bias | ||
| from pyam.index import replace_index_values | ||
| from pyam.kaya import kaya_factors, kaya_variables, lmdi | ||
| from pyam.timeseries import growth_rate | ||
| from pyam.utils import remove_from_list | ||
|
|
||
|
|
@@ -249,6 +251,248 @@ def bias(self, name, method, axis): | |
| """ | ||
| _compute_bias(self._df, name, method, axis) | ||
|
|
||
| def kaya_variables(self, scenarios, append=False): | ||
| """Compute the variables needed to compute Kaya factors | ||
| for the Kaya Decomposition Analysis. | ||
|
zacharyschmidt marked this conversation as resolved.
Outdated
|
||
|
|
||
| Parameters | ||
| ---------- | ||
| scenarios : iterable of tuples (model, scenario, region) | ||
| The (model, scenario, region) combinations to be included. | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Wondering about two issues with the scenarios filter:
I'm concerned about adding an argument "scenarios" that is actually a combination of model-scenario-region, that's quite confusing...
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I removed the scenarios filter and followed suggestion 1--we'll try to compute kaya-variables for all scenario/model/region combinations in the dataframe. I didn't try to implement returning partial sets of kaya-variables for scenario/model/region combinations that have only partial input data (though, if you think this is an important feature I'll give it a try). There's a new helper function in the kaya_variables module, _is_input_data_incomplete(), that relies on the require_data() method to make sure the data is complete enough so that none of the arithmetic operations in later processing will throw errors. require_data() only considers model/scenario combinations (it does not care if all variables are present in all regions), but I want to alert the user when regions without full input data are present, so there is extra logic in _is_input_data_incomplete() to log this information. |
||
| append : bool, optional | ||
| Whether to append computed timeseries data to this instance. | ||
|
|
||
| Returns | ||
| ------- | ||
| :class:`IamDataFrame` or **None** | ||
| Computed timeseries data or None if `append=True`. | ||
|
|
||
| Notes | ||
| ----- | ||
|
|
||
| Example of calling the method: | ||
|
|
||
| .. code-block:: python | ||
|
|
||
| df.compute.kaya_variables(scenarios=[("model_a", "scenario_a", "region_a"), | ||
| ("model_b", "scenario_b", "region_b")], | ||
| append=True) | ||
|
|
||
| The IamDataFrame must contain the following variables, otherwise the method | ||
| will return None: | ||
| .. list-table:: | ||
| - Required Variables | ||
| - Population | ||
| - GDP (MER or PPP) | ||
| - Final Energy | ||
| - Primary Energy | ||
| - Primary Energy|Coal | ||
| - Primary Energy|Oil | ||
| - Primary Energy|Gas | ||
| - Emissions|CO2|Industrial Processes | ||
| - Emissions|CO2|Carbon Capture and Storage | ||
| - Emissions|CO2|Carbon Capture and Storage|Biomass | ||
| - Emissions|CO2|Fossil Fuels and Industry | ||
| - Emissions|CO2|AFOLU | ||
| - Carbon Sequestration|CCS|Fossil|Energy | ||
| - Carbon Sequestration|CCS|Fossil|Industrial Processes | ||
| - Carbon Sequestration|CCS|Biomass|Energy | ||
| - Carbon Sequestration|CCS|Biomass|Industrial Processes | ||
|
|
||
| """ | ||
| valid_scenarios = _validate_kaya_scenario_args(scenarios=scenarios) | ||
| if valid_scenarios is None: | ||
| return None | ||
| kaya_variables_frame = kaya_variables.kaya_variables(self._df, valid_scenarios) | ||
| if kaya_variables_frame is None: | ||
| return None | ||
| if append: | ||
| self._df.append( | ||
| _find_non_duplicate_rows(self._df, kaya_variables_frame), inplace=True | ||
| ) | ||
|
|
||
| return kaya_variables_frame | ||
|
|
||
| def kaya_factors(self, scenarios, append=False): | ||
| """Compute the Kaya factors needed to compute factors | ||
| for the Kaya Decomposition Analysis. | ||
|
zacharyschmidt marked this conversation as resolved.
Outdated
|
||
|
|
||
| Parameters | ||
| ---------- | ||
| scenarios : iterable of tuples (model, scenario, region) | ||
| The (model, scenario, region) combinations to be included. | ||
| append : bool, optional | ||
| Whether to append computed timeseries data to this instance. | ||
|
|
||
| Returns | ||
| ------- | ||
| :class:`IamDataFrame` or **None** | ||
| Computed timeseries data or None if `append=True`. | ||
|
|
||
| Notes | ||
| ----- | ||
|
|
||
| Example of calling the method: | ||
|
|
||
| .. code-block:: python | ||
|
|
||
| df.compute.kaya_factors(scenarios=[("model_a", "scenario_a", "region_a"), | ||
| ("model_b", "scenario_b", "region_b")], | ||
| append=True) | ||
|
|
||
| The IamDataFrame must contain the following variables, otherwise the method | ||
| will return None: | ||
| .. list-table:: | ||
| - Required Variables | ||
| - Population | ||
| - GDP (MER or PPP) | ||
| - Final Energy | ||
| - Primary Energy | ||
| - Primary Energy|Coal | ||
| - Primary Energy|Oil | ||
| - Primary Energy|Gas | ||
| - Emissions|CO2|Industrial Processes | ||
| - Emissions|CO2|Carbon Capture and Storage | ||
| - Emissions|CO2|Carbon Capture and Storage|Biomass | ||
| - Emissions|CO2|Fossil Fuels and Industry | ||
| - Emissions|CO2|AFOLU | ||
| - Carbon Sequestration|CCS|Fossil|Energy | ||
| - Carbon Sequestration|CCS|Fossil|Industrial Processes | ||
| - Carbon Sequestration|CCS|Biomass|Energy | ||
| - Carbon Sequestration|CCS|Biomass|Industrial Processes | ||
|
|
||
| """ | ||
| valid_scenarios = _validate_kaya_scenario_args(scenarios=scenarios) | ||
| if valid_scenarios is None: | ||
| return None | ||
| kaya_variables = self.kaya_variables(valid_scenarios, append=False) | ||
| if kaya_variables is None: | ||
| return None | ||
| kaya_factors_frame = kaya_factors.kaya_factors(kaya_variables, valid_scenarios) | ||
| if kaya_factors_frame is None: | ||
| return None | ||
| if append: | ||
| self._df.append( | ||
| _find_non_duplicate_rows(self._df, kaya_factors_frame), inplace=True | ||
| ) | ||
| return kaya_factors_frame | ||
|
|
||
| def kaya_lmdi(self, ref_scenario, int_scenario, append=False): | ||
| """Calculate the logarithmic mean Divisia index (LMDI) decomposition | ||
| using Kaya factors. | ||
|
|
||
| Parameters | ||
| ---------- | ||
| ref_scenario : tuple of strings (model, scenario, region) | ||
| The (model, scenario, region) to be used as the reference scenario | ||
| in the LMDI calculation. | ||
| int_scenario : tuple of strings (model, scenario, region) | ||
| The (model, scenario, region) to be used as the intervention scenario | ||
| in the LMDI calculation. | ||
| append : bool, optional | ||
| Whether to append computed timeseries data to this instance. | ||
|
|
||
| Returns | ||
| ------- | ||
| :class:`IamDataFrame` or **None** | ||
| Computed timeseries data or None if `append=True`. | ||
|
|
||
| Notes | ||
| ----- | ||
|
|
||
| Example of calling the method: | ||
|
|
||
| .. code-block:: python | ||
|
|
||
| df.compute.kaya_lmdi(ref_scenario=("model_a", "scenario_a", "region_a"), | ||
| int_scenario=("model_b", "scenario_b", "region_b"), | ||
| append=True) | ||
|
|
||
| The IamDataFrame must contain the following variables, otherwise the method | ||
| will return None: | ||
| .. list-table:: | ||
| - Required Variables | ||
| - Population | ||
| - GDP (MER or PPP) | ||
| - Final Energy | ||
| - Primary Energy | ||
| - Primary Energy|Coal | ||
| - Primary Energy|Oil | ||
| - Primary Energy|Gas | ||
| - Emissions|CO2|Industrial Processes | ||
| - Emissions|CO2|Carbon Capture and Storage | ||
| - Emissions|CO2|Carbon Capture and Storage|Biomass | ||
| - Emissions|CO2|Fossil Fuels and Industry | ||
| - Emissions|CO2|AFOLU | ||
| - Carbon Sequestration|CCS|Fossil|Energy | ||
| - Carbon Sequestration|CCS|Fossil|Industrial Processes | ||
| - Carbon Sequestration|CCS|Biomass|Energy | ||
| - Carbon Sequestration|CCS|Biomass|Industrial Processes | ||
|
|
||
| The model, scenario, and region fields for the results dataframe will be | ||
| concatenated values from the reference and intervention scenarios in the | ||
| form reference_scenario_value::intervention_scenario_value. | ||
|
|
||
| Example results data: | ||
|
|
||
| model scenario region variable unit year value | ||
| model_a::model_a scen_a::scen_b World::World FE/GNP (LMDI) unknown 2010 1.321788 | ||
| model_a::model_a scen_a::scen_b World::World GNP/P (LMDI) unknown 2010 0.000000 | ||
| model_a::model_a scen_a::scen_b World::World PEDEq/FE (LMDI) unknown 2010 0.816780 | ||
| model_a::model_a scen_a::scen_b World::World PEFF/PEDEq (LMDI) unknown 2010 0.000000 | ||
| model_a::model_a scen_a::scen_b World::World Population (LMDI) unknown 2010 0.000000 | ||
| model_a::model_a scen_a::scen_b World::World TFC/PEFF (LMDI) unknown 2010 4.853221 | ||
|
|
||
| """ | ||
| valid_ref_and_int_scenarios = _validate_kaya_scenario_args( | ||
| scenarios=[ref_scenario, int_scenario] | ||
| ) | ||
| # we must have two different scenarios to calculate kaya_lmdi | ||
| if (valid_ref_and_int_scenarios is None) or ( | ||
| len(valid_ref_and_int_scenarios) != 2 | ||
| ): | ||
| return None | ||
| kaya_factors = self.kaya_factors(valid_ref_and_int_scenarios, append=False) | ||
| if kaya_factors is None: | ||
| return None | ||
| kaya_lmdi_frame = lmdi.corrected_lmdi(kaya_factors, ref_scenario, int_scenario) | ||
| if kaya_lmdi_frame is None: | ||
| return None | ||
| if append: | ||
| self._df.append( | ||
| _find_non_duplicate_rows(self._df, kaya_lmdi_frame), inplace=True | ||
| ) | ||
| return kaya_lmdi_frame | ||
|
|
||
|
|
||
| def _validate_kaya_scenario_args(scenarios): | ||
| validated_scenarios = [] | ||
| for scenario in scenarios: | ||
| if (len(scenario) == 3) and _kaya_args_are_strings(scenario): | ||
| validated_scenarios.append(scenario) | ||
| # don't recalculate for identical scenarios | ||
| unique_scenarios = set(scenarios) | ||
| if len(unique_scenarios) == 0: | ||
| return None | ||
| return validated_scenarios | ||
|
|
||
|
|
||
| def _kaya_args_are_strings(scenario): | ||
| for arg in scenario: | ||
| if not isinstance(arg, str): | ||
| return False | ||
| return True | ||
|
|
||
|
|
||
| def _find_non_duplicate_rows(original_df, variables_to_add): | ||
| variables_for_append = pyam.IamDataFrame( | ||
| variables_to_add.as_pandas(meta_cols=False) | ||
| .merge(original_df.as_pandas(meta_cols=False), how="left", indicator=True) | ||
| .query('_merge=="left_only"') | ||
| .drop(columns="_merge") | ||
| ) | ||
| return variables_for_append | ||
|
|
||
|
|
||
| def _compute_learning_rate(x, performance, experience): | ||
| """Internal implementation for computing implicit learning rate from timeseries data | ||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,17 @@ | ||
| POPULATION = "Population" | ||
| GDP_MER = "GDP|MER" | ||
| GDP_PPP = "GDP|PPP" | ||
| FINAL_ENERGY = "Final Energy" | ||
| PRIMARY_ENERGY = "Primary Energy" | ||
| PRIMARY_ENERGY_COAL = "Primary Energy|Coal" | ||
| PRIMARY_ENERGY_OIL = "Primary Energy|Oil" | ||
| PRIMARY_ENERGY_GAS = "Primary Energy|Gas" | ||
| EMISSIONS_CO2_INDUSTRIAL_PROCESSES = "Emissions|CO2|Industrial Processes" | ||
| EMISSIONS_CO2_CCS = "Emissions|CO2|Carbon Capture and Storage" | ||
| EMISSIONS_CO2_CCS_BIOMASS = "Emissions|CO2|Carbon Capture and Storage|Biomass" | ||
| EMISSIONS_CO2_FOSSIL_FUELS_AND_INDUSTRY = "Emissions|CO2|Fossil Fuels and Industry" | ||
| EMISSIONS_CO2_AFOLU = "Emissions|CO2|AFOLU" | ||
| CCS_FOSSIL_ENERGY = "Carbon Sequestration|CCS|Fossil|Energy" | ||
| CCS_FOSSIL_INDUSTRY = "Carbon Sequestration|CCS|Fossil|Industrial Processes" | ||
| CCS_BIOMASS_ENERGY = "Carbon Sequestration|CCS|Biomass|Energy" | ||
| CCS_BIOMASS_INDUSTRY = "Carbon Sequestration|CCS|Biomass|Industrial Processes" |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,6 @@ | ||
| GNP_per_P = "GNP/P" | ||
| FE_per_GNP = "FE/GNP" | ||
| PEdeq_per_FE = "PEDEq/FE" | ||
| PEFF_per_PEDEq = "PEFF/PEDEq" | ||
| TFC_per_PEFF = "TFC/PEFF" | ||
| NFC_per_TFC = "NFC/TFC" |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,89 @@ | ||
| from functools import reduce | ||
|
|
||
| from pyam.kaya import input_variable_names, kaya_factor_names, kaya_variable_names | ||
|
|
||
|
|
||
| def kaya_factors(kaya_variables_frame, scenarios): | ||
| kaya_factors_frames = [] | ||
| for scenario in scenarios: | ||
| input = kaya_variables_frame.filter( | ||
| model=scenario[0], scenario=scenario[1], region=scenario[2] | ||
| ) | ||
| if input.empty: | ||
| break | ||
| kaya_factors_frames.append(_calc_gnp_per_p(input)) | ||
|
zacharyschmidt marked this conversation as resolved.
Outdated
|
||
| kaya_factors_frames.append(_calc_fe_per_gnp(input)) | ||
| kaya_factors_frames.append(_calc_pedeq_per_fe(input)) | ||
| kaya_factors_frames.append(_calc_peff_per_pedeq(input)) | ||
| kaya_factors_frames.append(_calc_tfc_per_peff(input)) | ||
| kaya_factors_frames.append(_calc_nfc_per_tfc(input)) | ||
| kaya_factors_frames.append( | ||
| input.filter( | ||
| variable=[kaya_variable_names.TFC, input_variable_names.POPULATION] | ||
| ) | ||
| ) | ||
| if len(kaya_factors_frames) == 0: | ||
| return None | ||
| return reduce(lambda x, y: x.append(y), kaya_factors_frames) | ||
|
|
||
|
|
||
| def _calc_gnp_per_p(input_data): | ||
| variable = input_variable_names.GDP_PPP | ||
| if input_data.filter(variable=variable).empty: | ||
| variable = input_variable_names.GDP_MER | ||
| return input_data.divide( | ||
| variable, | ||
| input_variable_names.POPULATION, | ||
| kaya_factor_names.GNP_per_P, | ||
| append=False, | ||
| ) | ||
|
|
||
|
|
||
| def _calc_fe_per_gnp(input_data): | ||
| variable = input_variable_names.GDP_PPP | ||
| if input_data.filter(variable=variable).empty: | ||
| variable = input_variable_names.GDP_MER | ||
| return input_data.divide( | ||
| input_variable_names.FINAL_ENERGY, | ||
| variable, | ||
| kaya_factor_names.FE_per_GNP, | ||
| append=False, | ||
| ) | ||
|
|
||
|
|
||
| def _calc_pedeq_per_fe(input_data): | ||
| return input_data.divide( | ||
| input_variable_names.PRIMARY_ENERGY, | ||
| input_variable_names.FINAL_ENERGY, | ||
| kaya_factor_names.PEdeq_per_FE, | ||
| append=False, | ||
| ) | ||
|
|
||
|
|
||
| def _calc_peff_per_pedeq(input_data): | ||
| return input_data.divide( | ||
| kaya_variable_names.PRIMARY_ENERGY_FF, | ||
| input_variable_names.PRIMARY_ENERGY, | ||
| kaya_factor_names.PEFF_per_PEDEq, | ||
| append=False, | ||
| ) | ||
|
|
||
|
|
||
| def _calc_tfc_per_peff(input_data): | ||
| return input_data.divide( | ||
| kaya_variable_names.TFC, | ||
| kaya_variable_names.PRIMARY_ENERGY_FF, | ||
| kaya_factor_names.TFC_per_PEFF, | ||
| ignore_units="Mt CO2/EJ", | ||
| append=False, | ||
| ) | ||
|
|
||
|
|
||
| def _calc_nfc_per_tfc(input_data): | ||
| return input_data.divide( | ||
| kaya_variable_names.NFC, | ||
| kaya_variable_names.TFC, | ||
| kaya_factor_names.NFC_per_TFC, | ||
| ignore_units=True, | ||
| append=False, | ||
| ).rename(unit={"unknown": ""}) | ||
|
zacharyschmidt marked this conversation as resolved.
Outdated
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,3 @@ | ||
| PRIMARY_ENERGY_FF = "Primary Energy|Fossil" | ||
| TFC = "Total Fossil Carbon" | ||
| NFC = "Net Fossil Carbon" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's a method
kaya_variablesand an imported module with the same name, which then also has a method of that name... Hard to follow the code logic here.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I changed the function names to
compute_kaya_variablesandcompute_kaya_factors. Now compute.py has methodskaya_variablesandkaya_factorswhich call functionscompute_kaya_variablesandcompute_kaya_factors. The functions are imported from the kaya_variables.py and kaya_factors.py modules, so those names are still duplicated. Let me know if this is still unclear and I'll make more changes.