Initial import: Azure SAP zone alignment Pacemaker/OCF resource agent#2122
Initial import: Azure SAP zone alignment Pacemaker/OCF resource agent#2122sanoop-t wants to merge 3 commits intoClusterLabs:mainfrom
Conversation
|
Can one of the project admins check and authorise this run please: https://ci.kronosnet.org/job/resource-agents/job/resource-agents-pipeline/job/PR-2122/1/input |
|
Hi @oalbrigt I noticed the PR was moved to draft a few days ago. I didn’t see any comments, so I just wanted to check if there’s anything specific, you’d like me to address or if more work is needed before review. Happy to make updates. Thanks! |
|
We are discussing where it would make most sense to host this. If it would be with the SAP agents, or in this repo. I'll come back to you when we have made a decission. |
|
@oalbrigt Hello, please let me know if there are any updates from your discussion. |
|
@oalbrigt Hello, following up on this. Let me know if there are any updates from your discussion. |
|
I'm still waiting for an update from our internal discussion, and have asked for an update again. |
|
@oalbrigt Hello, let me know if there are any updates from your discussion. |
|
@sanoop-t We've decided to merge it here. So I'll do a full review next week. From a quick glance you have to add it to the See https://github.com/ClusterLabs/resource-agents/blob/main/doc/dev-guides/ra-dev-guide.asc#submitting-resource-agents for more info. |
|
Can one of the project admins check and authorise this run please: https://haci.fast.eng.rdu2.dc.redhat.com/job/resource-agents/job/resource-agents-pipeline/job/PR-2122/2/input |
Add azure-sap-zone to configure.ac, heartbeat/Makefile.am, doc/man/Makefile.am, and .gitignore following the existing azure-events pattern. The agent is conditionally built when Python 3.6+ is available.
2cfa0e7 to
1060cc6
Compare
|
Can one of the project admins check and authorise this run please: https://haci.fast.eng.rdu2.dc.redhat.com/job/resource-agents/job/resource-agents-pipeline/job/PR-2122/3/input |
|
Thanks @oalbrigt! I've updated the PR with the build system integration:
All following the existing azure-events pattern. Ready for your full review whenever you get a chance. |
|
Great. Thanks. |
| # Pacemaker/OCF tooling does not always populate OCF_ROOT/OCF_FUNCTIONS_DIR when querying meta-data. | ||
| # Fall back to common distro paths so the `ocf` Python helper can be imported reliably. | ||
| _ocf_root = os.environ.get("OCF_ROOT") or "/usr/lib/ocf" | ||
| OCF_FUNCTIONS_DIR = os.environ.get("OCF_FUNCTIONS_DIR") or os.path.join(_ocf_root, "lib", "heartbeat") |
There was a problem hiding this comment.
Let's use the simpler approach that all our agents use (I've not had anyone copmlain about the default path yet):
https://github.com/ClusterLabs/resource-agents/blob/main/doc/dev-guides/writing-python-agents.md#run-loop-and-metadata-example
| sys.path.append(OCF_FUNCTIONS_DIR) | ||
| try: | ||
| import ocf # type: ignore | ||
| except Exception: # pragma: no cover |
There was a problem hiding this comment.
Let's get rid of this and any other duplicate code (use logger(), ocf_exit_reason(), is_probe(), etc from ocf.py. No reason to have a fallback anymore now that the agent will be in the repository with the OCF library.
When you've done that I'll do a full review of the agent.
There was a problem hiding this comment.
@oalbrigt Done! I've updated the agent to use the standard ocf.py library:
- Replaced the custom OCF path-searching and _OCFShim fallback class with the standard 3-line import pattern used by other agents in the repo
- Removed the custom _is_probe_operation() function; now uses ocf.is_probe()
- Adjusted validate_action to handle string parameter values from the ocf.py run loop
Successfully validated the changes through our testing cycle. Ready for your full review.
Remove the _OCFShim fallback class and custom OCF path-searching logic. Use the standard ocf.py import pattern consistent with other Python agents in the repository. Remove the custom _is_probe_operation() function in favor of ocf.is_probe(). Adjust validate_action to handle string parameter values from the ocf.py run loop.
|
Can one of the project admins check and authorise this run please: https://haci.fast.eng.rdu2.dc.redhat.com/job/resource-agents/job/resource-agents-pipeline/job/PR-2122/4/input |
PR Summary
This PR adds a new Python-based OCF resource agent under
heartbeat/namedazure-sap-zone. The agent helps align SAP application-server activity with the active (PROMOTED) HANA node’s Azure Availability Zone (or a logical “zone group” for non-zonal/PPG deployments), to reduce cross-zone latency and to automate app-tier switching during HANA failover.Key behavior
hana_vm_zonesmapping (non-zonal/PPG deployments).stop_vms=false: deactivates SAP (passive mode) without stopping VMs.stop_vms=true: stops SAP, waits for shutdown, then deallocates those VMs.Notable parameters
hana_resource(required): Pacemaker HANA resource name used to read status attributes.app_vm_names(comma-separated), orapp_vm_name_pattern(regex), orapp_vm_zones(mapping; can also supply names).hana_vm_zones="hanavm1:1,hanavm2:2"app_vm_zones="sapapp01:1,sapapp02:1,sapapp03:2,..."client_id(optional): user-assigned MI; if omitted, system-assigned MI is used.stop_vms,wait_before_stop_sap,wait_time,soft_shutdown_timeoutretry_count,retry_waitfor ARM API retriesSafety/correctness notes
app_vm_zones/hana_vm_zonesare provided and Azure zone metadata exists, the agent verifies the mapping matches Azure and fails early on mismatch.PowerState/*) from instanceView.Dependencies / packaging
requestsis treated as an optional import someta-datadiscovery can still work; Azure API operations requirepython3-requests.Testing / validation performed
meta-dataoutput generation andvalidate-allparameter validation paths.Notes
#!@PYTHON@ -tt(consistent with other Python agents that are patched/installed by packaging or install instructions).