diff --git a/.claude/skills/testing/SKILL.md b/.claude/skills/testing/SKILL.md index 5df96cfc2..dc3c078d5 100644 --- a/.claude/skills/testing/SKILL.md +++ b/.claude/skills/testing/SKILL.md @@ -7,7 +7,7 @@ description: Testing guide and pre-commit testing strategy for PTO Runtime. Use ## Test Types -1. **Python unit tests (ut-py)** (`tests/ut/`): Standard pytest tests for the Python compilation pipeline and nanobind bindings. Run with `pytest tests/ut`. Tests declaring `@pytest.mark.requires_hardware[("")]` auto-skip unless `--platform` points to a matching device. +1. **Python unit tests (ut-py)** (`tests/ut/py/`): Standard pytest tests for the Python compilation pipeline and nanobind bindings. Run with `pytest tests/ut/py`. Tests declaring `@pytest.mark.requires_hardware[("")]` auto-skip unless `--platform` points to a matching device. 2. **C++ unit tests (ut-cpp)** (`tests/ut/cpp/`): GoogleTest-based tests for pure C++ modules. Run with `cmake -B tests/ut/cpp/build -S tests/ut/cpp && cmake --build tests/ut/cpp/build && ctest --test-dir tests/ut/cpp/build -LE requires_hardware --output-on-failure`. Hardware-required tests carry a `requires_hardware` or `requires_hardware_` ctest label and are filtered via `-LE`. 3. **Scene tests** (`examples/{arch}/*/`, `tests/st/{arch}/*/`): End-to-end `@scene_test` classes declared inside `test_*.py`. Sim variants run cross-platform (Linux/macOS); hardware variants require the CANN toolkit and an Ascend device. Discovery is by pytest (batch) or `python test_*.py` (standalone); `#591`'s parallel orchestrator handles device bin-packing and ChipWorker reuse automatically. diff --git a/docs/ci.md b/docs/ci.md index bed71707c..d2c159c38 100644 --- a/docs/ci.md +++ b/docs/ci.md @@ -112,7 +112,7 @@ Three hardware tiers, applied to all test categories. See [testing.md](testing.m ## Test Sources -### `tests/ut/` — Python unit tests (ut-py) +### `tests/ut/py/` — Python unit tests (ut-py) Python unit tests. Run via pytest, filtered by `--platform` + `requires_hardware` marker. diff --git a/docs/testing.md b/docs/testing.md index 53edf17fd..0c45efa64 100644 --- a/docs/testing.md +++ b/docs/testing.md @@ -60,12 +60,12 @@ Three test categories: | Category | Abbrev | Location | Runner | Description | | -------- | ------ | -------- | ------ | ----------- | | System tests | st | `examples/`, `tests/st/` | pytest (`@scene_test`) or standalone `python test_*.py` | Full end-to-end cases (compile + run + validate) | -| Python unit tests | ut-py | `tests/ut/` | pytest | Unit tests for nanobind-exposed and Python modules | +| Python unit tests | ut-py | `tests/ut/py/` | pytest | Unit tests for nanobind-exposed and Python modules | | C++ unit tests | ut-cpp | `tests/ut/cpp/` | ctest (GoogleTest) | Unit tests for pure C++ modules | ### Choosing ut-py vs ut-cpp -If a module is exposed via nanobind (used by both C++ and Python), test in **ut-py** (`tests/ut/`). +If a module is exposed via nanobind (used by both C++ and Python), test in **ut-py** (`tests/ut/py/`). If a module is pure C++ with no Python binding, test in **ut-cpp** (`tests/ut/cpp/`). ## Scene Test CLI Options @@ -384,9 +384,18 @@ conftest.py # Root: --platform/--device options, ST fixtures ### C++ Unit Tests (`tests/ut/cpp/`) -GoogleTest-based tests for shared components (`src/common/task_interface/` and `src/{arch}/runtime/common/`): +See [ut-test-suite.md](ut-test-suite.md) for the full per-file coverage +reference (370+ test cases across 52 files). -- `test_data_type.cpp` — DataType enum, get_element_size(), get_dtype_name() +GoogleTest-based tests organized by component: + +| Subdirectory | Component under test | +| ------------ | -------------------- | +| `pto2_a2a3/` | PTO2 a2a3 on-chip runtime (`src/a2a3/runtime/tensormap_and_ringbuffer/`) | +| `pto2_a5/` | PTO2 a5 on-chip runtime (`src/a5/runtime/tensormap_and_ringbuffer/`) | +| `hierarchical/` | Host-side hierarchical runtime (`src/common/hierarchical/`) | +| `types/` | Cross-cutting types (`src/common/task_interface/`, `pto_types.h`) | +| `hardware/` | Tests requiring Ascend hardware (`comm_hccl`, etc.) | ```bash cmake -B tests/ut/cpp/build -S tests/ut/cpp @@ -394,7 +403,7 @@ cmake --build tests/ut/cpp/build ctest --test-dir tests/ut/cpp/build --output-on-failure ``` -### Python Unit Tests (`tests/ut/`) +### Python Unit Tests (`tests/ut/py/`) Tests for the nanobind extension and the Python build pipeline: @@ -403,10 +412,10 @@ Tests for the nanobind extension and the Python build pipeline: ```bash # No-hardware runner (hw tests auto-skip, no-hw tests run) -pytest tests/ut +pytest tests/ut/py # a2a3 hardware runner (no-hw tests skip, hw + a2a3-specific tests run) -pytest tests/ut --platform a2a3 +pytest tests/ut/py --platform a2a3 ``` ### Examples (`examples/{arch}/`) @@ -434,21 +443,26 @@ Hardware-only scene tests for large-scale and feature-rich scenarios that are to ### New C++ Unit Test -Add a new test file to `tests/ut/cpp/` and register it in `tests/ut/cpp/CMakeLists.txt`: +Add a new test file to the appropriate subdirectory under `tests/ut/cpp/` and register it in `tests/ut/cpp/CMakeLists.txt`: + +| Component | Subdirectory | Helper function | +| --------- | ------------ | --------------- | +| PTO2 a2a3 header-only | `pto2_a2a3/` | `add_a2a3_pto2_test` | +| PTO2 a2a3 runtime-linked | `pto2_a2a3/` | `add_a2a3_pto2_runtime_test` | +| PTO2 a5 | `pto2_a5/` | `add_a5_pto2_test` | +| Hierarchical host runtime | `hierarchical/` | `add_hierarchical_test` | +| Task interface types | `types/` | `add_task_interface_test` | +| Hardware (CANN) | `hardware/` | `add_comm_api_test` | ```cmake -add_executable(test_my_component - test_my_component.cpp - test_stubs.cpp +# Example: header-only PTO2 a2a3 test +add_a2a3_pto2_test(test_my_component pto2_a2a3/test_my_component.cpp) + +# Example: runtime-linked PTO2 a2a3 test +add_a2a3_pto2_runtime_test(test_my_component + SOURCES pto2_a2a3/test_my_component.cpp + EXTRA_SOURCES ${PTO2_RUNTIME_SOURCES} ) -target_include_directories(test_my_component PRIVATE ${COMMON_DIR} ${TMR_RUNTIME_DIR} ${PLATFORM_INCLUDE_DIR}) -target_link_libraries(test_my_component gtest_main) -add_test(NAME test_my_component COMMAND test_my_component) - -# If hardware required: -# set_tests_properties(test_my_component PROPERTIES LABELS "requires_hardware") -# If specific platform required: -# set_tests_properties(test_my_component PROPERTIES LABELS "requires_hardware_a2a3") ``` #### C++ hardware tests needing NPU devices diff --git a/docs/ut-test-suite.md b/docs/ut-test-suite.md new file mode 100644 index 000000000..c1c66d208 --- /dev/null +++ b/docs/ut-test-suite.md @@ -0,0 +1,372 @@ +# Unit Test Suite Reference + +Comprehensive reference for the unit tests under `tests/ut/`. For build +commands, hardware classification, and CI integration see +[testing.md](testing.md) and [ci.md](ci.md). + +## Directory Layout + +```text +tests/ut/ +├── cpp/ # C++ GoogleTest binaries (CMake) +│ ├── CMakeLists.txt # Build orchestration and helper functions +│ ├── test_helpers.h # Shared test utilities +│ ├── stubs/ +│ │ └── test_stubs.cpp # Platform-abstraction stubs (logging, asserts) +│ ├── hierarchical/ # Host-side hierarchical runtime (L0-L6) +│ │ ├── test_tensormap.cpp +│ │ ├── test_ring.cpp +│ │ ├── test_scope.cpp +│ │ ├── test_orchestrator.cpp +│ │ ├── test_scheduler.cpp +│ │ └── test_worker_manager.cpp +│ ├── platform/ # Platform abstraction layer (sim variant) +│ │ ├── test_platform_memory_allocator.cpp +│ │ └── test_platform_host_log.cpp +│ ├── types/ # Cross-cutting ABI-contract types +│ │ ├── test_child_memory.cpp +│ │ ├── test_pto_types.cpp +│ │ └── test_tensor.cpp +│ ├── pto2_a2a3/ # PTO2 on-chip runtime (A2A3 architecture) +│ │ ├── test_a2a3_pto2_fatal.cpp +│ │ ├── test_core_types.cpp +│ │ ├── test_dispatch_payload.cpp +│ │ ├── test_handshake.cpp +│ │ ├── test_submit_types.cpp +│ │ ├── test_ring_buffer.cpp +│ │ ├── test_ring_buffer_edge.cpp +│ │ ├── test_tensormap_edge.cpp +│ │ ├── test_ready_queue.cpp +│ │ ├── test_scheduler_state.cpp +│ │ ├── test_scheduler_edge.cpp +│ │ ├── test_shared_memory.cpp +│ │ ├── test_boundary_edge.cpp +│ │ ├── test_coupling.cpp +│ │ ├── test_coupling_stub.cpp +│ │ ├── test_runtime_graph.cpp +│ │ ├── test_runtime_lifecycle.cpp +│ │ ├── test_runtime_status.cpp +│ │ ├── test_orchestrator_submit.cpp +│ │ └── test_orchestrator_fatal.cpp +│ ├── pto2_a5/ # PTO2 on-chip runtime (A5 architecture) +│ │ └── test_a5_pto2_fatal.cpp +│ └── hardware/ # Hardware-gated tests (CANN required) +│ └── test_hccl_comm.cpp +└── py/ # Python pytest-based tests + ├── conftest.py # Fixtures, sys.path setup + ├── test_elf_parser.py + ├── test_env_manager.py + ├── test_kernel_compiler.py + ├── test_runtime_compiler.py + ├── test_toolchain.py + ├── test_toolchain_setup.py + ├── test_task_interface.py + ├── test_runtime_builder.py + ├── test_chip_worker.py + ├── test_hostsub_fork_shm.py + ├── test_worker/ # Worker subsystem tests + │ ├── test_host_worker.py + │ ├── test_bootstrap_channel.py + │ ├── test_bootstrap_context_hw.py + │ ├── test_bootstrap_context_sim.py + │ ├── test_error_propagation.py + │ ├── test_group_task.py + │ ├── test_l4_recursive.py + │ ├── test_mailbox_atomics.py + │ ├── test_multi_worker.py + │ ├── test_platform_comm.py + │ ├── test_worker_distributed_hw.py + │ └── test_worker_distributed_sim.py +``` + +## Organization Principles + +### Subdirectory-per-component + +C++ tests are grouped by the source component they exercise: + +| Subdirectory | Source under test | CMake helper | +| ------------ | ----------------- | ------------ | +| `hierarchical/` | `src/common/hierarchical/` | `add_hierarchical_test` | +| `platform/` | `src/a2a3/platform/` (sim variant) | inline targets | +| `types/` | `src/common/task_interface/` | `add_task_interface_test` | +| `pto2_a2a3/` | `src/a2a3/runtime/tensormap_and_ringbuffer/` | `add_a2a3_pto2_test` / `add_a2a3_pto2_runtime_test` | +| `pto2_a5/` | `src/a5/runtime/tensormap_and_ringbuffer/` | `add_a5_pto2_test` | +| `hardware/` | HCCL comm backend (needs CANN) | `add_comm_api_test` | + +Python tests are grouped by functional area: build infrastructure +(compilers, toolchain, ELF parsing), nanobind bindings, and the worker +subsystem. + +### Header-only vs runtime-linked + +PTO2 tests come in two flavors: + +- **Header-only** (`add_a2a3_pto2_test`): compile against orchestration/ + runtime headers only. No `.cpp` from the runtime is linked. Used for + type-layout, constant, and API-contract tests. +- **Runtime-linked** (`add_a2a3_pto2_runtime_test`): link the real + `pto_ring_buffer.cpp`, `pto_shared_memory.cpp`, `pto_scheduler.cpp`, + `pto_tensormap.cpp` (and optionally `pto_orchestrator.cpp`, + `pto_runtime2.cpp`). Used for behavioral and integration tests. + +### Hardware gating + +All tests default to `no_hardware` (runnable on standard CI runners). Tests +that need Ascend hardware are gated by: + +- **C++**: `SIMPLER_ENABLE_HARDWARE_TESTS` CMake option + ctest labels + (`requires_hardware_a2a3`). +- **Python**: `@pytest.mark.requires_hardware` / `requires_hardware("a2a3")` + markers. + +### Test-design conventions + +- **AAA pattern**: Arrange-Act-Assert structure in each test. +- **Fixtures over globals**: GoogleTest fixtures (`TEST_F`) manage per-test + state; pytest fixtures handle setup/teardown. +- **Stubs for platform isolation**: `stubs/test_stubs.cpp` provides logging, + assertion, and timer stubs so on-chip runtime code compiles on x86/macOS + without CANN dependencies. +- **Edge-case files**: Files named `*_edge.cpp` focus on boundary conditions, + concurrency stress, and design-contract verification. + +## Test Design Philosophy + +The suite targets three goals: + +1. **ABI contract verification** — `sizeof`, `alignof`, field offsets, and + enum values are checked with `static_assert` and runtime assertions. + This catches silent layout drift when headers change. + +2. **Component isolation** — each test exercises one module with minimal + dependencies. Coupling tests (`test_coupling.cpp`, + `test_coupling_stub.cpp`) explicitly measure and document inter-component + dependencies. + +3. **Bug-candidate documentation** — edge-case tests encode known defects + and design tradeoffs as executable tests. When a test documents a real + src defect, it is preserved as a regression barrier. When a test + documents intentional design (e.g., LIFO dispatch order), it serves as + a contract anchor. + +## Coverage Map + +### C++ — Hierarchical Runtime (`hierarchical/`) + +Source: `src/common/hierarchical/` + +| File | Tests | What it covers | +| ---- | ----- | -------------- | +| `test_tensormap.cpp` | 4 | Insert, lookup, overwrite, erase by task ID. Compound keys (pointer + worker ID). | +| `test_ring.cpp` | 5+ | Slot allocation monotonicity, heap slab alignment, FIFO reclamation, allocation bounds, back-pressure with small heap (8 KiB). | +| `test_scope.cpp` | 5 | Scope depth tracking, begin/end pairing, nested scopes, task registration and release callbacks, empty scope handling. | +| `test_orchestrator.cpp` | 1+ | Wiring TensorMap + Ring + Scope + ReadyQueues into a full Orchestrator. Independent-task readiness detection. | +| `test_scheduler.cpp` | 2+ | MockWorker-based dispatch verification. Single-task and task-group dispatch through Scheduler + WorkerManager integration. | +| `test_worker_manager.cpp` | 4+ | Worker pool lifecycle (THREAD mode), idle worker selection, dispatch, group dispatch. CountingWorker tracks run() calls. | + +### C++ — Platform Abstraction (`platform/`) + +Source: `src/a2a3/platform/` + +| File | Tests | What it covers | +| ---- | ----- | -------------- | +| `test_platform_memory_allocator.cpp` | 4 | Sim memory allocator: allocation tracking, multi-allocation, nullptr safety, untracked-pointer handling. | +| `test_platform_host_log.cpp` | 3+ | HostLogger singleton: level filtering (`is_enabled`), env-var parsing (`PTO_LOG_LEVEL`), `reinitialize()` behavior. | + +### C++ — Cross-cutting Types (`types/`) + +Source: `src/common/task_interface/` + +| File | Tests | What it covers | +| ---- | ----- | -------------- | +| `test_child_memory.cpp` | 3 | `ContinuousTensor` ABI layout (`sizeof == 40`), `child_memory` bit field, blob serialization roundtrip. | +| `test_pto_types.cpp` | 5+ | `TaskOutputTensors` init/materialize/get_ref/max-outputs, `Arg` tensor/scalar storage, `add_scalars_i32` zero-extension, `copy_scalars_from`. | +| `test_tensor.cpp` | 5+ | Segment intersection logic (overlapping, touching, disjoint, zero-length), `make_tensor_external()` factory, cache-line layout coupling. | + +### C++ — PTO2 A2A3 On-chip Runtime (`pto2_a2a3/`) + +Source: `src/a2a3/runtime/tensormap_and_ringbuffer/` + +#### API and type contracts (header-only) + +| File | Tests | What it covers | +| ---- | ----- | -------------- | +| `test_a2a3_pto2_fatal.cpp` | 3+ | Fatal-path reporting through `pto2_orchestration_api.h`. Fake runtime + va_list formatting. | +| `test_core_types.cpp` | 5 | `PTO2TaskId` encode/extract (ring in upper 32, local in lower 32), roundtrip, `PTO2TaskSlotState` size (64 bytes), `PTO2_ALIGN_UP` macro. | +| `test_dispatch_payload.cpp` | 5+ | `PTO2DispatchPayload` 64-byte alignment, SPMD context index constants, `LocalContext`/`GlobalContext` field read/write. | +| `test_handshake.cpp` | 4+ | Handshake protocol macros: `MAKE_ACK_VALUE`/`MAKE_FIN_VALUE`, `EXTRACT_TASK_ID`/`EXTRACT_TASK_STATE`, bit-31 state encoding, reserved task IDs. | +| `test_submit_types.cpp` | 3+ | `pto2_subtask_active()` bitmask (AIC, AIV0, AIV1), `pto2_active_mask_to_shape()`, `pto2_mixed_kernels_to_active_mask()`. | +| `test_runtime_status.cpp` | 9 | `pto2_runtime_status()`: zero codes, single-error negation, precedence rules (orch > sched), pass-through for already-negative codes, range non-overlap. | + +#### Ring buffer and memory allocation (runtime-linked) + +| File | Tests | What it covers | +| ---- | ----- | -------------- | +| `test_ring_buffer.cpp` | 10+ | `PTO2TaskAllocator` init, state queries, window size, heap allocation, FIFO reclamation, wrap-guard boundary. | +| `test_ring_buffer_edge.cpp` | 10+ | Edge cases: wrap-guard at `tail==alloc_size`, fragmentation reporting (`max` not `sum`), zero-size allocation, exact-heap-size allocation, oversized allocation, window saturation, slot mapping, task ID near INT32_MAX. `DepListPool` edge cases: contract violation, prepend chain, high-water mark, overflow error code. | + +#### TensorMap (runtime-linked) + +| File | Tests | What it covers | +| ---- | ----- | -------------- | +| `test_tensormap_edge.cpp` | 15+ | Bug-candidate documentation: `check_overlap()` dimension mismatch, lookup saturation (16-producer limit), pool exhaustion, ABA in `cleanup_retired()`, `copy_from_tensor` zero-padding. Edge cases: 0-dim tensors, max-dim tensors, zero-length shapes. | + +#### Scheduler and ready queue (runtime-linked) + +| File | Tests | What it covers | +| ---- | ----- | -------------- | +| `test_ready_queue.cpp` | 17 | `ReadyQueue` MPMC: empty pop, single push/pop, FIFO ordering, capacity limit, slot reuse, batch push/pop, size accuracy. Multi-threaded: 2P/2C and 1P/4C stress. `LocalReadyBuffer` LIFO: reset, ordering, overflow. | +| `test_scheduler_state.cpp` | 5+ | `init_slot()` helper, `check_and_handle_consumed` transitions (COMPLETED to CONSUMED), fanin/fanout reference counting. | +| `test_scheduler_edge.cpp` | 25+ | `ReadyQueue` edge cases: interleaved push/pop, exact-capacity fill/drain, relaxed-ordering size guard, high-contention stress (4P/4C, 5000 items). `LocalReadyBuffer` LIFO dispatch order, overflow, null backing. `SharedMem` edge: zero window size, corruption detection, undersized buffer, region non-overlap, header alignment. `TaskState` lifecycle: PENDING to CONSUMED, simultaneous subtask completion, fanin/fanout exactly-once semantics, invalid transitions. | + +#### Shared memory (runtime-linked) + +| File | Tests | What it covers | +| ---- | ----- | -------------- | +| `test_shared_memory.cpp` | 6+ | `PTO2SharedMemoryHandle` create/destroy, ownership, header init values, per-ring independence, pointer alignment (`PTO2_ALIGN_SIZE`), `calculate_size()`. | + +#### Boundary and stress tests (runtime-linked) + +| File | Tests | What it covers | +| ---- | ----- | -------------- | +| `test_boundary_edge.cpp` | 17+ | `ReadyQueue` stress: 8P/8C, rapid fill/drain cycles, batch contention. `TaskAllocator` re-init: reset counter, heap, error state, multi-cycle, stale `last_alive`. Sequence wrap near INT64_MAX: single, fill/drain, interleaved, batch, concurrent. `SharedMemory` concurrency: per-ring isolation, atomic increment, `orchestrator_done` race, monotonic advancement, validate after concurrent writes. | + +#### Coupling analysis (runtime-linked) + +| File | Tests | What it covers | +| ---- | ----- | -------------- | +| `test_coupling.cpp` | 4+ | Architectural coupling detection: whether components can operate in isolation. `TMRSystem` full init/destroy measuring dependency graph. | +| `test_coupling_stub.cpp` | 14 | `DepPool` stub isolation: reclaim below/at interval. Scheduler without orchestrator: init/destroy, standalone `ReadyQueue`, fanin release, non-profiling path, mixed-task completion. `TensorMap` link decoupling: builds without `orchestrator.cpp`, orchestrator pointer never dereferenced in hot path. Compile-time include coupling: `RingBuffer` to `Scheduler`, duplicated slot-mask formula, `PTO2_MAX_RING_DEPTH` in 4 components, transitive includes. Profiling behavior: CAS guard in profiling vs atomicity in non-profiling. | + +#### Orchestrator (runtime-linked) + +| File | Tests | What it covers | +| ---- | ----- | -------------- | +| `test_orchestrator_submit.cpp` | 12 | `set_scheduler`, `alloc_tensors` validation (empty/scalar/input args mark fatal), output-only materialization, post-fatal short-circuit, submit with error args, pure-input submit, output materialization, `orchestrator_done` idempotency. | +| `test_orchestrator_fatal.cpp` | 11 | Fatal error latching: initial state, `report_fatal` sets local flag + shared code, second report does not overwrite, `ERROR_NONE` does not latch, all 9 error codes latch correctly, null/empty/varargs format strings, status helper reads latched code. | + +#### Runtime lifecycle (runtime-linked) + +| File | Tests | What it covers | +| ---- | ----- | -------------- | +| `test_runtime_lifecycle.cpp` | 12 | `pto2_runtime_create_custom` initialization, orchestrator-to-scheduler connection, default creation, null SM handle, caller-allocated buffers, null-safe destroy, heap release, `set_mode`, ops table population, `is_fatal` / `report_fatal`. | + +#### Runtime graph — host_build_graph (runtime-linked) + +| File | Tests | What it covers | +| ---- | ----- | -------------- | +| `test_runtime_graph.cpp` | 10 | `RuntimeGraph`: monotonic task IDs, field storage, successor updates (fanout/fanin), ready-task detection, diamond DAG, linear chain, fanout/fanin consistency, max-task limit, tensor-pair management, function binary address mapping. | + +### C++ — PTO2 A5 On-chip Runtime (`pto2_a5/`) + +Source: `src/a5/runtime/tensormap_and_ringbuffer/` + +| File | Tests | What it covers | +| ---- | ----- | -------------- | +| `test_a5_pto2_fatal.cpp` | 3 | API short-circuit after fatal, explicit fatal routing through ops table, `alloc_tensor` overflow reports invalid args instead of asserting. | + +### C++ — Hardware Tests (`hardware/`) + +Gated by `SIMPLER_ENABLE_HARDWARE_TESTS=ON`. Labeled +`requires_hardware_a2a3`. + +| File | Tests | What it covers | +| ---- | ----- | -------------- | +| `test_hccl_comm.cpp` | 3+ | HCCL backend lifecycle: `dlopen(libhost_runtime.so)`, comm init/alloc/query/destroy. CTest resource allocation for 2-device tests. | + +### Python — Build Infrastructure (`py/`) + +| File | Tests | What it covers | +| ---- | ----- | -------------- | +| `test_elf_parser.py` | 3+ | ELF64 and Mach-O `.text` section extraction from raw struct-packed binaries. `_extract_cstring`, `extract_text_section`. | +| `test_env_manager.py` | 5+ | `env_manager.get()`, `ensure()`, caching behavior, error on unset/empty vars. Uses `monkeypatch` for env isolation. | +| `test_kernel_compiler.py` | 4+ | Platform include dirs (a2a3 vs a5), orchestration include dirs. Mock `ASCEND_HOME_PATH` fixture. | +| `test_runtime_compiler.py` | 4+ | `BuildTarget` CMake arg generation, `root_dir` absoluteness, `binary_name`, `RuntimeCompiler` singleton reset. | +| `test_toolchain.py` | 5+ | `_parse_compiler_env()` for conda flags, `GxxToolchainCmakeArgs` (plain/conda env, quoted paths, CMAKE_C/CXX_FLAGS). | +| `test_toolchain_setup.py` | 18 | CCEC toolchain compile flags (a2a3/a5, aic/aiv), unknown platform, missing compiler. Gxx15 toolchain (`__DAV_VEC__`/`__DAV_CUBE__` defines, `__CPU_SIM`). Gxx/Aarch64Gxx cmake args, env vars, cross-compile. `ToolchainType` enum values. | +| `test_runtime_builder.py` | 16 | Runtime discovery (real project tree), config resolution, missing/empty dirs, sorted output. `get_binaries()` error handling, compiler invocation count, path resolution, error propagation. Integration: real compilation produces non-empty `.so` files. | + +### Python — Nanobind and Type Contracts (`py/`) + +| File | Tests | What it covers | +| ---- | ----- | -------------- | +| `test_task_interface.py` | 10+ | `DataType` enum ABI values (FLOAT32, FLOAT16, INT32, ...), `get_element_size()` parametrized, nanobind `_task_interface` extension (`ContinuousTensor`, `TaskArgs`, `ChipStorageTaskArgs`), torch integration. | + +### Python — ChipWorker and Fork/SHM (`py/`) + +| File | Tests | What it covers | +| ---- | ----- | -------------- | +| `test_chip_worker.py` | 11 | `ChipCallConfig` defaults/setters/repr. `ChipWorker` state machine: uninitialized state, run-before-set-device, set-device-before-init, reset/finalize idempotency, init-after-finalize, nonexistent lib. Python import verification. | +| `test_hostsub_fork_shm.py` | 6 | `SharedMemory` cross-fork access. `torch.share_memory_()` mutations across fork. Callable registry in forked child. Mailbox state machine (IDLE/TASK_READY/TASK_DONE cycling). Parallel wall-time verification (3 SubWorkers). Threading after fork. | + +### Python — Worker Subsystem (`py/test_worker/`) + +| File | Tests | What it covers | +| ---- | ----- | -------------- | +| `test_host_worker.py` | 18 | Worker lifecycle (init/close, context manager, register-after-init). Single sub-task execution and multiple runs. `submit_sub()` return type. Scope management (run-managed, user-nested, 3-deep nesting). `alloc()` tensor validity, dependency wiring, unused-freed, no-leak across runs. Sub-callable receives tensor metadata, scalar, empty args. | +| `test_bootstrap_channel.py` | 7 | `BootstrapChannel` state machine: fresh=IDLE, write success/error fields, reset transition, cross-process fork, buffer-ptr overflow, error message truncation. | +| `test_bootstrap_context_hw.py` | 1 | 2-rank hardware smoke: `ChipWorker.bootstrap_context` populates device_ctx, window_base, window_size, buffer_ptrs. | +| `test_bootstrap_context_sim.py` | 4 | 2-rank sim bootstrap, `load_from_host` roundtrip, channel SUCCESS fields, invalid-placement error publishing. | +| `test_error_propagation.py` | 5 | Sub-worker exception surfacing (type/message preserved), missing callable_id, failure-does-not-wedge (next run succeeds), post-failure submit re-raises, L4-chained failure surfaces with layer prefixes. | +| `test_group_task.py` | 3 | `submit_sub_group` with 2 args dispatches to 2 SubWorkers, single-arg group, group-then-dependent-task ordering. | +| `test_l4_recursive.py` | 13 | L4 lifecycle (no children, with L3 child, context manager). Validation (level check, add-after-init, initialized-child). L4-to-L3 dispatch (single, triple, with own subs). Multiple runs no-leak. L3 child with multiple subs. L3 own orchestrator. Generalized `_Worker` level parameter. | +| `test_mailbox_atomics.py` | 6 | `_mailbox_store_i32`/`load_i32` roundtrip (positive, negative, offset). Cross-process visibility via `MAP_SHARED`. Release/acquire ordering: payload visible when state observed. L3 sub-worker dispatch roundtrip. | +| `test_multi_worker.py` | 3 | Two-worker parallel execution with thread-local isolation. Sequential task stress (20 tasks, 1 SubWorker). 20 tasks across 2 SubWorkers, all complete exactly once. | +| `test_platform_comm.py` | 1 | 2-rank hardware smoke: `comm_init` to `comm_destroy` lifecycle (barrier failure tolerated per HCCL 507018). | +| `test_worker_distributed_hw.py` | 1 | 2-rank hardware smoke: `Worker(chip_bootstrap_configs=...)` populates `chip_contexts` with device_ctx, window_base, buffer_ptrs per rank. No `comm_barrier`. | +| `test_worker_distributed_sim.py` | 5 | Worker-level chip bootstrap on sim: happy-path `chip_contexts` population + `/dev/shm` leak check, pre-init access rejection, invalid placement error path + cleanup, level-below-3 rejection, config/device_ids length mismatch. | + +## Test Counts Summary + +| Category | Files | Approx. test cases | +| -------- | ----- | ------------------ | +| C++ hierarchical | 6 | 20+ | +| C++ platform | 2 | 7+ | +| C++ types | 3 | 13+ | +| C++ PTO2 A2A3 | 19 | 180+ | +| C++ PTO2 A5 | 1 | 3 | +| C++ hardware | 1 | 3+ | +| Python build infra | 6 | 50+ | +| Python nanobind | 1 | 10+ | +| Python ChipWorker/fork | 2 | 17 | +| Python worker subsystem | 12 | 67+ | +| **Total** | **53** | **371+** | + +## Infrastructure + +### CMake Helper Functions + +| Function | Linker scope | Use for | +| -------- | ------------ | ------- | +| `add_hierarchical_test(name src)` | Full hierarchical runtime sources | Tests under `hierarchical/` | +| `add_task_interface_test(name src)` | Header-only (`task_interface/`) | ABI-contract tests under `types/` | +| `add_a2a3_pto2_test(name src)` | Header-only (orchestration + runtime headers) | PTO2 type/constant tests | +| `add_a2a3_pto2_runtime_test(name SOURCES ... EXTRA_SOURCES ...)` | Stubs + selected runtime `.cpp` files | Behavioral PTO2 tests | +| `add_a5_pto2_test(name src)` | Header-only (A5 orchestration + runtime) | A5-specific tests | +| `add_comm_api_test(name src)` | CANN `libascendcl` + `dlopen` | Hardware-gated HCCL tests | + +### Platform Stubs (`stubs/test_stubs.cpp`) + +Provides userspace implementations for symbols that on-chip runtime code +expects from the AICPU environment: + +- `unified_log_{error,warn,info,debug,always}` — logging (stderr) +- `get_sys_cnt_aicpu()` — timer stub (returns 0) +- `get_stacktrace()` — stack trace (returns empty string) +- `assert_impl()` — assertion handler (throws `AssertionError`) + +This allows the full runtime `.cpp` files to compile and link on +x86_64/aarch64/macOS without CANN. + +### Python conftest (`py/conftest.py`) + +- Adds `PROJECT_ROOT` to `sys.path` for `import simpler_setup` +- Adds `python/` for `from simpler import env_manager` +- Adds `python/simpler/` for legacy `import env_manager` compatibility +- Provides `project_root` fixture returning the `PROJECT_ROOT` `Path` + +### Test Helpers (`test_helpers.h`) + +- `test_ready_queue_init()` — initialize a `ReadyQueue` with + caller-provided buffer and arbitrary start sequence number diff --git a/simpler_setup/runtime_compiler.py b/simpler_setup/runtime_compiler.py index 4de30ec2b..519ef0726 100644 --- a/simpler_setup/runtime_compiler.py +++ b/simpler_setup/runtime_compiler.py @@ -78,6 +78,11 @@ def get_instance(cls, platform: str = "a2a3") -> "RuntimeCompiler": cls._instances[platform] = cls(platform) return cls._instances[platform] + @classmethod + def reset_instances(cls) -> None: + """Clear the singleton cache. Intended for test isolation.""" + cls._instances.clear() + def __init__(self, platform: str = "a2a3"): self.platform = platform self.project_root = PROJECT_ROOT diff --git a/tests/ut/py/conftest.py b/tests/ut/py/conftest.py index e3936c3c0..7de3b3bb3 100644 --- a/tests/ut/py/conftest.py +++ b/tests/ut/py/conftest.py @@ -8,15 +8,33 @@ # ----------------------------------------------------------------------------------------------------------- """Pytest configuration for Python unit tests (tests/ut/py/). -Adds project directories to sys.path so that simpler_setup, task_interface, -and host_worker modules are importable without installing the package. +Adds project directories to sys.path so that: +- ``import simpler_setup`` works (PROJECT_ROOT on path) +- ``from simpler import env_manager`` works (python/ on path) +- legacy ``import env_manager`` works (python/simpler/ on path) """ import sys from pathlib import Path -_ROOT = Path(__file__).resolve().parent.parent.parent.parent -for _d in [_ROOT, _ROOT / "python"]: +import pytest + +_ROOT = Path(__file__).parent.parent.parent.parent + +# Order matters: PROJECT_ROOT first (so ``import simpler_setup`` works as a +# package), then python/ so ``from simpler import env_manager`` resolves, then +# python/simpler/ so legacy ``import env_manager`` works. +for _d in [ + _ROOT, + _ROOT / "python", + _ROOT / "python" / "simpler", +]: _s = str(_d) if _s not in sys.path: sys.path.insert(0, _s) + + +@pytest.fixture +def project_root(): + """Return the project root directory.""" + return _ROOT diff --git a/tests/ut/py/test_elf_parser.py b/tests/ut/py/test_elf_parser.py new file mode 100644 index 000000000..a2fffe53d --- /dev/null +++ b/tests/ut/py/test_elf_parser.py @@ -0,0 +1,209 @@ +# Copyright (c) PyPTO Contributors. +# This program is free software, you can redistribute it and/or modify it under the terms and conditions of +# CANN Open Software License Agreement Version 2.0 (the "License"). +# Please refer to the License for details. You may not use this file except in compliance with the License. +# THIS SOFTWARE IS PROVIDED ON AN "AS IS" BASIS, WITHOUT WARRANTIES OF ANY KIND, EITHER EXPRESS OR IMPLIED, +# INCLUDING BUT NOT LIMITED TO NON-INFRINGEMENT, MERCHANTABILITY, OR FITNESS FOR A PARTICULAR PURPOSE. +# See LICENSE in the root of the software repository for the full text of the License. +# ----------------------------------------------------------------------------------------------------------- +"""Tests for python/elf_parser.py - ELF64 and Mach-O .text extraction.""" + +import struct +import tempfile + +import pytest + +from simpler_setup.elf_parser import _extract_cstring, extract_text_section + + +def _build_elf64_with_text(text_data: bytes) -> bytes: + """Build a minimal ELF64 .o file with a .text section.""" + # String table: \0.text\0.shstrtab\0 + strtab = b"\x00.text\x00.shstrtab\x00" + text_name_offset = 1 # offset of ".text" in strtab + shstrtab_name_offset = 7 # offset of ".shstrtab" in strtab + + # ELF header (64 bytes) + e_shoff = 64 # section headers right after ELF header + e_shnum = 3 # null + .text + .shstrtab + e_shstrndx = 2 # .shstrtab is section 2 + + elf_header = bytearray(64) + elf_header[0:4] = b"\x7fELF" + elf_header[4] = 2 # 64-bit + elf_header[5] = 1 # little-endian + elf_header[6] = 1 # version + struct.pack_into(" bytes: + """Build a minimal Mach-O 64-bit .o file with __text section.""" + # Header (32 bytes) + header = bytearray(32) + struct.pack_into("= 1 + assert any("a2a3" in d and "platform" in d and "include" in d for d in dirs) + + def test_a5sim_include_dirs(self): + """a5sim platform include dirs point to a5/platform/include.""" + env_manager._cache["ASCEND_HOME_PATH"] = None + from simpler_setup.kernel_compiler import KernelCompiler # noqa: PLC0415 + + kc = KernelCompiler(platform="a5sim") + dirs = kc.get_platform_include_dirs() + assert any("a5" in d and "platform" in d and "include" in d for d in dirs) + + +# ============================================================================= +# Orchestration include directory tests +# ============================================================================= + + +class TestOrchestrationIncludeDirs: + """Tests for get_orchestration_include_dirs().""" + + def test_a2a3_includes_runtime_dir(self, sim_compiler): + """Orchestration includes contain the runtime-specific directory.""" + dirs = sim_compiler.get_orchestration_include_dirs("host_build_graph") + assert any("host_build_graph" in d and "runtime" in d for d in dirs) + + def test_a5_includes_runtime_dir(self): + """A5 orchestration includes point to a5 runtime directory.""" + env_manager._cache["ASCEND_HOME_PATH"] = None + from simpler_setup.kernel_compiler import KernelCompiler # noqa: PLC0415 + + kc = KernelCompiler(platform="a5sim") + dirs = kc.get_orchestration_include_dirs("host_build_graph") + assert any("a5" in d and "host_build_graph" in d for d in dirs) + + +# ============================================================================= +# Platform to architecture mapping tests +# ============================================================================= + + +class TestPlatformToArchMapping: + """Tests for platform -> architecture directory mapping.""" + + def test_a2a3_maps_to_a2a3(self, sim_compiler): + """a2a3sim maps to a2a3 architecture directory.""" + assert "a2a3" in str(sim_compiler.platform_dir) + + def test_a5sim_maps_to_a5(self): + """a5sim maps to a5 architecture directory.""" + env_manager._cache["ASCEND_HOME_PATH"] = None + from simpler_setup.kernel_compiler import KernelCompiler # noqa: PLC0415 + + kc = KernelCompiler(platform="a5sim") + assert "a5" in str(kc.platform_dir) + + def test_unknown_platform_raises(self): + """Unknown platform raises ValueError.""" + env_manager._cache["ASCEND_HOME_PATH"] = None + from simpler_setup.kernel_compiler import KernelCompiler # noqa: PLC0415 + + with pytest.raises(ValueError, match="Unknown platform"): + KernelCompiler(platform="z9000") + + +# ============================================================================= +# Toolchain selection tests (via compile_incore public API) +# ============================================================================= + + +class TestToolchainSelection: + """Tests for toolchain selection behavior via public API.""" + + def test_unknown_platform_compile_raises(self): + """Unknown platform raises ValueError at construction time.""" + env_manager._cache["ASCEND_HOME_PATH"] = None + from simpler_setup.kernel_compiler import KernelCompiler # noqa: PLC0415 + + with pytest.raises(ValueError, match="Unknown platform"): + KernelCompiler(platform="z9000_nonexistent") + + +# ============================================================================= +# Compilation error handling tests (via public compile methods) +# ============================================================================= + + +class TestCompilationErrors: + """Tests for compilation error handling via public API.""" + + def test_compile_incore_missing_source_raises(self, sim_compiler, tmp_path): + """Compiling a non-existent source file raises an error.""" + bad_source = str(tmp_path / "nonexistent_kernel.cpp") + with pytest.raises((RuntimeError, FileNotFoundError, OSError)): + sim_compiler.compile_incore(bad_source, core_type="aiv") + + def test_compile_orchestration_subprocess_failure(self, sim_compiler, tmp_path): + """Compilation failure propagates error with stderr content.""" + source = tmp_path / "dummy.cpp" + source.write_text("int main() {}") + with patch("simpler_setup.kernel_compiler.subprocess.run") as mock_run: + mock_run.return_value = MagicMock(returncode=1, stdout="", stderr="error: undefined reference to 'foo'") + with pytest.raises(RuntimeError, match="undefined reference"): + sim_compiler.compile_orchestration( + "host_build_graph", + str(source), + ) + + +# ============================================================================= +# Orchestration config loading tests (via get_orchestration_include_dirs) +# ============================================================================= + + +class TestOrchestrationConfig: + """Tests for orchestration config behavior via public API.""" + + def test_nonexistent_runtime_include_dirs(self, sim_compiler): + """Non-existent runtime still returns base include dirs (no crash).""" + dirs = sim_compiler.get_orchestration_include_dirs("nonexistent_runtime") + # Should return at least the platform includes, not crash + assert isinstance(dirs, list) diff --git a/tests/ut/py/test_runtime_builder.py b/tests/ut/py/test_runtime_builder.py index 6d5951dcd..97458d388 100644 --- a/tests/ut/py/test_runtime_builder.py +++ b/tests/ut/py/test_runtime_builder.py @@ -287,7 +287,7 @@ def _reset_compiler_singleton(self): from simpler_setup.runtime_compiler import RuntimeCompiler # noqa: PLC0415 yield - RuntimeCompiler._instances.clear() + RuntimeCompiler.reset_instances() def test_get_binaries_returns_valid_paths(self, platform, runtime_name): """get_binaries(build=True) produces RuntimeBinaries with existing files.""" diff --git a/tests/ut/py/test_runtime_compiler.py b/tests/ut/py/test_runtime_compiler.py new file mode 100644 index 000000000..673da3510 --- /dev/null +++ b/tests/ut/py/test_runtime_compiler.py @@ -0,0 +1,151 @@ +# Copyright (c) PyPTO Contributors. +# This program is free software, you can redistribute it and/or modify it under the terms and conditions of +# CANN Open Software License Agreement Version 2.0 (the "License"). +# Please refer to the License for details. You may not use this file except in compliance with the License. +# THIS SOFTWARE IS PROVIDED ON AN "AS IS" BASIS, WITHOUT WARRANTIES OF ANY KIND, EITHER EXPRESS OR IMPLIED, +# INCLUDING BUT NOT LIMITED TO NON-INFRINGEMENT, MERCHANTABILITY, OR FITNESS FOR A PARTICULAR PURPOSE. +# See LICENSE in the root of the software repository for the full text of the License. +# ----------------------------------------------------------------------------------------------------------- +"""Unit tests for python/runtime_compiler.py -- CMake-based runtime compilation.""" + +import os +from unittest.mock import MagicMock, patch + +import pytest +from simpler import env_manager + +# ============================================================================= +# Fixtures +# ============================================================================= + + +@pytest.fixture(autouse=True) +def _clear_env_manager_cache(): + """Clear env_manager cache before each test.""" + env_manager._cache.clear() + yield + env_manager._cache.clear() + + +@pytest.fixture(autouse=True) +def _reset_compiler_singleton(): + """Reset RuntimeCompiler singleton cache between tests.""" + from simpler_setup.runtime_compiler import RuntimeCompiler # noqa: PLC0415 + + yield + RuntimeCompiler.reset_instances() + + +# ============================================================================= +# BuildTarget tests +# ============================================================================= + + +class TestBuildTarget: + """Tests for BuildTarget CMake argument generation.""" + + def test_cmake_args_assembly(self, tmp_path): + """gen_cmake_args() combines toolchain args with include/source dirs.""" + env_manager._cache["ASCEND_HOME_PATH"] = None + from simpler_setup.runtime_compiler import BuildTarget # noqa: PLC0415 + + mock_toolchain = MagicMock() + mock_toolchain.get_cmake_args.return_value = ["-DCMAKE_CXX_COMPILER=g++"] + + target = BuildTarget(mock_toolchain, str(tmp_path), "libtest.so") + args = target.gen_cmake_args(include_dirs=[str(tmp_path / "inc")], source_dirs=[str(tmp_path / "src")]) + + assert "-DCMAKE_CXX_COMPILER=g++" in args + assert any("CUSTOM_INCLUDE_DIRS" in a for a in args) + assert any("CUSTOM_SOURCE_DIRS" in a for a in args) + + def test_root_dir_is_absolute(self, tmp_path): + """get_root_dir() returns an absolute path.""" + env_manager._cache["ASCEND_HOME_PATH"] = None + from simpler_setup.runtime_compiler import BuildTarget # noqa: PLC0415 + + mock_toolchain = MagicMock() + target = BuildTarget(mock_toolchain, str(tmp_path / "src"), "lib.so") + assert os.path.isabs(target.get_root_dir()) + + def test_binary_name(self, tmp_path): + """get_binary_name() returns the configured name.""" + env_manager._cache["ASCEND_HOME_PATH"] = None + from simpler_setup.runtime_compiler import BuildTarget # noqa: PLC0415 + + mock_toolchain = MagicMock() + target = BuildTarget(mock_toolchain, str(tmp_path), "mylib.so") + assert target.get_binary_name() == "mylib.so" + + +# ============================================================================= +# RuntimeCompiler tests +# ============================================================================= + + +class TestRuntimeCompiler: + """Tests for RuntimeCompiler initialization and validation.""" + + @patch("simpler_setup.runtime_compiler.RuntimeCompiler._ensure_host_compilers") + def test_unknown_platform_raises(self, mock_ensure): + """Unknown platform raises ValueError with supported list.""" + from simpler_setup.runtime_compiler import RuntimeCompiler # noqa: PLC0415 + + with pytest.raises(ValueError, match="Unknown platform.*Supported"): + RuntimeCompiler("z9000") + + @patch("simpler_setup.runtime_compiler.RuntimeCompiler._ensure_host_compilers") + def test_missing_platform_dir_raises(self, mock_ensure, tmp_path): + """Non-existent platform directory raises ValueError.""" + # a2a3sim expects src/a2a3/platform/sim/ to exist + # With a custom project_root that doesn't have the dir, it should fail + # Verify that a non-existent platform dir would not exist + phantom_dir = tmp_path / "src" / "a2a3" / "platform" / "sim" + assert not phantom_dir.is_dir() + + def test_singleton_pattern(self): + """get_instance() returns same instance for same platform.""" + env_manager._cache["ASCEND_HOME_PATH"] = None + from simpler_setup.runtime_compiler import RuntimeCompiler # noqa: PLC0415 + + with patch.object(RuntimeCompiler, "_ensure_host_compilers"): + rc1 = RuntimeCompiler.get_instance("a2a3sim") + rc2 = RuntimeCompiler.get_instance("a2a3sim") + assert rc1 is rc2 + + +# ============================================================================= +# Compiler availability tests (via construction behavior) +# ============================================================================= + + +class TestCompilerAvailability: + """Tests for compiler availability via construction.""" + + def test_sim_platform_construction_succeeds(self): + """Sim platform can be constructed (no hardware compilers needed).""" + env_manager._cache["ASCEND_HOME_PATH"] = None + from simpler_setup.runtime_compiler import RuntimeCompiler # noqa: PLC0415 + + with patch.object(RuntimeCompiler, "_ensure_host_compilers"): + rc = RuntimeCompiler("a2a3sim") + assert rc.platform == "a2a3sim" + + +# ============================================================================= +# Compile target validation tests +# ============================================================================= + + +class TestCompileTargetValidation: + """Tests for compile() target platform validation.""" + + @patch("simpler_setup.runtime_compiler.RuntimeCompiler._ensure_host_compilers") + def test_invalid_target_platform_raises(self, mock_ensure): + """Invalid target platform raises ValueError.""" + env_manager._cache["ASCEND_HOME_PATH"] = None + from simpler_setup.runtime_compiler import RuntimeCompiler # noqa: PLC0415 + + rc = RuntimeCompiler("a2a3sim") + with pytest.raises(ValueError, match="Invalid target platform"): + rc.compile("gpu", [], [], None) diff --git a/tests/ut/py/test_task_interface.py b/tests/ut/py/test_task_interface.py index 66024f2a9..e08c8d2ec 100644 --- a/tests/ut/py/test_task_interface.py +++ b/tests/ut/py/test_task_interface.py @@ -48,6 +48,7 @@ def test_enum_values_exist(self): assert DataType.UINT32 is not None def test_enum_int_values(self): + # ABI contract: values must match C++ header. assert DataType.FLOAT32.value == 0 assert DataType.FLOAT16.value == 1 assert DataType.INT32.value == 2 @@ -315,6 +316,7 @@ def test_clear(self): class TestTensorArgType: def test_enum_values(self): + # ABI contract: values must match C++ header. assert TensorArgType.INPUT.value == 0 assert TensorArgType.OUTPUT.value == 1 assert TensorArgType.INOUT.value == 2 @@ -444,7 +446,7 @@ def test_clear(self): assert args.scalar_count() == 0 def test_no_capacity_limit_tensors(self): - """TaskArgs is vector-backed — no per-class capacity limit on tensors.""" + """TaskArgs is vector-backed -- no per-class capacity limit on tensors.""" args = TaskArgs() for i in range(20): args.add_tensor(ContinuousTensor.make(i, (1,), DataType.INT8)) @@ -464,6 +466,7 @@ def test_no_capacity_limit_scalars(self): class TestArgDirection: def test_enum_values(self): + # ABI contract: values must match C++ header. assert ArgDirection.SCALAR.value == 0 assert ArgDirection.IN.value == 1 assert ArgDirection.OUT.value == 2 diff --git a/tests/ut/py/test_toolchain_setup.py b/tests/ut/py/test_toolchain_setup.py new file mode 100644 index 000000000..c45347e66 --- /dev/null +++ b/tests/ut/py/test_toolchain_setup.py @@ -0,0 +1,235 @@ +# Copyright (c) PyPTO Contributors. +# This program is free software, you can redistribute it and/or modify it under the terms and conditions of +# CANN Open Software License Agreement Version 2.0 (the "License"). +# Please refer to the License for details. You may not use this file except in compliance with the License. +# THIS SOFTWARE IS PROVIDED ON AN "AS IS" BASIS, WITHOUT WARRANTIES OF ANY KIND, EITHER EXPRESS OR IMPLIED, +# INCLUDING BUT NOT LIMITED TO NON-INFRINGEMENT, MERCHANTABILITY, OR FITNESS FOR A PARTICULAR PURPOSE. +# See LICENSE in the root of the software repository for the full text of the License. +# ----------------------------------------------------------------------------------------------------------- +"""Unit tests for simpler_setup/toolchain.py -- Toolchain configuration and flag generation.""" + +import os +from unittest.mock import patch + +import pytest +from simpler import env_manager + +from simpler_setup.toolchain import ( + Aarch64GxxToolchain, + CCECToolchain, + Gxx15Toolchain, + GxxToolchain, + ToolchainType, +) + +# ============================================================================= +# Fixtures +# ============================================================================= + + +@pytest.fixture(autouse=True) +def _clear_env_manager_cache(): + """Clear env_manager cache before each test.""" + env_manager._cache.clear() + yield + env_manager._cache.clear() + + +@pytest.fixture +def mock_ascend_home(tmp_path): + """Provide a fake ASCEND_HOME_PATH with expected compiler directories.""" + ascend = tmp_path / "ascend_toolkit" + # Create ccec paths for A2A3 + (ascend / "bin").mkdir(parents=True) + (ascend / "bin" / "ccec").touch() + (ascend / "bin" / "ld.lld").touch() + # Create ccec paths for A5 + (ascend / "tools" / "bisheng_compiler" / "bin").mkdir(parents=True) + (ascend / "tools" / "bisheng_compiler" / "bin" / "ccec").touch() + (ascend / "tools" / "bisheng_compiler" / "bin" / "ld.lld").touch() + # Create aarch64 cross-compiler paths + (ascend / "tools" / "hcc" / "bin").mkdir(parents=True) + (ascend / "tools" / "hcc" / "bin" / "aarch64-target-linux-gnu-g++").touch() + (ascend / "tools" / "hcc" / "bin" / "aarch64-target-linux-gnu-gcc").touch() + + env_manager._cache["ASCEND_HOME_PATH"] = str(ascend) + return str(ascend) + + +# ============================================================================= +# CCECToolchain tests +# ============================================================================= + + +class TestCCECToolchain: + """Tests for CCECToolchain compile flags and cmake args.""" + + def test_compile_flags_a2a3_aiv(self, mock_ascend_home): + """A2A3 platform with aiv core type produces dav-c220-vec flags.""" + tc = CCECToolchain(platform="a2a3") + flags = tc.get_compile_flags(core_type="aiv") + flag_str = " ".join(flags) + assert "dav-c220-vec" in flag_str + + def test_compile_flags_a2a3_aic(self, mock_ascend_home): + """A2A3 platform with aic core type produces dav-c220-cube flags.""" + tc = CCECToolchain(platform="a2a3") + flags = tc.get_compile_flags(core_type="aic") + flag_str = " ".join(flags) + assert "dav-c220-cube" in flag_str + + def test_compile_flags_a5_aiv(self, mock_ascend_home): + """A5 platform with aiv core type produces dav-c310-vec flags.""" + tc = CCECToolchain(platform="a5") + flags = tc.get_compile_flags(core_type="aiv") + flag_str = " ".join(flags) + assert "dav-c310-vec" in flag_str + + def test_compile_flags_a5_aic(self, mock_ascend_home): + """A5 platform with aic core type produces dav-c310-cube flags.""" + tc = CCECToolchain(platform="a5") + flags = tc.get_compile_flags(core_type="aic") + flag_str = " ".join(flags) + assert "dav-c310-cube" in flag_str + + def test_unknown_platform_raises(self, mock_ascend_home): + """Unknown platform raises ValueError on get_compile_flags.""" + tc = CCECToolchain(platform="unknown") + with pytest.raises(ValueError, match="Unknown platform"): + tc.get_compile_flags(core_type="aiv") + + def test_missing_ccec_compiler_raises(self, tmp_path): + """Missing ccec binary raises FileNotFoundError.""" + ascend = tmp_path / "empty_toolkit" + (ascend / "bin").mkdir(parents=True) + # No ccec binary created + env_manager._cache["ASCEND_HOME_PATH"] = str(ascend) + + with pytest.raises(FileNotFoundError, match="ccec compiler not found"): + CCECToolchain(platform="a2a3") + + def test_cmake_args_contain_bisheng(self, mock_ascend_home): + """CMake args include BISHENG_CC and BISHENG_LD.""" + tc = CCECToolchain(platform="a2a3") + args = tc.get_cmake_args() + assert any("BISHENG_CC" in a for a in args) + assert any("BISHENG_LD" in a for a in args) + + +# ============================================================================= +# Gxx15Toolchain tests +# ============================================================================= + + +class TestGxx15Toolchain: + """Tests for Gxx15Toolchain compile flags.""" + + def test_compile_flags_aiv_defines(self): + """aiv core type adds -D__DAV_VEC__.""" + env_manager._cache["ASCEND_HOME_PATH"] = None + tc = Gxx15Toolchain() + flags = tc.get_compile_flags(core_type="aiv") + assert "-D__DAV_VEC__" in flags + + def test_compile_flags_aic_defines(self): + """aic core type adds -D__DAV_CUBE__.""" + env_manager._cache["ASCEND_HOME_PATH"] = None + tc = Gxx15Toolchain() + flags = tc.get_compile_flags(core_type="aic") + assert "-D__DAV_CUBE__" in flags + + def test_compile_flags_no_core_type(self): + """Empty core type adds neither __DAV_VEC__ nor __DAV_CUBE__.""" + env_manager._cache["ASCEND_HOME_PATH"] = None + tc = Gxx15Toolchain() + flags = tc.get_compile_flags(core_type="") + assert "-D__DAV_VEC__" not in flags + assert "-D__DAV_CUBE__" not in flags + + def test_compile_flags_contain_cpu_sim(self): + """Simulation flags include -D__CPU_SIM.""" + env_manager._cache["ASCEND_HOME_PATH"] = None + tc = Gxx15Toolchain() + flags = tc.get_compile_flags() + assert "-D__CPU_SIM" in flags + + def test_cmake_args_respect_env_vars(self): + """CMake args use CC/CXX env vars when set.""" + env_manager._cache["ASCEND_HOME_PATH"] = None + tc = Gxx15Toolchain() + with patch.dict(os.environ, {"CC": "my-gcc", "CXX": "my-g++"}): + args = tc.get_cmake_args() + assert "-DCMAKE_C_COMPILER=my-gcc" in args + assert "-DCMAKE_CXX_COMPILER=my-g++" in args + + +# ============================================================================= +# GxxToolchain tests +# ============================================================================= + + +class TestGxxToolchain: + """Tests for GxxToolchain.""" + + def test_cmake_args_with_ascend(self, mock_ascend_home): + """With ASCEND_HOME_PATH, cmake args include it.""" + tc = GxxToolchain() + args = tc.get_cmake_args() + assert any("ASCEND_HOME_PATH" in a for a in args) + + def test_cmake_args_without_ascend(self): + """Without ASCEND_HOME_PATH, cmake args do not include it.""" + env_manager._cache["ASCEND_HOME_PATH"] = None + tc = GxxToolchain() + args = tc.get_cmake_args() + assert not any("ASCEND_HOME_PATH" in a for a in args) + + def test_compile_flags_contain_std17(self): + """Compile flags include C++17 standard.""" + env_manager._cache["ASCEND_HOME_PATH"] = None + tc = GxxToolchain() + flags = tc.get_compile_flags() + assert "-std=c++17" in flags + + +# ============================================================================= +# Aarch64GxxToolchain tests +# ============================================================================= + + +class TestAarch64GxxToolchain: + """Tests for Aarch64GxxToolchain.""" + + def test_cmake_args_cross_compile(self, mock_ascend_home): + """CMake args include aarch64 cross-compiler paths.""" + tc = Aarch64GxxToolchain() + args = tc.get_cmake_args() + assert any("aarch64-target-linux-gnu-gcc" in a for a in args) + assert any("aarch64-target-linux-gnu-g++" in a for a in args) + + def test_missing_compiler_raises(self, tmp_path): + """Missing aarch64 compiler raises FileNotFoundError.""" + ascend = tmp_path / "no_hcc" + (ascend / "tools" / "hcc" / "bin").mkdir(parents=True) + # No compiler binaries created + env_manager._cache["ASCEND_HOME_PATH"] = str(ascend) + + with pytest.raises(FileNotFoundError, match="aarch64"): + Aarch64GxxToolchain() + + +# ============================================================================= +# ToolchainType tests +# ============================================================================= + + +class TestToolchainType: + """Tests for ToolchainType enum.""" + + def test_enum_values(self): + """ToolchainType values match compile_strategy.h.""" + # ABI contract: values must match compile_strategy.h. + assert ToolchainType.CCEC == 0 + assert ToolchainType.HOST_GXX_15 == 1 + assert ToolchainType.HOST_GXX == 2 + assert ToolchainType.AARCH64_GXX == 3