Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
127 commits
Select commit Hold shift + click to select a range
b762e65
Added explicit jobid logging to srun
leventeBajczi Mar 20, 2024
702f525
Added logging to retry
leventeBajczi Mar 20, 2024
6037e7f
Trying with small wait before seff
leventeBajczi Mar 20, 2024
512fc56
Fixed cancelling flow, as well as job id matching
leventeBajczi Mar 21, 2024
d8f1495
Using sacct instead of seff by default
leventeBajczi Mar 21, 2024
5431275
Added logging
leventeBajczi Mar 21, 2024
18a827e
Fixd log
leventeBajczi Mar 21, 2024
5e5f74a
Fixed line splitting
leventeBajczi Mar 21, 2024
796ef71
str()
leventeBajczi Mar 21, 2024
093b448
Added none checks to int()
leventeBajczi Mar 21, 2024
cafd1d9
Added log
leventeBajczi Mar 21, 2024
35bfb00
str() -> .decode()
leventeBajczi Mar 21, 2024
25958b1
Missing decode()
leventeBajczi Mar 21, 2024
1c47cd0
Check for not available memory
leventeBajczi Mar 21, 2024
83ca919
Removed sleep()
leventeBajczi Mar 21, 2024
1ab722d
Formatted
leventeBajczi Mar 21, 2024
2b297b9
Fixed stopping
leventeBajczi Apr 21, 2024
86aff97
using slurm arrayjob
leventeBajczi Oct 14, 2024
c963163
Added slurm with array-based aggregation
leventeBajczi Oct 17, 2024
b77be49
Added back non-array executor
leventeBajczi Oct 19, 2024
2fbcac4
Finalized arrayexecutor
leventeBajczi Oct 21, 2024
2dee15f
runexec no longer relative
leventeBajczi Nov 21, 2024
a9df02d
Merge remote-tracking branch 'upstream/main' into slurm-array-aggregate
leventeBajczi Nov 21, 2024
3ad572d
Changed to shlex.join
leventeBajczi Nov 21, 2024
63132ac
determining version in singularity now
leventeBajczi Nov 21, 2024
dd8d67b
Fix cwd
leventeBajczi Nov 21, 2024
9c13fb5
Removed erroneous param
leventeBajczi Nov 21, 2024
fb997a5
Fixed paths
leventeBajczi Nov 21, 2024
dedea8f
fixed result file collection
leventeBajczi Nov 21, 2024
ad9fa6f
Moving files as well
leventeBajczi Nov 21, 2024
2159e5f
Fixed source file path
leventeBajczi Nov 21, 2024
b14ad92
retrying version in singularity
leventeBajczi Nov 21, 2024
45eced4
fixed copying
leventeBajczi Nov 21, 2024
0510de7
Fixed dest path
leventeBajczi Nov 21, 2024
ddd77e7
fixed copying; no longer overwriting directory
leventeBajczi Nov 21, 2024
6a1d316
Using task result_files_folder instead of runset result_files_folder
leventeBajczi Nov 21, 2024
6b174f9
Enhanced logging
leventeBajczi Nov 21, 2024
8e2fe48
Only copying files if folder exists
leventeBajczi Nov 21, 2024
cf975aa
Forcing version query in singularity
leventeBajczi Nov 22, 2024
d0b7ee1
Fixed version string
leventeBajczi Nov 22, 2024
1108b4c
fix version parsing
leventeBajczi Nov 22, 2024
c7670ea
Determining system info now
leventeBajczi Nov 22, 2024
49707d9
Black
leventeBajczi Nov 22, 2024
9206329
Preserving logfiles if encountering an error
leventeBajczi Nov 22, 2024
c370475
Added logging
leventeBajczi Nov 22, 2024
217431b
creating folder if doesn't exist
leventeBajczi Nov 22, 2024
67b184d
syntax fix
leventeBajczi Nov 22, 2024
0295b5f
Retrying
leventeBajczi Nov 22, 2024
9706c73
fixed retry
leventeBajczi Nov 22, 2024
fd4f6ad
Fixed aggregation of results
leventeBajczi Nov 22, 2024
8131b76
Only using fuse where necessary
leventeBajczi Nov 22, 2024
5fc0bd5
Removed prefix, no longer necessary.
leventeBajczi Nov 22, 2024
ea9410b
keeping only witnesses
leventeBajczi Nov 23, 2024
9edfe88
Fixed timelimit for arrays
leventeBajczi Nov 23, 2024
4ad77c7
Merge remote-tracking branch 'upstream/main' into slurm-array-aggregate
leventeBajczi Nov 23, 2024
81c6c86
Overwriting version() instead of _version_from_tool()
leventeBajczi Nov 23, 2024
f07c037
Fix false positive timeout
leventeBajczi Nov 23, 2024
05c888d
int()
leventeBajczi Nov 23, 2024
c42de14
Handling Nones
leventeBajczi Nov 23, 2024
93ea05a
Format, prepare for MR
leventeBajczi Nov 23, 2024
dad8bd5
enhance documentation
leventeBajczi Nov 23, 2024
704161a
Black
leventeBajczi Nov 23, 2024
8a2babe
Add missing REUSE
leventeBajczi Nov 23, 2024
5d526df
singularity is not optional
leventeBajczi Nov 23, 2024
d633afc
sleeping before running array
leventeBajczi Nov 23, 2024
bc8435a
Specify ro and rw for binds
leventeBajczi Nov 23, 2024
23923d3
fix quotes
leventeBajczi Nov 23, 2024
adeb41d
Ruff fix
leventeBajczi Nov 23, 2024
ea14f51
Moved to resultfiles-based approach
leventeBajczi Nov 23, 2024
39256a0
Fixed copying result files
leventeBajczi Nov 23, 2024
222b306
Added catch-all witness pattern
leventeBajczi Nov 23, 2024
65c9a19
fixed concat
leventeBajczi Nov 23, 2024
5067d69
added cache to func
leventeBajczi Nov 23, 2024
cda2d98
Fixed moving files, only moving those that are necessary
leventeBajczi Nov 24, 2024
54eddc2
Syntax fix
leventeBajczi Nov 24, 2024
4ab571e
added logging for retrying
leventeBajczi Nov 25, 2024
dc6f5f2
Added logic to re-run experiments
leventeBajczi Nov 25, 2024
46b7925
Fixed argument
leventeBajczi Nov 25, 2024
a707842
Added filter
leventeBajczi Nov 25, 2024
52a207e
Added logging, fixed name
leventeBajczi Nov 25, 2024
dfbf57e
using proper logfile name now
leventeBajczi Nov 25, 2024
d5bfa92
Fixed logfile and resultfile paths
leventeBajczi Nov 25, 2024
7ddf826
Fixed resultfile paths
leventeBajczi Nov 25, 2024
ed7cf23
Added logging, and no longer adding status back
leventeBajczi Nov 25, 2024
6e94524
Added missing param
leventeBajczi Nov 25, 2024
92fa124
Added .cmdline()
leventeBajczi Nov 25, 2024
2288a77
Fixed copying
leventeBajczi Nov 25, 2024
8d0e7d5
fixed max, reformat
leventeBajczi Nov 25, 2024
4691c1e
No attempting recovery if there is no logfiles.zip
leventeBajczi Nov 26, 2024
00f17d0
copying instead of reading-writing
leventeBajczi Nov 26, 2024
2ca8afc
Fixed len() for filter()
leventeBajczi Nov 26, 2024
e11e8ab
Added further checks to recovery
leventeBajczi Nov 28, 2024
b11f215
reformat
leventeBajczi Nov 28, 2024
187925c
Fixed f-string
leventeBajczi Nov 28, 2024
74b2a73
Fixed options comparison
leventeBajczi Nov 28, 2024
19d70d8
Added srun timeout
leventeBajczi Nov 28, 2024
eed7c19
Not unzipping old files and logfiles any more
leventeBajczi Nov 28, 2024
c9e24bd
Added prefix to names
leventeBajczi Nov 28, 2024
57b3c4b
Fixed which params to check
leventeBajczi Nov 30, 2024
61baa52
Moved output printing outside of try block
leventeBajczi Nov 30, 2024
76fc58c
Removed run result setting to outside of try
leventeBajczi Dec 2, 2024
5d24b42
Merging rundefs if smaller than batchsize
leventeBajczi Mar 13, 2025
72a1af0
added --fakeroot and --contain to singularity command list
leventeBajczi Mar 13, 2025
5facff3
Added workaround to container
leventeBajczi Mar 13, 2025
021c42f
Added workaround to container, now working
leventeBajczi Mar 13, 2025
69c8a97
chmod TMPDIR as well
leventeBajczi Mar 13, 2025
346e602
Added readonlydir /
leventeBajczi Mar 13, 2025
1705836
Added fullaccessdir
leventeBajczi Mar 13, 2025
06a944f
Added missing overlay dir
leventeBajczi Mar 15, 2025
5b7130b
readonly basedir
leventeBajczi Mar 15, 2025
0118b47
simplified filesystem
leventeBajczi Apr 8, 2025
2255b67
Added --copy-tool
leventeBajczi Apr 11, 2025
88ab216
Merge remote-tracking branch 'upstream/main' into slurm-array-aggregate
leventeBajczi Apr 11, 2025
ecb68ca
added parens
leventeBajczi Apr 11, 2025
907c21f
Unsetting TMPDIR
leventeBajczi Apr 11, 2025
367eebe
Added input file copying as well
leventeBajczi Apr 11, 2025
a7db875
Added input file copying as well (fixed)
leventeBajczi Apr 11, 2025
79bb006
Added input file copying as well (fixed)
leventeBajczi Apr 11, 2025
e4a4006
put upper and workdir back
leventeBajczi Apr 11, 2025
ca39e01
Added missing mapping
leventeBajczi Apr 11, 2025
acf3454
Switched cd and cp
leventeBajczi Apr 11, 2025
bc91d1d
Added /tmp
leventeBajczi Apr 11, 2025
438d7a3
Added sleeping and fsync
leventeBajczi Apr 20, 2025
7ad0d6e
Added better log
leventeBajczi Apr 20, 2025
d817a58
Instead of fsync, using delay
leventeBajczi Apr 21, 2025
d4d5ed0
removed dependency on runexec
leventeBajczi Oct 20, 2025
fdabfb5
removed nonsense import
leventeBajczi Oct 20, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
64 changes: 62 additions & 2 deletions contrib/slurm-benchmark.py
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,8 @@
import os
import sys

sys.path.insert(0, os.path.join(os.path.dirname(__file__), ".."))

import benchexec.benchexec
import benchexec.tools
import benchexec.util
Expand Down Expand Up @@ -43,6 +45,12 @@ def create_argument_parser(self):
action="store_true",
help="Use SLURM to execute benchmarks.",
)
slurm_args.add_argument(
"--slurm-array",
dest="slurm_array",
action="store_true",
help="Use SLURM array jobs to execute benchmarks.",
)
slurm_args.add_argument(
"--singularity",
dest="singularity",
Expand All @@ -61,13 +69,65 @@ def create_argument_parser(self):
dest="retry",
type=int,
default="0",
help="Retry killed jobs this many times. Use -1 for unbounded retry attempts.",
help="Retry killed jobs this many times. Use -1 for unbounded retry attempts (cannot be used with --slurm-array).",
)

slurm_args.add_argument(
"--aggregation-factor",
dest="aggregation_factor",
type=int,
default="10",
help="Aggregation factor for batch jobs (this many tasks will run in a single SLURM job).",
)
slurm_args.add_argument(
"--batch-size",
dest="batch_size",
type=int,
default="5000",
help="Split run sets into batches of at most this size. Helpful in avoiding errors with script sizes.",
)
slurm_args.add_argument(
"--parallelization",
dest="concurrency_factor",
type=int,
default="4",
help="Run this many tasks at once in one job.",
)
slurm_args.add_argument(
"--overtime-factor",
dest="overtime_factor",
type=float,
default="1.1",
help="Factor which by to scale timelimits to overapproximate CPU time limit with walltime limit.",
)
slurm_args.add_argument(
"--continue-interrupted",
dest="continue_interrupted",
action="store_true",
help="Continue a previously interrupted job.",
)
slurm_args.add_argument(
"--copy-tool",
dest="copy_tool",
action="store_true",
help="Make a copy of the tool folder in the container.",
)
slurm_args.add_argument(
"--generate-only",
dest="generate_only",
action="store_true",
help="Only generate the SLURM array description, don't run it.",
)

return parser

def load_executor(self):
if self.config.slurm:
if self.config.slurm_array:
from slurm import arrayexecutor as executor
elif self.config.slurm:
logging.error(
"Single-job-based SLURM-integration is no longer supported. Use --slurm-array instead."
)
from slurm import slurmexecutor as executor
else:
logging.warning(
Expand Down
94 changes: 94 additions & 0 deletions contrib/slurm/README-old.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,94 @@
<!--
This file is part of BenchExec, a framework for reliable benchmarking:
https://github.com/sosy-lab/benchexec

SPDX-FileCopyrightText: 2021 Dirk Beyer <https://www.sosy-lab.org>
SPDX-FileCopyrightText: 2024 Levente Bajczi
SPDX-FileCopyrightText: Critical Systems Research Group
SPDX-FileCopyrightText: Budapest University of Technology and Economics <https://www.ftsrg.mit.bme.hu>

SPDX-License-Identifier: Apache-2.0
-->
# BenchExec Extension for Benchmarking via SLURM

> [!CAUTION]
> This, single-job-based SLURM integration is no longer maintained. For the maintained, array-based version's documentation, see [README.md](./README.md)

This Python script extends BenchExec, a benchmarking framework, to facilitate benchmarking via SLURM, optionally using a Singularity container.

In case of problems, please tag in an [issue](https://github.com/sosy-lab/benchexec/issues/new/choose): [Levente Bajczi](https://github.com/leventeBajczi) (@leventeBajczi).

## Preliminaries

* [SLURM](https://slurm.schedmd.com/documentation.html) is an open-source job scheduling and workload management system used primarily in high-performance computing (HPC) environments.
* [Singularity](https://docs.sylabs.io/guides/latest/user-guide/) is a containerization platform designed for scientific and high-performance computing (HPC) workloads, providing users with a reproducible and portable environment for running applications and workflows.

## Requirements

* SLURM, tested with `slurm 22.05.7`, should work within `22.x.x`
* Singularity (optional), tested with `singularity-ce version 4.0.1`, should work within `4.x.x`

## Usage
1. Run the script with Python 3:
```
python3 $BENCHEXEC_FOLDER/contrib/slurm-benchmark.py [options]
```
Options:
- `--slurm`: Use SLURM to execute benchmarks. Will revert to regular (local) benchexec if not given.
- `--singularity <path_to_sif>`: Specify the path to the Singularity .sif file to use. See usage later.
- `--scratchdir <path>`: Specify the directory for temporary files. The script will use this parameter to create temporary directories for file storage per-run, which get discarded later. By default, this is the CWD, which might result in temporary files being generated by the thousands in the working directory. On some systems, this must be on the same mount, or even under the same hierarchy as the current directory. Must exist, be writable, and be a directory.
- `--retry-killed <N>`: Retry killed jobs (e.g., due to SLURM errors) this many times. Use -1 for unbounded retry attempts.
- `-N <N>`: Specify the factor of parallelism, i.e., how many instances to start at a time. Tested with up to `1000`, probably works with much higher values as well.

## Overview of the Workflow

This works similarly to BenchExec, however, instead of delegating each run to `runexec`, it delegates to `srun` from SLURM.

1. If the `--singularity` option is given, the script wraps the command to run in a container. This is useful for dependency management (in most HPC environments, arbitrary package installations are frowned upon). For a simple container, use the following:

```singularity
BootStrap: docker
From: ubuntu:22.04

%post
apt -y update
apt -y install openjdk-17-jre-headless libgomp1 libmpfr-dev fuse-overlayfs
```

Use `singularity build [--remote / --fakeroot] --fix-perms <name>.sif <name>.def` to build the container.

Notice the `fuse-overlayfs` package. That is mandatory for the overlay filesystem to work properly.

The script parameterizes `singularity exec` with the following params:
* `-B $PWD:/lower`: Bind the working directory to `/lower` (could be read-only)
* `--no-home`: Do not bind the home directory
* `-B {tempdir}:/overlay`: Bind the temporary directory to `/overlay` (must be writeable)
* `--fusemount "container:fuse-overlayfs -o lowerdir=/lower -o upperdir=/overlay/upper -o workdir=/overlay/work $HOME"`: mount an overlay filesystem at $HOME, where modifications go in the temp dir but files can be read from the current dir

We also wrap this command inside the container using `bash -c "{command} && echo 0 > exitcode || echo $? > exitcode` to save the exitcode of the process, _and_ always have 0 as the exitcode of a completed run. Otherwise, we cannot differentiate between a FAILURE happening due to SLURM-issues (e.g., transport failures), or a simply failing command. Otherwise, retrying would not work.

2. Currently, the following parameters are passed to `srun` (calculated from the benchmark's parameters):
* `-t <hh:mm:ss>` CPU timelimit (generally, SLURM will round up to nearest minute)
* `-c <cpus>` number of cpus
* `--threads-per-core=1` only use one thread per core
* `--mem-per-cpu <mem/cpus>` memory allocaiton in MBs per cpu
* `--ntasks=1` number of tasks per node

3. The script parses the resulting job ID, and after the job finishes, runs `seff` to gather resource usage data:
* Exit code
* Status
* CPU time [s]
* Wall time [s]
* Memory [MB]

## Limitations

Currently, there are the following limitations compared to local benchexec:

1. No advanced resource constraining / monitoring: only CPU time, CPU core and memory limits are handled, and only CPU time, wall time, and memory usage are monitored.
2. No exotic paths in the command are handled: only the current working directory and its children are visible in the container
3. The user on the host and the container should not differ (due to using $HOME in the commands).
4. Without singularity, no constraint is placed on the resulting files of the runs: this will populate the current directory with all the output files of all the runs.
5. For timed-out runs, where SLURM terminated the run, no CPU time values are available.
6. The executor only works with hyperthreading disabled, due to the inability to query nodes about the number of threads per core. Assuming it's always 2 is risky, as it may not hold true universally. Consequently, because we can only request whole cores from SLURM instead of threads, we must divide the requested number of threads by the threads-per-core value, which is unknown if hyperthreading could be enabled.
7. Cancelling a benchmark run (by sending SIGINT) could be delayed up to a few minutes depending on the SLURM configuration.
72 changes: 35 additions & 37 deletions contrib/slurm/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,10 @@ SPDX-License-Identifier: Apache-2.0
-->
# BenchExec Extension for Benchmarking via SLURM

This Python script extends BenchExec, a benchmarking framework, to facilitate benchmarking via SLURM, optionally using a Singularity container.
> [!IMPORTANT]
> The previous, single-job-based SLURM integration is no longer maintained. For its documentation, see [README-old.md](./README-old.md)

This Python script extends BenchExec, a benchmarking framework, to facilitate benchmarking via SLURM array jobs using Singularity containers.

In case of problems, please tag in an [issue](https://github.com/sosy-lab/benchexec/issues/new/choose): [Levente Bajczi](https://github.com/leventeBajczi) (@leventeBajczi).

Expand All @@ -23,69 +26,64 @@ In case of problems, please tag in an [issue](https://github.com/sosy-lab/benche
## Requirements

* SLURM, tested with `slurm 22.05.7`, should work within `22.x.x`
* Singularity (optional), tested with `singularity-ce version 4.0.1`, should work within `4.x.x`
* Singularity, tested with `singularity-ce version 4.0.1`, should work within `4.x.x`
* cgroup support is required

## Usage
1. Run the script with Python 3:
```
python3 $BENCHEXEC_FOLDER/contrib/slurm-benchmark.py [options]
```
Options:
- `--slurm`: Use SLURM to execute benchmarks. Will revert to regular (local) benchexec if not given.
- `--slurm-array`: Use SLURM array jobs to execute benchmarks. Will revert to regular (local) benchexec if not given.
- `--singularity <path_to_sif>`: Specify the path to the Singularity .sif file to use. See usage later.
- `--scratchdir <path>`: Specify the directory for temporary files. The script will use this parameter to create temporary directories for file storage per-run, which get discarded later. By default, this is the CWD, which might result in temporary files being generated by the thousands in the working directory. On some systems, this must be on the same mount, or even under the same hierarchy as the current directory. Must exist, be writable, and be a directory.
- `--retry-killed <N>`: Retry killed jobs (e.g., due to SLURM errors) this many times. Use -1 for unbounded retry attempts.
- `-N <N>`: Specify the factor of parallelism, i.e., how many instances to start at a time. Tested with up to `1000`, probably works with much higher values as well.
- `-N <N>`: Specify the factor of parallelism, i.e., how many jobs to submit at a time. Tested with up to `1000`, probably works with much higher values as well.
- `--aggregation-factor`: Put this many jobs into a single job of the array.
- `--batch-size`: Allow this many runs inside a runcollection to be submitted. Lower values might hurt responsiveness, higher values might cause problems with script sizes. Suggested size is around a few thousand.
- `--parallelization`: Execute this many jobs in parallel inside a job of the array.

## Overview of the Workflow

This works similarly to BenchExec, however, instead of delegating each run to `runexec`, it delegates to `srun` from SLURM.
This works similarly to BenchExec, however, instead of delegating each run directly to `runexec`, it creates a hierarchy of run infos and an array job description for SLURM, which is then executed using `sbatch`. `runexec` is still used to measure and limit resources.

1. If the `--singularity` option is given, the script wraps the command to run in a container. This is useful for dependency management (in most HPC environments, arbitrary package installations are frowned upon). For a simple container, use the following:
1. The script wraps the command to run in a container. This is useful for dependency management (in most HPC environments, arbitrary package installations are frowned upon). For a simple container, use the following:

```singularity
BootStrap: docker
From: ubuntu:22.04

%post
apt -y update
apt -y install openjdk-17-jre-headless libgomp1 libmpfr-dev fuse-overlayfs
BootStrap: docker
From: ubuntu:24.04

%post
apt -y update
apt -y install <necessary packages for the tools>
apt -y install software-properties-common
add-apt-repository ppa:sosy-lab/benchmarking
apt -y install benchexec fuse-overlayfs
mkdir /work
mkdir /upper
```

Use `singularity build [--remote / --fakeroot] --fix-perms <name>.sif <name>.def` to build the container.
Use `singularity build [--remote / --fakeroot] --fix-perms <name>.sif <name>.def` to build the container. A remote service (e.g., [sylabs](https://cloud.sylabs.io/builder)) may be used if root permissions are missing.

Notice the `fuse-overlayfs` package. That is mandatory for the overlay filesystem to work properly.
Notice the `fuse-overlayfs` and `benchexec` packages. That is mandatory for the overlay filesystem to work properly and for `runexec` to exist in the container.

The script parameterizes `singularity exec` with the following params:
* `-B $PWD:/lower`: Bind the working directory to `/lower` (could be read-only)
* `-B "/sys/fs/cgroup:/sys/fs/cgroup"`: Bind the cgroup hierarchy for use inside the container
* `-B {basedir}`: Bind the "base directory" (directory of the .sif file) (can be read-only)
* `-B {workdir}:/lower`: Bind the current directory to `/lower` (can be read-only)
* `--no-home`: Do not bind the home directory
* `-B {tempdir}:/overlay`: Bind the temporary directory to `/overlay` (must be writeable)
* `--fusemount "container:fuse-overlayfs -o lowerdir=/lower -o upperdir=/overlay/upper -o workdir=/overlay/work $HOME"`: mount an overlay filesystem at $HOME, where modifications go in the temp dir but files can be read from the current dir

We also wrap this command inside the container using `bash -c "{command} && echo 0 > exitcode || echo $? > exitcode` to save the exitcode of the process, _and_ always have 0 as the exitcode of a completed run. Otherwise, we cannot differentiate between a FAILURE happening due to SLURM-issues (e.g., transport failures), or a simply failing command. Otherwise, retrying would not work.
* `--fusemount "container:fuse-overlayfs -o lowerdir=/lower -o upperdir=/overlay/upper -o workdir=/overlay/work {workdir}"`: mount an overlay filesystem at {workdir} under {basedir}, where modifications go in the temp dir

2. Currently, the following parameters are passed to `srun` (calculated from the benchmark's parameters):
* `-t <hh:mm:ss>` CPU timelimit (generally, SLURM will round up to nearest minute)
* `-c <cpus>` number of cpus
* `--threads-per-core=1` only use one thread per core
* `--mem-per-cpu <mem/cpus>` memory allocaiton in MBs per cpu
* `--ntasks=1` number of tasks per node
2. A `--batch-size`-sized portion of the runs is organized into bins of size `--aggregation-factor`. Each bin will correspond to a job in the array. Inside each bin, `--parallelization`-many `runexec` instances can be started with exact resource allocations and usage reporting. Output files and output log are stored inside the temp dir. If an error is encountered (most commonly this is due to `fuse` locking up and causing a TIMEOUT without any logs being ready) the run is put into a second-chance queue to be run again, at most `--retry-killed` times.

3. The script parses the resulting job ID, and after the job finishes, runs `seff` to gather resource usage data:
* Exit code
* Status
* CPU time [s]
* Wall time [s]
* Memory [MB]
3. The script parses the resource usage and status of each run, as it would with regular `runexec`.

## Limitations

Currently, there are the following limitations compared to local benchexec:

1. No advanced resource constraining / monitoring: only CPU time, CPU core and memory limits are handled, and only CPU time, wall time, and memory usage are monitored.
2. No exotic paths in the command are handled: only the current working directory and its children are visible in the container
3. The user on the host and the container should not differ (due to using $HOME in the commands).
4. Without singularity, no constraint is placed on the resulting files of the runs: this will populate the current directory with all the output files of all the runs.
5. For timed-out runs, where SLURM terminated the run, no CPU time values are available.
6. The executor only works with hyperthreading disabled, due to the inability to query nodes about the number of threads per core. Assuming it's always 2 is risky, as it may not hold true universally. Consequently, because we can only request whole cores from SLURM instead of threads, we must divide the requested number of threads by the threads-per-core value, which is unknown if hyperthreading could be enabled.
7. Cancelling a benchmark run (by sending SIGINT) could be delayed up to a few minutes depending on the SLURM configuration.
1. No exotic paths in the command are handled: only the directory of the `.sif` file and its children are visible in the container.
1. The executor only works with hyperthreading disabled, due to the inability to query nodes about the number of threads per core. Assuming it's always 2 is risky, as it may not hold true universally. Consequently, because we can only request whole cores from SLURM instead of threads, we must divide the requested number of threads by the threads-per-core value, which is unknown if hyperthreading could be enabled.
1. `fuse` sometimes locks up (more precisely: is in an uninterruptible state) for the entire duration of a job. My guess is the underlying lustre file system does not like it when the same path is overlayed from hundreds of nodes at the same time. As a mitigation, we re-run timed out jobs (not runs!).
Loading
Loading