geopmlaunch(1) – application launch wrapper
Synopsis
geopmlaunch launcher [launcher_opt] [geopm_opt] -- executable [executable_opt]
Description
The geopmlaunch
application enables execution of the GEOPM runtime
along with a distributed parallel compute application, executable,
using the command line interface for the underlying application
scheduler, launcher, deployed on the HPC system. The geopmlaunch
command line interface is designed to support many application
schedulers including SLURM srun
, ALPS aprun
, and Intel / MPICH
mpiexec
. The geopmlaunch
command line interface has been designed
to wrap the underlying application scheduler while reinterpreting the
command line options, launcher_opt, specified for it. In this way,
the user can easily modify an existing job launch command to enable
the GEOPM runtime by prefixing the command with geopmlaunch
and
extending the existing scheduler options with the options for the
GEOPM runtime, geopm_opt. The GEOPM runtime options control the
behavior of the GEOPM runtime and are detailed in the GEOPM Options section below.
All command line options accepted by the underlying job scheduler
(e.g. srun
or aprun
) can be passed to the geopmlaunch
wrapper,
except for CPU affinity related options. The wrapper
script reinterprets the command line to pass modified options and set
environment variables for the underlying application scheduler. The
GEOPM options, geopm_opt, are translated into environment variables
to be interpreted by the GEOPM runtime (see the ENVIRONMENT section of
geopm(7)). The reinterpreted command line, including
the
environment modification, is printed to standard output by the script
before execution. The command that is printed can be executed in the
bash
shell to replicate the execution without using geopmlaunch
directly. The command is modified to support the GEOPM control thread
by setting CPU affinity for each process and increasing the number of
processes per node or CPUs per process.
Note
The primary compute executable and its options,
executable_opt, must appear at the end of the command line and be
preceded by two dashes: --
. The GEOPM launcher will not parse
arguments to the right of the first --
sequence and will pass the
arguments that follow unaltered while removing the first --
from the
command line. Refer to the EXAMPLES section below.
Supported Launchers
The launcher is selected by specifying the launcher as the first command line parameter. Available launcher values are listed below.
srun, SrunLauncher: Wrapper for the SLURM resource manager’s
srun
job launcher. The-cc
and--cpu-bind
options are reserved for use by GEOPM; do not specify CPU-binding options when usinggeopmlaunch
.aprun, AlpsLauncher: Wrapper for the Cray ALPS
aprun
job launcher. The-cc
and--cpu-binding
options are reserved for use by GEOPM; do not specify these when usinggeopmlaunch
.impi, IMPIExecLauncher: Wrapper for the Intel MPI
mpiexec
job launcher. TheKMP_AFFINITY
I_MPI_PIN_DOMAIN
, andMV2_ENABLE_AFFINITY
environment variables are reserved for use by GEOPM and are overwritten when usinggeopmlaunch
.ompi, OMPIExecLauncher: Wrapper for the Open MPI
mpiexec
job launcher. The-rf
and--rank-file
as well as explicitly specifying number of processes with-H
and--host
options are reserved for use by GEOPM; do not specify when usinggeopmlaunch
.SrunTOSSLauncher: Wrapper for
srun
when used with the Trilab Operating System Software stack. This special launcher was developed to support special affinity plugins for SLURM that were deployed at LLNL’s computing center. The-cc
and--cpu-bind
options are reserved for use by GEOPM; do not specify when usinggeopmlaunch
.pals, PALSLauncher: Wrapper for Cray-HPE PALS launcher, leveraging
mpiexec
. The--cpu-bind
option is reserved for use by GEOPM; do not specify CPU-binding options when usinggeopmlaunch
.
GEOPM Options
- --geopm-report path
Specifies the path to the GEOPM report output file that is generated at the conclusion of the run if this option is provided. If the option is not provided, a report named
geopm.report
will be created. The GEOPM report contains a summary of the profiling information collected by GEOPM throughout the application execution. Refer to geopm_report(7) for a description of the information contained in the report.This option is used by the launcher to set the
GEOPM_REPORT
environment variable. The command line option will override any value currently set in the environment. See the ENVIRONMENT section of geopm(7).- --geopm-report-signals signals
Used to insert additional measurements into the report beyond the default values. See the Notes On Sampling section of geopm_report(7) for more information about how signal samples are summarized over time and across sampling domains in a report.
The value of signals must be formatted as a comma-separated list of valid signal names. The available signals and their descriptions are documented in the geopm_pio(7) man page.
By default, the signals in the report are aggregated to the board domain. A domain other than board can be specified by appending the signal name with an
'@'
character and then specifying one of the domains. For example, the following will extend the region and application totals sections of the report with package energy for each package and DRAM energy summed over the all DIMMs:--geopm-report-signals=CPU_ENERGY@package,DRAM_ENERGY
The geopmread(1) executable enables discovery of signals and domains available on your system. The signal names and domain names given for this parameter are specified as in the geopmread(1) command line interface.
- --geopm-trace path
The base name and path of the trace file(s) generated if this option is specified. One trace file is generated for each compute node used by the application containing a pipe-delimited ASCII table describing a time series of measurements made by the GEOPM runtime. The path is extended with the host name of the node for each created file. The trace files will be written to the file system path specified or current directory if only a file name is given. This feature is primarily a debugging tool, and may not scale to large node counts due to file system issues. This option is used by the launcher to set the
GEOPM_TRACE
environment variable. The command line option will override any value currently set in the environment. See the ENVIRONMENT section of geopm(7).- --geopm-trace-signals signals
Used to insert additional columns into the trace beyond the default columns and the columns added by the Agent. This option has no effect unless tracing is enabled with
--geopm-trace
. The value must be formatted as a comma-separated list of valid signal names. When not specified all custom signals added to the trace will be sampled and aggregated for the entire node unless the domain is specified by appending"@domain_type"
to the signal name. For example, the following will add total DRAM energy and power as columns in the trace:--geopm-trace-signals=DRAM_ENERGY,DRAM_POWER
The signals available and their descriptions are documented in the PlatformIO(3) man page.
TIME
,EPOCH_COUNT
,REGION_HASH
,REGION_HINT
,REGION_PROGRESS
,CPU_ENERGY
,DRAM_ENERGY
,CPU_POWER
,DRAM_POWER
,CPU_FREQUENCY_STATUS
,CPU_CYCLES_THREAD
,CPU_CYCLES_REFERENCE
,CPU_CORE_TEMPERATURE
are included in the trace by default. A domain other than board can be specified by appending the signal name with an'@'
character and then specifying one of the domains, e.g:--geopm-trace-signals=CPU_POWER@package,CPU_ENERGY@package
will trace the package power and energy for each package on the system. The geopmread(1) executable enables discovery of signals and domains available on your system. The signal names and domain names given for this parameter are specified as in the geopmread(1) command line interface. This option is used by the launcher to set the
GEOPM_TRACE_SIGNALS
environment variable. The command line option will override any value currently set in the environment. See the ENVIRONMENT section of geopm(7).- --geopm-trace-profile
The base name and path of the profile trace file(s) generated if this option is specified. One trace file is generated for each compute node used by the application containing a pipe-delimited ASCII table describing a log of each call to the
geopm_prof_*()
APIs. The path is extended with the host name of the node for each created file. The profile trace files will be written to the file system path specified or current directory if only a file name is given. This feature is primarily a debugging tool, and may not scale to large node counts due to file system issues. This option is used by the launcher to set theGEOPM_TRACE_PROFILE
environment variable. The command line option will override any value currently set in the environment. See the ENVIRONMENT section of geopm(7).- --geopm-trace-endpoint-policy path
The path to the endpoint policy trace file generated if this option is specified. This file tracks only policies sent through the endpoint at the root controller, not all policies within the controller tree. If
--geopm-endpoint
is not provided, or if the agent does not have any policy values, this file will not be created. This option is used by the launcher to set theGEOPM_TRACE_ENDPOINT_POLICY`
environment variable. The command line option will override any value currently set in the environment. See the ENVIRONMENT section of geopm(7).- --geopm-profile name
The name of the profile which is printed in the report and trace files. This name can be used to index the data in post-processing. For example, when running a sweep of experiments with multiple power caps, the profile could contain the power setting for one run. The default profile name is “default”. This option is used by the launcher to set the
GEOPM_PROFILE
environment variable. The command line option will override any value currently set in the environment. See the ENVIRONMENT section of geopm(7).- --geopm-ctl CONTROL_MODE
Use the GEOPM runtime and launch GEOPM with one of three
CONTROL_MODE
s: application, process, or pthread.When used with
srun
, the application method of launch must be called inside an existing allocation made withsalloc
orsbatch
and the command must request all the compute nodes assigned to the allocation. This method is the default if the--geopm-ctl
option is not provided.When invoked with non-MPI applications, the process and pthread methods will silently fail to launch geopm. Only the application method will launch geopm with non-MPI applications.
The process method allocates one extra MPI process per node for the GEOPM controller. The process method can be used in the widest variety of cases, but some systems require that each MPI process be assigned the same number of CPUs which may waste resources by assigning more than one CPU to the GEOPM controller process.
The pthread method spawns a thread from one MPI process per node to run the GEOPM controller. The application method launches the geopmctl(1) application in the background which connects to the primary compute application. The pthread option requires support for
MPI_THREAD_MULTIPLE
, which is not enabled at many sites.The
--geopm-ctl
option is used by the launcher to set theGEOPM_CTL
environment variable. The command line option will override any value currently set in the environment. See the ENVIRONMENT section of geopm(7).- --geopm-agent agent
Specify the Agent type. The Agent defines the control algorithm used by the GEOPM runtime. Available agents are:
"monitor"
(default, enables profiling features only),"power_balancer"
(optimizes runtime under a power cap),"power_governor"
(enforces a uniform power cap), and"frequency_map"
(runs each region at a specified frequency). See geopm_agent_monitor(7), geopm_agent_power_balancer(7), geopm_agent_power_governor(7), geopm_agent_frequency_map(7) and geopm_agent_ffnet(7) for descriptions of each agent.For more details on the responsibilities of an agent, see geopm::Agent(3).
This option is used by the launcher to set the
GEOPM_AGENT
environment variable. The command line option will override any value currently set in the environment. See the ENVIRONMENT section of geopm(7).- --geopm-policy policy
GEOPM policy JSON file used to configure the Agent plugin. If the policy is provided through this file, it will only be read once and cannot be changed dynamically. In this mode, samples will not be provided to the resource manager. See geopmagent(1) and geopm_agent(3) for more information about how to create this input file.
This option is used by the launcher to set the
GEOPM_POLICY
environment variable. The command line option will override any value currently set in the environment. See the ENVIRONMENT section of geopm(7).- --geopm-init-control path
Initialize any available controls with the values in the file specified by path. The file format is the same as that for
geopmwrite
with one control specified per line. The comment character is ‘#’. It may be placed anywhere on a line to stop parsing of that line. If the comment character results in invalid syntax an error will be raised.Files that contain only comments are valid. They will be parsed but no controls will be written. A file that is truly empty (i.e. contains only ‘0’) will raise an error.
Example file contents:
CPU_POWER_LIMIT_CONTROL board 0 200 # Set a 200W power limit # Also set the time limit CPU_POWER_TIME_WINDOW_CONTROL board 0 0.015
This option is used by the launcher to set the
GEOPM_INIT_CONTROL
environment variable. The command line option will override any value currently set in the environment. See the ENVIRONMENT section of geopm(7).- --geopm-affinity-enable
GEOPM will choose CPU affinity settings to minimize interference between the GEOPM Runtime, the OS, and the application. When specified, the launcher will emit command line arguments and/or environment variables related to affinity settings for the underlying launcher. The user should refrain from using of command line options or environment variable that are known to modify application CPU affinity when specifying this option for
geopmlaunch
.- --geopm-endpoint endpoint
Prefix for shared memory keys used by the endpoint. The endpoint receives policies dynamically from the resource manager. Refer to geopm_endpoint(3) for more detail.
If this option is provided, the GEOPM controller will also send samples to the endpoint at runtime, depending on the Agent selected. This option overrides the use of
--geopm-policy
to receive policy values. This option is used by the launcher to set theGEOPM_ENDPOINT
environment variable. The command line option will override any value currently set in the environment. See the ENVIRONMENT section of geopm(7).- --geopm-timeout sec
Time in seconds that the application should wait for the GEOPM controller to connect over shared memory. The default value is 30 seconds. This option is used by the launcher to set the
GEOPM_TIMEOUT
environment variable. The command line option will override any value currently set in the environment. See the ENVIRONMENT section of geopm(7).- --geopm-plugin-path path
The search path for GEOPM plugins. It is a colon-separated list of directories used by GEOPM to search for shared objects which contain GEOPM plugins. In order to be available to the GEOPM runtime, plugins should register themselves with the appropriate factory. See geopm::PluginFactory(3) for information about the GEOPM plugin interface.
A zero-length directory name indicates the current working directory; this can be specified by a leading or trailing colon, or two adjacent colons. The default search location is always loaded first and is determined at library configuration time and by way of the
'pkglib'
variable (typically/usr/lib64/geopm/
). This option is used by the launcher to set theGEOPM_PLUGIN_PATH
environment variable. The command line option will override any value currently set in the environment. See the ENVIRONMENT section of geopm(7).- --geopm-record-filter filter
Applies the user specified filter to the application record data feed. The filters currently supported are
"proxy_epoch"
and"edit_distance"
. These filters can be used to infer the application outer loop (epoch) without modifying the application by inserting calls togeopm_prof_epoch()
(see geopm_prof(3)). Region entry and exit may be captured automatically through runtimes such as MPI and OpenMP.The
"proxy_epoch"
filter looks for entries into a specific region that serves as a proxy for epoch events. The filter is specified as a comma-separated list. The first value selects the filter by name:"proxy_epoch"
. The second value in the comma-separated list specifies a region that will be used as a proxy for calls togeopm_prof_epoch()
. If the value can be interpreted as an integer, it will be used as the numerical region hash of the region name, otherwise, the value is interpreted as the region name. The third value that can be provided in the comma-separated list is optional. If provided, it specifies the number of region entries into the proxy region that are expected per outer loop. By default, this is assumed to be 1. The fourth optional parameter that can be specified in the comma-separated list is the number of region entries into the proxy region that are expected prior to the outer loop beginning. By default, this is assumed to be 0. In the following example, theMPI_Barrier
region entry is used as a proxy for the epoch event:--geopm-record-filter=proxy_epoch,MPI_Barrier
In the next example the
MPI_Barrier
region is specified as a hash and the calls per outer loop is given as 6:--geopm-record-filter=proxy_epoch,0x7b561f45,6
In the last example the calls prior to startup is specified as 10:
--geopm-record-filter=proxy_epoch,MPI_Barrier,6,10
Note: you must specify the calls per outer loop in order to specify the calls prior to startup.
The
"edit_distance"
filter will attempt to infer the epoch based on patterns in the region entry events using an edit distance algorithm. The filter is specified as string beginning with the name"edit_distance"
; if optional parameters are specified, they are provided as a comma-separated list following the name. The first parameter is the buffer size; the default if not provided is 100. The second parameter is the minimum stable period length in number of records. The third parameter is the stable period hysteresis factor. The fourth parameter is the unstable period hysteresis factor. In the following example, the"edit_distance"
filter will be used with all optional parameters provided:--geopm-record-filter=edit_distance,200,8,2.0,3.0
- --geopm-debug-attach rank
Enables a serial debugger such as
gdb
to attach to a job when the GEOPM PMPI wrappers are enabled. If set to a numerical value, the associated rank will wait inMPI_Init()
until a debugger is attached and the local variable"cont"
is set to a non-zero value. If set, but not to a numerical value then all ranks will wait. The runtime will print a message explaining the hostname and process ID that the debugger should attach to. This option is used by the launcher to set theGEOPM_DEBUG_ATTACH
environment variable. The command line option will override any value currently set in the environment. See the ENVIRONMENT section of geopm(7).- --geopm-preload
Use LD_PRELOAD to link libgeopm.so at runtime. This can be used to enable the GEOPM runtime when an application has not been compiled against libgeopm.so.
- --geopm-hyperthreads-disable
Prevent the launcher from trying to use hyperthreads for pinning purposes when attempting to satisfy the MPI ranks / OMP threads configuration specified. This is done for both the controller and the application. An error is raised if the launcher cannot satisfy the current request without hyperthreads.
- --geopm-ctl-disable
Used to allow passing the command through to the underlying launcher. By default,
geopmlaunch
will launch the GEOPM runtime in process mode. When this option is specified, the GEOPM runtime will not be launched.- --geopm-ctl-local
Disable all communication between controllers running on different compute nodes. This will result in one report file per host, and each report file will have the
hostname
appended to the file path requested by the user. Note this option has no effect if--geopm-ctl
or--geopm-ctl-disable
are provided. This is especially useful when launching non-MPI applications.- --geopm-ompt-enable
Enable OMPT detection of OpenMP regions. See the INTEGRATION WITH OMPT section of geopm(7) for more information about OpenMP region detection.
- --geopm-period
Override the control loop period specified by the Agent. All agents have a default control loop period, and this command line option allows users to set a different value in units of seconds. With longer control loop periods, the overhead for using GEOPM will be reduced, but more interpolation will be required when aligning the sparsely sampled hardware signals with the application feedback. Additionally, agent reaction time is reduced with longer control loop periods.
- --geopm-program-filter
Only enable profiling for processes where their
program_invocation_name
orprogram_invocation_name_short_name
matches one of the names in the comma separated list provided by option. This is especially useful when launching a bash script to avoid profiling bash or other ancillary commands that are not part of the main application process set. See program_invocation_name(3) for more details.
Examples
Use geopmlaunch
to queue a job using geopmbench
on a SLURM managed system
requesting two nodes using 32 application MPI process each with four threads:
geopmlaunch srun -N 2 -n 32 -c 4 \
--geopm-ctl=process \
--geopm-report=tutorial6.report \
-- ./geopmbench tutorial6_config.json
Use geopmlaunch
to launch the miniFE
executable with the same configuration,
but on an ALPS managed system:
geopmlaunch aprun -N 2 -n 64 --cpus-per-pe 4 \
--geopm-ctl process \
--geopm-report miniFE.report \
-- ./miniFE.x -nx 256 -ny 256 -nz 256
Environment
Every command line option to the launcher can also be specified as an
environment variable if desired (except for --geopm-ctl
).
For example, instead of specifying --geopm-trace=geopm.trace
one can
instead set in the environment GEOPM_TRACE=geopm.trace
prior to
invoking the launcher script. The environment variables are named the
same as the command line option but have the hyphens replaced with
underscores, and are all uppercase. The command line options take
precedence over the environment variables.
The usage of --geopm-ctl
here is slightly different from how the
controller handles the GEOPM_CTL
environment variable. In the
case of the launcher, one can specify process, pthread, or
application to the command line argument. In the case of
GEOPM_CTL
one can ONLY specify process
or pthread
, as
launching the controller as a separate application is handled through
the geopmctl
binary.
The interpretation of the environment is affected if either of the GEOPM configuration files exist:
/etc/geopm/environment-default.json
/etc/geopm/environment-override.json
These files may specify system default and override settings for all
of the GEOPM environment variables. The environment-default.json
file contains a JSON object mapping GEOPM environment variable strings
to strings that define default values for any unspecified GEOPM
environment variable or unspecified geopmlaunch
command line
options. The environment-override.json
contains a JSON object that
defines values for GEOPM environment variables that take precedence
over any settings provided by the user either through the environment
or through the geopmlaunch
command line options. The order of
precedence for each GEOPM variable is: override configuration file,
geopmlaunch
command line option, environment setting, the default
configuration file, and finally there are some preset default values
that are coded into GEOPM which have the lowest precedence.
The KMP_WARNINGS
environment variable is set to 'FALSE'
, thus
disabling the Intel OpenMP warnings. This avoids warnings emitted
because the launcher configures the OMP_PROC_BIND
environment
variable to support applications compiled with a non-Intel
implementation of OpenMP.
See Also
geopm(7), geopmpy(7), geopm_agent_monitor(7), geopm_agent_power_balancer(7), geopm_agent_power_governor(7), geopm_report(7), geopm_error(3), geopmctl(1)