Basic concepts about SLURM
The Slurm scheduler provides three key functions:
it allocates access to resources (compute nodes) to users for some duration of time so they can perform work.
it provides a framework for starting, executing, and monitoring work (typically a parallel job such as MPI) on a set of allocated nodes.
it arbitrates contention for resources by managing a queue of pending jobs.
A job consists in two parts: resource requests and job steps.
Resource requests describe the amount of computing resource (CPUs, memory, expected run time, etc.) that the job will need to successfully run.
Job steps describe individual tasks that must be executed into a job. Most often a single job needs to execute several individual computations to be completed. Each partial execution in a job is called job step. You can execute a job step with the SLURM command:
srun
. A job consists in one or more steps, each consisting in one or more tasks each using one or more CPU.
Jobs are typically created with the sbatch command, steps are created with the srun
command, tasks are requested, at the job level with –ntasks or –ntasks-per-node, or at the step level with –ntasks. CPUs are requested per task with –cpus-per-task. Note that jobs submitted with sbatch have one implicit step; the Bash script itself.
The typical way of creating a job is to write a job submission script. A submission script is a shell script (e.g. a Bash script) whose first comments, if they are prefixed with #SBATCH, are interpreted by Slurm as parameters describing resource requests and submissions options.
Figure 1 is an example of job submission script. For this example we are requesting a total of 6 CPUs and 12GB of RAM for each job step. We could define infinite number of job steps but each job step is assigned a maximum number of resources to use (usually the input of one is the output of another).
For the example the job step 1 is parallelized into 3 tasks and each task uses 2 CPUs (to do this you need that your code was programmed with MPI or a similar programming paradigm). The job step 2 executes a serial code that only uses 1 core. It’s useless to use more than 1 task and 1 CPU to do this step because the code is not paralellized (the input data of this job step is the output results of job step 1). Finally the job step 3 executes an only task that can be executed in 6 cores in parallel. To do this you need that your code is programmed with OpenMP or similar programming paradigm.

Figure 1. Example of a job with 3 job steps and 3 tasks per job step.
Memory and CPU requests
A large number of users request far more memory and CPUs than their jobs use.
While it is important to request more memory than will be used (10-20% is usually sufficient), requesting 100x, or even 10,000x, more memory only reduces the number of jobs that a user can run as well as overall throughput on the cluster. Many users will be able to run far more jobs if they request more reasonable amounts of memory.
If your job is not able to parallelize then use only 1 CPU. Read first the user guide of your application in order to find if your code can be paralelized and with what parameters then submit it specifying the correct number of CPUs.
When a job finishes and doesn’t meet the requirements requested we send to the user an email showing how much memory and CPU the job actually used and can be used to adjust memory requests for future jobs. The SLURM directives for memory requests are the –mem or –mem-per-cpu. It is in the user’s best interest to adjust the memory request to a more realistic value.
Requesting more memory than needed will not speed up analyses. Based on their experience of finding their personal computers run faster when adding more memory, users often believe that requesting more memory will make their analyses run faster. This is not the case. An application running on the cluster will have access to all of the memory it requests, and we never swap RAM to disk. If an application can use more memory, it will get more memory. Only when the job crosses the limit based on the memory request does SLURM kill the job.
Slurm commands
Below we briefly explains some of more used commands of SLURM but you can review the complete information in the SLURM site
Monitoring jobs: squeue
The squeue
command is a tool we use to pull up information about the jobs in queue. You can use the extended comand squeue_
to retrieve statistics about the efficiency of your job. By default, the squeue command will print out the job ID, QoS, username, job status, number of nodes, and name of nodes for all jobs queued or running within Slurm. Usually you wouldn’t need information for all jobs that were queued in the system, so we can specify jobs that only you are running with the –user flag:
$ squeue --user=USERNAME
We can output non-abbreviated information with the –long flag. This flag will print out the non-abbreviated default information with the addition of a time limit field:
$ squeue --user=USERNAME --long
The squeue command also provides users with a means to calculate a job’s estimated start time by adding the –start flag to our command. This will append Slurm’s estimated start time for each job in our output information.
Note: The start time provided by this command can be inaccurate. This is because the time calculated is based on jobs queued or running in the system. If a job with a higher priority is queued after the command is run, your job may be delayed.
$ squeue --user=USERNAME --start
When checking the status of a job, you may want to repeatedly call the squeue command to check for updates. We can accomplish this by adding the –iterate flag to our squeue command. This will run squeue every n seconds, allowing for a frequent, continuous update of queue information without needing to repeatedly call squeue:
$ squeue --user=USERNAME --start --iterate=n_seconds
Press ctrl-c to stop the command from looping and bring you back to the terminal.
more information about squeue
The squeue command details a variety of information on an active job’s status with state and reason codes. Job state codes describe a job’s current state in queue (e.g. pending, completed). Job reason codes describe the reason why the job is in its current state.
The following tables outline a variety of job state and reason codes you may encounter when using squeue to check on your jobs.
Job State Codes:
Job State |
Codes |
Explaination |
---|---|---|
COMPLETED |
CD |
The job has completed successfully. |
COMPLETING |
CG |
The job is finishing but some processes are still active. |
FAILED |
F |
The job terminated with a non-zero exit code and failed to execute. |
PENDING |
PD |
The job is waiting for resource allocation. It will eventually run. |
PREEMPTED |
PR |
The job was terminated because of preemption by another job. |
RUNNING |
R |
The job currently is allocated to a node and is running. |
SUSPENDED |
S |
A running job has been stopped with its cores released to other jobs. |
STOPPED |
ST |
A running job has been stopped with its cores retained. |
A full list of these Job State codes can be found in Slurm’s documentation.
Job Reason Codes:
Reason Codes |
Explaination |
---|---|
Priority |
One or more higher priority jobs is in queue for running. Your job will eventually run. |
Dependency |
This job is waiting for a dependent job to complete and will run afterwards. |
Resources |
The job is waiting for resources to become available and will eventually run. |
InvalidAccount |
The job’s account is invalid. Cancel the job and rerun with correct account. |
InvaldQoS |
The job’s QoS is invalid. Cancel the job and rerun with correct account. |
QOSGrpCpuLimit |
All CPUs assigned to your job’s specified QoS are in use; job will run eventually. |
QOSGrpMaxJobsLimit |
Maximum number of jobs for your job’s QoS have been met; job will run eventually. |
QOSGrpNodeLimit |
All nodes assigned to your job’s specified QoS are in use; job will run eventually. |
PartitionCpuLimit |
All CPUs assigned to your job’s specified partition are in use; job will run eventually. |
PartitionMaxJobsLimit |
Maximum number of jobs for your job’s partition have been met; job will run eventually. |
PartitionNodeLimit |
All nodes assigned to your job’s specified partition are in use; job will run eventually. |
AssociationCpuLimit |
All CPUs assigned to your job’s specified association are in use; job will run eventually. |
AssociationMaxJobsLimit |
Maximum number of jobs for your job’s association have been met; job will run eventually. |
AssociationNodeLimit |
All nodes assigned to your job’s specified association are in use; job will run eventually. |
A full list of these Job Reason Codes can be found in Slurm’s documentation
Monitoring finished jobs: sacct
The sacct
command allows users to pull up status information about past jobs. This command is used on jobs that have been previously run on the system instead of currently running jobs.
We can use a job’s id
$ sacct --user=USERNAME
Or your Garnatxa username…
$ sacct --user=USERNAME
to pull up accounting information on jobs run at an earlier time.
By default, sacct will only pull up jobs that were run on the current day. We can use the –starttime flag to tell the command to look beyond its short-term cache of jobs.
$ sacct --user=USERNAME --starttime=YYYY-MM-DD
To see a non-abbreviated version of sacct output, use the –long flag:
$ sacct --user=USERNAME --starttime=YYYY-MM-DD --long
The standard output of sacct may not provide the information we want. To remedy this, we can use the –format flag to choose what we want in our output. Similarly, the format flag is handled by a list of comma separated variables which specify output data:
$ sacct --user=USERNAME --format=var_1,var_2, ... ,var_N
A chart of some variables is provided below:
Variable |
Description |
---|---|
account |
Account the job ran under. |
avecpu |
Average CPU time of all tasks in job. |
averss |
Average resident set size of all tasks in the job. |
cputime |
Formatted (Elapsed time * CPU) count used by a job or step. |
elapsed |
Jobs elapsed time formated as DD-HH:MM:SS. |
exitcode |
The exit code returned by the job script or salloc. |
jobid |
The id of the Job. |
jobname |
The name of the Job. |
maxdiskread |
Maximum number of bytes read by all tasks in the job. |
maxdiskwrite |
Maximum number of bytes written by all tasks in the job. |
maxrss |
Maximum resident set size of all tasks in the job. |
ncpus |
Amount of allocated CPUs. |
nnodes |
The number of nodes used in a job. |
ntasks |
Number of tasks in a job. |
priority |
Slurm priority. |
qos |
Quality of service. |
reqcpu |
Required number of CPUs |
reqmem |
Required amount of memory for a job. |
user |
Username of the person who ran the job. |
more information about ssact
Canceling jobs: scancel
Sometimes you may need to stop a job entirely while it’s running. The best way to accomplish this is with the scancel
command. The scancel
command allows you to cancel jobs you are running on Garnatxa using the job’s ID. The command looks like this:
$ scancel job-id
To cancel multiple jobs, you can use a comma-separated list of job IDs:
$ scancel job-id1, job-id2, jobiid3
To cancel all your jobs (running and pending):
$ scancel -u USERNAME
more information about scancel
Checking efficiency of running jobs: squeue_
The extended command squeue_
allows users to easily pull up status information about their currently running jobs. This includes information about the requested resources and the resources used so far. The command retrieves the efficiency achieved as a percentage. It’s very important you monitor your jobs checking the resources actually consumed. If squeue_
is showing a efficiency smaller than 80% you must accurate the requirements of your next job.
To check the efficiency of a running job:
$ squeue_ -j 10075
__________________________________________________________________________________________________________________________________________________________________________
| ST | JOB | NAME | USER | ACCOUNT | QOS | STARTIME | TIME | TIME_LEFT | ND | CPU E.CPU | PEAK_MEM E.MEM | NODES |
|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| R | 10075 | EXAMPLE | USER1 | ACC1 | short | 2023-01-13 | 4:51:34 | 19:08:26 | 1 | 1/10 10% | 23G/100G 23% | cn00 |
|__________________________________________________________________________________________________________________________________________________________________________|
CPU: Is the average number of cpus used during the execution time of the task. In the example 1/10 means that the user requested 10 cpus but only 1 cpu is being used on average.
PEAK_MEM: Is the maximum size of memory that the job has been used during the execution. In the example 23G/100G means that the user requested a total of 100 GB of memory and at one point in execution the job has reached 23 GB.
What parameter to use to measure my efficiency?: You must check the column CPU and PEAK_MEM in order to determine if the efficiency of your job is less than 75% in cpu or memory. If that happens you must to modify the requested parameters in your submission script. These values can give you an idea of the resources consumed by your work so far. If the efficiency is so low and your job is running a short time then cancel the job and adjust the requirements or wait for the job to finish to be sure what consumed your job (next section).
To check the efficiency of all your jobs:
$ squeue_ -u USERNAME
Checking efficiency of running and completed jobs: sacct_
The resources consumed by a running or finished jobs can be checked executing: sacct_
.
$ sacct_ -j 10516
[USERNAME@master test]$ sacct_ -j 10516
________________________________________________________________________________________________________________________________________________________________________________________________________________
| JOBID | NAME | START | END | ELAPSED | TOTAL_CPU | USER | ACCOUNT | QOS | CPU E.CPU | PEAK_MEM E.MEM | STATE | EXIT_CODE |
|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| 10516 | seqJobTest | 2023-01-11T11:09:32 | 2023-01-11T11:11:33 | 00:02:01 | 02:01.567 | xxx | admin | short | 1/1 100% | 6G/10G 60% | COMPLETED | 0:0 |
| 10516.batch | batch | 2023-01-11T11:09:32 | 2023-01-11T11:11:33 | 00:02:01 | 00:00.373 | | admin | | -/2 - | -/10G - | COMPLETED | 0:0 |
| 10516.0 | stress | 2023-01-11T11:09:33 | 2023-01-11T11:11:33 | 00:02:00 | 02:01.194 | | admin | | -/2 - | -/10G - | COMPLETED | 0:0 |
|________________________________________________________________________________________________________________________________________________________________________________________________________________|
In the example the job with job_id: 10516 consumed 6GB of memory but the user requested 10GB so the efficiency was only the 60% of the total. To show the last jobs finished by a user:
$ sacct_ -u USERNAME
And to get a brief output (only efficiencies and discarding job steps):
$ sacct_ -b -u USERNAME
Plotting the job efficiency over time: plotjob
The command plotjob
can be used to display a plot of the consumed resources (cpu or memory) over the execution time. You can use this command with running or finished jobs.
To use this command you will need to establish a X11 forwarding connection with Garnatxa (ssh -X). Use plotjob -h to get more information.
Attention
Keep in mind that the gray areas on the plot mean wasted resources for your jobs and for the rest of the users. Pay attention to the mean (spu) or peak (memory) data and adjust your sbatch script with the resources that job really needs.
Example of plotting the cpu efficiency:
ssh -X USERNAME@garnatxa
plotjob -j job_id -o cpu

Example of plotting the memory efficiency:
ssh -X USERNAME@garnatxa
plotjob -j <job_id> -o mem

If you cannot get a graphical output you can try saving the graphic file in disk and sending it to an external location.
plotjob -j <job_id> -o mem -s
ls /tmp/mem_plot_1857479.png
scp /tmp/mem_plot_1857479.png user@external_host:/tmp
Controlling queued and running jobs: scontrol
The scontrol
command provides users extended control of their jobs run through Slurm. This includes actions like suspending a job, holding a job from running, or pulling extensive status information on jobs.
To suspend a job that is currently running on the system, we can use scontrol with the suspend command. This will stop a running job on its current step that can be resumed at a later time. We can suspend a job by typing the command:
$ scontrol suspend job_id
To resume a paused job, we use scontrol with the resume command:
$ scontrol resume job_id
Slurm also provides a utility to hold jobs that are queued in the system. Holding a job will place the job in the lowest priority, effectively “holding” the job from being run. A job can only be held if it’s waiting on the system to be run. We use the hold command to place a job into a held state:
$ scontrol hold job_id
We can then release a held job using the release command:
$ scontrol release job_id
scontrol can also provide information on jobs using the show job command. The information provided from this command is quite extensive and detailed, so be sure to either clear your terminal window, grep certain information from the command, or pipe the output to a separate text file:
Output to console
$ scontrol show job job_id
Streaming output to a textfile
$ scontrol show job job_id > outputfile.txt
Piping output to Grep and find lines containing the word “Time”
$ scontrol show job job_id | grep Time
more information about scontrol
Submitting jobs to the cluster: sbatch
sbatch submits a batch script to Slurm. The batch script may be given to sbatch through a file name on the command line, or if no file name is specified, sbatch will read in a script from standard input. The batch script may contain options preceded with “#SBATCH” before any executable commands in the script. sbatch will stop processing further #SBATCH directives once the first non-comment non-whitespace line has been reached in the script. sbatch exits immediately after the script is successfully transferred to the Slurm controller and assigned a Slurm job ID. The batch script is not necessarily granted resources immediately, it may sit in the queue of pending jobs for some time before its required resources become available.
By default both standard output and standard error are directed to a file of the name “slurm-%j.out”, where the “%j” is replaced with the job allocation number. The file will be generated on the first node of the job allocation. Other than the batch script itself, Slurm does no movement of user files. When the job allocation is finally granted for the batch script, Slurm runs a single copy of the batch script on the first node in the set of allocated nodes.
A batch job is a shell script that is processed by a batch system. A typical batch job is shown below. It has four sections
shebang (line 1)
submit options (lines 3-5)
initialization (lines 7-8)
data handling and work (lines 10-12)
that are explained afterwards. The example shows the basic structure. Real batch jobs can become more complex.
1#!/bin/bash
2
3# submit options
4#SBATCH --ntasks=1
5#SBATCH --time=00:05:00
6
7# initialization
8module load package/version
9
10# data handling and work
11cd /path/to/working/directory
12binary [arguments]
13
14exit
Explanations
1. shebang The first line of every shell script is the Shebang which specifies the command line interpreter to use.
2. submit options In a batch job the next lines contain submit options. Alternatively options could be given as arguments to the submit command. The syntax for specifying options is the same in both cases. In a job script submit options must be preceded by a special prefix which is #SBATCH for the SLURM batch system. Syntactically the first character of the prefix makes such a line a shell script comment. The submit command stops processing these lines once the first line containing a shell command has been reached.
3. initialization System specific initialization. It depends on the system whether system specific initialization is needed. For example, on our system the module function which is often needed in job specific initialization. Job specific initialization. There are two typical use cases: For application packages the corresponding module must be loaded. For self-compiled software it might be necessary to load (or switch to) exactly the same modules that were loaded at compile time. For MPI programs that are lauched with mpirun command the MPI module used at compile time must be loaded in any case.
4. data handling and work This part contains commands for handling data and the actual work to be performed. Selecting the working directory. The default working directory is the directory in which the submit command was issued.
When you write a sbatch file you can use a set of read variables in order to get the value of the requirements of your job:
Variable |
Description |
---|---|
$SLURM_JOB_ID |
The Job ID. |
$SLURM_SUBMIT_DIR |
The path of the job submission directory. |
$SLURM_SUBMIT_HOST |
The hostname of the node used for job submission. |
$SLURM_JOB_NODELIST |
Contains the definition (list) of the nodes that is assigned to the job. |
$SLURM_CPUS_PER_TASK |
Number of CPUs per task. |
$SLURM_CPUS_ON_NODE |
Number of CPUs on the allocated node. |
$SLURM_JOB_CPUS_PER_NODE |
Count of processors available to the job on this node. |
$SLURM_CPUS_PER_GPU |
Number of CPUs requested per allocated GPU. |
$SLURM_MEM_PER_CPU |
Memory per CPU. Same as –mem-per-cpu . |
$SLURM_MEM_PER_GPU |
Memory per GPU. |
$SLURM_MEM_PER_NODE |
Memory per node. Same as –mem . |
$SLURM_GPUS |
Number of GPUs requested. |
$SLURM_NTASKS |
The number of tasks. |
$SLURM_NTASKS_PER_NODE |
Number of tasks requested per node. |
$SLURM_NTASKS_PER_SOCKET |
Number of tasks requested per socket. |
$SLURM_NTASKS_PER_CORE |
Number of tasks requested per core. |
$SLURM_NTASKS_PER_GPU |
Number of tasks requested per GPU. |
$SLURM_NNODES |
Total number of nodes in the job’s resource allocation. |
$SLURM_TASKS_PER_NODE |
Number of tasks to be initiated on each node. |
$SLURM_ARRAY_JOB_ID |
Job array’s master job ID number. |
$SLURM_ARRAY_TASK_ID |
Job array ID (index) number. |
$SLURM_ARRAY_TASK_COUNT |
Total number of tasks in a job array. |
$SLURM_ARRAY_TASK_MAX |
Job array’s maximum ID (index) number. |
$SLURM_ARRAY_TASK_MIN |
Job array’s minimum ID (index) number. |
$SLURM_RESTART_COUNT |
The number of times your job was restarted due to node failures |
Job scripts, the sbatch
command, and the srun
command support many different resource requests in the form of flags. These flags are available to all forms of jobs. To review all possible flags for these commands, please visit the Slurm page on sbatch . Below, we have listed some useful directives to consider when running your job script.
Type |
Description |
Flag |
---|---|---|
Allocation |
Specify an allocation account |
–account=allocation |
Quality of service |
Specify a QOS (see the section limits) |
–qos=qos |
Sending email |
Receive email at beginning or end of job completion |
–mail-type=type |
Email address |
Email address to receive the email |
–mail-user=user |
Number of nodes |
The number of nodes needed to run the job |
–nodes=nodes |
Number of tasks |
The total number of processes needed to run the job |
–ntasks=processes |
Tasks per node |
The number of processes you wish to assign to each node |
–ntasks-per-node=processes |
Cpus per task |
The number of cpus will be used per task |
–cpus-per-task=number_cpus |
Total memory |
The total memory (per node requested) required for the job. |
–mem=memory (units:K,M,G,T (default M)) |
Memory per cpu |
The memory per cpu. |
–mem-per-cpu=memory (units:K,M,G,T (default M)) |
Wall time |
The max amount of time your job will run |
–time=wall time |
Job Name |
Name your job so you can identify it in the queue |
–job-name=jobname |
Multithreading |
Each task will only use 1 thread per core |
–hint=nomultithread or –threads-per-core=1 |
Below is listed a set of scripts of example that you could use as template for building your SLURM submission scripts. Each of these scripts is used to submit a type of job (sequential, parallel, MPI, OpenMPI, etc).
If you want start testing the script before to submit your jobs in the cluster then copy the directory: /doc/test/ to your account.
[USERNAME@master ~]$ cp -pr /doc/test/ .
The scripts solves a typical problem in the bioinformatics environment: Index a reference sequence and then align multiple reads to the reference genome. The structure of directories is:
[USERNAME@master ~]$ cd test
[USERNAME@master test]$ ls -R
.:
ArrayJob.sh data executables FileJob.sh files MPIJob.sh OpenMPJob.sh out ref SequentialJob.sh
./data:
reads_00.fq reads_02.fq reads_04.fq reads_06.fq reads_08.fq reads_10.fq reads_12.fq reads_14.fq reads_16.fq reads_18.fq reads_20.fq
reads_01.fq reads_03.fq reads_05.fq reads_07.fq reads_09.fq reads_11.fq reads_13.fq reads_15.fq reads_17.fq reads_19.fq
./out:
./ref:
chr8.fa
If you choose to execute one of these sample scripts, please make sure you understand what each #SBATCH directive before before using the script to submit your jobs. Otherwise, you may not get the result you want and may waste valuable computing resources.
Basic, Single-Threaded Job
This script can serve as the template for many single-processor applications. The mem flag can be used to request the appropriate amount of memory for your job. Please make sure to test your application and set this value to a reasonable number based on actual memory use. The %j in the –output line tells SLURM to substitute the job ID in the name of the output file. You can also add a -e or –error line with an error file name to separate output and error logs. Observe this type of jobs are not parallel so we need a single cpu to work. Remember to indicate this with the lines –ntasks and –cpus-per-task
See the script: SequentialJob.sh
1#!/bin/bash
2#SBATCH --job-name=seqJobTest # Job name (showed with squeue)
3#SBATCH --output=seqJobTest_%j.out # Standard output and error log
4#SBATCH --qos=short # QoS: short,medium,long,long-mem
5#SBATCH --nodes=1 # Required only 1 node
6#SBATCH --ntasks=1 # Required only 1 task
7#SBATCH --cpus-per-task=1 # Required only 1 cpu
8#SBATCH --mem=10G # Required 10GB of memory
9#SBATCH --time=00:05:00 # Required 5 minutes of execution time.
10
11# The first command is to load the required software.
12module load biotools
13
14# Index the reference genome (ref/chr8.fa). The output files will be re-named with preffix: chr8_ref
15srun bwa index ref/chr8.fa -p ref/chr8_ref
16
17# Align a single file of reads (data/reads_00.fq) to the indexed reference file (ref/chr8.fa). We are using a single cpu (parameter: -t 1)
18srun bwa aln -I -t 1 ref/chr8_ref data/reads_00.fq > out/example_aln.sai
19
20exit 0
Commented lines:
4. –qos=short , we are requesting 1 cpu, 10GB of memory and 5 minutes of execution time then we have to select the qos short (see limits)
5. –nodes=1 , the minimum of nodes to run. A maximum node count may also be specified with syntax (min-max): –nodes=1-4 .**You can omit this parameter and let SLURM to select the number of nodes that are necessary.**
6. –ntasks=1 , each job step (each of the
srun
lines) is submitted with only 1 task.7. –cpus-per-task=1 , each task (each of the
srun
lines) will be running in a single CPU, see notes below.8. –mem=10G , request 10GB of RAM for all the job.
15. srun bwa index , first job step in the job. It’s a sequential job (no parallelized so only needs a single cpu) to indexing the reference genome.
18. srun bwa aln, second job step in the job. It’s a sequential job (no parallelized so only needs a single cpu) to align the one of the 20 reads to the reference genome.
Important
Garnatxa always works with an even number of logical threads per core. This is for performance reasons related to hyper-threading technology. Each physical core in Garnatxa is associated with two computation threads which are reserved exclusively for single jobs. This means that when you request a single cpu in Slurm, the system internally reserves an even number. Then it is your option to use or not the extra thread (cpu) and this depends if your job is capable of parallelization.
We start launching the job to the queue system. the sbatch
commands returns the job identifier that will be used then.
[USERNAME@master test]$ sbatch SequentialJob.sh
Submitted batch job 6757
Now we can review the state of the job in the batch system. squeue
returns if the job is running: R or pending state waiting for resources: PD ( extended command: squeue_
returns efficiency information about the job).
This example takes approximately 2 minutes to finish
[USERNAME@master test]$ squeue -u USERNAME
JOBID PARTITION QOS NAME USER ST TIME NODES NODELIST(REASON)
6757 global short seqJobTe USERNAME R 0:05 1 cn07
[USERNAME@master test]$ squeue_ -u USERNAME
____________________________________________________________________________________________________________________________________________________________________________________________________________________
| ST | JOB | NAME | USER | ACCOUNT | QOS | STARTIME | TIME | TIME_LEFT | ND | CPU E.CPU | PEAK_CPU EFFIC | PEAK_MEM EFFIC | NOW_MEM EFFIC | NODES |
|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| R | 6757 | seqJobTest | USERNAME | admin | short | 2022-11-14 | 2:46 | 2:14 | 1 | 1/2 50% | 1/2 50% | 0G/10G 0% | 0G/10G 0% | cn00 |
|____________________________________________________________________________________________________________________________________________________________________________________________________________________|
While the job is running we can check the amount of CPU and memory used typing squeue_
. You will have to wait at least a couple of minutes until the system shows the first efficiency results.
In the example the job is using only 1 CPU but the system allocated 2 because is the minimum to request. The job is currently using 220MB (squeue_ only shows values upper than 1GB) we requested 10GB. Take account that the PEAK_MEM column means the maximum size of memory consumed by your job during all the execution.
We can wait for the job is completed and check then the resources consumed. In any case, it’s clear that you should adjust the number of CPUs and memory in the next execution.
In any time we can also check the status of a job typing sacct
, sacct_
. If the job is finished the job will be marked as: COMPLETED
[USERNAME@master test]$ sacct_ -j 6757
__________________________________________________________________________________________________________________________________________________________________________________________________________________
| JOBID | NAME | START | END | ELAPSED | TOTAL_CPU | USER | ACCOUNT | QOS | CPU | E.CPU | PEAK_MEM | E.MEM | STATE | EXIT_CODE |
|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| 6757 | seqJobTest | 2022-11-14T17:12:20 | 2022-11-14T17:15:02 | 00:02:42 | 00:02:40 | USERNAME | admin | short | 1/1 | 100% | 0G/10G | 0% | COMPLETED | 0:0 |
| 6757.batch | batch | 2022-11-14T17:12:20 | 2022-11-14T17:15:02 | 00:02:42 | 00:02:40 | | admin | | -/2 | - | -/10G | - | COMPLETED | 0:0 |
| 6757.0 | bwa | 2022-11-14T17:12:21 | 2022-11-14T17:14:12 | 00:01:51 | 00:01:48 | | admin | | -/2 | - | -/10G | - | COMPLETED | 0:0 |
| 6757.1 | bwa | 2022-11-14T17:14:02 | 2022-11-14T17:14:47 | 00:00:50 | 00:00:48 | | admin | | -/2 | - | -/10G | - | COMPLETED | 0:0 |
|__________________________________________________________________________________________________________________________________________________________________________________________________________________|
The above command is showing that the main job (6757) executed job steps (6757.0 and 6757.1). The columns E.CPU and E.MEM shows the achieved efficiency (in this case the job consumed less than 1GB of memory).
Finally we can see the output of the job. Remember the name of the output file contains the job id number.
[USERNAME@master test]$ more seqJobTest_6757.out
[bwa_index] Pack FASTA... 0.98 sec
[bwa_index] Construct BWT for the packed sequence...
[BWTIncCreate] textLength=292728044, availableWord=32597292
[BWTIncConstructFromPacked] 10 iterations done. 53770876 characters processed.
[BWTIncConstructFromPacked] 20 iterations done. 99337660 characters processed.
[BWTIncConstructFromPacked] 30 iterations done. 139833516 characters processed.
[BWTIncConstructFromPacked] 40 iterations done. 175822284 characters processed.
[BWTIncConstructFromPacked] 50 iterations done. 207805164 characters processed.
[BWTIncConstructFromPacked] 60 iterations done. 236227612 characters processed.
[BWTIncConstructFromPacked] 70 iterations done. 261485516 characters processed.
[BWTIncConstructFromPacked] 80 iterations done. 283930748 characters processed.
[bwt_gen] Finished constructing BWT in 85 iterations.
[bwa_index] 72.14 seconds elapse.
[bwa_index] Update BWT... 0.59 sec
[bwa_index] Pack forward-only FASTA... 0.58 sec
[bwa_index] Construct SA from BWT and Occ... 33.64 sec
[main] Version: 0.7.17-r1188
[main] CMD: /storage/apps/BWA/0.7.17/bin/bwa index -p ref/chr8_ref ref/chr8.fa
[main] Real time: 110.993 sec; CPU: 107.935 sec
[bwa_aln] 17bp reads: max_diff = 2
[bwa_aln] 38bp reads: max_diff = 3
[bwa_aln] 64bp reads: max_diff = 4
[bwa_aln] 93bp reads: max_diff = 5
[bwa_aln] 124bp reads: max_diff = 6
[bwa_aln] 157bp reads: max_diff = 7
[bwa_aln] 190bp reads: max_diff = 8
[bwa_aln] 225bp reads: max_diff = 9
[bwa_aln_core] calculate SA coordinate... 3.35 sec
[bwa_aln_core] write to the disk... 0.01 sec
[bwa_aln_core] 262144 sequences have been processed.
[bwa_aln_core] calculate SA coordinate... 3.33 sec
[bwa_aln_core] write to the disk... 0.01 sec
[bwa_aln_core] 524288 sequences have been processed.
[bwa_aln_core] calculate SA coordinate... 3.31 sec
[bwa_aln_core] write to the disk... 0.01 sec
[bwa_aln_core] 786432 sequences have been processed.
[bwa_aln_core] calculate SA coordinate... 3.25 sec
[bwa_aln_core] write to the disk... 0.01 sec
[bwa_aln_core] 1048576 sequences have been processed.
[bwa_aln_core] calculate SA coordinate... 3.25 sec
[bwa_aln_core] write to the disk... 0.01 sec
[bwa_aln_core] 1310720 sequences have been processed.
[bwa_aln_core] calculate SA coordinate... 3.25 sec
[bwa_aln_core] write to the disk... 0.01 sec
[bwa_aln_core] 1572864 sequences have been processed.
[bwa_aln_core] calculate SA coordinate... 3.25 sec
[bwa_aln_core] write to the disk... 0.01 sec
[bwa_aln_core] 1835008 sequences have been processed.
[bwa_aln_core] calculate SA coordinate... 3.28 sec
[bwa_aln_core] write to the disk... 0.01 sec
[bwa_aln_core] 2097152 sequences have been processed.
[bwa_aln_core] calculate SA coordinate... 3.26 sec
[bwa_aln_core] write to the disk... 0.01 sec
[bwa_aln_core] 2359296 sequences have been processed.
[bwa_aln_core] calculate SA coordinate... 3.24 sec
[bwa_aln_core] write to the disk... 0.01 sec
[bwa_aln_core] 2621440 sequences have been processed.
[bwa_aln_core] calculate SA coordinate... 3.26 sec
[bwa_aln_core] write to the disk... 0.01 sec
[bwa_aln_core] 2883584 sequences have been processed.
[bwa_aln_core] calculate SA coordinate... 3.27 sec
[bwa_aln_core] write to the disk... 0.01 sec
[bwa_aln_core] 3145728 sequences have been processed.
[bwa_aln_core] calculate SA coordinate... 3.25 sec
[bwa_aln_core] write to the disk... 0.01 sec
[bwa_aln_core] 3407872 sequences have been processed.
[bwa_aln_core] calculate SA coordinate... 1.18 sec
[bwa_aln_core] write to the disk... 0.00 sec
[bwa_aln_core] 3502500 sequences have been processed.
[main] Version: 0.7.17-r1188
[main] CMD: bwa aln -I -t 1 ref/chr8_ref data/reads_00.fq
[main] Real time: 50.314 sec; CPU: 47.208 sec
Multi-Threaded SMP Job
This script can serve as a template for applications that are capable of using multiple processors on a single server or physical computer. These applications are commonly referred to as threaded, OpenMP, PTHREADS, or shared memory applications. While they can use multiple processors, they cannot make use of multiple servers and all the processors must be on the same node.
These applications required shared memory and can only run on one node; as such it is important to remember the following:
You must set –ntasks=1, and then set –cpus-per-task to the number of OpenMP threads you wish to use. You must make the application aware of how many processors to use. How that is done depends on the application: For some applications (using the OpenMP paradigm), set OMP_NUM_THREADS to a value less than or equal to the number of cpus-per-task you set. For some applications, use a command line option when calling that application. Check if your application provides a parameter indicating the number of threads to parallelize.
The script below requests 4 cpus to parallelize one of the job steps. Observe that the indexing process is not parallelized (it will use only 1 cpu). However the alignment process uses the parameter -t to request 4 threads. You ca use the SLURM variable $SLURM_CPUS_PER_TASK
to referencing the number of cpus per task required (written in the line #SBATCH –cpus-per-task=4)
1#!/bin/bash
2
3#SBATCH --job-name=multiThreadJob # Job name
4#SBATCH --output=OpenMPJob_%j.out # Standard output and error log
5#SBATCH --nodes=1 # Run all processes on a single node
6#SBATCH --ntasks=1 # Run a single task
7#SBATCH --cpus-per-task=4 # Number of CPU cores per task
8#SBATCH --mem=1gb # Job memory request
9#SBATCH --time=00:05:00 # Time limit hrs:min:sec
10#SBATCH --qos=short # QoS: short,medium,long,long-mem
11
12OMP_NUM_THREADS=$SLURM_CPUS_PER_TASK # Only if your application uses the OpenMP paradigm
13
14# Load the required software (bwa)
15module load biotools
16
17# Index the reference genome (ref/chr8.fa). The output files will be re-named with preffix: chr8_ref
18srun bwa index ref/chr8.fa -p ref/chr8_ref
19
20# Align a single file of reads (data/reads_00.fq) to the indexed reference file (ref/chr8.fa).
21# We are using 4 cpus (parameter: -t $SLURM_CPUS_PER_TASK)
22srun bwa aln -I -t $SLURM_CPUS_PER_TASK ref/chr8_ref data/reads_00.fq > out/example_aln.sai
23
24exit 0
Commented lines:
6. –ntasks=1 , each job step is assigned to a single task but each task could request 4 cpus (see line 7).
7. –cpus-per-task=4 , if the job step parallelizes then it could employ until 4 cpus.
8. –mem=1gb , the entire job (including all the job steps) requires 1GB of RAM. Review this parameter after your job was finished.
18. srun bwa index, first job step. The indexing process is a sequential task so we can’t specify more than 1 cpu.
22. srun bwa aln -I -t $SLURM_CPUS_PER_TASK , second job step. The alignment process can be parellelized to run faster. Use the -t parameter in bwa application to specify the number to threads running concurrently. We can use: $SLURM_CPUS_PER_TASK to specify the number of threads requested in the sbatch script (line 7).
Then to submit a multi thread job wait until the job is completed. If we show the resulted file you can check that the alignment job step took 33.59 seconds while in the sequential version took 50.34 seconds
[USERNAME@master test]$ sbatch MultiThreadJob.sh
Submitted batch job 7035
[USERNAME@master test]$ cat multiThreadJob_7035.out
[bwa_index] Pack FASTA... 1.06 sec
[bwa_index] Construct BWT for the packed sequence...
[BWTIncCreate] textLength=292728044, availableWord=32597292
[BWTIncConstructFromPacked] 10 iterations done. 53770876 characters processed.
[BWTIncConstructFromPacked] 20 iterations done. 99337660 characters processed.
[BWTIncConstructFromPacked] 30 iterations done. 139833516 characters processed.
[BWTIncConstructFromPacked] 40 iterations done. 175822284 characters processed.
[BWTIncConstructFromPacked] 50 iterations done. 207805164 characters processed.
[BWTIncConstructFromPacked] 60 iterations done. 236227612 characters processed.
[BWTIncConstructFromPacked] 70 iterations done. 261485516 characters processed.
[BWTIncConstructFromPacked] 80 iterations done. 283930748 characters processed.
[bwt_gen] Finished constructing BWT in 85 iterations.
[bwa_index] 69.26 seconds elapse.
[bwa_index] Update BWT... 0.65 sec
[bwa_index] Pack forward-only FASTA... 0.64 sec
[bwa_index] Construct SA from BWT and Occ... 31.88 sec
[main] Version: 0.7.17-r1188
[main] CMD: /storage/apps/BWA/0.7.17/bin/bwa index -p ref/chr8_ref ref/chr8.fa
[main] Real time: 105.585 sec; CPU: 103.520 sec
[bwa_aln] 17bp reads: max_diff = 2
[bwa_aln] 38bp reads: max_diff = 3
[bwa_aln] 64bp reads: max_diff = 4
[bwa_aln] 93bp reads: max_diff = 5
[bwa_aln] 124bp reads: max_diff = 6
[bwa_aln] 157bp reads: max_diff = 7
[bwa_aln] 190bp reads: max_diff = 8
[bwa_aln] 225bp reads: max_diff = 9
[bwa_aln_core] calculate SA coordinate... 5.35 sec
[bwa_aln_core] write to the disk... 0.02 sec
[bwa_aln_core] 262144 sequences have been processed.
[bwa_aln_core] calculate SA coordinate... 5.26 sec
[bwa_aln_core] write to the disk... 0.02 sec
[bwa_aln_core] 524288 sequences have been processed.
[bwa_aln_core] calculate SA coordinate... 5.31 sec
[bwa_aln_core] write to the disk... 0.02 sec
[bwa_aln_core] 786432 sequences have been processed.
[bwa_aln_core] calculate SA coordinate... 5.23 sec
[bwa_aln_core] write to the disk... 0.02 sec
[bwa_aln_core] 1048576 sequences have been processed.
[bwa_aln_core] calculate SA coordinate... 5.29 sec
[bwa_aln_core] write to the disk... 0.02 sec
[bwa_aln_core] 1310720 sequences have been processed.
[bwa_aln_core] calculate SA coordinate... 5.44 sec
[bwa_aln_core] write to the disk... 0.01 sec
[bwa_aln_core] 1572864 sequences have been processed.
[bwa_aln_core] calculate SA coordinate... 5.24 sec
[bwa_aln_core] write to the disk... 0.02 sec
[bwa_aln_core] 1835008 sequences have been processed.
[bwa_aln_core] calculate SA coordinate... 5.34 sec
[bwa_aln_core] write to the disk... 0.01 sec
[bwa_aln_core] 2097152 sequences have been processed.
[bwa_aln_core] calculate SA coordinate... 5.23 sec
[bwa_aln_core] write to the disk... 0.02 sec
[bwa_aln_core] 2359296 sequences have been processed.
[bwa_aln_core] calculate SA coordinate... 5.37 sec
[bwa_aln_core] write to the disk... 0.02 sec
[bwa_aln_core] 2621440 sequences have been processed.
[bwa_aln_core] calculate SA coordinate... 5.44 sec
[bwa_aln_core] write to the disk... 0.02 sec
[bwa_aln_core] 2883584 sequences have been processed.
[bwa_aln_core] calculate SA coordinate... 5.35 sec
[bwa_aln_core] write to the disk... 0.02 sec
[bwa_aln_core] 3145728 sequences have been processed.
[bwa_aln_core] calculate SA coordinate... 5.34 sec
[bwa_aln_core] write to the disk... 0.02 sec
[bwa_aln_core] 3407872 sequences have been processed.
[bwa_aln_core] calculate SA coordinate... 1.95 sec
[bwa_aln_core] write to the disk... 0.01 sec
[bwa_aln_core] 3502500 sequences have been processed.
[main] Version: 0.7.17-r1188
[main] CMD: /storage/apps/BWA/0.7.17/bin/bwa aln -I -t 4 ref/chr8_ref data/reads_00.fq
[main] Real time: 33.590 sec; CPU: 75.471 sec
Message Passing Interface (MPI) Jobs
MPI is a specification for software developers used to make use of a cluster of computers. A set of libraries exist for using this standard on modern day (High Performance Computing) HPC Clusters. The problem with a computing cluster is that while some of the CPUs share memory (Shared Memory), others have a distributed memory architecture which is only connected by network. Today’s developers are able to make use of these distributed memory, shared memory and a hybrid system of both; all with the power of MPI.
If your application uses MPI you can review the next script that can serve as a template for MPI, or message passing interface, applications. These are applications that can use multiple processors that may, or may not, be on multiple compute nodes.
First, we need to load the openmpi module in order to your application is able to run with the MPI libraries.
Some parameters listed in the script:
-n, –ntasks=<number> Number of tasks (MPI ranks). We are requesting 80 processes (tasks) MPI each of them will use a single cpu then the number of cpus to be used is
-c, –cpus-per-task=<ncpus> Request ncpus cores per task.
-N, –nodes=<minnodes[-maxnodes]> Request that a minimum of minnodes nodes be allocated to this job. We could omit this parameter and SLURM will employ the suitable number of nodes to allocate the job.
As you can see the last line in the script launch the mpirun command passing the number of MPI tasks that will be created (use the SLURM variable: ${SLURM_NTASKS}
).
1[USERNAME@master test]$ cat MPIJob.sh
2#!/bin/bash
3#SBATCH --job-name=MPIJob # Job name
4#SBATCH --nodes=2 # Maximum number of nodes to be allocated
5#SBATCH --ntasks=80 # Number of MPI tasks (i.e. processes)
6#SBATCH --cpus-per-task=1 # Number of cores per MPI task
7#SBATCH --mem=1G # Memory per node
8#SBATCH --time=00:05:00 # Wall time limit (days-hrs:min:sec)
9#SBATCH --output=MPIJob_%j.log # Path to the standard output and error files relative to the working directory
10#SBATCH --qos=short # QoS: short,medium,long,long-mem
11
12echo "JOBID = $SLURM_JOB_ID"
13echo "Number of Nodes Allocated = $SLURM_JOB_NUM_NODES"
14echo "Number of Tasks Allocated = $SLURM_NTASKS"
15echo "Number of Cores/Task Allocated = $SLURM_CPUS_PER_TASK"
16
17module load openmpi4
18
19mpirun -np ${SLURM_NTASKS} ./mpi_hello_world
Commented lines:
4. –nodes=2 , in the example we are requesting 80 MPI tasks (80 MPI processes = 80 cpus) so we will need 2 nodes (40 cpus per node). You can omit this parameter and SLURM will select the number of nodes your job need.
5. –cpus-per-task=1, each MPI process will consume 1 cpu.
6. –mem=1G, all MPI processes (80) will use a total of 1GB of RAM. Also we could have specify –mem-per-cpu , in this case the total memory requested by the job would: ntasks * cpus-per-task * mem-per-cpu
17. module load openmpi4, we need to load the openmpi libraries before to submit a MPI application.
19. mpirun -np ${SLURM_NTASKS} , submit the MPI application (in this example is a trivial hello_world example). You must select the number of MPI tasks with the -np pararameter (use the ${SLURM_NTASKS} variable to reference the number of tasks written in the line 5 of sbatch script: –ntasks)
After the job is completed we can check that the 80 MPI tasks were run in two nodes.
[USERNAME@master test]$ sbatch MPIJob.sh
Submitted batch job 7053
[USERNAME@master test]$ more MPIJob_7053.log
Number of Nodes Allocated = 2
Number of Tasks Allocated = 80
Number of Cores/Task Allocated = 1
Hello world from processor osd01, rank 42 out of 80 processors
Hello world from processor osd01, rank 46 out of 80 processors
Hello world from processor osd01, rank 51 out of 80 processors
Hello world from processor osd01, rank 55 out of 80 processors
Hello world from processor osd01, rank 69 out of 80 processors
Hello world from processor osd01, rank 79 out of 80 processors
Hello world from processor osd01, rank 40 out of 80 processors
Hello world from processor osd00, rank 11 out of 80 processors
Hello world from processor osd01, rank 47 out of 80 processors
Hello world from processor osd01, rank 50 out of 80 processors
Hello world from processor osd01, rank 52 out of 80 processors
Hello world from processor osd01, rank 54 out of 80 processors
Hello world from processor osd01, rank 60 out of 80 processors
Hello world from processor osd01, rank 62 out of 80 processors
Hello world from processor osd00, rank 29 out of 80 processors
Hello world from processor osd01, rank 64 out of 80 processors
Hello world from processor osd00, rank 32 out of 80 processors
Hello world from processor osd01, rank 67 out of 80 processors
Hello world from processor osd01, rank 41 out of 80 processors
Hello world from processor osd00, rank 36 out of 80 processors
Hello world from processor osd01, rank 43 out of 80 processors
Hello world from processor osd00, rank 39 out of 80 processors
Hello world from processor osd01, rank 44 out of 80 processors
Hello world from processor osd01, rank 49 out of 80 processors
Hello world from processor osd01, rank 53 out of 80 processors
Hello world from processor osd00, rank 2 out of 80 processors
Hello world from processor osd01, rank 57 out of 80 processors
Hello world from processor osd00, rank 4 out of 80 processors
Hello world from processor osd01, rank 58 out of 80 processors
Hello world from processor osd00, rank 5 out of 80 processors
Hello world from processor osd01, rank 61 out of 80 processors
Hello world from processor osd00, rank 10 out of 80 processors
Hello world from processor osd01, rank 65 out of 80 processors
Hello world from processor osd00, rank 12 out of 80 processors
Hello world from processor osd01, rank 68 out of 80 processors
Hello world from processor osd00, rank 15 out of 80 processors
Hello world from processor osd01, rank 70 out of 80 processors
Hello world from processor osd00, rank 20 out of 80 processors
Hello world from processor osd01, rank 71 out of 80 processors
Hello world from processor osd00, rank 8 out of 80 processors
Hello world from processor osd01, rank 73 out of 80 processors
Hello world from processor osd01, rank 74 out of 80 processors
Hello world from processor osd00, rank 17 out of 80 processors
Hello world from processor osd01, rank 75 out of 80 processors
Hello world from processor osd00, rank 3 out of 80 processors
Hello world from processor osd01, rank 76 out of 80 processors
Hello world from processor osd00, rank 19 out of 80 processors
Hello world from processor osd01, rank 78 out of 80 processors
Hello world from processor osd00, rank 33 out of 80 processors
Hello world from processor osd01, rank 45 out of 80 processors
Hello world from processor osd00, rank 34 out of 80 processors
Hello world from processor osd01, rank 48 out of 80 processors
Hello world from processor osd00, rank 37 out of 80 processors
Hello world from processor osd01, rank 56 out of 80 processors
Hello world from processor osd00, rank 9 out of 80 processors
Hello world from processor osd01, rank 59 out of 80 processors
Hello world from processor osd00, rank 13 out of 80 processors
Hello world from processor osd01, rank 63 out of 80 processors
Hello world from processor osd00, rank 24 out of 80 processors
Hello world from processor osd01, rank 66 out of 80 processors
Hello world from processor osd00, rank 26 out of 80 processors
Hello world from processor osd01, rank 72 out of 80 processors
Hello world from processor osd00, rank 35 out of 80 processors
Hello world from processor osd01, rank 77 out of 80 processors
Hello world from processor osd00, rank 1 out of 80 processors
Hello world from processor osd00, rank 22 out of 80 processors
Hello world from processor osd00, rank 27 out of 80 processors
Hello world from processor osd00, rank 30 out of 80 processors
Hello world from processor osd00, rank 31 out of 80 processors
Hello world from processor osd00, rank 38 out of 80 processors
Hello world from processor osd00, rank 0 out of 80 processors
Hello world from processor osd00, rank 14 out of 80 processors
Hello world from processor osd00, rank 16 out of 80 processors
Hello world from processor osd00, rank 28 out of 80 processors
Hello world from processor osd00, rank 21 out of 80 processors
Hello world from processor osd00, rank 25 out of 80 processors
Hello world from processor osd00, rank 7 out of 80 processors
Hello world from processor osd00, rank 6 out of 80 processors
Hello world from processor osd00, rank 18 out of 80 processors
Hello world from processor osd00, rank 23 out of 80 processors
Parallelization of data: ArrayJobs
When you have a lot of files that should be processed with the same applications then you can use SLURM arrays to parallelize the processing. The script below takes as input 20 files of genome reads and run the alignment step simultaneously. The paths to the input files (read files) will be assigned to a component of array. You can limit the maximum number of components of array with the sbatch line: SBATCH –array=1-20. Is possible to select which components of array will be processed, so to submit a job array with index values of 1, 3, 5 and 7 you chould specify: $ sbatch –array=1,3,5,7. To submit a job array with index values between 1 and 7 with a step size of 2 (i.e. 1, 3, 5 and 7) : $ sbatch –array=1-7:2. The maximum number of simultaneously running tasks from the job array may be specified using a “%” separator. For example “–array=0-15%4” will limit the number of simultaneously running tasks from this job array to 4.
You can find more information about the SLURM’s arrays.
1#!/bin/bash
2#SBATCH --job-name=ArrayJob
3#SBATCH --output=arrayJob_%A_%a.out
4#SBATCH --ntasks=1
5#SBATCH --cpus-per-task=1
6#SBATCH --time=00:30:00
7#SBATCH --mem-per-cpu=1G
8#SBATCH --array=1-20
9#SBATCH --qos=short
10
11# Load the required software (bwa)
12module load biotools
13
14# List all reads
15FILES=(data/*)
16
17INPUTFILE=${FILES[$SLURM_ARRAY_TASK_ID]}
18OUTPUTFILE=$(basename ${FILES[$SLURM_ARRAY_TASK_ID]} .fq)
19
20# Index the reference genome (ref/chr8.fa). The output files will be re-named with preffix: chr8_ref
21srun bwa index ref/chr8.fa -p ref/chr8_ref
22
23# Align a single file of reads (data/reads_00.fq) to the indexed reference file (ref/chr8.fa). We are using a single cpu (parameter: -t 1)
24srun bwa aln -I -t 1 ref/chr8_ref ${INPUTFILE} > out/example_ali_${OUTPUTFILE}.sai
25
26exit 0
Commented lines:
7. –mem-per-cpu=1G , you are requesting the memory per cpu instead of the total memory (–mem) per job. Every job in the array will request ntasks * cpus-per-task = 1 * 1 = 1GB
21. srun bwa index , the indexing of the reference genome is a previous step common to the following 20 alignment processes.
24. srun bwa aln , we need align 20 different files of reads. Arrays in SLURM creates 20 tasks and each task will execute a different alignment depending the input file assigned by SLURM. Note that with SLURM arrays we can use $SLURM_ARRAY_TASK_ID to reference task that is running.
If you submit the script then a total of 1000 job array will be running simultaneously (this is because the maximum number of jobs running per user is limited to 1000). The job will finish when all input files (reads) are processed and the output of each alignment is stored into an individual file. Note that using job arrays in SLURM we are executing as many times as there are components in the array. In the example above the job step indexation will be executed 20 times even though it’s the same process with the same inputs and same outputs. Probably this process will fail because the output files are overwritten 20 times . You should remove the indexation line from this job and submit a previous single job in charged of doing the indexation. Also you can read the next section that explains another mode to submit jobs in parallel.
[USERNAME@master test]$ sbatch ArrayJob.sh
Submitted batch job 7054
[USERNAME@master test]$ squeue_ -u USERNAME
JOBID QOS NAME USER ACCOUNT TIME TIME_LEFT START_TIME NODES CPU MIN_M NODELIST ST REASON
7054_[11-20] short ArrayJob USERNAME admin 0:00 30:00 N/A 1 1 1G PD AssocMaxJobsLimit
7054_6 short ArrayJob USERNAME admin 1:35 28:25 2022-11-17 1 2 1G cn03 R None
7054_7 short ArrayJob USERNAME admin 1:35 28:25 2022-11-17 1 2 1G cn03 R None
7054_8 short ArrayJob USERNAME admin 1:35 28:25 2022-11-17 1 2 1G cn03 R None
7054_9 short ArrayJob USERNAME admin 1:35 28:25 2022-11-17 1 2 1G cn03 R None
7054_10 short ArrayJob USERNAME admin 1:35 28:25 2022-11-17 1 2 1G cn03 R None
[USERNAME@master test]$ ls arrayJob*
arrayJob_7054_10.out arrayJob_7054_12.out arrayJob_7054_14.out arrayJob_7054_16.out arrayJob_7054_18.out arrayJob_7054_1.out arrayJob_7054_2.out arrayJob_7054_4.out arrayJob_7054_6.out arrayJob_7054_8.out
arrayJob_7054_11.out arrayJob_7054_13.out arrayJob_7054_15.out arrayJob_7054_17.out arrayJob_7054_19.out arrayJob_7054_20.out arrayJob_7054_3.out arrayJob_7054_5.out arrayJob_7054_7.out arrayJob_7054_9.out
If you have a variable amount of files to process in each execution then you can define the limits of the array out of the sbatch file. For the above example you can automatically calculate the number of files to process from the command line :
[USERNAME@master test]$ sbatch --array=0-`ls data|wc -l` ArrayJob.sh
In this way you could increase the number of files in the data directory omitting the directive: #SBATCH --array=1-20
in the sbatch configuration file.
Parallelization of data using a file of commands (arrays version)
Alternatively you can use an input file with a list of samples/datasets (one per line) to process. You must previously create a text file with one execution line per file to be processed. You can see in the example below the file that contains each alignment command.
1[USERNAME@master test]$ cat list_of_cmd.txt
2bwa aln -I -t 1 ref/chr8_ref data/reads_00.fq > out/example_ali_reads_00.sai
3bwa aln -I -t 1 ref/chr8_ref data/reads_01.fq > out/example_ali_reads_01.sai
4bwa aln -I -t 1 ref/chr8_ref data/reads_02.fq > out/example_ali_reads_02.sai
5bwa aln -I -t 1 ref/chr8_ref data/reads_03.fq > out/example_ali_reads_03.sai
6bwa aln -I -t 1 ref/chr8_ref data/reads_04.fq > out/example_ali_reads_04.sai
7bwa aln -I -t 1 ref/chr8_ref data/reads_05.fq > out/example_ali_reads_05.sai
8bwa aln -I -t 1 ref/chr8_ref data/reads_06.fq > out/example_ali_reads_06.sai
9bwa aln -I -t 1 ref/chr8_ref data/reads_07.fq > out/example_ali_reads_07.sai
10bwa aln -I -t 1 ref/chr8_ref data/reads_08.fq > out/example_ali_reads_08.sai
11bwa aln -I -t 1 ref/chr8_ref data/reads_09.fq > out/example_ali_reads_09.sai
12bwa aln -I -t 1 ref/chr8_ref data/reads_10.fq > out/example_ali_reads_10.sai
13bwa aln -I -t 1 ref/chr8_ref data/reads_11.fq > out/example_ali_reads_11.sai
14bwa aln -I -t 1 ref/chr8_ref data/reads_12.fq > out/example_ali_reads_12.sai
15bwa aln -I -t 1 ref/chr8_ref data/reads_13.fq > out/example_ali_reads_13.sai
16bwa aln -I -t 1 ref/chr8_ref data/reads_14.fq > out/example_ali_reads_14.sai
17bwa aln -I -t 1 ref/chr8_ref data/reads_15.fq > out/example_ali_reads_15.sai
18bwa aln -I -t 1 ref/chr8_ref data/reads_16.fq > out/example_ali_reads_16.sai
19bwa aln -I -t 1 ref/chr8_ref data/reads_17.fq > out/example_ali_reads_17.sai
20bwa aln -I -t 1 ref/chr8_ref data/reads_18.fq > out/example_ali_reads_18.sai
21bwa aln -I -t 1 ref/chr8_ref data/reads_19.fq > out/example_ali_reads_19.sai
22bwa aln -I -t 1 ref/chr8_ref data/reads_20.fq > out/example_ali_reads_20.sai
Then we could modify a SLURM array submission script to read from the file of commands and execute each line concurrently in parallel.
1[USERNAME@master test]$ cat ArrayJob_List.sh
2#!/bin/bash
3#SBATCH --job-name=ArrayJob_List
4#SBATCH --output=arrayJob_List_%A_%a.out
5#SBATCH --ntasks=1
6#SBATCH --cpus-per-task=1
7#SBATCH --time=00:30:00
8#SBATCH --mem-per-cpu=1G
9#SBATCH --array=0-20
10#SBATCH --qos=short
11
12# Load the required software (bwa)
13module load biotools
14
15# Put the content of the file of commands into an array
16readarray -t ARRAY_OF_COMMANDS <list_of_cmd.txt
17
18# Index the reference genome (ref/chr8.fa). The output files will be re-named with preffix: chr8_ref
19srun bwa index ref/chr8.fa -p ref/chr8_ref
20
21# Align a single file of reads (data/reads_00.fq) to the indexed reference file (ref/chr8.fa). We are using a single cpu (parameter: -t 1)
22eval srun ${ARRAY_OF_COMMANDS[$SLURM_ARRAY_TASK_ID]}
23
24exit 0
16. readarray -t ARRAY_OF_COMMANDS <list_of_cmd.txt . We are reading the content of the file of commands and putting each line into an array.
22. eval srun ${ARRAY[$SLURM_ARRAY_TASK_ID]} . We are submitting an alignment per each component in the array. We need prefix the command
eval
in order the redirection symbol ‘>’ can be interpreted by the bash.
The problem using SLURM arrays in this example is the indexation will be executed 20 times when it should be executed only 1 time. The indexation process will fail because the results file are overwritten 20 times. Running this type of jobs (when part of the job should be executed only one time) is better to use the srun in background. You can see how to use this same example running srun in background and using a list of commands:
Parallelization of data: srun in background (not recommended)
As an alternative to the array method, you can use a single job that is responsible for performing concurrent interaction with all files. Note that this should be the last method to use, as it is a very inefficient way of executing in the cluster. You have a lot of files to process but you need to paralellize only some job steps in the job. Remeber that using arrays in SLURM you are executing the same commands in all job steps in the job. To below template shows how to run in parallel some parts of your job.
1[USERNAME@master test]$ cat FileJob.sh
2#!/bin/bash
3#SBATCH --job-name=FileJob #Job name to show with squeue
4#SBATCH --output=FileJob_%j.out #Output file
5#SBATCH --ntasks=20 #Maximum number of cores to use
6#SBATCH --time=00:30:00 #Time limit to execute the job (60 minutes)
7#SBATCH --mem-per-cpu=1G #Required Memory per core
8#SBATCH --cpus-per-task=2 #CPUs assigned per task.
9#SBATCH --qos=short #QoS: short,medium,long,long-mem
10
11# Load the required software (bwa)
12module load biotools
13
14# Index the reference genome (ref/chr8.fa). The output files will be re-named with preffix: chr8_ref
15srun -n 1 -c 1 bwa index ref/chr8.fa -p ref/chr8_ref
16
17for file in data/*
18do
19 # Align a single file of reads (data/reads_00.fq) to the indexed reference file (ref/chr8.fa). We are using a single cpu (parameter: -t 1)
20 srun -n 1 -c 2 -Q --exclusive bwa aln -I -t 1 ref/chr8_ref $file > out/example_aln.sai &
21done
22wait
23
24exit 0
Commented lines:
5. –ntasks=20 , specifying the ntasks parameter to a number we can run until 20 job steps in parallel.
7. –mem-per-cpu=1G , in this type of jobs you have to specify the memory consumed per cpu.
8. –cpus-per-task=2 , in background jobs you must to specify an even number of cpus.
15. srun -n 1 -c 1 -Q –exclusive bwa index , bwa index is a sequential task we can not parallelize using more than 1 cpu.
20. srun -n 1 -c 2 -Q –exclusive bwa aln -I -t 1 ref/chr8_ref $file > out/example_aln.sai & , we iterate with all files in the directory data. Note that we are submitting a srun in background mode (
&
symbol at the end). This means that each job step is executed without wait to finish the last iteration. However as we specified 20 tasks as limit, SLURM only will put in execution until 20 tasks concurrently. This means that we are executing 20 distinct alignments at once. Remember to specify only one task and one core per srun command (srun -n 1 -c 1). If you omit this then the same alignment will be submitted 20 times.22. wait , we must to put this directive in order to the main job waits to exit until all job steps are finished.
Attention
For this type of parallelization jobs (background parallelization) you must specify an even number of cores in the slurm scripts: #SBATCH --cpus-per-task=2
and srun -c 2
Warning
You should not submit jobs that require more than 5.000 srun processes running in background. The system will kill all the srun processes beyong this limit.
[USERNAME@master test]$ sbatch FileJob.sh
Submitted batch job 7106
Connect to the internal node where the job is running:
[USERNAME@master test]$ ssh cn00
Check that you have 20 alignments running concurrently.
[USERNAME@osd00 ~]$ top -u USERNAME
top - 10:20:17 up 3 days, 55 min, 1 user, load average: 11,84, 13,12, 10,09
Tasks: 1168 total, 3 running, 1165 sleeping, 0 stopped, 0 zombie
%Cpu(s): 2,8 us, 1,2 sy, 0,0 ni, 93,1 id, 2,6 wa, 0,0 hi, 0,3 si, 0,0 st
MiB Mem : 1546500,+total, 1356580,+free, 161530,3 used, 28390,0 buff/cache
MiB Swap: 4096,0 total, 4096,0 free, 0,0 used. 1378350,+avail Mem
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
1449923 USERNAME 20 0 142440 129924 1992 R 100,0 0,0 0:36.34 /storage/apps/BWA/0.7.17/bin/bwa index ref/chr8.fa -p ref/chr8_ref
top - 10:21:45 up 3 days, 56 min, 1 user, load average: 14,05, 12,64, 10,15
Tasks: 1258 total, 22 running, 1236 sleeping, 0 stopped, 0 zombie
%Cpu(s): 25,6 us, 1,3 sy, 0,0 ni, 72,6 id, 0,2 wa, 0,0 hi, 0,3 si, 0,0 st
MiB Mem : 1546500,+total, 1343804,+free, 171277,3 used, 31419,4 buff/cache
MiB Swap: 4096,0 total, 4096,0 free, 0,0 used. 1368603,+avail Mem
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
1450299 USERNAME 20 0 307940 295308 2232 R 100,0 0,0 0:24.13 /storage/apps/BWA/0.7.17/bin/bwa aln -I -t 1 ref/chr8_ref data/reads_03.fq
1450307 USERNAME 20 0 307940 295308 2232 R 100,0 0,0 0:24.11 /storage/apps/BWA/0.7.17/bin/bwa aln -I -t 1 ref/chr8_ref data/reads_07.fq
1450316 USERNAME 20 0 307940 295640 2564 R 100,0 0,0 0:23.92 /storage/apps/BWA/0.7.17/bin/bwa aln -I -t 1 ref/chr8_ref data/reads_18.fq
1450328 USERNAME 20 0 307940 295716 2636 R 100,0 0,0 0:23.94 /storage/apps/BWA/0.7.17/bin/bwa aln -I -t 1 ref/chr8_ref data/reads_14.fq
1450336 USERNAME 20 0 307940 295640 2564 R 100,0 0,0 0:23.79 /storage/apps/BWA/0.7.17/bin/bwa aln -I -t 1 ref/chr8_ref data/reads_04.fq
1450344 USERNAME 20 0 307940 295308 2232 R 100,0 0,0 0:23.95 /storage/apps/BWA/0.7.17/bin/bwa aln -I -t 1 ref/chr8_ref data/reads_17.fq
1450360 USERNAME 20 0 307940 295576 2496 R 100,0 0,0 0:23.89 /storage/apps/BWA/0.7.17/bin/bwa aln -I -t 1 ref/chr8_ref data/reads_08.fq
1450366 USERNAME 20 0 307940 295444 2364 R 100,0 0,0 0:23.83 /storage/apps/BWA/0.7.17/bin/bwa aln -I -t 1 ref/chr8_ref data/reads_10.fq
1450388 USERNAME 20 0 307940 295576 2496 R 100,0 0,0 0:23.76 /storage/apps/BWA/0.7.17/bin/bwa aln -I -t 1 ref/chr8_ref data/reads_02.fq
1450393 USERNAME 20 0 307940 295644 2568 R 100,0 0,0 0:23.69 /storage/apps/BWA/0.7.17/bin/bwa aln -I -t 1 ref/chr8_ref data/reads_13.fq
1450399 USERNAME 20 0 307940 295640 2564 R 100,0 0,0 0:23.69 /storage/apps/BWA/0.7.17/bin/bwa aln -I -t 1 ref/chr8_ref data/reads_05.fq
1450404 USERNAME 20 0 307940 295456 2376 R 100,0 0,0 0:23.78 /storage/apps/BWA/0.7.17/bin/bwa aln -I -t 1 ref/chr8_ref data/reads_09.fq
1450370 USERNAME 20 0 307940 295440 2364 R 100,0 0,0 0:23.85 /storage/apps/BWA/0.7.17/bin/bwa aln -I -t 1 ref/chr8_ref data/reads_19.fq
1450377 USERNAME 20 0 307940 295232 2152 R 100,0 0,0 0:23.80 /storage/apps/BWA/0.7.17/bin/bwa aln -I -t 1 ref/chr8_ref data/reads_20.fq
1450410 USERNAME 20 0 307940 295456 2376 R 100,0 0,0 0:23.80 /storage/apps/BWA/0.7.17/bin/bwa aln -I -t 1 ref/chr8_ref data/reads_11.fq
1450413 USERNAME 20 0 307940 295312 2232 R 100,0 0,0 0:23.77 /storage/apps/BWA/0.7.17/bin/bwa aln -I -t 1 ref/chr8_ref data/reads_01.fq
1450323 USERNAME 20 0 307956 295320 2232 R 94,4 0,0 0:23.32 /storage/apps/BWA/0.7.17/bin/bwa aln -I -t 1 ref/chr8_ref data/reads_12.fq
1450355 USERNAME 20 0 307936 295796 2572 R 69,3 0,0 0:16.62 /storage/apps/BWA/0.7.17/bin/bwa aln -I -t 1 ref/chr8_ref data/reads_00.fq
1450416 USERNAME 20 0 307936 295504 2316 R 69,3 0,0 0:15.91 /storage/apps/BWA/0.7.17/bin/bwa aln -I -t 1 ref/chr8_ref data/reads_15.fq
1450373 USERNAME 20 0 307936 295432 2208 R 61,7 0,0 0:15.97 /storage/apps/BWA/0.7.17/bin/bwa aln -I -t 1 ref/chr8_ref data/reads_16.fq
While the job is running you can see what tasks are finished.
[USERNAME@master test]$ sacct -j 7115
Start End User Account JobID Name QoS Elapsed ReqCPUS AllocCPUS ReqMem MaxRSS TotalCPU State ExitCode
2022-11-18T12:48:10 Unknown USERNAME admin 7115 FileJob short 00:00:04 20 20 1Gc 00:00:00 RUNNING 0:0
2022-11-18T12:48:10 Unknown admin 7115.batch batch 00:00:04 20 20 1Gc 00:00:00 RUNNING 0:0
2022-11-18T12:48:10 Unknown admin 7115.0 bwa 00:00:04 20 20 1Gc 00:00:00 RUNNING 0:0
[USERNAME@master test]$ sacct -j 7115
Start End User Account JobID Name QoS Elapsed ReqCPUS AllocCPUS ReqMem MaxRSS TotalCPU State ExitCode
2022-11-18T12:48:10 Unknown USERNAME admin 7115 FileJob short 00:02:10 20 20 1Gc 01:54.503 RUNNING 0:0
2022-11-18T12:48:10 Unknown admin 7115.batch batch 00:02:10 20 20 1Gc 00:00:00 RUNNING 0:0
2022-11-18T12:48:10 2022-11-18T12:50:07 admin 7115.0 bwa 00:01:57 20 20 1Gc 216728K 01:54.503 COMPLETED 0:0
2022-11-18T12:50:07 Unknown admin 7115.1 bwa 00:00:13 1 1 1Gc 00:00:00 RUNNING 0:0
2022-11-18T12:50:07 Unknown admin 7115.2 bwa 00:00:13 1 1 1Gc 00:00:00 RUNNING 0:0
2022-11-18T12:50:07 Unknown admin 7115.3 bwa 00:00:13 1 1 1Gc 00:00:00 RUNNING 0:0
2022-11-18T12:50:07 Unknown admin 7115.4 bwa 00:00:13 1 1 1Gc 00:00:00 RUNNING 0:0
2022-11-18T12:50:07 Unknown admin 7115.5 bwa 00:00:13 1 1 1Gc 00:00:00 RUNNING 0:0
2022-11-18T12:50:07 Unknown admin 7115.6 bwa 00:00:13 1 1 1Gc 00:00:00 RUNNING 0:0
2022-11-18T12:50:07 Unknown admin 7115.7 bwa 00:00:13 1 1 1Gc 00:00:00 RUNNING 0:0
2022-11-18T12:50:07 Unknown admin 7115.8 bwa 00:00:13 1 1 1Gc 00:00:00 RUNNING 0:0
2022-11-18T12:50:07 Unknown admin 7115.9 bwa 00:00:13 1 1 1Gc 00:00:00 RUNNING 0:0
2022-11-18T12:50:07 Unknown admin 7115.10 bwa 00:00:13 1 1 1Gc 00:00:00 RUNNING 0:0
2022-11-18T12:50:07 Unknown admin 7115.11 bwa 00:00:13 1 1 1Gc 00:00:00 RUNNING 0:0
2022-11-18T12:50:07 Unknown admin 7115.12 bwa 00:00:13 1 1 1Gc 00:00:00 RUNNING 0:0
2022-11-18T12:50:07 Unknown admin 7115.13 bwa 00:00:13 1 1 1Gc 00:00:00 RUNNING 0:0
2022-11-18T12:50:07 Unknown admin 7115.14 bwa 00:00:13 1 1 1Gc 00:00:00 RUNNING 0:0
2022-11-18T12:50:07 Unknown admin 7115.15 bwa 00:00:13 1 1 1Gc 00:00:00 RUNNING 0:0
2022-11-18T12:50:07 Unknown admin 7115.16 bwa 00:00:13 1 1 1Gc 00:00:00 RUNNING 0:0
2022-11-18T12:50:07 Unknown admin 7115.17 bwa 00:00:13 1 1 1Gc 00:00:00 RUNNING 0:0
2022-11-18T12:50:07 Unknown admin 7115.18 bwa 00:00:13 1 1 1Gc 00:00:00 RUNNING 0:0
2022-11-18T12:50:07 Unknown admin 7115.19 bwa 00:00:13 1 1 1Gc 00:00:00 RUNNING 0:0
2022-11-18T12:50:07 Unknown admin 7115.20 bwa 00:00:13 1 1 1Gc 00:00:00 RUNNING 0:0
Parallelization of data using a file of commands (background version - not recommended)
The same concept using background parallelization.
1[USERNAME@master test]$ cat list_of_cmd.txt
2bwa aln -I -t 1 ref/chr8_ref data/reads_00.fq > out/example_ali_reads_00.sai
3bwa aln -I -t 1 ref/chr8_ref data/reads_01.fq > out/example_ali_reads_01.sai
4bwa aln -I -t 1 ref/chr8_ref data/reads_02.fq > out/example_ali_reads_02.sai
5bwa aln -I -t 1 ref/chr8_ref data/reads_03.fq > out/example_ali_reads_03.sai
6bwa aln -I -t 1 ref/chr8_ref data/reads_04.fq > out/example_ali_reads_04.sai
7bwa aln -I -t 1 ref/chr8_ref data/reads_05.fq > out/example_ali_reads_05.sai
8bwa aln -I -t 1 ref/chr8_ref data/reads_06.fq > out/example_ali_reads_06.sai
9bwa aln -I -t 1 ref/chr8_ref data/reads_07.fq > out/example_ali_reads_07.sai
10bwa aln -I -t 1 ref/chr8_ref data/reads_08.fq > out/example_ali_reads_08.sai
11bwa aln -I -t 1 ref/chr8_ref data/reads_09.fq > out/example_ali_reads_09.sai
12bwa aln -I -t 1 ref/chr8_ref data/reads_10.fq > out/example_ali_reads_10.sai
13bwa aln -I -t 1 ref/chr8_ref data/reads_11.fq > out/example_ali_reads_11.sai
14bwa aln -I -t 1 ref/chr8_ref data/reads_12.fq > out/example_ali_reads_12.sai
15bwa aln -I -t 1 ref/chr8_ref data/reads_13.fq > out/example_ali_reads_13.sai
16bwa aln -I -t 1 ref/chr8_ref data/reads_14.fq > out/example_ali_reads_14.sai
17bwa aln -I -t 1 ref/chr8_ref data/reads_15.fq > out/example_ali_reads_15.sai
18bwa aln -I -t 1 ref/chr8_ref data/reads_16.fq > out/example_ali_reads_16.sai
19bwa aln -I -t 1 ref/chr8_ref data/reads_17.fq > out/example_ali_reads_17.sai
20bwa aln -I -t 1 ref/chr8_ref data/reads_18.fq > out/example_ali_reads_18.sai
21bwa aln -I -t 1 ref/chr8_ref data/reads_19.fq > out/example_ali_reads_19.sai
22bwa aln -I -t 1 ref/chr8_ref data/reads_20.fq > out/example_ali_reads_20.sai
[USERNAME@master test]$ cat FileJob_List.sh
#!/bin/bash
#SBATCH --job-name=FileJob # Job name to show with squeue
#SBATCH --output=FileJob_%j.out # Output file
#SBATCH --ntasks=10 # Maximum number of cores to use
#SBATCH --cpus-per-task=2 # Required only 1 cpu per task
#SBATCH --time=00:30:00 # Time limit to execute the job (30 minutes)
#SBATCH --mem-per-cpu=1G # Required Memory per core
#SBATCH --qos=short # QoS:short,medium,long,long-mem
# Load the required software (bwa)
module load biotools
# Put the content of the file of commands into an array
readarray -t ARRAY_OF_COMMANDS <list_of_cmd.txt
# Index the reference genome (ref/chr8.fa). The output files will be re-named with preffix: chr8_ref
srun -n 1 -c 2 bwa index ref/chr8.fa -p ref/chr8_ref
# Align a single file of reads (data/reads_00.fq) to the indexed reference file (ref/chr8.fa). We are using a single cpu (parameter: -t 1)
for command in "${ARRAY_OF_COMMANDS[@]}"
do
eval srun -n 1 -c 2 -Q --exclusive $command &
done
wait
exit 0