The most common software is preinstalled in Garnatxa. If you need a commonly used software and it isn’t available in the system then contact with the administrator and comment it. If the needed software is used to test or you just need it for yourself then install it in your own account.

Pre-installed software

Software that is most used on clusters is already pre-installed on Garnatxa: C/C++ compilers, interpreters like Python or R, numerical libraries such as BLAS and LAPACK, message passing libraries (e.g. OpenMPI) and bioinformatics tools like (samtools, bowtie2, fastp, ect). In addition, several versions of these software are often installed.

If you are already familiar with an environment modules tool and you are only interested in the list of software available in the clusters, go to the available libraries section. Otherwise, please keep reading to learn how to enable the software you want.

Modules environment

The installed software can be enabled through the use of the module command, its main options are:

module avail list available software (modules) module load | add [module] set up the environment to use the software module list list currently loaded software module purge clears the environment module spider list all possible modules module show show the commands in the module file module help get help

For instance, to list the software that you have loaded and ready to use:

[$ [USERNAME@master ~]$ module list

Currently Loaded Modules:
1) autotools   2) prun/2.2   3) gnu9/9.4.0   4) openmpi4/4.1.4   5) ohpc

To list all the available software that you can load:

    $ [USERNAME@master ~]$ module avail

    ------------------------------- /opt/ohpc/pub/moduledeps/gnu9 -----------------------------------------
    R/4.1.2    gsl/2.7    mpich/3.4.2-ucx    openblas/0.3.7    openmpi4/4.1.1

    ------------------------------ /storage/apps/modulefiles ----------------------------------------------
R/4.2.1                    (D)    bcftools/1.16          blast/2.13       fastqc/0.11.9                 intel/compiler/2021.5.0 (D)    iqtree2/2.2.0       matlab/R2019b           python/3.8            samtools/1.16
anaconda/anaconda3_2021.11        bedtools/2.30.0        bowtie2/2.4.5    freebayes/1.3.6               intel/debugger/2021.5.0        kmc/3.2.1           matlab/R2022b  (D)      python/3.11    (D)    spades/3.15
anaconda/anaconda3         (D)    biotools/1             bwa/0.7.17       intel/2019.4.243              intel/mpi/2021.5.0             mafft/7.505         metagenomics/1          roary/3.7.0           sra-tools/3.0.0
bbmap/39.01                       biotools/2      (D)    fastp/0.23       intel/compiler/2021.5.0_RT    intel/tbb/2021.5.0             mathematica/12.1    openmpi4/4.1.4 (L,D)    samstats/1.5.1

    ----------------------------- /opt/ohpc/pub/modulefiles ------------------------------------------------
    autotools (L)    charliecloud/0.15    cmake/3.21.3    gnu9/9.4.0 (L)    hwloc/2.5.0    libfabric/1.13.0    ohpc (L)    os    prun/2.2 (L)    singularity/3.7.1    ucx/1.11.2    valgrind/3.18.1

    Where:
    D:  Default Module
    L:  Module is loaded

Note

The naming convention for the available modules is always in the form software/version-toolchain

To show information about an installed package:

[USERNAME@master ~]$ module whatis R
R/4.2.1             : Name: R project for statistical computing built with the gnu8 compiler toolchain.
R/4.2.1             : Version: 4.2.1
R/4.2.1             : Category: utility, developer support, user tool
R/4.2.1             : Keywords: Statistics
R/4.2.1             : Description: R is a language and environment for statistical computing and graphics (S-Plus like).
R/4.2.1             : URL http://www.r-project.org/

Note

You can press the tab key while you are typing the name of the module in order to show the list of available versions. If you only specify the name of a module but not the version the system automatically choose the last release of this module. The last releases of modules can be identified through the D character that appears when you type: module list

To load the last version of a module:

[USERNAME@master ~]$ module load R

To verify that the module is loaded and ready to use:

[USERNAME@master ~]$ module list
Currently Loaded Modules:
1) autotools   2) prun/2.2   3) gnu9/9.4.0   4) openmpi4/4.1.4   5) ohpc   6) openblas/0.3.7   7) R/4.2.

Then the software is available to use:

[USERNAME@master ~]$ R

R version 4.2.1 (2022-06-23) -- "Funny-Looking Kid"
Copyright (C) 2022 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

Natural language support but running in an English locale

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

[Previously saved workspace restored]

>

To change the version of a package:

[USERNAME@master ~]$ module load R/4.1.2

The following have been reloaded with a version change:
1) R/4.2.1 => R/4.1.2

To definitively unload a module:

[USERNAME@master ~]$ module unload R

Save and restore your modules workspace

Every time you close a ssh session you will have to reload all the loaded modules the next time you connect to Garnatxa. To avoid to reload each module you could save the currently loaded modules with the save. The modules will be automatically loaded at the beginning of each ssh session:

[USERNAME@master ~]$ module save

The modules workspace will be saved in .lmod.d/default. You can delete this file to avoid load a default set of modules.

Note

Also you can enable the load of a particular module from the beginning the ssh session. For instance if you want to load the R package automatically edit the .bashrc file and add the line module load R to the end of the file. Re-start your SSH session and the R module will be already loaded.

Create your own modules

If you want to load the software that is installed in your own account (or in /storage/projects) you can build your own modules and load in future. For instance if you have a repository of software with several packages installed in : /home/USERNAME/software/

First create a module files in your account:

[USERNAME@master ~]$ $mkdir modulefiles

Use this path to load modules:

[USERNAME@master ~]$  module use $HOME/modulefiles

Create the module file specifying the path to binaries and libraries. Is recommended to follow a hierarchy of directories to keep grouped versions. For instance: we have the compiled fastani binaries in your /home.

Create modules files:

[USERNAME@master ~]$  mkdir /home/USERNAME/modulefiles/fastani/

[USERNAME@master ~]$ vim  /home/USERNAME/modulefiles/fastani/1.33
#%Module1.0#####################################################################
##
##  fastp Module File
##

proc ModulesHelp { } {

    puts stderr " "
puts stderr "This module provides FasANI"
    puts stderr "\nVersion 1.83\n"
}

module-whatis "Name: FastANI"
module-whatis "Version 1.83"
module-whatis "Description: FastANI"

set version 0.23.2

always-load gnu9/9.4.0

set BASE_PATH /home/USERNAME/software/fastani

prepend-path PATH $BASE_PATH/bin/

Load your the new module:

[USERNAME@master ~]$ module avail
-------------------------------------------- /home/USERNAME/modulefiles ------------------------------------------
fastani/1.83 (D)

[USERNAME@master ~]$ user load fastani

Available software

Garnatxa has installed the main applications that are used in bioinformatic environments. This tools are updated a few times a year or on demand. If you can’t find the tool that you need then contact with the (data center administrator). Alternatively you can install the software on your /home.

In order to simplify the work with bioinformatic tools you can load a module that contains a set of software usually used in bioinformatic. Every year the packages will be updated to the last version. The last release of Biotools module contains the packages listed below:

Name

Version

NCBI-BLAST

2.13

samtools

1.16

bwa

0.7.17

bowtie2

2.4.5

fastqc

0.12.1

fastp

0.23

BBMap

39.01

bedtools

2.30.0

mafft

7.505

iqtree2

2.2.0

SRA Toolkit

3.0.0

bcftools

1.17

KMC

3.2.1

To load all these tools you only have to type: module load biotools. In any case you can load individual packages: module load samtools

Mamba / Anaconda environments

Anaconda provides Python and a long list of packages that are ready to be used in bioinformatic environments. To use anaconda on Garnatxa, load the required anaconda module and then create your own conda environment. Before this you can install all packages that you require, and use it in a job.

On the other hand Anaconda is very slow resolving a long list of dependencies. As a recommended alternative you can use Mamba. Mamba is similar to Anaconda but simplifies the way to resolve dependencies doing the process quick.

The Mamba environment is similar to conda so you can use the same parameters as conda. Mamba and conda commands are integrated into the same module of software.

Below is described the steps to create a new conda or mamba environment and install and use some bioinformatics packages:

First load the anaconda module:

[USERNAME@master ~]$ module load anaconda
(base) [USERNAME@master ~]$

Warning

We strongly recommend using the mamba command but if you want to use conda then replace the mamba by conda in the next lines.

Then create a new environment conda (you must specify a name for the environment with the -n parameter. i.e.: biotools). Remember to add the version of python: 2 or 3

    (base) [USERNAME@master ~]$ mamba create -n biotools python=3
    Collecting package metadata (current_repodata.json): done
    Solving environment: done

    ## Package Plan ##

    environment location: /home/USERNAME/.conda/envs/biotools

    added / updated specs:
- python=3


    The following packages will be downloaded:

package                    |            build
---------------------------|-----------------
ca-certificates-2022.10.11 |       h06a4308_0         124 KB
certifi-2022.9.24          |  py310h06a4308_0         154 KB
pip-22.2.2                 |  py310h06a4308_0         2.4 MB
python-3.10.6              |       haa1d7c7_1        21.9 MB
readline-8.2               |       h5eee18b_0         357 KB
setuptools-65.5.0          |  py310h06a4308_0         1.2 MB
tzdata-2022f               |       h04d1e81_0         115 KB
zlib-1.2.13                |       h5eee18b_0         103 KB
------------------------------------------------------------
                                       Total:        26.3 MB

    The following NEW packages will be INSTALLED:

    _libgcc_mutex      pkgs/main/linux-64::_libgcc_mutex-0.1-main
    _openmp_mutex      pkgs/main/linux-64::_openmp_mutex-5.1-1_gnu
    bzip2              pkgs/main/linux-64::bzip2-1.0.8-h7b6447c_0
    ca-certificates    pkgs/main/linux-64::ca-certificates-2022.10.11-h06a4308_0
    certifi            pkgs/main/linux-64::certifi-2022.9.24-py310h06a4308_0
    ld_impl_linux-64   pkgs/main/linux-64::ld_impl_linux-64-2.38-h1181459_1
    libffi             pkgs/main/linux-64::libffi-3.3-he6710b0_2
    libgcc-ng          pkgs/main/linux-64::libgcc-ng-11.2.0-h1234567_1
    libgomp            pkgs/main/linux-64::libgomp-11.2.0-h1234567_1
    libstdcxx-ng       pkgs/main/linux-64::libstdcxx-ng-11.2.0-h1234567_1
    libuuid            pkgs/main/linux-64::libuuid-1.0.3-h7f8727e_2
    ncurses            pkgs/main/linux-64::ncurses-6.3-h5eee18b_3
    openssl            pkgs/main/linux-64::openssl-1.1.1q-h7f8727e_0
    pip                pkgs/main/linux-64::pip-22.2.2-py310h06a4308_0
    python             pkgs/main/linux-64::python-3.10.6-haa1d7c7_1
    readline           pkgs/main/linux-64::readline-8.2-h5eee18b_0
    setuptools         pkgs/main/linux-64::setuptools-65.5.0-py310h06a4308_0
    sqlite             pkgs/main/linux-64::sqlite-3.39.3-h5082296_0
    tk                 pkgs/main/linux-64::tk-8.6.12-h1ccaba5_0
    tzdata             pkgs/main/noarch::tzdata-2022f-h04d1e81_0
    wheel              pkgs/main/noarch::wheel-0.37.1-pyhd3eb1b0_0
    xz                 pkgs/main/linux-64::xz-5.2.6-h5eee18b_0
    zlib               pkgs/main/linux-64::zlib-1.2.13-h5eee18b_0


    Proceed ([y]/n)? y


    Downloading and Extracting Packages
    setuptools-65.5.0    | 1.2 MB    | ################################################################################################################################################################################################## |         100%
    pip-22.2.2           | 2.4 MB    | ################################################################################################################################################################################################## |         100%
    tzdata-2022f         | 115 KB    | ################################################################################################################################################################################################## |         100%
    zlib-1.2.13          | 103 KB    | ################################################################################################################################################################################################## |         100%
    python-3.10.6        | 21.9 MB   | ################################################################################################################################################################################################## |         100%
    readline-8.2         | 357 KB    | ################################################################################################################################################################################################## |         100%
    certifi-2022.9.24    | 154 KB    | ################################################################################################################################################################################################## |         100%
    ca-certificates-2022 | 124 KB    | ################################################################################################################################################################################################## |         100%
    Preparing transaction: done
    Verifying transaction: done
    Executing transaction: done
    #
    # To activate this environment, use
    #
    #     $ mamba activate biotools
    #
    # To deactivate an active environment, use
    #
    #     $ mamba deactivate

Activate the new module:

(base) [USERNAME@master ~]$ mamba activate biotools
(biotools) [USERNAME@master ~]$

Install the software:

    (biotools) [USERNAME@master ~]$ mamba install bowtie2
    Collecting package metadata (current_repodata.json): done
    Solving environment: done

    ## Package Plan ##

    environment location: /home/USERNAME/.conda/envs/biotools

    added / updated specs:
- bowtie2


    The following packages will be downloaded:

package                    |            build
---------------------------|-----------------
python-3.7.13              |       haa1d7c7_1        40.7 MB
setuptools-65.5.0          |   py37h06a4308_0         1.1 MB
------------------------------------------------------------
                                       Total:        41.8 MB

    The following NEW packages will be INSTALLED:

    bowtie2            bioconda/linux-64::bowtie2-2.2.5-py37h6bb024c_5
    perl               pkgs/main/linux-64::perl-5.26.2-h14c3975_0

    The following packages will be DOWNGRADED:

    certifi                         2022.9.24-py310h06a4308_0 --> 2022.9.24-py37h06a4308_0
    pip                                22.2.2-py310h06a4308_0 --> 22.2.2-py37h06a4308_0
    python                                  3.10.6-haa1d7c7_1 --> 3.7.13-haa1d7c7_1
    setuptools                         65.5.0-py310h06a4308_0 --> 65.5.0-py37h06a4308_0


    Proceed ([y]/n)? y


    Downloading and Extracting Packages
    python-3.7.13        | 40.7 MB   | ################################################################################################################################################################################### | 100%
    setuptools-65.5.0    | 1.1 MB    | ################################################################################################################################################################################### | 100%
    Preparing transaction: done
    Verifying transaction: done
    Executing transaction: done

Finally use the module:

(biotools) [USERNAME@master ~]$ bowtie2 --version
/home/USERNAME/.conda/envs/biotools/bin/bowtie2-align-s version 2.2.5
64-bit
Built on default-0bf8b44e-835b-4e12-b9cf-ba75185f6582
Mon May 13 19:06:45 UTC 2019
Compiler: gcc version 7.3.0 (crosstool-NG 1.23.0.449-a04d0)
Options: -O3 -m64 -msse2  -funroll-loops -g3 -DPOPCNT_CAPABILITY
Sizeof {int, long, long long, void*, size_t, off_t}: {4, 8, 8, 8, 8, 8

Warning

Do not try to use modules from our module system as well as packages installed in a conda environment in the same job. If you use a conda environment, install everything you require for that job inside of your conda environment. Mixing modules from our software system with packages installed in conda environments will fail or cause major issues.

If you want deactivate your current conda environment:

(biotools) [USERNAME@master ~]$ mamba deactivate
(base) [USERNAME@master ~]$

To definitively remove an environment:

(base) [USERNAME@master ~]$ mamba env remove -n biotools

To list the available environments

(base) [USERNAME@master ~]$ mamba info --envs
# conda environments:
#
DRAM                     /home/USERNAME/.conda/envs/DRAM
bio                      /home/USERNAME/.conda/envs/bio
bionew                   /home/USERNAME/.conda/envs/bionew
base                  *  /storage/apps/ANACONDA/anaconda3
bioenv                   /storage/apps/ANACONDA/anaconda3/envs/bioenv
metagenomicenv           /storage/apps/ANACONDA/anaconda3/envs/metagenomicenv

Export an Anaconda’s environment and replicate in a new environment. Example:

(base) [USERNAME@master ~]$ mamba activate biotools
(biotools) [USERNAME@master ~]$ mamba env export --file environment.yml
(biotools) [USERNAME@master ~]$ vi environment.yml (change the first line)
 Modify name: biotools by name:biotools2
(biotools) [USERNAME@master ~]$ mamba env create -f environment.yml
(base) [USERNAME@master ~]$ mamba activate biotools2

Podman/Singularity vs Docker

The use of docker containers in garnatxa is prohibited for security reasons. However, you can use a similar container platform such as podman or singularity. Podman has support for docker images. Podman uses the a docker alias to submit containers. You can use the command docker in the usual way.

Warning

Docker is not installed on Garnatxa because the design of Docker presents potential security issues for shared platforms with multiple users. Podman or singularity, on the other hand, can be run by end-users entirely within “user space”, that is, no special administrative privileges need to be assigned to a user in order for them to run and interact with containers on a platform where Singularity has been installed.

Singularity is a container platform. In some ways it appears similar to Docker from a user perspective, but in others, particularly in the system’s architecture, it is fundamentally different. These differences mean that Singularity is particularly well-suited to running on distributed, High Performance Computing (HPC) infrastructure.

Singularity images are a little different. Singularity uses the Singularity Image Format (SIF) and images are provided as single SIF files (with a .sif filename extension). Singularity images can be pulled from Singularity Hub, a registry for container images. Singularity is also capable of running containers based on images pulled from Docker Hub and some other sources.

You must to previously load the singularity module in order to run a container.

$ module load singularity

Pulling images from Singularity Hub:

$ singularity pull hello-world.sif shub://vsoch/hello-world

Pulling images fron Docker Hub:

$ singularity pull python-3.9.6.sif docker://python:3.9.6-slim-buster

Pulling images from external locations:

$ singularity pull --name fastqc-0.11.9--0.sif https://depot.galaxyproject.org/singularity/fastqc:0.11.9--0

Checking the container information.

$ singularity inspect fastqc-0.11.9--0.sif
org.label-schema.build-date: Saturday_25_January_2020_5:26:17_UTC
org.label-schema.schema-version: 1.0
org.label-schema.usage.singularity.deffile.bootstrap: docker
org.label-schema.usage.singularity.deffile.from: quay.io/biocontainers/fastqc:0.11.9--0
org.label-schema.usage.singularity.version: 3.3.0-1

If we know the path of an executable that we want to run within a container, we can use the singularity exec

$ singularity exec fastqc-0.11.9--0.sif fastqc -h

Above we used the singularity exec command but we can use singularity run. To clarify, the difference between these two commands is:

  • singularity run: This will run the default command set for containers based on the specified image. This default command is set within the image metadata when the image is built. You do not specify a command to run when using singularity run, you simply specify the image file name. As we saw earlier, you can use the singularity inspect command to see what command is run by default when starting a new container based on an image.

  • singularity exec: This will start a container based on the specified image and run the command provided on the command line following singularity exec <image file name>. This will override any default command specified within the image metadata that would otherwise be run if you used singularity run.

You will sometimes need to bind additional host system directories into a container you are using over and above those bound by default. For example:

There may be a shared dataset in a shard location that you need access to in the container. You may require executables and software libraries in the container The -B option to the singularity command is used to specify additional binds. For example, to bind the /storage/group/shared directory into a container you could use (note this directory is unlikely to exist on the host system you are using so you’ll need to test this using a different directory):

singularity shell -B /storage/group/shared hello-world.sif

Default ones, no need to mount them explicitly (for 3.6.x): $HOME , /sys:/sys , /proc:/proc, /tmp:/tmp, /var/tmp:/var/tmp, /etc/resolv.conf:/etc/resolv.conf, /etc/passwd:/etc/passwd, and $PWD For others, need to be done explicitly (syntax: host:container)

Important information before hands-on:

Singularity containers are able to run in batch or interactive sessions.

  • X-Sessions aren’t possible.

  • You can’t write outside of HOME or /home/<USERNAME>

In the following examples, we will run an bwa container and we will check with the /doc/test data set.

$ module load singularity
$ cp -pr /doc/test/ .
$ cd test
$ singularity pull bwa_v0.7.17.sif oras://registry.forgemia.inra.fr/gafl/singularity/bwa/bwa:latest
INFO:    Downloading oras image
  1. Submitting a singularity container in interactive mode (only to test)

$ interactive
$ singularity run bwa_v0.7.17.sif index ref/chr8.fa -p ref/chr8_ref
$ singularity run bwa_v0.7.17.sif aln -I -t 1 ref/chr8_ref data/reads_00.fq > out/example_aln.sai
  1. Submitting a singularity job in batch mode

$ cat SingularityJob.sh

#!/bin/bash
#SBATCH --job-name=singularityTest          # Job name (showed with squeue)
#SBATCH --output=singularityTest_%j.out     # Standard output and error log
#SBATCH --ntasks=1                          # Required only 1 task
#SBATCH --cpus-per-task=1                   # Required only 1 cpu
#SBATCH --mem=10G                           # Required 10GB of memory
#SBATCH --time=00:05:00                     # Required 5 minutes of execution.
#SBATCH --qos=short                         # QoS: short,medium,long,long-mem

# Load the required software (bwa)
module load singularity

# Index the reference genome (ref/chr8.fa). The output files will be re-named with preffix: chr8_ref
srun singularity run bwa_v0.7.17.sif index ref/chr8.fa -p ref/chr8_ref

# Align a single file of reads (data/reads_00.fq) to the indexed reference file (ref/chr8.fa). We are using a single cpu (parameter: -t 1)
srun singularity run bwa_v0.7.17.sif aln -I -t 1 ref/chr8_ref data/reads_00.fq > out/example_aln.sai

exit 0
$ sbatch SingularityJob.sh

Building our own container image in singularity: We can build our own .sif image and install the software we require permanently. To do this we must carry out the process on our own computer with administrator permissions.

  1. Download an image base from external repository. This creates a tree of directories where we can install the desired software:

$ singularity build --sandbox python_3.12-stretch docker://python:3.7.3-stretch
INFO:    Starting build...
Getting image source signatures
Copying blob 39aa0c89bda1 done
Copying blob 494c27a8a6b8 done
Copying blob 6f2f362378c5 done
Copying blob 7596bb83081b done
Copying blob 615db220d76c done
Copying blob 372744b62d49 done
Copying blob ac275157d894 done
Copying blob 98d16dec829a done
Copying blob c8514b1c6524 done
Copying config 5aa56d9ce3 done
Writing manifest to image destination
Storing signatures
2024/08/09 13:02:35  info unpack layer: sha256:6f2f362378c5a6fd915d96d11dda1e0223ccf213bf121ace56ae0f6616ea1dc8
2024/08/09 13:02:40  info unpack layer: sha256:494c27a8a6b820f9167ec7e368b3a9bb47d7029f4dc8c97c67091f3757a5bc4e
2024/08/09 13:02:40  info unpack layer: sha256:7596bb83081b6c8410df557d538a0ae45922cbf81e469c6f4cfa835247cb24ab
2024/08/09 13:02:41  info unpack layer: sha256:372744b62d49eba993652ee4a1201801fe278b687d85489101e07e7b9a4900e0
2024/08/09 13:02:45  info unpack layer: sha256:615db220d76c063138a2e6c5849703a7a80d682a682f7e1a841e6e7ed5f43879
2024/08/09 13:02:58  info unpack layer: sha256:39aa0c89bda1ee16e94ab7039cb0b9a9fce8a390769c2194aaf5fdf0ae1a2bdd
2024/08/09 13:02:59  info unpack layer: sha256:ac275157d894bedd09171a43b2b24ee6e7587a9544a1eae42ef2ea6b60584100
2024/08/09 13:03:00  info unpack layer: sha256:98d16dec829a865dcb9bad110c7f1fc04ceecb52cb00f4b37f85b592aa68089d
2024/08/09 13:03:00  info unpack layer: sha256:c8514b1c6524ef491c388ca3114b7f7e969e0e7507ea9efac08e663b982ec5d1
INFO:    Creating sandbox directory...
INFO:    Build complete: python_3.12-stretch
  1. Install the software. For instance we will install bwa from the pip tool:

$ singularity exec --writable python_3.12-stretch pip3 install --upgrade pip

$ singularity exec --writable python_3.12-stretch pip3 install bwa
Collecting bwa
Downloading bwa-1.1.1-py3-none-any.whl.metadata (7.0 kB)
Collecting python-telegram-bot (from bwa)
Downloading python_telegram_bot-20.3-py3-none-any.whl.metadata (15 kB)
Collecting requests (from bwa)
Downloading requests-2.31.0-py3-none-any.whl.metadata (4.6 kB)
Collecting yagmail>=0.11.214 (from bwa)
Downloading yagmail-0.15.293-py2.py3-none-any.whl.metadata (2.9 kB)
Collecting premailer (from yagmail>=0.11.214->bwa)
Downloading premailer-3.10.0-py2.py3-none-any.whl.metadata (15 kB)
Collecting httpx~=0.24.0 (from python-telegram-bot->bwa)
Downloading httpx-0.24.1-py3-none-any.whl.metadata (7.4 kB)
Collecting charset-normalizer<4,>=2 (from requests->bwa)
Downloading charset_normalizer-3.3.2-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (33 kB)
Collecting idna<4,>=2.5 (from requests->bwa)
Downloading idna-3.7-py3-none-any.whl.metadata (9.9 kB)
Collecting urllib3<3,>=1.21.1 (from requests->bwa)
Downloading urllib3-2.0.7-py3-none-any.whl.metadata (6.6 kB)
Collecting certifi>=2017.4.17 (from requests->bwa)
Downloading certifi-2024.7.4-py3-none-any.whl.metadata (2.2 kB)
Collecting httpcore<0.18.0,>=0.15.0 (from httpx~=0.24.0->python-telegram-bot->bwa)
Downloading httpcore-0.17.3-py3-none-any.whl.metadata (18 kB)
Collecting sniffio (from httpx~=0.24.0->python-telegram-bot->bwa)
Downloading sniffio-1.3.1-py3-none-any.whl.metadata (3.9 kB)
Collecting lxml (from premailer->yagmail>=0.11.214->bwa)
Downloading lxml-5.2.2-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (3.4 kB)
Collecting cssselect (from premailer->yagmail>=0.11.214->bwa)
Downloading cssselect-1.2.0-py2.py3-none-any.whl.metadata (2.2 kB)
Collecting cssutils (from premailer->yagmail>=0.11.214->bwa)
Downloading cssutils-2.7.1-py3-none-any.whl.metadata (9.3 kB)
Collecting cachetools (from premailer->yagmail>=0.11.214->bwa)
Downloading cachetools-5.4.0-py3-none-any.whl.metadata (5.3 kB)
Collecting h11<0.15,>=0.13 (from httpcore<0.18.0,>=0.15.0->httpx~=0.24.0->python-telegram-bot->bwa)
Downloading h11-0.14.0-py3-none-any.whl.metadata (8.2 kB)
Collecting anyio<5.0,>=3.0 (from httpcore<0.18.0,>=0.15.0->httpx~=0.24.0->python-telegram-bot->bwa)
Downloading anyio-3.7.1-py3-none-any.whl.metadata (4.7 kB)
Collecting importlib-metadata (from cssutils->premailer->yagmail>=0.11.214->bwa)
Downloading importlib_metadata-6.7.0-py3-none-any.whl.metadata (4.9 kB)
Collecting exceptiongroup (from anyio<5.0,>=3.0->httpcore<0.18.0,>=0.15.0->httpx~=0.24.0->python-telegram-bot->bwa)
Downloading exceptiongroup-1.2.2-py3-none-any.whl.metadata (6.6 kB)
Collecting typing-extensions (from anyio<5.0,>=3.0->httpcore<0.18.0,>=0.15.0->httpx~=0.24.0->python-telegram-bot->bwa)
Downloading typing_extensions-4.7.1-py3-none-any.whl.metadata (3.1 kB)
Collecting zipp>=0.5 (from importlib-metadata->cssutils->premailer->yagmail>=0.11.214->bwa)
Downloading zipp-3.15.0-py3-none-any.whl.metadata (3.7 kB)
Downloading bwa-1.1.1-py3-none-any.whl (13 kB)
Downloading yagmail-0.15.293-py2.py3-none-any.whl (17 kB)
Downloading python_telegram_bot-20.3-py3-none-any.whl (545 kB)
        ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 545.4/545.4 kB 24.2 MB/s eta 0:00:00
Downloading requests-2.31.0-py3-none-any.whl (62 kB)
        ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 62.6/62.6 kB 10.4 MB/s eta 0:00:00
Downloading certifi-2024.7.4-py3-none-any.whl (162 kB)
        ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 163.0/163.0 kB 18.7 MB/s eta 0:00:00
Downloading charset_normalizer-3.3.2-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (136 kB)
        ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 136.8/136.8 kB 17.2 MB/s eta 0:00:00
Downloading httpx-0.24.1-py3-none-any.whl (75 kB)
        ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 75.4/75.4 kB 12.0 MB/s eta 0:00:00
Downloading idna-3.7-py3-none-any.whl (66 kB)
        ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 66.8/66.8 kB 11.1 MB/s eta 0:00:00
Downloading urllib3-2.0.7-py3-none-any.whl (124 kB)
        ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 124.2/124.2 kB 21.8 MB/s eta 0:00:00
Downloading premailer-3.10.0-py2.py3-none-any.whl (19 kB)
Downloading httpcore-0.17.3-py3-none-any.whl (74 kB)
        ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 74.5/74.5 kB 12.2 MB/s eta 0:00:00
Downloading sniffio-1.3.1-py3-none-any.whl (10 kB)
Downloading cachetools-5.4.0-py3-none-any.whl (9.5 kB)
Downloading cssselect-1.2.0-py2.py3-none-any.whl (18 kB)
Downloading cssutils-2.7.1-py3-none-any.whl (399 kB)
        ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 399.7/399.7 kB 36.0 MB/s eta 0:00:00
Downloading lxml-5.2.2-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (4.9 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 4.9/4.9 MB 110.2 MB/s eta 0:00:00
Downloading anyio-3.7.1-py3-none-any.whl (80 kB)
        ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 80.9/80.9 kB 13.1 MB/s eta 0:00:00
Downloading h11-0.14.0-py3-none-any.whl (58 kB)
        ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 58.3/58.3 kB 10.8 MB/s eta 0:00:00
Downloading importlib_metadata-6.7.0-py3-none-any.whl (22 kB)
Downloading typing_extensions-4.7.1-py3-none-any.whl (33 kB)
Downloading zipp-3.15.0-py3-none-any.whl (6.8 kB)
Downloading exceptiongroup-1.2.2-py3-none-any.whl (16 kB)
Installing collected packages: zipp, urllib3, typing-extensions, sniffio, lxml, idna, exceptiongroup, cssselect, charset-normalizer, certifi, cachetools, requests, importlib-metadata, h11, anyio, httpcore, cssutils, premailer, httpx, yagmail, python-telegram-bot, bwa
Successfully installed anyio-3.7.1 bwa-1.1.1 cachetools-5.4.0 certifi-2024.7.4 charset-normalizer-3.3.2 cssselect-1.2.0 cssutils-2.7.1 exceptiongroup-1.2.2 h11-0.14.0 httpcore-0.17.3 httpx-0.24.1 idna-3.7 importlib-metadata-6.7.0 lxml-5.2.2 premailer-3.10.0 python-telegram-bot-20.3 requests-2.31.0 sniffio-1.3.1 typing-extensions-4.7.1 urllib3-2.0.7 yagmail-0.15.293 zipp-3.15.0
WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv
  1. Try the instaled software. Remember that you have not yet built the final image of the container. This will be done in the next step.

$ singularity exec python_3.12-stretch  python3 -c "import bwa"
  1. Build the image. This can take a long time:

$ singularity build python_3.12-stretch.sif python_3.12-stretch

Now you can bring the image of container to Garnatxa and execute there.

Git concepts

This guide is a basic tutorial to understand the initial concepts about Git. For more advanced knowledge about Git you can review the official guide: https://git-scm.com/book/en/v2/

Git is a Version Control System (VCS) that records changes to a file or set of files over time so that you can recall specific versions later. It allows you to revert selected files back to a previous state, revert the entire project back to a previous state, compare changes over time, see who last modified something that might be causing a problem, who introduced an issue and when, and more. Using a VCS also generally means that if you screw things up or lose files, you can easily recover. In addition, you get all this for very little overhead. But, what type of files is possible to track ?. In reality you can do this with nearly any type of file on a computer but VCS systems were intended to version source code.

Warning

Garnatxa provides a remote repository to push Git files. It should only used to store source code. Then avoid to upload not ascii files like binaries, databases, compressed files, etc. Any type of these files may be rejected when trying to store them in the repository.

Git is a Distributed Version Control Systems (DVCSs). In a DVCS (such as Git, Mercurial, Bazaar or Darcs), clients don’t just check out the latest snapshot of the files; rather, they fully mirror the repository, including its full history. Thus, if any server dies, and these systems were collaborating via that server, any of the client repositories can be copied back up to the server to restore it. Every clone is really a full backup of all the data.

Main features

  • Backup & Restore: Save and edit files.

  • Synchronization: Allows sharing source code and update your local repository to the last version.

  • Undo changes: Any change can be undone to go back to any old version.

  • Track changes: Is possible to track the evolution of changes in files and their differences. The changes in source code can be explained to better understanding.

  • Track ownership: Git is a multiuser platform and this means that everybody can see who and when a file was modified.

  • Sandboxing: Allows to develop testing code without interfering with stable code.

  • Branching & Merging: The testing code (or any type of source code) can be merged to the main line of development and vice versa.

  • Distributed and Offline availability : Every user keeps a copy of all changes in files (local repository). After to commit a change they can push changes to a remote repository but if this remote location fails the data continues available in local repositories.

../_images/gitlab8.png

Figure 1. Example of workflow in Git.

During the life cycle of a project a set of files can be modified many times. Also is possible to work with multiple developments of the same files in parallel (branches) merging then the resulted sources into a stable release. Git allows you to track changes in files and recover modifications at every instant of time. In this way is possible to return to a specific version of a project. In addition git allows multi-user collaborative development. Several users can be modifying the same source code without interfering. The changes produced in the files are reviewed, agreed and merged before obtaining a stable version in Git.

../_images/gitlab9.png

Figure 2. In the figure the files A,B and C are modified several times along the time. Git allows commit snapshots of the project and tagging them in order to preserving changes. You can return to each version of the project and compare differences.

It is very important to understand the stages and workspaces that a file goes through in Git. In Git we will always work with files stored on our personal computer. Then we can identify the next working spaces:

  • Working directory: Associated to the entire project workspace. We can modify files as many times as we want. Changes to those files are not committed.

  • Staging Area: Set of files that were modified from the last committed. We can put in this area all the files to be committed as modified.

  • Git Repository: Committed files. This stage include the files that were tracked as a new snapshot in the project. You can return to old commits.

  • Remote server: It’s used to push committed files to external servers. This server is used to backup all our project. The git project can be downloaded from this server to other locations.

Note

Garnatxa provides a remote Git repository to upload your code project. Check the :ref:` Garnatxa’s Gitlab service <gitlab_service>` section in order to add a new Git repository in the remote server.

../_images/gitlab10.png

Figure 3. Stages in git.

The general procedure to develop in Garnatxa is:

1- Develop your software in your personal computer or workstation. Test your code and produce a stable version in Git. Upload this release to the remote Git server (provided by Garnatxa).

2- Once you have uploaded a stable release in Garnatxa repository (Gitlab) connect to Garnatxa and clone or download the changes in your code from the remote Git repository. Test the code in an interactive session or submit the jobs in the queue system.

3- Finally Submit jobs to Garnatxa.

The following sections provide a summary of the basic actions to handle git in you local computer and upload changes to a remote server. You only need a personal computer with git installed.

Preliminary actions

To develop the course we are going to employ a directory with some already existing source files. In this case we are considering that our project already exists but it is possible to create an initial project and start to develop in git from zero.

USERNAME@localhost:~$ scp -r USERNAME@garnatxa.uv.es:/doc/test .
USERNAME@localhost:~$ cd test
USERNAME@localhost:~$ ls test

Create an empty project in Gitlab

We need to create a new project in Gitlab (the Git remote repository provided by Garnatxa). You can obtain more information about how to initialize a Gitlab project here: Garnatxa’s Gitlab service . Follow the steps in the guide to create a new project with the name: test (avoid to create a README.md file).

Initialize the local project

Only the first time you are configuring a new project in git you have to do:

1. Configure some global variables in git and initialize the local project. The next commands create a .git directory and some internal configuration settings inside it.

localhost$~/test git config --global init.defaultBranch main
localhost$~/test git config --global user.name "USERNAME"
localhost$~/test git config --global user.email user.surname@example.com
localhost$~/test git init
Initialized empty Git repository in /home/USERNAME/test/.git/

2. Add the remote address of your project test in Gitlab to the git local configuration. To get the address enter in Gitlab and Projects->test->Clone->Clone with SSH and copy (ctrl+C) the text: git@garnatxagitlab.uv.es:USER_FIRSTNAME.USER:LASTNAME/test.git

localhost@$~/test$ git remote add origin git@garnatxagitlab.uv.es:user.surname/test.git

3. A branch in git is a development line and can coexist with others versions of the same code in parallel. For example, we can have a main branch where we only develop stable content and release stable versions (it is the branch from which our clients should download the stable versions) and at the same time another branch can coexist in which we work to solve a problem in the code. This way we can work with the bug without interfering with the stable version. When we fix the bug it will be possible to join the two branches into one, usually the main one. Gitlab and GitHub use the name: main as the default branch. We make sure that our branch is called main with: git branch -m main

localhost@$~/test$ git branch -m main

Working with Git

git status

The first command that we will use will allow us to know what state our project files are respect to git. Use: git status whenever you want to know which files have been modified or confirmed with respect to the latest version stored in our local directory.

localhost$~/test git status
On branch main

No commits yet

Untracked files:
(use "git add <file>..." to include in what will be committed)
    ArrayJob.sh
    ArrayJob_List.sh
    FileJob.sh
    FileJob_List.sh
    MPIJob.sh
    MultiThreadJob.sh
    OpenMPJob.sh
    SequentialJob.sh
    SingularityJob.sh
    data/
    executables/
    list_of_cmd.txt
    ref/

nothing added to commit but untracked files present (use "git add" to track)

The output of the command tells us that there is no file modified and staged yet with respect to the latest version. The list shows all the files and directories of test that logically they have not been changed. We will modify one of the files to see what happens but first we must understand the following rule in git.

.gitignore

Git can only be used to manage source code developments, so we must avoid uploading files outside of that scope: binaries, databases, data files (fasta, fastq, zip, etc) or directories of input/output files. To avoid upload this type of files and directories you have to create a special file named: .gitignore and put inside the list of files/directories to omit. Review https://git-scm.com/docs/gitignore to learn more about git ignore. In this example of test the directories: data, out, executables should be ignored in git. Use vim or other text editor to create the file .gitignore then add the lines:

localhost$~/test vim .gitignore
data
executables
out

git add

Now the files are local but we want to upload them to the Gitlab server so that they are accessible later in Garnatxa. To upload the files to the server (push in Git terminology) we must first do two previous steps.

  1. We add the files to the stage area. In this way we tell git that these files are ready to be committed as a new version of our software. The git add command allows you to select which directories or files you want to move to the stage area. If we want all the files modified or added since the last version maintained in our working directory to be moved, we must do: git add .

localhost$~/test$ git add .
localhost$~/test$ git status
On branch main

No commits yet

Changes to be committed:
(use "git rm --cached <file>..." to unstage)
    new file:   .gitignore
    new file:   ArrayJob.sh
    new file:   ArrayJob_List.sh
    new file:   FileJob.sh
    new file:   FileJob_List.sh
    new file:   MPIJob.sh
    new file:   MultiThreadJob.sh
    new file:   OpenMPJob.sh
    new file:   README.md
    new file:   SequentialJob.sh
    new file:   SingularityJob.sh
    new file:   list_of_cmd.txt
    new file:   ref/chr8.fa

git commit

  1. If we are sure that all these are the changes that will go into our next version of the software, we must commit the files in the stage area before we can upload them to the Gitlab server. A commit action is something like taking a snapshot of the state of your development files. We can make confirmation when we want to launch a new release of the software or we have corrected a bug in our code and we want to upload it to a remote server so that other users can download the corrected version.

Use git commit -m 'description' and add a description about what include the commit.

localhost$~/test$ git commit -m "My first commit. Original scripts"
[main (root-commit) d0068b2] My first commit
13 files changed, 2927504 insertions(+)
create mode 100644 .gitignore
create mode 100644 ArrayJob.sh
create mode 100644 ArrayJob_List.sh
create mode 100644 FileJob.sh
create mode 100644 FileJob_List.sh
create mode 100644 MPIJob.sh
create mode 100644 MultiThreadJob.sh
create mode 100644 OpenMPJob.sh
create mode 100644 README.md
create mode 100644 SequentialJob.sh
create mode 100644 SingularityJob.sh
create mode 100644 list_of_cmd.txt
create mode 100644 ref/chr8.fa
localhost$~/test$ git status
On branch main
nothing to commit, working tree clean

git push

  1. The last step is to upload the new commit to the Gitlab server in Garnatxa. Use git push -u origin main

localhost$~/test$ git push -u origin main
Enumerating objects: 16, done.
Counting objects: 100% (16/16), done.
Delta compression using up to 16 threads
Compressing objects: 100% (14/14), done.
Writing objects: 100% (16/16), 44.73 MiB | 3.25 MiB/s, done.
Total 16 (delta 3), reused 0 (delta 0), pack-reused 0
To garnatxagitlab.uv.es:user.surname/test.git
* [new branch]      main -> main
Branch 'main' set up to track remote branch 'main' from 'origin'.

git log

Use git log to review the history of commits done. each commit has an associated hash code that can be referred to when we want to go back to a previous version of our code.

localhost$~/test$ git log
commit d0068b214af7c7efe580da766e1001bcdfd6a108 (HEAD -> main, origin/main)
Author: User  <user.surname@example.es>
Date:   Wed Jul 5 14:51:56 2023 +0200

My first commit

The same in brief mode: git log --pretty=oneline

localhost$~/test$  git log --pretty=oneline
98132db3ef221ea9043cc04702f81157e5b85bc2 (HEAD -> main, origin/main) Add a README file
d0068b214af7c7efe580da766e1001bcdfd6a108 My first commit

Now we can create a new file called README.md. Most of the git projects have this file (it contains a brief description of the project) since it allows us to explain the functionality of our software as well as the steps for its installation, use, etc. Use a text editor to create and add these lines:

localhost$~/test vim README.md
This is a testing project. We include some sbatch templates in SLURM.
The sbatch scripts allow to submit jobs to the queue system in Garnatxa.
Multiple types of jobs are implemented: Sequential, multithreads, MPI, arrays, background and singularity.

Git status shows that the new file: README.md is untracked (means out of the stage area). You could continue editing files and do git add at the end.

localhost$~/test git status
On branch main
Your branch is ahead of 'origin/main' by 1 commit.
(use "git push" to publish your local commits)

Untracked files:
(use "git add <file>..." to include in what will be committed)
    README.md

nothing added to commit but untracked files present (use "git add" to track)

Execute Git add , git commit and git push to send the new commit to Gitlab.

localhost$~/test git add .
localhost$~/test git commit -m "Add a README file"
[main 98132db] Add a README file
1 file changed, 3 insertions(+)
create mode 100644 README.md
localhost$~/test git push -u origin main
Enumerating objects: 7, done.
Counting objects: 100% (7/7), done.
Delta compression using up to 16 threads
Compressing objects: 100% (5/5), done.
Writing objects: 100% (5/5), 521 bytes | 521.00 KiB/s, done.
Total 5 (delta 3), reused 0 (delta 0), pack-reused 0
To garnatxagitlab.uv.es:user.surname/test.git
d0068b2..98132db  main -> main
Branch 'main' set up to track remote branch 'main' from 'origin'.
localhost$~/test  git log
commit 98132db3ef221ea9043cc04702f81157e5b85bc2 (HEAD -> main, origin/main)
Author: User  <user.surname@example.com>
Date:   Thu Jul 6 10:06:59 2023 +0200

Add a README file

commit d0068b214af7c7efe580da766e1001bcdfd6a108
Author: User  <user.surname@example.com>
Date:   Wed Jul 5 14:51:56 2023 +0200

My first commit

git show

With git show shows the content of the last commit. Also is possible to obtain differences between the last commits and older.

localhost$~/test  git show
commit 98132db3ef221ea9043cc04702f81157e5b85bc2 (HEAD -> main, origin/main)
Author: User  <user.surname@example.com>
Date:   Thu Jul 6 10:06:59 2023 +0200

Add a README file

diff --git a/README.md b/README.md
new file mode 100644
index 0000000..47f41b3
--- /dev/null
+++ b/README.md
@@ -0,0 +1,3 @@
+This is a testing project. We include some sbatch templates in SLURM.
+The sbatch scripts allow to submit jobs to the queue system in Garnatxa.
+Multiple types of jobs are implemented: Sequential, multithreads, MPI, arrays, background and singularity.

git tag

Some commits can be tagged to make it easier to refer to them in subsequent actions. The git tag command allows you to define tags for an already made commit. The tags are often used to reference a milestone achieved in software development, such as the release of a new version, the fixing of a bug, or the creation of a parallel development branch.

localhost$~/test  git tag -a v1.0 -m "First version of test" d0068b
localhost$~/test git log --pretty=oneline
98132db3ef221ea9043cc04702f81157e5b85bc2 (HEAD -> main, origin/main) Add a README file
d0068b214af7c7efe580da766e1001bcdfd6a108 (tag: v1.0) My first commit

When you create tags remember to push them to the remote Gitlab repository: git push origin --tags. Tags remain stored in the local repository.

localhost$~/test git push origin --tags
Enumerating objects: 1, done.
Counting objects: 100% (1/1), done.
Writing objects: 100% (1/1), 167 bytes | 167.00 KiB/s, done.
Total 1 (delta 0), reused 0 (delta 0), pack-reused 0
To garnatxagitlab.uv.es:user.surname/test.git
* [new tag]         v1.0 -> v1.0

git checkout

At any point in time we can return to a previous version of our code (only is possible with committed snapshots). For example, if we wanted to go back to the initial version before adding the README.md file, we should use the git checkout <commit> command.

Check the content of the test directory, the file README.md is there because we committed it.

localhost$~/test ls
ArrayJob_List.sh  ArrayJob.sh  data  executables  FileJob_List.sh  FileJob.sh  list_of_cmd.txt  MPIJob.sh  MultiThreadJob.sh  OpenMPJob.sh  out  README.md  ref  SequentialJob.sh  SingularityJob.sh

Review the list of commits, remember that we tagged the initial commit with: v1.0

localhost$~/test git log --pretty=oneline
98132db3ef221ea9043cc04702f81157e5b85bc2 (HEAD -> main, origin/main) Add a README file
d0068b214af7c7efe580da766e1001bcdfd6a108 (tag: v1.0) My first commit

We can use a tag or the first 6 characters of the commit identifier to switch the commit.

localhost$~/test git checkout d0068b   (git checkout v1.0 produces the same effect)

Now README.md is gone from the test directory. Notice how the HEAD pointer now points to the initial commit.

localhost$~/test ls
ArrayJob_List.sh  ArrayJob.sh  data  executables  FileJob_List.sh  FileJob.sh  list_of_cmd.txt  MPIJob.sh  MultiThreadJob.sh  OpenMPJob.sh  out  ref  SequentialJob.sh  SingularityJob.sh

localhost$~/test git log --pretty=oneline
d0068b214af7c7efe580da766e1001bcdfd6a108 (HEAD, tag: v1.0) My first commit

The important thing here is that the file README.md has not been removed from git. We have only returned to older snapshot in our project in which the README.md file did not yet exist. This is useful for example if we wanted to return to a point in the code where we detected an error and we wanted to create a parallel branch to solve it.

We can return to the last commit in the main branch:

localhost$~/test git checkout main
Previous HEAD position was d0068b2 My first commit
Switched to branch 'main'
Your branch is up to date with 'origin/main'

localhost$~/test ls
ArrayJob_List.sh  ArrayJob.sh  data  executables  FileJob_List.sh  FileJob.sh  list_of_cmd.txt  MPIJob.sh  MultiThreadJob.sh  OpenMPJob.sh  out  README.md  ref  SequentialJob.sh  SingularityJob.sh

localhost$~/test git log --pretty=oneline
98132db3ef221ea9043cc04702f81157e5b85bc2 (HEAD -> main, origin/main) Add a README file
d0068b214af7c7efe580da766e1001bcdfd6a108 (tag: v1.0) My first commit

git mv and git rm

To move or delete a file or directory from local repositories you have to use git commands. You should not move or delete files directly from the local repository as those actions will not be reflected in subsequent commits.

Create a new directory and move the file SingularityJob.sh there with git mv.

localhost$~/test mkdir others
localhost$~/test git mv SingularityJob.sh others
localhost$~/test git status
On branch main
Your branch is up to date with 'origin/main'.

Changes to be committed:
(use "git restore --staged <file>..." to unstage)
    renamed:    SingularityJob.sh -> others/SingularityJob.sh

Now try to remove the file with git rm

localhost$~/test git rm others/SingularityJob.sh
error: the following file has changes staged in the index:
others/SingularityJob.sh
(use --cached to keep the file, or -f to force removal)

The commands returns an error because the new directory and the move action was not previously committed. We can force the action but this would delete the others directory as well.

localhost$~/test git rm -f others/SingularityJob.sh

To restore a deleted file in git use git restore

localhost$~/test git restore SingularityJob.sh

Repeat the process but this time we will commit the others directory and his contents.

localhost$~/test mkdir others
localhost$~/test git mv SingularityJob.sh others
localhost$~/test git add .
localhost$~/test git commit -m "Move SingularityJob.sh to new directory others."
[main ef5dacc] Move SingularityJob.sh to new directory others.
1 file changed, 0 insertions(+), 0 deletions(-)
rename SingularityJob.sh => others/SingularityJob.sh (100%

Finally upload all the commits in your local repository to Gitlab in Garnatxa.

localhost$~/test git push -u origin main
Enumerating objects: 4, done.
Counting objects: 100% (4/4), done.
Delta compression using up to 16 threads
Compressing objects: 100% (2/2), done.
Writing objects: 100% (3/3), 344 bytes | 344.00 KiB/s, done.
Total 3 (delta 1), reused 0 (delta 0), pack-reused 0
To garnatxagitlab.uv.es:jose.carrion/test.git
98132db..ef5dacc  main -> main
Branch 'main' set up to track remote branch 'main' from 'origin'.

Summary of commands

Git command

Description

git init

Initialize a new git project. It should only be executed once at the start of the project.

git status

Shows the status and stage of the project files.

git log

Gets a history with the commits made in the project.

git log –pretty=oneline

The same in only a line.

git log –graph

The same showing a temporal line.

git show

Gets differences between commits.

git add .

Add files or directories to the preparation stage. The step before to do a commit.

git commit -m ‘<description>’

Commit a snapshot of the project. The files located in the stage area are the ones that will be included in that commit. Commits are stored on the local machine. Use git push to push the commits to a remote repository.

git push -u origin main

Upload local commits to an external repository. Use this remote repository as a backup and share file area.

git tag -a <tag> -m <description> <commit>

Label a commit to make it easier to refer to in the future. It usually coincides with different milestones in the development of your code. For example a the release of a new version.

git checkout <branch> or <commit> or <tag>

Switch development to other commit or branch.

git rm

Remove a file from the local repository.

git mv

Move a file to a directory.

git restore

Restore a file deleted with git rm

Garnatxa’s Gitlab service

GitLab is an open source code repository and collaborative software development platform for large DevOps and DevSecOps projects. GitLab offers a location for online code storage and capabilities for issue tracking. The repository enables hosting different development chains and versions, and allows users to inspect previous code and roll back to it in the event of unforeseen problems. GitLab is a competitor to GitHub. Because GitLab is developed on the same Git basis of version control, it functions very similarly for source code management. Garnatxa provides a GitLab repository that enabled you to upload and download your source code from your local computer to a remote server. In this way you can use your personal computer to develop and testing code. When the version of your development is stable you can commit and push your changes to the remote repository. The code stored in the Gitlab service can be accessed from Garnatxa. You can fetch your last version of your code and submit your code to the Garnatxa system. Remember that Garnatxa only should be used with production versions of your code.

Warning

The Gitlab service is only allowed to upload source code. You should avoid uploading databases, binary files or any other type of file distinct of source code. The size of the push operations is limited to 10MB.

In order initialize and integrate your existing code in GitLab follow the below steps.

Import ssh keys to Gitlab

In order to access to your project from any git client you need to generate a ssh public/private keys pair and then import to Gitlab.

  1. First open a terminal in your personal computer and be secure you have a ssh key generated.

localhost$ ~$ ssh-keygen -t rsa
Generating public/private rsa key pair.
Enter file in which to save the key (/home/jocarbur/.ssh/id_rsa):
/home/user/.ssh/id_rsa already exists.
Overwrite (y/n)?

If you receive a overwrite message then you already have a pair of ssh keys installed in your account ando you can use them, cancel the command with crt+c. In other case continue creating the new keys.

Copy the content of the file id_rsa.pub in memory (select lines of the file and click the right button of your mouse then copy).

localhost$ ~$ cat /home/user/.ssh/id_rsa.pub
  1. Import the public key into Gitlab. Click on preferences (upper right hand side of the screen).

../_images/gitlab5.png
  1. Click on SSH keys and paste the content of the public key in the key box. Then click on add key.

../_images/gitlab6.png

Create an empty project in Gitlab

First you should create a empty project in the Gitlab repository:

  1. Sign in the Garnatxa’s Gitlab service: https://garnatxadoc.uv.es/gitlab .You must use an active account and password in Garnatxa.

../_images/gitlab0.png

Figure 1. Select the LDAP tab then enter the username and password that you use to access to Garnatxa.

  1. Create an empty new project. To do this select Project -> New project

../_images/gitlab1.png

Figure 2. Create a new project in Gitlab.

  1. Create a blank project.

../_images/gitlab2.png

Figure 3. Create a blank project.

  1. Enter data to the new project:

  • Enter the name of project.

  • Select a identifier to the URL related to the project. Choose a suggested name from the combo box.

  • You can choose the visibility level for your project:
    • Private: Select private if only you can access to this project. If the project is shared in a group then the members of the group could access to the project.

    • Internal: All users logged in Gitlab could access to he project.

    • Public: The project can be accessed from Internet.

  • If you already have an existing source code and you want to integrate it with Gitlab then uncheck the option Initialize repository with a README. You can create this file later.

  • Click on the create project button.

../_images/gitlab3.png

Figure 4. Project related data.

  1. The next screen will show some information about your project. Copy the git address of the new project to the clipboard. You will need this address more later in order to access to the project.

../_images/gitlab4_bis.png

Figure 5. Select the git address (marked as a red rectangle in the picture) of the project and click on the right icon to copy the address to the clipboard.

Initialize some global variables

Initialize git project with some global git parameters. Add the user name and mail. For compatibility with Gitlab, GitHub and other git repositories is better to rename the default branch: master to main.

localhost$ git config --global init.defaultBranch main
localhost$ git config --global user.name "User 1"
localhost$ git config --global user.email user1@example.com

Integrate an existing local project and push the code to Gitlab

If you already have a project with source code then you can push it to the new created project in Gitlab.

Open a terminal and change to your source directory.

localhost$ cd LDAP_MANAGER
localhost$ git init
Initialized empty Git repository in /home/user1/LDAP_MANAGER/.git/

Configure the address to the Gitlab service in Garnatxa. You should to paste the address copied in step 5 above.

localhost$ git remote add origin git@garnatxagitlab.uv.es:jose.carrion/ldapmanager.git

Add all the source files to the stage level:

localhost$ git add .

Commit your files adding a comment:

localhost$ git commit -m "My first commmit"
[master (root-commit) 1e9ed9c] My first commmit
14 files changed, 1297 insertions(+)
create mode 100644 .project
create mode 100644 .pydevproject
create mode 100644 .settings/org.eclipse.core.resources.prefs
create mode 100644 .vscode/launch.json
create mode 100644 .vscode/settings.json
create mode 100644 LDAPManager2.py

Some old versions of git require forcing the change to branch main:

localhost$ git branch -m main

Finally push your initial commit to Gitlab

localhost$ git push -u origin main
Enumerating objects: 17, done.
Counting objects: 100% (17/17), done.
Delta compression using up to 16 threads
Compressing objects: 100% (16/16), done.
Writing objects: 100% (17/17), 19.15 KiB | 3.19 MiB/s, done.
Total 17 (delta 0), reused 0 (delta 0)
To garnatxagitlab.uv.es:jose.carrion/ldap_manager.git
* [new branch]      main -> main
Branch 'main' set up to track remote branch 'main' from 'origin'.

Check that you are in the main branch and commit was uploaded.

localhost$ git status
On branch main
Your branch is up to date with 'origin/main'.

nothing to commit, working tree clean

Explore your git project in Gitlab

After to push a commit you can check that all files are uploaded to the Gitlab repository. Sign in https://garnatxadoc.uv.es/gitlab and list the files from the Repository->files option.

  1. The next screen will show some information about your project. Copy the git address of the new project to the clipboard. You will need this address more later in order to access to the project.

../_images/gitlab7.png

Figure 1. Check the files related with the last commit.

Clone your Gitlab project in Garnatxa

When you have a stable release of your code pushed to Gitlab then you can use this release to compute in Garnatxa.

First you have to import your ssh public key from your account in Garnatxa to Gitlab. Follow the steps described in import ssh keys to Gitlab using the ssh public key in Garnatxa (/home/user/.ssh/id_rsa.pub).

Sign in Gitlab and copy the git address of the project to clone.

../_images/gitlab4.png

Then open a terminal in Garnatxa and clone the project using the copied address:

Garnatxa$ git clone git@garnatxagitlab.uv.es:jose.carrion/ldap_manager.git
Cloning into 'ldap_manager'...
remote: Enumerating objects: 17, done.
remote: Total 17 (delta 0), reused 0 (delta 0), pack-reused 17
Receiving objects: 100% (17/17), 19.15 KiB | 3.19 MiB/s, done.

Garnatxa$ ls ldap_manager
LDAPManager2.py

VSCode

Visual Studio Code is a popular free and open-source code editing application that can be deployed on Linux, macOS and Windows. It has an integrated terminal within its user interface that removes the need to switch between command-line tasks and code editing. The functionality of VS Code can easily be extended by installing extensions. These extensions allow for almost arbitrary language support, debugging or remote development.

Using VS Code remotely to Garnatxa is discouraged due to the issues described below. Alternatively, we recommend using VS Code locally and synchronizing with Garnatxa via Git or automatic synchronization extensions in VS Code. Administrators could disable the use of VS Code if they fail to comply with the basic rules of use.

Important

Visual Studio Code can cause severe problems in login nodes. VS code is a very popular tool, but it has caused severe problems on shared resources. So, users should be aware of its potential problems, which generally affect not only the intial user, but the rest of users accessing our shared resources. And, therefore, use the tool with care and strictly following the recommendations/settings indicated by administrators for dealing with possible nasty issues created by this tool.

The main know issues are:

  • VS Code uses a process called FileWatcher to constantly monitor files being modified in the remote folder. This introduces a high load to Garnatxa’s CPU, memory, and disk. Remember that you are in a shared environment, and multiple users using VS Code remotely can disrupt the cluster’s normal operation. To avoid this,

    Do not open folders with a large number of files, as this will introduce a high load to Garnatxa’s file system.

    Never open the root folder of your account in Garnatxa: /home/<USER> only open the folder containing your code.

    Do not use VS Code as a file viewer or transfer.

  • VS code can leave orphan processes running (and occupying resources indefinitely) in the login nodes. To avoid this, users should explicitly

    Close the remote connections once they have finished their working session. Users should also regularly check for orphan processes in the login nodes. Always end the remote session with explicit click in the “Close Remote Connection” button. If you want to end your remote session, click the “SSH connection status” box in the lower left corner. Then, in the input box that opens, select the “Close Remote Connection” option.

  • VS code can overload resources in the login nodes when automatic filewatcher and search is active. To avoid this,

    Users should restrict the extent of action of these tools together with the TypeScript and JavaScript language services.