Abel Newsletter #2, 2016

Fall course week, application deadline for CPU time through Notur, interesting conferences and external courses, along with the usual list of updated tools and applications now available on Abel or in the Lifeportal.

  • USIT Seksjon for IT i forskning ITF(NO), or Department for research computing RC(EN), is responsible for delivering IT support for research at University of Oslo.
  • The department's groups operate infrastructure for research, and support researchers in the use of computational resources, data storage, application portals, parallelization and optimizing of code, and advanced user support.
  • The Abel High Performance Computing (HPC) cluster, Notur, and the NorStore storage resources are central components of the USIT IT support for researchers.
  • Announcement of this news letter is done on the abel-users mailing list. All users with an account on Abel are automatically added to the abel-users list. This is mandatory. The news letter will be issued at least twice a year.


News and announcements

HPC basics for research training course 08-10, November 2016

Training in using High-performance Computing (HPC) efficiently, for Abel and Colossus (TSD) users. The course is open for all users of Notur systems, but examples and specifics will pertain to Abel and Colossus.

This November we have a shorter and more focused research computing training week in collaboration with the software carpentry initiative. The participants are expected to participate in the software carpentry workshop prior to the course, unless they have a working knowledge of the Unix shell.

The courses are held on November 08-10 in the Ole-Johan Dahls hus and Kristen Nygaards hus. The event is free of charge, but requires registration.The lectures are held in English.

Registration: https://nettskjema.uio.no/answer/73724.html . Registration starts 01 August 2016. Deadline for registration until course is full (32 participants maximum) or 21  October 2016, which ever reached first.

Schedule : http://www.uio.no/english/services/it/research/events/hpc_for_research_november_2016.html

Questions? Ideas? Contact hpc-drift@usit.uio.no.

New Notur allocation period 2016.2, application deadline 26 August 2016

The Notur period 2016.2 (01.10.2016 - 31.03.2017) is getting nearer, and the deadline for applications for CPU hours or advanced user support is August 26.

Kindly reminder: If you have many CPU hours remaining in the current period, you should of course try to utilize them asap, but since many users will be doing the same there is likely going to be a resource squeeze and potentially long queue times. The quotas are based on even use throughout the allocation period. If you think you will not be able to spend all your allocated CPU hours, it is highly appreciated to notify sigma@uninett.no so that the CPU hours may be released. You may get extra hours if you need more later. For those of you that have run out of hours already, or are about to run out of hours, you may contact sigma@uninett.no and ask for a little more. No guarantees of course.

Run

projects

to list project accounts you are able to use.

Run

cost -p

or

cost -p nn0815k

to check your allocation (replace 0815 with your project's account name).

Run

cost -p nn0815k --detail

to check your allocation and print consumption for all users of that allocation.

Planned maintenance stop 03-06 October 2016

At regular intervals, it is necessary to bring down the entire cluster to do necessary maintenance and upgrades. We will have such a maintenance period this fall October 03-06.

The exact starttime and duration of the maintenance period can change, but the current plan is to close off user acces to Abel on Monday October 03 at 9am. Prior to this we will put a limit on batch job submissions so that jobs scheduled to last beyond 8am on October 03 will not start. Abel should return to normal operation by Thursday October 06, 4pm or earlier.

More information and reminders wil be sent to all Abel users closer to the event.

Abel storage running full - best practices and scheduled installation of additional 192 TB BeeGFS disk

When a HPC cluster is getting to a certain age, its disks get filled up. This has become increasingly apparent on Abel, in particular for our scratch disks which have been running close to or above 90% filled the last months. We will as part of the maintenance stop add another 192 TB BeeGFS disk, but until then we kindly ask you to try to limit the amount of data stored on scratch disk and remove data as soon as it is no longer needed. It is of course understood that certain workflows require a lot of scratch disk while running, but the sooner the results are moved out and temporary files removed the better to avoid interrupting the work for everyone on the cluster.  

Introduction course for the new supercomputer in Tromsø (A1) 6-7 September in Oslo

Sigma2 is hosting a two days training event at UiO to enable best possible usage of the new computation facility (currently known as A1).  First day is Best Practices for this new system followed by OpenACC and vectorization Best Practices. The second day experts from Intel will go deeper into vectorization. The seminar is focused on helping developers to build best possible code for this kind of system and to guide developers into the future when both vector width and core count increase, looking forward to Knights Landing and Skylake. The OpenACC part will guide developers to start using directives in order to exploit the GPUs in the new system.

The new system will be using Intel Broadwell processors and Mellanox EDR (100 Gbits/s) InfiniBand and will have about 30k cores, nodes with up to 6 TiB of memory will be available as well as GPU accelerated nodes.

Intel developers workshop for application developers September 9th at USIT (Please contact us if you want to attend)

This workshop is primarily aimed at developers of the major consumers of CPU hours on the Sigma2 systems and is by invitation only.  If your code is consuming millions of hours you should request to be invited to this workshop. Focus is on code development using Intel tools for the new A1 system, next generation Skylake and Knights Landing. We'll work on developers code and look at details that might be vital for performance.

Uninett Sigma2 has signed a contract for the new supercomputer that is to take over for Vilje and Hexagon 1 April 2017

Read more at https://www.sigma2.no/content/contract-signed-new-hpc-system-0

Abel and Stallo is supposed to be running through Q3 2018, and will then be replaced by the new Tromsø Supercomputer, a new supercomputer will then be put in operation in Trondheim at the time Abel and Stallo is shut down. Typical massive parallel jobs will then be run in Trondheim from Q3 2018, while the supercomputer in Tromsø is redesigned and will take over for Abel and Stallo running high-throughput production. We at USIT - Department for Research Computing will still be working on operations, maintenance, user support and training for the new supercomputers although they will not be located in Oslo. (See point above regarding introduction course)

Portal service Lifeportal now Kalmar2 enabled

As of January 7th 2016, we have implemented the Kalmar2 access to the Lifeportal. The Kalmar e-identity Union is a cross-Nordic authentication system for higher education and research. Students and staff members in a Nordic university or research institution can use a single username and password to access services in other Nordic countries. So far, Kalmar2 is open for Norwegian (FEIDE), Danish (WAYF) and Finnish (HAKA) institutions. The full list of member-institutions is available in the dropdown menu on the Lifeportal login page. The access for HAKA and WAYF members to the Lifeportal is free.

EUDAT events coming up

Supercomputing 2016 is in Salt Lake City, USA 11-18 November

Read more http://sc16.supercomputing.org/

A small contingent from USIT is attending. Contact us if you have any information to convey to some vendor or if you want lecture notes from any of the tutorials.

Pilot service on visualization nodes connected to Abel/Norstore

We plan to start a pilot service of remote visualisation on Abel and Norstore and have for that purpose set-up several Linux nodes with 8 CPUs, 32 GB of RAM and 1 or 2 NVIDIA Tesla card (M2090, 6 GB).

If you are interested please contact us (hpc-drift@usit.uio.no) and let us know what visualisation software you would like to see installed.

Reminder about availability of accelerated computing resources

NVIDIA GPUs:

A rack of nodes (16 nodes) equipped with NVIDIA K20x GPUs is now available on Abel. There is one NVIDIA card per CPU socket, e.g. two cards per host. The GPUs can be used by applications that is GPU/CUDA enabled, or by custom built applications that can take advantage of GPUs. There is a CUDA module available to enable the toolkit. The Portland compiler suite (PGI) has support for CUDA / GPUs through OpenACC support, see http://openacc.org.

Two GPU devices can be used per node. The processors are from Intel and are of the same type as the CPUs on the standard compute nodes on Abel, but with only four cores per socket and slightly lower clock frequency. The nodes have 64 GiB memory, while the NVIDIA cards have 6 GiB each. The nodes are also equipped with FDR InfiniBand which enable MPI.

Please contact us (hpc@usit.uio.no) if you wish to use the GPU resource. The queue system SLURM support GPUs and set the relevant flags needed at run time, (--partition=accel --gres=gpu:2 in addition to your normal parameters).

LAMMPS, NAMD, ADF and VASP have been available with GPU support for some time already.

Intel Xeon Phi:

We are happy to inform that 4 nodes with Xeon Phi co-processors are installed and available for testing and development. Each of the nodes contain two SB processors (16 cores total), 128 GiB of memory, InfiniBand and two Intel Xeon Phi cards with a 60 core mic processors and 8 GiB on card memory each.

These outdated co-processors are about to be upgraded to the new Knights Landing, which will be standalone compute nodes. This will happen some time during the spring of 2016. We'll try to provide early access as soon as possible, please contact us if you have pressing needs to port an application to KNL. All the Intel tools are ready for KNL.

Intel developer site provide more information on products for HPC develpoment.

Abel operations

File system quota

Each Abel user has a quota of 500 GB for his/her home directory. We have started to enforce quotas.

Follow operations

If you want to be informed about day-to-day operations you can subscribe to the abel-operations list by emailing "subscribe abel-operations <Your Name>" to sympa@usit.uio.no. You can also follow us on twitter abelcluster: http://twitter.com/#!/abelcluster

New and updated software packages

The following is a list of new or updated software packages available on Abel with the module command.
=== R 3.3.0 ===
module load R/3.3.0
 
=== R 3.3.0.gnu ===
module load R/3.3.0.gnu
 
=== R 3.3.0.profmem ===
module load R/3.3.0.profmem
 
=== abyss 1.9.0 ===
module load abyss/1.9.0
 
=== adf 2016.101 ===
module load adf/2016.101
 
=== amber 16 ===
module load amber/16
 
=== amber_gpu 16 ===
module load amber_gpu/16
 
=== beast2 2.4.2 ===
module load beast2/2.4.2
 
=== binutils 2.26 ===
module load binutils/2.26
 
=== bismark 0.16.1 ===
module load bismark/0.16.1
 
=== blasr 3.0.1 ===
module load blasr/3.0.1
 
=== boost 1.60.0 ===
module load boost/1.60.0
 
=== boost 1.60.0-intel ===
module load boost/1.60.0-intel
 
=== bowtie2 2.2.9 ===
module load bowtie2/2.2.9
 
=== busco v1.1 ===
module load busco/v1.1
 
=== bzip2 1.0.6 ===
module load bzip2/1.0.6
 
=== cufflinks 2.2.1 ===
module load cufflinks/2.2.1
 
=== curl 7.46.0 ===
module load curl/7.46.0
 
=== defuse 0.7.0 ===
module load defuse/0.7.0
 
=== dosageconvertor 1.0.2 ===
module load dosageconvertor/1.0.2
 
=== esysparticle 2.3.3 ===
module load esysparticle/2.3.3
 
=== gcc 6.1.0 ===
module load gcc/6.1.0
 
=== hisat2 2.0.1-beta ===
module load hisat2/2.0.1-beta
 
=== htslib 1.3.1 ===
module load htslib/1.3.1
 
=== humann2 0.8.1 ===
module load humann2/0.8.1
 
=== icu 56.1_intel ===
module load icu/56.1_intel
 
=== intel-libs 2016.3 ===
module load intel-libs/2016.3
 
=== intel 2016.3 ===
module load intel/2016.3
 
=== intel 2017.beta ===
module load intel/2017.beta
 
=== irap 0-7-0p13 ===
module load irap/0-7-0p13
 
=== jags 4.2.0 ===
module load jags/4.2.0
 
=== jags 4.2.0_2015_3 ===
module load jags/4.2.0_2015_3
 
=== julia 0.4.5 ===
module load julia/0.4.5
 
=== libbam 250216 ===
module load libbam/250216
 
=== libconfig 1.5 ===
module load libconfig/1.5
 
=== mafft 7.300 ===
module load mafft/7.300
 
=== mapdamage 2.0.6 ===
module load mapdamage/2.0.6
 
=== migrate migrate-3.6.11 ===
module load migrate/migrate-3.6.11
 
=== mothur 1.38.1 ===
module load mothur/1.38.1
 
=== namd 2.11 ===
module load namd/2.11
 
=== namd_gpu 2.11 ===
module load namd_gpu/2.11
 
=== ngsplot ngsplot-2.61 ===
module load ngsplot/ngsplot-2.61
 
=== openfoam 3.0.1 ===
module load openfoam/3.0.1
 
=== openmpi.gnu 1.10.2 ===
module load openmpi.gnu/1.10.2
 
=== openmpi.gnu 2.0.0rc1 ===
module load openmpi.gnu/2.0.0rc1
 
=== openmpi.intel 1.10.2 ===
module load openmpi.intel/1.10.2
 
=== openmpi.intel 2.0.0rc1 ===
module load openmpi.intel/2.0.0rc1
 
=== pbsuite 15.8.24 ===
module load pbsuite/15.8.24
 
=== pcre 8.38 ===
module load pcre/8.38
 
=== perf-reports 6.0.5 ===
module load perf-reports/6.0.5
 
=== pgi 16.5 ===
module load pgi/16.5
 
=== phono3py 1.10.9 ===
module load phono3py/1.10.9
 
=== plink2 1.90b3.31 ===
module load plink2/1.90b3.31
 
=== pmap 1.0.0 ===
module load pmap/1.0.0
 
=== quast 4.2 ===
module load quast/4.2
 
=== rakudo-star 2016.04 ===
module load rakudo-star/2016.04
 
=== samtools 1.3.1 ===
module load samtools/1.3.1
 
=== smrtlink 1.0.5 ===
module load smrtlink/1.0.5
 
=== spades 3.9.0 ===
module load spades/3.9.0
 
=== star 2.5 ===
module load star/2.5
 
=== stringtie 1.2.2 ===
module load stringtie/1.2.2
 
=== swarm 2.1.8 ===
module load swarm/2.1.8
 
=== tophat 2.1.1 ===
module load tophat/2.1.1
 
=== usearch 6.1.544 ===
module load usearch/6.1.544
 
=== usearch 8.1.1861 ===
module load usearch/8.1.1861
 
=== vasp 5.4.1.Feb16 ===
module load vasp/5.4.1.Feb16
 
=== vsearch 2.0.3 ===
module load vsearch/2.0.3

VSEARCH is a USEARCH replacement, and has many advantages over the 32 bit version of USEARCH we have on Able. This was developed by Torbjørn Rognes, Tomáš Flouri, Ben Nichols, Christopher Quince and Frédéric Mahé at UIO and we have installed and tested it on Abel. More details read here
 
=== wget 1.17 ===
module load wget/1.17
 
=== xz 5.2.2 ===
module load xz/5.2.2

 

Puh! Questions? Contact hpc-drift@usit.uio.no.

Publication tracker

USIT Department for Research Computing (RC) is interested in keeping track of publications where computation on Abel (or Titan) or usage of any other RC services are involved. We greatly appreciate an email to:

hpc-publications@usit.uio.no

about any publications (including in the general media). If you would like to cite use of Abel or our other services, please follow this information.

Abel Operations mailing list

To receive extensive system messages and information please subscribe to the "Abel Operations" mailing-list. This can be done by emailing "subscribe abel-operations <Your Name>" to sympa@usit.uio.no.

Follow us on Twitter

Follow us on twitter abelcluster. Twitter is the place for short notices about Abel operations.

http://twitter.com/#!/abelcluster


 

Publisert 18. aug. 2016 12:10 - Sist endret 5. okt. 2016 14:05