Norwegian version of this page

TSD Operational Log

Published May 10, 2024 1:35 PM

There is currently a login issue with data.tsd.usit.no. We are working on resolving it.

Published Apr. 30, 2024 4:14 PM

The operating system (OS) on Colossus will be upgraded starting Monday June 10th. This is a major upgrade, and will take one week. During the upgrade, Colossus will be unavailable.

The main reason for the upgrade is that the current OS is very old, and soon will soon reach end of life.

During the downtime, we will reorganize the cluster, upgrade the networking and upgrade the OS from CentOS7 to Rocky9. Rocky, like CentOS, is a RedHat clone.

After the upgrade, you will most likely have to recompile any software that you have installed on the submit hosts today (e.g. R packages). If possible, the Linux RedHat9 submit hosts will be made available a little before the rest, so you can start testing and reinstalling software.

The software stack available via "module load" will be reinstalled. We will install toolchain 2021a and newer. This means that not all
old versions of all software will be available after the upgrade. If you...

Published Apr. 25, 2024 1:28 PM

The existing hardware and software for the host tsd-fx03 need to be replaced.  The new host is ready to take over and we plan to do the switch over to the new host Monday 2024-04-29.  The swap is expected to take less than 5 minutes and in the ideal case should not be detectable by its users.

The switch took longer than expected, due to issues with routing affecting its services.  The new server was put in production today, 2024-05-13.

Published Apr. 25, 2024 9:16 AM

We're experiencing some issues with our data portal, but we're on it and working to get things back to normal. Thanks for your patience while we fix the problem.

Published Apr. 1, 2024 10:24 AM

Several reservations (including tsd) on Colossus are currently unavailable. The Sigma2 allocation is not affected.

 

Published Mar. 25, 2024 12:44 PM

Dear TSD user,

We need to temporarily halt self-service for maintenance. We apologize for the inconvenience.

TSD Team

Published Mar. 20, 2024 10:59 AM

From 2024-03-20 1730 the first 100 (of 250) submit hosts will be migrated to a new virtualization cluster overnight. This requires the vms to be powered off, migrated and powered on. The downtime per host will be about 5-10 minutes.

Update 2024-03-21: The remaining submit hosts will be migrated from 2024-03-21 1730.

Published Mar. 13, 2024 2:00 PM

Applicants who use TSD Self Service to apply for membership in a TSD project currently end up in a loop when they return from ID-porten, which prevents them from submitting their application.

We are working on the problem.

Published Mar. 12, 2024 3:04 PM

[2024-03-18 11:00: update] The maintenance is now over, and jobs are running again.

 

On Monday 18-03-2024 10:00 there will be a short maintenance stop to apply a critical configuration change.

A maintenance reservation has been set in Slurm. Any submitted jobs that cannot complete before the downtime will remain pending until after the downtime.

Published Mar. 11, 2024 11:07 AM

The IAM system of TSD will go for a quick upgrade today between 15.00 and 15.15. The services that will not be available during the period are:

1) Selfservice

2) Nettskjema new forms activation

3) Command line UR

Published Feb. 28, 2024 12:17 PM

We're currently facing storage issues causing input/output errors that are affecting many software applications, including Stata. Our team is actively working on resolving this matter.

Published Feb. 15, 2024 3:32 PM

[2024-02-21 17:49 update] All affected jobs have been requeued. 85 jobs had to be cancelled, so please inspect the output of your jobs to see if they're affected.

[2024-02-21 15:45 update] Several jobs that were running at the start of the upgrade did not successfully resume. We're trying to resolve the issue. New jobs are not affected.

[2024-02-21 10:35: update] The upgrade is now done, and seems to have gone well.

[2024-02-21 10:00: update] The upgrade has now started

The queue system on Colossus will be upgraded on Wednesday (February 21) at 10:00.  During the upgrade, running jobs will be suspended, and slurm commands (squeue, sbatch, etc) will not work.  We expect the upgrade to take no more than 20 minutes.

Published Feb. 14, 2024 7:23 AM

TSD is performing network maintenance (on the DNS service) at 10:00 CET today.

Published Feb. 7, 2024 7:30 AM

We're experiencing technical difficulties with our project creation service. Our team is actively working on resolving the issue to restore full functionality as soon as possible. We apologize for any inconvenience this may cause and appreciate your patience during this time.

Published Jan. 31, 2024 1:30 PM

We are currently experiencing technical difficulties with core services, and are working to restore operations. Sorry for the inconvenience caused by this.

Published Jan. 29, 2024 12:50 PM

Slurm has been restarted on several compute nodes to resolve an issue. Please check the output of your jobs to see if they've been affected.

Published Jan. 26, 2024 10:10 AM

We're currently experiencing issues with some nodes on Colossus. Jobs on these nodes might have crashed and been requeued. Please check the output of your jobs to see if they've been affected.

Published Jan. 26, 2024 8:59 AM

Some users are currently facing issues logging in to the Data Portal to export/import data. The specific error message they encounter is "An unexpected error has occurred which may affect the proper functioning of the application." If you also experience this error while attempting to log in to the Data Portal, please notify us by emailing tsd-drift@usit.uio.no.

Published Jan. 11, 2024 12:10 PM

TSD will be upgrading the storage system, which may cause some instability on the Windows and Linux vms.

Published Jan. 10, 2024 1:35 PM

We've updated our password policy. This change is part of our commitment to enhancing security protocols and safeguarding sensitive information, taking effect on January 8th, 2024.

All TSD users are now required to update their passwords at least once every year. This practice is essential to maintain a high level of security. You may change your password at any time by logging into TSD's Selfservice Portal: https://selfservice.tsd.usit.no/profile/change-password

You will receive an email notification 30 days before your password expiration date, providing sufficient time for a timely update.

Users with over due password changes will be contacted, with the first group of users contacted December 11th, 2023 and requiring a mandatory password change to be completed by January 8th, 2024.

Accounts that have not complied with the password update requirement by the deadline will be temporarily suspended. Access will be restored upon u...

Published Jan. 8, 2024 12:30 PM

ID Porten has logging problem, please follow:

https://status.digdir.no/incidents/ctml93xm9lnh

It impacts both TSD and Nettskjema logins

Published Dec. 22, 2023 9:50 AM

We are currently experiencing some issues with file import through the Data Portal and are looking into the cause of the problem.

Published Dec. 15, 2023 7:58 AM

This affected TSD systems that relied on NFS.

[Update 08:48]

The core problem is resolved and most systems are up again. We are still investigating the reason for the problems, and some system may still have instability.

[Update 11:00]

All systems should work as normal.

Published Nov. 28, 2023 10:11 AM

TSD will be upgrading software on the storage system Thursday, 2023-11-30 from 08:00 CET. We expect storage instability on the Windows and Linux vms throughout the day.

Around 10:00 the storage system will be shut down for an estimated 15min, which means network storage is inaccessible on all TSD hosts (Windows and Linux) as well as on our central services (file import/export, etc). To be on the safe side, please close any programs and log off from your vm prior to the downtime.

A maintenance reservation has been set on Colossus from 08:00. This means any jobs that cannot complete before the downtime will remain pending until after the maintenance completes. They'll resume automatically.

Our automation should fix any file system hangs that may occur, and we will be on standby to fix any remaining issues that do not automatically recover.

Apologies for the short notice, we've been in dialogue with IBM to alleviate storage instabi...

Published Nov. 22, 2023 3:00 PM

TSD will be upgrading software on the storage system tomorrow, 2023-11-17 08:00 - 09:00 CET. Our automation should fix any file system hangs that may occur, and we will be on standby to fix any remaining issues that do not automatically recover. Apologies for the short notice, we've been in dialogue with IBM to alleviate storage instability and want to act on their latest recommendations as fast as possible.