[SOLVED] ESS storage issues affecting submit hosts and Colossus

We're experiencing NFS hangs on many Linux hosts mounting /cluster since 5:55 this morning.

Its also affecting /cluster on the Colossus compute nodes. The majority of compute nodes have been rebooted which may have affected running jobs.

Update 12:00: The submit hosts and Colossus are currently unavailable.

Update 14:00: The issue has been resolved, and we're rebooting the submit hosts now.

Published Feb. 9, 2021 8:50 AM - Last modified Feb. 9, 2021 2:31 PM