From chudler at cs.uchicago.edu Fri Oct 8 13:48:56 2021 From: chudler at cs.uchicago.edu (Colin Hudler) Date: Fri, 8 Oct 2021 13:48:56 -0500 Subject: [SLURM] FYI: Node a008 down one card Message-ID: <79DF47B3-14FC-4FBA-8554-4FB5B4CB6664@cs.uchicago.edu> Node a008 is now at 3/4 capacity because a card has stopped responding. The scheduler got weird around this time also, but that's also influenced a lot by a combination of different user activities. I'm off campus at the moment, so will investigate it in full on Monday (tentative). Right now the node is schedule-able with 1-3GPU, but please let me know if it doesn't operate properly (I didn't run my own job on it).