Initial Response to Nov. 5, 2014 Data Center 2 outage

For Immediate Release

Nov. 6, 2014, 10:30 a.m.

Last night (Wednesday, Nov. 5), the Office of Information Technology (OIT) worked to restore multiple campus IT systems and services that experienced intermittent interruption due to excessive heat in Data Center 2. 

At about 7:45 p.m., the chilled water system that cools Data Center 2 failed. The switchover to the backup cooling system did not occur automatically as it should have. Monitoring systems noted the increasing temperature and OIT and Facilities staff were alerted. The center’s internal temperature rose well above 100 degrees before the backup cooling system could be manually activated.

As a result of the heat build-up, OIT staff began shutting down services in the data center. Several services shut themselves down automatically. There were many campus IT services that were degraded or went offline, including the primary Web servers and access to ResNet, the campus residential network. 

After the temperature stabilized, OIT staff began restarting services. Work continued through the night to bring services back up, to verify the integrity of databases, and to stop scheduled batch updates that would not have time to run successfully. Most major services were restored by approximately 1 a.m. today. Some services such as the High Performance Computing cluster remain unavailable as of 10:30 a.m. today.

A root cause analysis is underway as well as further evaluation of emergency procedures. OIT apologizes for any inconvenience this outage has caused and appreciates the hard work of Facilities staff and campus IT and OIT staff who helped to recover services as quickly as possible. 

For updates on the outage and recovery efforts, see Sysnews.

Primary Contacts:
Stan North Martin, OIT Outreach, Communications & Consulting, 919-515-1348,
John Black, OIT Infrastructure, Systems and Operations, 919-515-0042,