NARA’s 1940 Census Launch Delivers Lessons In Big Data, Cloud Computing

by Judi Hasson | for

When it comes to big data and high public demand, the cloud can be a federal agency’s salvation.That’s what the National Archives and Records Administration learned during the recent and long-anticipated 1940 census launch — the largest-ever release of publicly available data in the federal government.

With 22.5 million requests in the first three hours of April 2, site response slowed to the point that frustrated users abandoned efforts. NARA leaders took steps to avoid a site crash and were able to do so thanks to the right mix of technology and people. But there were lessons learned as well.

“It started to manifest itself almost instantaneously,” NARA’s CIO Michael Wash told AOL Government. “The number of visitors ballooned. There was a lot of advertising that went into the 9 a.m release date. People were pent up and ready to go to the site. … Things are very normal now. The site is performing very well. We’re getting very good comments from users out there.”

NARA had developed detailed plans and testing indicated that the agency and its partners — Inflection to develop the site and Amazon Web Services as host — would be able to handle the expected load. However, Wash said, they were anticipating 500,000 visitors an hour, not millions.

The site was immediately overwhelmed at 9:02 a.m. After six hours online, the site had 67 million hits, all facing the same problem of a slow return or failure to load an image from the census.

Wash convened a virtual “war room” with seven NARA officials and nearly two dozen from Inflection scattered in many locations. He talked through the problems in conference calls at the National Archives and regularly posted updates on Facebook for the frustrated public to read.

After discovering the network couldn’t handle the performance loads, Amazon added computer power, servers and more processing capacity while other team members changed the configuration of how images were being processed and shortened the time it took to get an image from the database to a customer’s screen.

Relying on the cloud allowed NARA to solve the problem by enabling adjustments to scaling capacity and increasing the site’s ability to deliver a lot of data quickly, Wash said.