SAN FRANCISCO – Information about as many as 198 million registered voters was left on an open online database and only taken down when it was discovered by a cyber security analyst.
The data was stored on publicly accessible files in an Amazon cloud account used by a data analytics contractor employed by the Republican National Committee to help it identify potential audiences for television ads.
Deep Root Analytics' main database included names, dates of birth, home addresses, phone numbers, and voter registration details for over 198 million registered voters, as well as data that looked as if it attempted to guess each voter’s ethnicity and religion based on other information collected about them. The firm said it believes the data was a mix of proprietary information and publicly available voter data.
The RNC said it had halted any further work with the company pending the conclusion of an investigation into its security procedures. No proprietary RNC information was accessed, it said.
The discovery comes after a year of political turmoil during which the servers of the Democratic National Committee were hacked and leaked. The attack has been attributed by federal intelligence officials to Russia, and former Democratic presidential candidate Hillary Clinton has placed some of the blame for her loss on the release of the stolen emails.
In this case, the RNC files were not hacked but instead were simply posted online without password protection. They were discovered by Chris Vickery, a cyber risk analyst with the company UpGuard who spends his days looking for exposed data online. He said the 25 terabytes of data was the biggest exposed data cache he’s ever found.
“It’s really rare to come across something of this magnitude,” he said.
Vickery discovered the files on June 12. He notified federal officials and they were removed from public access on June 14, he said.
The files were stored on Amazon’s AWS cloud storage, which requires passwords by default. In this case, someone would have needed to deliberately set the security to not require passwords.
“It’s not the default setting. Somebody had to go in and do that,” Vickery said.
While no directory of files was visible online, to access them it was only necessary to understand the naming conventions typically used for database files on AWS and then use wildcards to search for possible hits, said Vickery.
Vickery said companies sometimes remove password protection because it slows down the ability of developers or other contractors to work with data, or sometimes simply because it is easier. They instead rely on what's known in the security world as "security through obscurity," the idea that making something difficult to find in effect keeps it secret.
Deep Root Analytics said in a statement that it took full responsibility for the situation. It has updated the account settings on the files and put protocols in place to prevent further access.