Demographic and Water Data Demonstration

2016 Data Innovation Challenge
Team Members
Cyrus Pinto
Mike Dougherty
Joel Natividad
Sami Baig
Andrew Blythe


Project Description

We set out to examine how water quality information informs the demographics of the areas around it. In particular, we considered whether the reported water quality and toxicity metrics correlate with the average income levels or ethnic makeup of a region. We examined the records from the CEDEN Water Quality and Toxicity databases from August 1st 2012-July 31st 2015. The CEDEN data was married with the 2010 Census and income data by zip code.

Each substance or metric tested for in the dataset was tested to see how well it correlated to five variables for that particular zip code: mean income; median income; percentage of the population that is White; percentage of the population that is Hispanic; and percentage of the population that is African American. Analytes such as Chlordane, Methoxychlor, Endrin Ketone and PBDE 140 showed very strong negative correlation with mean and median income as well as the percentage of the population that is White. Meanwhile strong positive correlations with the percentages that are Black or Hispanic. This suggests that these substances are found in higher levels in the lower income and/or predominantly Black and/or Hispanic neighborhoods.

Additional Resources