About This Project

The tools and techniques required for conducting ethical and effective data analysis on real-world datasets are invaluable. To strengthen these skills, I collaborated with two peers from the University of Arizona to analyze multiple public datasets from the City of Tucson.

Datasets

Findings

We found that lower-income neighborhoods—particularly in Wards 3 and 5—experience higher rates of theft and violent crime, supporting the link between income inequality and crime. However, the presence of streetlights did not correspond with reduced crime rates. In fact, streetlights were more common in high-crime areas, suggesting they are likely installed in response to crime rather than as a deterrent. Machine learning models, especially Random Forest, were effective in predicting high-crime areas, but the findings also raise ethical concerns about over-policing and data bias. These concerns underscore the importance of thoughtful, equitable policy interventions.