Law enforcement agencies are increasingly using predictive policing systems to forecast criminal activity and allocate police resources. Yet in numerous jurisdictions, these systems are built on data produced during documented periods of flawed, racially biased, and sometimes unlawful practices and policies (“dirty policing”). These policing practices and policies shape the environment and the methodology by which data is created, which raises the risk of creating inaccurate, skewed, or systemically biased data (“dirty data”). If predictive policing systems are informed by such data, they cannot escape the legacies of the unlawful or biased policing practices that they are built on. Nor do current claims by predictive policing vendors provide sufficient assurances that their systems adequately mitigate or segregate this data.
In our research, we analyze thirteen jurisdictions that have used or developed predictive policing tools while under government commission investigations or federal court monitored settlements, consent decrees, or memoranda of agreement stemming from corrupt, racially biased, or otherwise illegal policing practices. In particular, we examine the link between unlawful and biased police practices and the data available to train or implement these systems. We highlight three case studies: (1) Chicago, an example of where dirty data was ingested directly into the city’s predictive system; (2) New Orleans, an example where the extensive evidence of dirty policing practices and recent litigation suggests an extremely high risk that dirty data was or could be used in predictive policing; and (3) Maricopa County, where despite extensive evidence of dirty policing practices, a lack of public transparency about the details of various predictive policing systems restricts a proper assessment of the risks. The implications of these findings have widespread ramifications for predictive policing writ large. Deploying predictive policing systems in jurisdictions with extensive histories of unlawful police practices presents elevated risks that dirty data will lead to flawed or unlawful predictions, which in turn risk perpetuating additional harm via feedback loops throughout the criminal justice system. The use of predictive policing must be treated with high levels of caution and mechanisms for the public to know, assess, and reject such systems are imperative.