Key Takeaways: Even though we are amassing loads of data, we're just beginning to figure out how to use it to improve policymaking. Here are some of the main insights shared tonight by faculty and policymakers who are working at the forefront of this challenge. The University of Chicago Harris School of Public Policy hosted the event.
Chris Berry, Associate Professor, saw the need for the Harris School's new program in data science and public policy as a natural progression of two emerging and related but "not yet touching" trends:
RCT's (or randomized control trials) which have been used to test policy, especially in health and social sciences.
Data science movement where cities (especially) collect data to improve outcomes, e.g., policing or restaurant inspections.
The future is to bring these fields together, but it is difficult to do so. Why? Similar underlying goals, but:
RCT's try to identify cause and effect. They evaluate a program and ask does x cause y?
Machine learning asks a different set of question and makes predictions. If I have a giant set of x's, how can I predict y?
Very different intellectual agendas.
"You will do a bad job in one if you use the tools from the other."
There are three common "areas of collision" between these two approaches:
Using the wrong technique to answer one question through the other. Bad outcomes.
Serendipity between the two, where elements of each work together for good outcomes. (He dubbed this the "you got chocolate in my peanut butter approach.")
An emerging area where you gather data with new techniques that can be used in both approaches to identify causal OR predictive relationships.
2. Charlie Catlett, Director Urban Center for Computation and Data. Catlett works on projects that embed data collections sensors within cityscapes (think pollution readings and water levels.) Great potential for understanding what is happening all over a city in new ways. Sensors can be added at relatively low cost. Chicago one of the most progressive cities using data and sharing it publicly.
Policymakers don't want to know about his charts and graphs. They come to him and ask how they can get more done with limited resources by using data analytics. He helped the City of Chicago manage 32 inspectors for 15,000 restaurants by shifting from random inspections to risk-based predictions. Found infractions 7 days more quickly.
Two big challenges they are grappling with currently are:
It's not a question of including privacy protections but rather including a process for validation of those issues and infractions.
Where do you put sensor devices like this? Catlett has found that beginning in the community and asking them which problems they want help solving is key.
Chicago is on the verge of ramping up these sensors around the city.
3. Panel discussion featuring Brenna Berman, the city of Chicago's CIO; Rick Stevens, Professor of Computer Science; and Dan Gaylin, President and CEO of NORC.
Stevens advocated for the use of data-driven models or simulations to explore implications of policies with the goal of avoiding bad policy all together or identifying implications of a policy before it gets implemented. Gave example of soon there will be a million driver-less cars on the roads; it's impossible to survey our way to an understanding about implications of that. Models can help. Building on Berry's analysis above, he says, instead of lining up all the x's and asking if it could predict y, we can integrate many variations of x into the models and investigate their impact that way.
Berman says city can always find a way to work through technological challenges of data, but shifting mindsets, processes, is much harder. Said computer simulations of possible data projects have helped community better understand implications, give their input. [City of Chicago Digital Hub]
Photo: Chris Berry presenting.
Follow me on Twitter