In this first week we ask: where do computer science and journalism intersect? CS techniques can help journalism in four different areas: data-driven reporting, story presentation, information filtering, and effect tracking.
Then we jumped right in with the concept of data. Specifically, we study feature vectors which are a fundamental data representation for many algorithms in data mining, language processing, machine learning, and visualization. This week we will explore two things: representing objects as vectors, and visualizing high dimensional spaces.
We also explored a principal components analysis of voting data from the UK House of Lords. The R file we ran to produce the output is here. A more sophisticated analysis, using custom distance metrics and multi-dimensional scaling, is here.
Readings:
- What should the digital public sphere do?, Jonathan Stray
- Computational Journalism, Cohen, Turner, Hamilton
- Precision Journalism, Ch.1, Journalism and the Scientific Tradition, Philip Meyer
Viewed in class
- The Jobless rate for People Like You, New York Times
- Dollars for Docs, ProPublica
- What did private security contractors do in Iraq and document mining methodology, Jonathan Stray