The Art and Science of Data-Driven Journalism

Понравилась презентация – покажи это...

Слайд 0

The Art and Science of Data-Driven Journalism Alexander B. Howard Tow Fellow, Columbia University May 30, 2014

Слайд 1

You know something, John Snow.

Слайд 2

This John Snow knew something.

Слайд 3

Newspapers have used data for centuries Source: The Guardian

Слайд 4

1960s: computer-assisted reporting (CAR) Bob Woodward, via Cliff1066

Слайд 5

Traditional tools applying tech to journalism… Calculators and Graphs Mainframe and PCs Spreadsheets Databases Text and code editors Statistics Programming

Слайд 6

In the 1990s, government and civil society spread the Internet globally

Слайд 7

In the 2000s, mobile phones and social networking connected us ever more

Слайд 8

In the 2010s, data creation exploded. Image Credit: Real Time Rome from Senseable.MIT.edu

Слайд 9

“Data-driven journalism is the future” Source: Tim Berners-Lee in the Guardian

Слайд 10

…combined with new tools & context… Online spreadsheets and wikis Data visualization tools Open source frameworks Code sharing Agile development Cloud storage and processing (EC2 & Heroku) More data and more access Privacy and security riskss

Слайд 11

2014: data journalism is the present Gathering, cleaning, organizing, analyzing, visualizing and publishing data to support the creation of acts of journalism

Слайд 12

Слайд 13

Trendy but not new The collection, protection and interrogation of data as a source, complementing traditional “shoe leather” investigative reporting relying on witnesses, experts and authorities

Слайд 14

Слайд 15

Dollars for Docs

Слайд 16

The Guardian

Слайд 17

Chicago Tribune Flame retardants

Слайд 18

Слайд 19

A tangled web

Слайд 20

Слайд 21

Los Angeles Times

Слайд 22

Слайд 23

La Nacion

Слайд 24

Reuters: Connected China

Слайд 25

Слайд 26

Слайд 27

Слайд 28

Best practices?

Слайд 29

Report it out

Слайд 30

Слайд 31

Show people something new about the world

Слайд 32

Слайд 33

Tell a story

Слайд 34

Center for Public Integrity

Слайд 35

Storytelling still matters. “We use these tools to find and tell stories. We use them like we use a telephone. The story is still the thing.” - Anthony DeBarros USA Today Source: Data Journalism and the Big Picture

Слайд 36

Make it personal

Слайд 37

Слайд 38

Understand the context for the data

Слайд 39

Слайд 40

Show your data

Слайд 41

Слайд 42

Show your work

Слайд 43

Слайд 44

Share your code

Слайд 45

Слайд 46

Consider ethics

Слайд 47

Questions Is the data clean? Is the data representative? What biases might be hidden in the data? Was the data legally obtained? Does the data contain personally identifiable information (PII)?

Слайд 48

Collection Who gathered the data? How? Was it clear how data would be used? Can people opt-out of collection or usage? “Notice and consent” is not enough “Privacy by design” applies to news apps

Слайд 49

Слайд 50

Data Analysis & Numeracy N = ? Average vs Median Statistical significance? Correlation != causation Regression to the mean

Слайд 51

Слайд 52


Слайд 53

Bad Data Viz wtfviz.net

Слайд 54

Present data with context, in context

Слайд 55

Be aware of de-anonymization risks

Слайд 56

Emerging trends

Слайд 57


Слайд 58

Networked reporting of corruption ICIJ: Offshore Leaks

Слайд 59

International Consortium of Investigative Journalists Offshoring $ 80 journalists 40 countries 260 gigabytes 2.5 million files

Слайд 60

Create your data “If Stage 1 of data journalism was “find and scrape data,” then… Stage 2 was “ask government agencies to release data” in easy to use formats. Stage 3 is going to be “make your own data”, and those sources of data are going to be automated and updated in real-time.” -Javaun Moradi, Mozilla

Слайд 61

Safecast open source Geiger counter

Слайд 62

Networked accountability

Слайд 63

Bus route in Nairobi, Kenya

Слайд 64

Sensor Journalism

Слайд 65

Слайд 66

Слайд 67

Citizens as Sensors: Andhra Pradesh

Слайд 68

Drones + data collection

Слайд 69

Privacy challenges

Слайд 70

Слайд 71

Open Data, FOIA & Press Freedom

Слайд 72

An expanding number of data sources

Слайд 73

Слайд 74

Слайд 75

Social data and crisis data

Слайд 76

Open government data platforms

Слайд 77

Слайд 78

Слайд 79

Fauxpen Data In an age of “openwashing”… We need to: Evaluate licenses. Peruse the Terms of Service. Review the governance. Look at community. Check the format.

Слайд 80

Слайд 81

Слайд 82

Center for Public Integrity

Слайд 83

Accountability for “personalized redlining” Gun map graphic

Слайд 84

Transparency for geographic profiling Gun map graphic WSJ: Websites vary prices, based upon user information

Слайд 85

Monitoring predictive policing Gun map graphic Verge: Chicago crime and profiling Geekwire: Predictive Policing

Слайд 86

Investigating human tissue trafficking Gun map graphic ICIJ: The data behind skin and bone

Слайд 87

Data + journalism + activism + responsive institutions = social change

Слайд 88

The fun part: predictions, prognostications and recommendations!

Слайд 89

1) Data will become even more of a strategic resource for media.

Слайд 90

2) Better tools will emerge that democratize data skills.

Слайд 91

3) News apps will explode as a primary way people consume data journalism.

Слайд 92

4) Being digital first means being data-centric and mobile-friendly.

Слайд 93

5. Expect more robo-journalism. Human relationships and storytelling still matter.

Слайд 94

6) More journalists will need to study the social sciences and statistics. Source: Ed Yong

Слайд 95

7) There will be higher standards for accuracy and corrections. Source: Jake Harris

Слайд 96

8) Competency in security and data protection will become more important. Source: Jake Harris

Слайд 97

9) Demand for more transparency on reader data collection and use. Source: eConsultancy

Слайд 98

10) More conflicts over public records, data scraping, and ethics will arise. Gun map graphic

Слайд 99

12) Data-driven personalization and predictive news in wearables.

Слайд 100

13) More diverse newsrooms will produce better (data) journalism. SOURCE: The Atlantic A 2013 ASNE survey of 68 online news organizations found that 63% of them had no minorities.

Слайд 101

14) Be mindful of data-ism and bad data. Embrace skepticism.