'

Data Science at the New York Times

Понравилась презентация – покажи это...





Слайд 0

data science @ The New York Times chris.wiggins@columbia.edu chris.wiggins@nytimes.com @chrishwiggins references: bit.ly/brown-refs


Слайд 1

data science @ The New York Times


Слайд 2

data science @ The New York Times


Слайд 3

“data science” jobs, jobs, jobs


Слайд 4

“data science” jobs, jobs, jobs


Слайд 5

data science: mindset & toolset drew conway, 2010


Слайд 6

modern history: 2009


Слайд 7

modern history: 2009


Слайд 8

“data science” ancient history: 2001


Слайд 9

“data science” ancient history: 2001


Слайд 10

data science context


Слайд 11

home schooled


Слайд 12

B.A. & M.Sc. from Brown


Слайд 13

PhD in topology


Слайд 14

“By the end of late 1945, I was a statistician rather than a topologist”


Слайд 15

invented: “bit”


Слайд 16

invented: “software”


Слайд 17

invented: “FFT”


Слайд 18

“the progenitor of data science.” - @mshron


Слайд 19

“The Future of Data Analysis,” 1962 John W. Tukey


Слайд 20

introduces: “Exploratory data anlaysis”


Слайд 21

Tukey 1965, via John Chambers


Слайд 22

TUKEY BEGAT S WHICH BEGAT R


Слайд 23

Tukey 1972


Слайд 24

In 1975, while at Princeton, Tufte was asked to teach a statistics course to a group of journalists who were visiting the school to study economics. He developed a set of readings and lectures on statistical graphics, which he further developed in joint seminars he subsequently taught with renowned statistician John Tukey (a pioneer in the field of information design). These course materials became the foundation for his first book on information design, The Visual Display of Quantitative Information Tukey 1975


Слайд 25

TUKEY BEGAT VDQI


Слайд 26

Tukey 1977


Слайд 27

TUKEY BEGAT EDA


Слайд 28

fast forward -> 2001


Слайд 29

“The primary agents for change should be university departments themselves.”


Слайд 30

data science histories York Times @ The New 1. slow burn @Bell: as heretical statistics (see also Breiman) 2. caught fire 2009-now: as job description historical rant: bit.ly/data-rant


Слайд 31

biology: 1892 vs. 1995


Слайд 32

biology: 1892 vs. 1995 biology changed for good.


Слайд 33

biology: 1892 vs. 1995 new toolset, new mindset


Слайд 34

genetics: 1837 vs. 2012 ML toolset; data science mindset


Слайд 35

genetics: 1837 vs. 2012


Слайд 36

genetics: 1837 vs. 2012 ML toolset; data science mindset arxiv.org/abs/1105.5821 ; github.com/rajanil/mkboost


Слайд 37

data science: mindset & toolset


Слайд 38

1851


Слайд 39

news: 20th century church state


Слайд 40

church


Слайд 41

church


Слайд 42

church


Слайд 43

news: 20th century church state


Слайд 44

news: 21st century church state engineering


Слайд 45

newspapering: 1851 vs. 1996 1851 1996


Слайд 46

example: millions of views per hour 2015


Слайд 47


Слайд 48

"...social activities generate large quantities of potentially valuable data...The data were not generated for the purpose of learning; however, the potential for learning is great’’


Слайд 49

"...social activities generate large quantities of potentially valuable data...The data were not generated for the purpose of learning; however, the potential for learning is great’’ - J Chambers, Bell Labs,1993


Слайд 50

data science: the web


Слайд 51

data science: the web is your “online presence”


Слайд 52

data science: the web is a microscope


Слайд 53

data science: the web is an experimental tool


Слайд 54

newspapering: 1851 vs. 1996 vs. 2008 1851 1996 2008


Слайд 55

“a startup is a temporary organization in search of a repeatable and scalable business model” —Steve Blank


Слайд 56

every publisher is now a startup


Слайд 57

every publisher is now a startup


Слайд 58


Слайд 59

news: 21st century church state engineering


Слайд 60

news: 21st century church state engineering


Слайд 61

learnings


Слайд 62

learnings - predictive modeling descriptive modeling prescriptive modeling


Слайд 63

(actually ML, shhhh…) - (supervised learning) (unsupervised learning) (reinforcement learning)


Слайд 64

learnings - predictive modeling descriptive modeling prescriptive modeling cf. modelingsocialdata.org


Слайд 65

predictive modeling, e.g., cf. modelingsocialdata.org


Слайд 66

predictive modeling, e.g., “the funnel” cf. modelingsocialdata.org


Слайд 67

super cool stuff interpretable predictive modeling cf. modelingsocialdata.org


Слайд 68

super cool stuff interpretable predictive modeling cf. modelingsocialdata.org arxiv.org/abs/q-bio/0701021


Слайд 69

optimization & learning, e.g., “How The New York Times Works “popular mechanics, 2015


Слайд 70

(some moneys) optimization & prediction, e.g., (some models) “How The New York Times Works “popular mechanics, 2015


Слайд 71

recommendation as predictive modeling


Слайд 72

recommendation as predictive modeling bit.ly/AlexCTM


Слайд 73

descriptive modeling, e.g, cf. daeilkim.com ; import bnpy


Слайд 74

modeling your audience bit.ly/Hughes-Kim-Sudderth-AISTATS15


Слайд 75

modeling your audience (optimization, ultimately)


Слайд 76

modeling your audience also allows insight+targeting as inference


Слайд 77

prescriptive modeling


Слайд 78

prescriptive modeling cf. modelingsocialdata.org


Слайд 79

prescriptive modeling aka “A/B testing”; RCT cf. modelingsocialdata.org


Слайд 80

prescriptive modeling, e.g,


Слайд 81

prescriptive modeling, e.g,


Слайд 82

prescriptive modeling, e.g,


Слайд 83

descriptive: predictive: Explore Learning Test prescriptive: Optimizing Reporting


Слайд 84

descriptive: predictive: Explore Learning Test prescriptive: Optimizing Reporting


Слайд 85

common requirements in data science:


Слайд 86

common requirements in data science: 1. people 2. ideas 3. things cf. John Boyd, USAF


Слайд 87

data science: ideas


Слайд 88

data skills data science and… - data data data data engineering embeds product multiliteracies cf. “data scientists at work”, ch 1


Слайд 89

data science: ideas - new mindset > new toolset


Слайд 90

data science: people


Слайд 91

thanks to the data science team!


Слайд 92

data science @ The New York Times chris.wiggins@columbia.edu chris.wiggins@nytimes.com @chrishwiggins references: bit.ly/brown-refs


Слайд 93


×

HTML:





Ссылка: