Big data little devices what it will do to us and for us

What is big data? 0 - 2003 5 exabytes 2011 2.5 exabytes per day

Perspectives 1MB 1GB 1TB 2PB 5EB

Where’s it coming from? Source: domo.com 2012

What does it look like?

Definitions Big Data: unstructured data, don’t know what questions are yet Business Intelligence: structured data, know what the questions you want answered Statistics: structured data, not realtime, no action taken as a result Machine Learning: creation of algorithms and applying them to data sets in an attempt to learn from data Predictive Analytics: extracting existing data to predict trends

Why now? 2003: Doug Cutting & Mike Cafarella, Nutch 2004:Google Labs: Map Reduce 2006:Doug Cutting moves to Yahoo and creates Hadoop 2008: Yahoo open sources Hadoop, Apache Software Foundation 2009: Matei Zaharia starts Spark at UC Berkley 2013: Spark open sourced under Apache

Map Reduce Traditional / Sequential Map Reduce

Spark x 100 Map Reduce

Cases What it will do to us

Security - Privacy NSA PrISM

Vulnerability Target Home Depot Michaels Blue Cross Blue Shield Sony Entertainment

Commerce Amazon Dash

commerce amazon

Cases What it will do FOR us

sports sabermetrics (moneyball)

productivity google now

POLITICS Obama campaign 2012

Science Monterey bay aquarium research institute

health Apple Research kit The early partners tell Bloomberg that they got thousands of volunteers within a day of launch, including 11,000 for a Stanford University cardiovascular trial -- for context, Stanford says that it would normally take a national year-long effort to get that kind of scale. The flood of data will theoretically improve the quality of the findings, especially since the automatic, phone-based tracking should prevent people from fibbing about their activity levels.

