'

Building Data-Centric Businesses

Понравилась презентация – покажи это...





Слайд 0

Daniel Aragao & Simon Hope


Слайд 1

Daniel Aragao Simon Hope @dear_dr_dan @mapbutcher


Слайд 2

REALESTATE.COM.AU Market Cap 6B Australian Properties 11M Visits in September 55M App Downloads …and counting 4.7M


Слайд 3

3,500 PEOPLE 13 COUNTRIES 34 OFFICES TECHNOLOGY & SOCIAL JUSTICE


Слайд 4

THIS IS WHAT THE STORY IS ABOUT • In the beginning… • Organising our Data • Implementation approaches • Hipster Batches • Reactify • Bring Your Own Data • Finding the Data • What we have learned so far


Слайд 5

SORRY… IT’S OK TO LEAVE NOW • Nope, we didn’t create a new Hadoop • No hardcore Data Science • There are some implementation details • REA embraced the Cloud. AWS everywhere • Under construction


Слайд 6

IN THE BEGINNING…


Слайд 7


Слайд 8


Слайд 9


Слайд 10


Слайд 11


Слайд 12


Слайд 13


Слайд 14


Слайд 15


Слайд 16


Слайд 17


Слайд 18


Слайд 19


Слайд 20


Слайд 21


Слайд 22


Слайд 23


Слайд 24


Слайд 25


Слайд 26


Слайд 27


Слайд 28


Слайд 29

ORGANISING OUR DATA Increasingly, content is being 
 distributed through search
 and social platforms...
 
 There’s less visiting 
 of publishers as destinations. Jeff Weiner, CEO, Linkedin


Слайд 30

PROBLEM… Data warehouse Data sources


Слайд 31

STRATEGY…


Слайд 32

STRATEGY…


Слайд 33

STRATEGY…


Слайд 34

PROBLEM… Data Warehouse SSIS Staging Dim Fact


Слайд 35

PROBLEM… Data Warehouse SSIS Staging Star schema leaky details Dim Fact


Слайд 36

STRATEGY… No Data Warehouse SSIS Staging Dim Fact


Слайд 37

STRATEGY… Data Warehouse Facade SSIS Staging Dim Fact


Слайд 38

WHAT’S IN THE BOX? ???


Слайд 39

Good things come in small packages services THE HIPSTER BATCH ??? Hipster Batch


Слайд 40

THE HIPSTER BATCH Hipster Batch • Small and short lived • Decoupled files via S3 • Single via flat purpose • Idempotent • Polyglot • Minimal runtime dependencies • Discoverable


Слайд 41

A ‘TYPICAL’ IMPLEMENTATION Hipster Batch SNS, SQS Data


Слайд 42

A ‘TYPICAL’ IMPLEMENTATION Hipster Batch SNS, SQS Data ASG, ECS, Lambda


Слайд 43

A ‘TYPICAL’ IMPLEMENTATION Hipster Batch SNS, SQS Data KMS ASG, ECS, Lambda


Слайд 44

A ‘TYPICAL’ IMPLEMENTATION Hipster Batch SNS, SQS Data KMS ASG, ECS, Lambda Logs


Слайд 45

A ‘TYPICAL’ IMPLEMENTATION Hipster Batch Cloudwatch SNS, SQS Data KMS ASG, ECS, Lambda Logs


Слайд 46

A ‘TYPICAL’ IMPLEMENTATION Hipster Batch Cloudwatch SNS, SQS Data KMS ASG, ECS, Lambda Logs S3 buckets


Слайд 47

HIPSTER BATCH DOES SCIENCE Hipster Batch • Behavioural models for targeted marketing • Recommendation • External channels engine


Слайд 48

SCIENCE! Hipster Batch


Слайд 49

SCIENCE! Hipster Batch Stats models x 20


Слайд 50

SCIENCE! Hipster Batch Stats models x 20 API


Слайд 51

SCIENCE! Hipster Batch Stats models API x 20 API


Слайд 52

SCIENCE! Hipster Batch Stats models API x 20 API


Слайд 53

SCIENCE! Hipster Batch Stats models API API Google Now x 20 API


Слайд 54

From legacy to reactive REACTIFY ??? Reactify


Слайд 55

REACTIFY Reactify • Manage • Protect Data flow with messages consumers and care about isolation • Resilience just fine is important and Data replication is • Demand is elastic - and your components should be too http://www.reactivemanifesto.org


Слайд 56

PROBLEM… Reactify Coupling Listings No resilience or elasticity Data coupling


Слайд 57

SOLUTION… Reactify Listings


Слайд 58

SOLUTION… Reactify Listings Reactify


Слайд 59

SOLUTION… Reactify Listings Reactify


Слайд 60

SOLUTION… Reactify Listings Reactify Hipster Batch


Слайд 61

SOLUTION… Shielded consumers Listings Decoupled Reactify Hipster Batch Isolation Reactify


Слайд 62

IMPLEMENTATION… Reactify Listings


Слайд 63

IMPLEMENTATION… Reactify Listings REST API


Слайд 64

IMPLEMENTATION… Reactify Listings REST API


Слайд 65

IMPLEMENTATION… Reactify Listings REST API Dynamo Event Maker Event Differ


Слайд 66

IMPLEMENTATION… Reactify Listings REST API 2 Dynamo 2 Event Maker Event Differ Kinesis


Слайд 67

REACTIFY REST API REST API • Exposes • Stream current state only of change notifications • Hypertext • Clear Application Language - HAL entity types • Linking over embedding • Cacheable and discoverable


Слайд 68

https://feeds.listings.realestate.com.au/combined-listings/120449689 REST API


Слайд 69

https://feeds.listings.realestate.com.au/combined-listings/120449689 REST API


Слайд 70

https://feeds.listings.realestate.com.au/combined-listings/120449689 REST API


Слайд 71

https://feeds.listings.realestate.com.au/combined-listings/120449689 REST API


Слайд 72

https://feeds.listings.realestate.com.au/combined-listings/-/changes REST API Event Maker


Слайд 73

https://feeds.listings.realestate.com.au/combined-listings/-/changes REST API Event Maker


Слайд 74

https://feeds.listings.realestate.com.au/combined-listings/-/changes REST API Event Maker


Слайд 75

https://feeds.listings.realestate.com.au/combined-listings/-/changes REST API Event Maker


Слайд 76

Reactify Event Differ


Слайд 77

Reactify Event Differ


Слайд 78

Reactify Event Differ


Слайд 79

Reactify Event Differ


Слайд 80

The octopus in the box — Did you use that data set? — Errr… No, we have another one BRING YOUR OWN DATA


Слайд 81

BRING YOUR OWN DATA - BYOD • Allow data to flow freely • Help the business to get what they need when they need it • Self-service


Слайд 82

BYOD


Слайд 83

BYOD CSV


Слайд 84

BYOD CSV x5


Слайд 85

BYOD CSV x5 Smarts on datatypes


Слайд 86

BYOD CSV x5 Tableau Server Smarts on datatypes


Слайд 87

BYOD CSV x5 Tableau Server Smarts on datatypes


Слайд 88

BYOD Audit, auth, share… CSV x5 Tableau Server Smarts on datatypes


Слайд 89

These were the implementation approaches, now to… FIND THE DATA Meaningful, automated, 
 and easy-to-search metadata


Слайд 90

WE TRIED


Слайд 91

MORE THAN DATA Hipster Batch Cloudwatch SNS, SQS KMS ASG, ECS, Lambda Logs


Слайд 92

MORE THAN DATA Hipster Batch Cloudwatch SNS, SQS KMS ASG, ECS, Lambda Logs


Слайд 93

MORE THAN DATA Hipster Batch Cloudwatch SNS, SQS Dataz Ancestry KMS ASG, ECS, Lambda Logs


Слайд 94

MORE THAN DATA Hipster Batch Cloudwatch SNS, SQS Dataz Ancestry KMS ASG, ECS, Lambda Metadata Logs


Слайд 95

Ancestry


Слайд 96

Ancestry


Слайд 97

Ancestry


Слайд 98

Ancestry


Слайд 99

Ancestry


Слайд 100

METADATA PIPELINE Producers REST API


Слайд 101

METADATA PIPELINE Producers Ancestry REST API Ancestry Ancestry


Слайд 102

METADATA PIPELINE Producers Ancestry REST API Ancestry Ancestry


Слайд 103

METADATA PIPELINE Producers Ancestry REST API Ancestry Ancestry Scrapy


Слайд 104

METADATA PIPELINE Producers Ancestry REST API Ancestry Ancestry Scrapy


Слайд 105

METADATA PIPELINE Producers Ancestry REST API Ancestry Ancestry Scrapy


Слайд 106

WHAT WE HAVE LEARNED SO FAR • Consumers create the last-mile data as needed • We must work with external, independent delivery channels • Push • Data quality back to source/producer systems belongs to the entire organisation, 
 not to a single team


Слайд 107

I’ll give you my 
 Data Warehouse 
 when you can pry it
 from my cold dead hands.


Слайд 108

THANK YOU Daniel Aragao @dear_dr_dan Simon Hope @mapbutcher REALESTATE.COM.AU


Слайд 109


×

HTML:





Ссылка: