'

Leveraging Customer Data to Enhance Relevancy in Personalization

Понравилась презентация – покажи это...





Слайд 0

Leveraging Customer Data to Enhance Relevancy in Personalization “Using Apache Data Processing Projects on top of MongoDB” Marc Schwering Sr. Solution Architect – EMEA marc@mongodb.com @m4rcsch


Слайд 1

Big Data Analytics Track Driving Personalized Experiences Using Customer Profiles Leveraging Data to Enhance Relevancy in Personalization Machine Learning to Engage the Customer, with Apache Spark, IBM Watson, and MongoDB


Слайд 2

Agenda For This Session Personalization Process Review The Life of an Application Separation of Concerns / Real World Architecture Apache Spark and Flink Data Processing Projects Clustering with Apache Flink Next Steps


Слайд 3

High Level Personalization Process 1. Profile created 2. Enrich with public data 3. Capture activity 4. Clustering analysis 5. Define Personas 6. Tag with personas 7. Personalize interactions Batch analytics Public data Common technologies R Hadoop Spark Python Java Many other options Personas changed much less often than tagging


Слайд 4

Evolution of a Profile (1) { "_id" : ObjectId("553ea57b588ac9ef066428e1"), "ipAddress" : "216.58.219.238", "referrer" : ”kay.com", "firstName" : "John", "lastName" : "Doe", "email" : "johndoe@gmail.com" } <sample> Originating IP Demographic info Location Name Sex Email


Слайд 5

Evolution of a Profile (n+1) { "_id" : ObjectId("553e7dca588ac9ef066428e0"), "firstName" : "John", "lastName" : "Doe", "address" : "229 W. 43rd St.", "city" : "New York", "state" : "NY", "zipCode" : "10036", "age" : 30, "email" : "john.doe@mongodb.com", "twitterHandle" : "johndoe", "gender" : "male", "interests" : [ "electronics", "basketball", "weightlifting", "ultimate frisbee", "traveling", "technology" ], "visitedCounts" : { "watches" : 3, "shirts" : 1, "sunglasses" : 1, "bags" : 2 }, "purchases" : [ { "id" : 1, "desc" : "Power Oxford Dress Shoe", "category" : "Mens shoes" }, { "id" : 2, "desc" : "Striped Sportshirt", "category" : "Mens shirts" } ], "persona" : "shoe-fanatic” }


Слайд 6

One size/document fits all? Profile Data Preferences Personal information Contact information DOB, gender, ZIP... Customer Data Purchase History Marketing History „Session Data“ View History Shopping Cart Data Information Broker Data Personalisation Data Persona Vectors Product and Category recommendations Application Batch analytics


Слайд 7

Separation of Concerns Profile Data Preferences Personal information Contact information DOB, gender, ZIP... Customer Data Purchase History Marketing History „Session Data“ View History Shopping Cart Data Information Broker Data Personalisation Data Persona Vectors Product and Category recommendations Batch analytics Layer


Слайд 8

Benefits Code does less, Document and Code stays focused Split ability Different Teams New Languages Defined Dependencies


Слайд 9

Result Code does less, Document and Code stays focused Split ability Different Teams New Languages Defined Dependencies KISS => Keep it simple and save! => Clean Code <= Robert C. Marten: https://cleancoders.com/ M. Fowler / B. Meyer. et. al.: Command Query Separation


Слайд 10

Analytics and Personalization From Query to Clustering


Слайд 11

Separation of Concerns Profile Data Preferences Personal information Contact information DOB, gender, ZIP... Customer Data Purchase History Marketing History „Session Data“ View History Shopping Cart Data Information Broker Data Personalisation Data Persona Vectors Product and Category recommendations Batch analytics Layer


Слайд 12

Separation of Concerns Profile Data Preferences Personal information Contact information DOB, gender, ZIP... Customer Data Purchase History Marketing History „Session Data“ View History Shopping Cart Data Information Broker Data Personalisation Data Persona Vectors Product and Category recommendations Batch analytics Layer


Слайд 13

Architecture revised Data Processing


Слайд 14

Advice for Developers OWN YOUR DATA! (but only relevant Data) Say no! (to direct Data ie. DB Access)


Слайд 15

Data Processing


Слайд 16

Hadoop in a Nutshell An open source distributed storage and distributed batch oriented processing framework Hadoop Distributed File System (HDFS) to store data on commodity hardware Yarn as resource management platform MapReduce as programming model working on top of HDFS


Слайд 17

Spark in a Nutshell Spark is a top-level Apache project Can be run on top of YARN and can read any Hadoop API data, including HDFS or MongoDB Fast and general engine for large-scale data processing and analytics Advanced DAG execution engine with support for data locality and in-memory computing


Слайд 18

Flink in a Nutshell Flink is a top-level Apache project Can be run on top of YARN and can read any Hadoop API data, including HDFS or MongoDB A distributed streaming dataflow engine Streaming and batch Iterative in memory execution and handling Cost based optimizer


Слайд 19

Latency of query operations


Слайд 20

Iterative Algorithms / Clustering


Слайд 21

K-Means in Pictures Source: Wikipedia K-Means


Слайд 22

K-Means as a Process


Слайд 23

Iterations in Hadoop and Spark


Слайд 24

Iterations in Flink Dedicated iteration operators Tasks keep running for the iterations, not redeployed for each step Caching and optimizations done automatically


Слайд 25

Demo


Слайд 26

Result


Слайд 27

More…?


Слайд 28

Takeaways Stay focussed => Start and stay small Evaluate with BigDocuments but do a PoC focussed on the topic Extending functionality is easy Aggregation, MapReduce Hadoop Connector opens a new variety of Use Cases Extending functionality could be challenging Evolution is outpacing help channels A lot of options (Spark, Flink, Storm, Hadoop….) More than just a binary


Слайд 29

Next Steps Next Session => Hands on Spark and Whatson Content! „Machine Learning to Engage the Customer, with Apache Spark, IBM Watson, and MongoDB“ RDD Examples Try out Spark and Flink http://bit.ly/MongoDB_Hadoop_Spark_Webinar http://flink.apache.org/ https://github.com/mongodb/mongo-hadoop https://github.com/m4rcsch/flink-mongodb-example Participate and ask Questions! @m4rcsch marc@mongodb.com


Слайд 30

Thank you! Marc Schwering Sr. Solutions Architect – EMEA marc@mongodb.com @m4rcsch


×

HTML:





Ссылка: