Shutterfly Customer Lifetime Value Code-Challenge
- For any event if the
required data
is missing or has null value, then that event's data would not be ingested into thein-memory Data Structure
. - An
additional Data Structure
has been created to handle the events which occur in anunordered way
, like:- A new order comes before the create event for that particular customer
- An Updation to an order comes before creation of that order
- The events are processed in such a way that all the ordered events will be ingested in the
in-memory Data Structure
, and all of the unordered events are inserted into theadditional Data Structure
. - Once all the ordered events have been processed then the additional Data Structure will be processed to inserts the unordered event in the
in-memory Data Structure
. - Since
revenue / visit
andvisits / week
are important metrics to the business so, there are two functions revenuePerVisits & visitsPerWeek which help in achieving these functionalities.
- Keeping Performance in mind I have tried to use HashMap for the
in-memory Data Structure
instead of lists, wherever frequent searching is required, so that we can reduce the lookup time to O(1). - I have used
Priority Queue
to fetch the top X Customer Lifetime Values which will give us the least completxity.
- Average customer lifespan for every customer is 10 years.
- If an order has been placed then it is assumed that, there is also a site_visit entry in the input data corresponding to that order.
- While updating customer we are just updating last_name, adr_city, adr_state and not
event_time
because if we update it, we will lose the customer creation date which is necessary for LTV calculation. - While updating orders we are just updating event_time, total_amount and not customer_id because we need to preserve customer id corresponding to a particular order.
Gson --> 'gson-2.2.3.jar' http://www.java2s.com/Code/JarDownload/gson/gson-2.2.3.jar.zip