Skip to content

taisakamisarava/GA-purchase-prediction

Repository files navigation

GA purchase prediction

The Challenge

Purchase prediction for Google Merchandise Store visitors. Use a combo of SQL + Python languages.

  1. Conduct short exploratory data analysis geared towards understanding variables needed for modeling
  2. Build a ML model to predict whether a user will make a purchase during the visit
  3. Discover the most important features or combination of features that indicate a user will return
  4. Provide fully commented code and model output for your analysis.
  5. Create a few slides (10-15 minutes of content) about the task - assume content will be used to present to the client. Client is non-technical, but wants to understand not only how well you might be able to predict, but also how they might identify more people like repeated visitors.

The Dataset

The sample dataset contains Google Analytics 360 data from the Google Merchandise Store, a real ecommerce store. The Google Merchandise Store sells Google branded merchandise. The data is typical of what you would see for an ecommerce website and accurately represents some challenges our Data Scientists are facing. It includes the following kinds of information:

● Traffic source data: information about where website visitors originate. This includes data about organic traffic, paid search traffic, display traffic, etc.

● Content data: information about the behavior of users on the site. This includes the URLs of pages that visitors look at, how they interact with content, etc.

● Transactional data: information about the transactions that occur on the Google Merchandise Store website.

The repository includes:

  • Presentation in .pdf format
  • Code in .ipynb
  • .sql query