Web site quality analytics summarization with R Markdown / RStudio Notebook script, SEO Spider, traffic, architecture/governance data
Use this annotated RStudio Notebook script when you want to unite and summarize SEO Spider output and join it with site architecture, governance, and traffic information, to help you locate and prioritize assignments for your web work teams. This script can help you answer the questions, "What kind of trouble are we in?" (after a new web site problem or directive emerges), and "What content should we repair first?"
Annotations in the script make it a "lab notebook" with links to resources useful for modifying this script.
The output of this particular script allows for a new kind of web site quality analytics that is enterprise-readable, interactive, and free, by way of the D3.js tree map I have posted to GitHub, although Tableau, Power BI, Excel, etc. could also be used. The result will be two CSV files:
-
groupData - Information summarized at the page group / communication package level, by ownership and product/service. It summarizes broken links; the number of pages not recently reviewed; the number of images that may be too large for smart phones; etc. whatever your web site "problem of the moment" might be.
-
pageData - Information summarized at the page level: the specific broken link data, the specific dates of review, the specific images that are too large for smart phones, etc.