Back
Alternative text for photo

Landing Page Audit Tool

With no data from an existing Landing Page Builder tool, I was unsure how to approach the redesign of landing pages needed across DAZN’s global markets. To gather data and insights, I coded a JavaScript command-line application that analyzed and databased all current and previous DAZN landing pages. I then developed a website for the findings where I could automate the publishing of results using data visualization.

Scope

AspectAnswer
TimeApprox 1 Week
TeamSolo
RoleFE Engineer
ToolsVS Code / Node / React

Process

The project is a Node.js command-line tool designed to fetch web pages, save them locally, and analyze them using class name attributes to generate component usage statistics. These statistics are then passed into a secondary application to create data visualizations and overall statistics. As separate applications, it’s possible to see changes over time and switch between datasets.

The application is a series of independent scripts so processing can be split into steps to enable them to be run manually, or sequentially in a chain. Certain steps create multiple outputs. Along with the data, some capture screenshots of pages and components for publishing along with being able to eyeball and both ensure everything is working correctly and understand how components are being used by different markets.

The tool includes various commands such as setting up directories, fetching pages, scanning pages, generating reports on page statistics, and analyzing component usage. Using this approach, I could effectively and accurately audit all pages, and their components by market and overall to gain extremely valuable insight.

Outcome

  • Analytical data for pages and components broken down and cross-referenced to understand precise statistics per market, region, and globally.
  • Periodic data to understand changes over time.
  • Data visualization to easily communicate, understand, and share for improved buy-in.
  • Enable data driven insight into redesign of DAZN’s Landing Pages.

Project

Github repository README

Introduction

This is a Command Line Interface tool to fetch web pages, save them locally, and analyse them using class name attributes to create ‘component usage’ statistics.

Theory

Before running its recommended to understand how the tool works.

Defining A Component

The input is a partial class name. The output is the highest counted single class name that contains that partial class name.

Example: If we want statistics on a Hero component we look for any pattern (start, end or containing) using the specified .hero class defined in catalog.json. So, if we found .hero__inner and .hero__wrapper--123 we would use the class name found the most to create the Hero statistics.

Generating Statistics

The tool is broken into a number of steps to create statistics on a set of components. 0. Before running the input folder needs to contain two important files: - A catalog.json contains the list of pages created manually to audit. Each has a unique id and a market category. - A component_map.json contains a component list to find statistics on. Each has a title and the class pattern selector.

  1. Firstly setup.js will create the output folder structure with empty directories ready for files to be written to.
  2. Then get-pages.js will download all pages supplied by the catalog.json file into the page_html folder. It also takes a screen shot of each page and add to the page_screenshots folder.
    • Once pages are stored locally the search.js script can be used to find all instances of a class to help identify components and where they are used and not used. This is helpful to test, refine and review class names and screen grabs.
  3. Then scan-pages.js will crawl and create report per page in page_reports listing every class found along with data to trace the market, url, id etc.
  4. Then report-pages.js will pull together statistics on every class across all pages and identify overall and per market use into pages_report.json.
  5. Lastly report-components.js will pull together statistics on components supplied in component_map.json (please see Defining A Component section above for important details on how this works).

Setup

Create empty directories ready to populate.

Generates:

  • Empty directory as ./output
  • Empty directory as ./output/search_results
  • Empty directory as ./output/page_screenshots
  • Empty directory as ./output/page_html
  • Empty directory as ./output/page_reports
npm run setup

Get Pages

Take html pages offline to make processing faster, safer and more flexible. Crawl a list of urls and download each one, also taking a full page screenshot (which also includes confirming cookie alert to hide before screenshot).

Generates:

  • Contents of <body> for each page as ./output/page_html/{id}.html
  • Screen shot of each page as ./output/page_screenshots/{id}.png
npm run get-pages

Scan Pages

Reduce pages down to an array of class names with additional page details. Create a raw data set about each page in turn. Data generated includes:

  • Page URL.
  • Page ID (for cross reference any data at any later stage).
  • Market.
  • Language.
  • List of every class on the page.
  • Total count of classes.

Generates:

  • A report per page as ./output/page_reports/{id}.json
npm run scan-pages

Report Pages

Analyse all pages to get overall statistics. Collate all raw data about each page into one. Data revealed on each (and every) class used across all pages includes:

  • Class name.
  • Total count across all pages.
  • Total number of pages with class found.
  • Per market:
    • Total count.
    • Total number of pages used.

Generates:

  • A report as ./output/page_reports.json
npm run report-pages

Report Components

Analyse all pages to get component statistics. Collate all raw data about each component into one. Data revealed on each component used across all pages includes:

  • Component name.
  • Class name.
  • Total number of pages.
  • Total count across all pages.
  • Usage percentage across all pages.
  • Per market:
    • Total count.
    • Total number of pages.
    • Usage percentage.

Generates:

  • A report as ./output/component_reports.json
npm run report-components

Find all instances of a class to investigate it’s use across all pages. Using a class to deep dive and generate a report on pages its being used (and optionally not being used) as well as bundle screenshots together for review by eye. Data generated through search includes:

  • Report of pages including/excluding class with each:
    • Page url.
    • Page id.
    • Page market.
  • Screen shot of each page including/excluding class.

Generates:

  • Directory of search results as ./output/search_results/{name-of-folder}/ with:
    • Directory includes/ with a report.json and a screen shot per page as{id}.png of pages where class is found.
    • Directory excludes/ with a report.json and a screen shot per page as{id}.png of pages where class is not found.
npm run search class="{name-of-class}" name="{name-of-folder}" excludes="true"
Understand the inner workings of the command line application. Expand for a detailed breakdown of its functionality.
Sample of source code with annotations and structured insights.
Results page shared with stakeholders.