Skip to content

Instantly share code, notes, and snippets.

@dgp1130
Created July 25, 2024 06:02
Show Gist options
  • Save dgp1130/d619a39fb90058635d27f258466259a8 to your computer and use it in GitHub Desktop.
Save dgp1130/d619a39fb90058635d27f258466259a8 to your computer and use it in GitHub Desktop.
Browser Line Length Metrics

Browser Line Length Metrics

This snippet computes the average line length on a web page based on visual layout and line wrapping. It's basically doing:

const lines = pageText.split('\n');
const lineLength = lines.map((line) => line.length);
average(lineLength);

Except .split('\n') does not take visual line breaks into account. This snippet does and provides a reasonable estimate of the line lengths the user will experience when reading page content.

/*
* Computes metrics about the number of characters per line length.
*
* Usage:
* 1. Save this as a snippet and run it on a web page.
* 2. In DevTools, highlight a parent element to calculate
* * (ex. An `<article>` containing a bunch of `<p>` tags).
* 3. Run `getLineLengthMetrics($0)` in the Console.
*
* Background: HTML elements don't expose line wrapping information, so instead what we do is
* replace every character with an element containing that text. For example:
*
* ```html
* <p>Hello</p>
* ```
*
* Becomes:
*
* ```html
* <p>
* <x-span>H</x-span>
* <x-span>e</x-span>
* <x-span>l</x-span>
* <x-span>l</x-span>
* <x-span>o</x-span>
* </p>
* ```
*
* Then we look at `offsetTop` of each of these elements to identify which character triggers
* line break. We then split on those elements and join the text back into a string, then count
* the number of characters on each line and average them.
*/
function getLineLengthMetrics(root) {
/**
* Replaces the `textContent` of the element with `<x-span>` tags containing the same text.
*
* @param {!Element} el
*/
function replaceTextWithSpans(el) {
/** Collapse whitespace. */
function normalizeWhitespace(text) {
const split = text.split(/[ \n\t]+/g);
return split.join(' ');
}
const text = normalizeWhitespace(el.textContent);
const chars = text.split('');
const spans = chars.map((char, index) => {
// Not a pre-existing element to avoid changing CSS too much.
const span = document.createElement('x-span');
span.textContent = char;
return span;
});
el.textContent = '';
el.append(...spans);
}
/**
* Given a paragraph tag containing a bunch of `<x-span>` elements, split it into an array
* of lines of `<x-span>` tags based on the visual layout (`offsetTop`).
*
* @param {!HTMLParagraphElement} paragraph
* @return {!Generator<!Array<!XSpan>, void, void>}
*/
function* splitLines(paragraph) {
if (paragraph.children.length === 0) return [];
const first = paragraph.children[0];
let currLine = [first];
let currTop = first.offsetTop;
for (const span of Array.from(paragraph.children).slice(1)) {
if (span.offsetTop <= currTop) {
currLine.push(span);
} else {
yield currLine;
currLine = [span];
currTop = span.offsetTop;
}
}
yield currLine;
}
/**
* Compute the mean average of the input numbers.
*
* @param {!Array<number>} numbers
* @return {number}
*/
function mean(numbers) {
const sum = numbers.reduce((l, r) => l + r);
return sum / numbers.length;
}
/**
* Compute the median average of the input numbers.
*
* @param {!Array<number>} numbers
* @return {number}
*/
function median(numbers) {
const sorted = Array.from(numbers).sort();
return sorted[Math.ceil(sorted.length / 2)];
}
// Replace all the `<p>` tag text content with `<x-span>` elements.
const paragraphs = Array.from(root.getElementsByTagName('p'));
const nonEmptyParagraphs = paragraphs.filter((paragraph) => paragraph.textContent.trim() !== '');
for (const paragraph of nonEmptyParagraphs) {
replaceTextWithSpans(paragraph);
}
// Split the span elements into lines (basically `.split('\n')` based on visual rendering).
const spanLines = nonEmptyParagraphs.flatMap((paragraph) => Array.from(splitLines(paragraph)));
// Join the content back into strings.
const textLines = spanLines.map((spans) => spans.map((span) => span.textContent));
const lines = textLines.map((line) => line.reduce((l, r) => l + r).trim());
// Compute line length metrics.
const lineLengths = lines.map((line) => line.length);
return {
mean: mean(lineLengths),
median: median(lineLengths),
max: Math.max(...lineLengths),
min: Math.min(...lineLengths),
longestLine: Array.from(lines).sort((l, r) => l.length - r.length).at(-1),
};
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment