Skip to content

Instantly share code, notes, and snippets.

@stmswitcher
Created May 16, 2019 12:44
Show Gist options
  • Save stmswitcher/66d6af8d4e2db7eb53b284561f26178d to your computer and use it in GitHub Desktop.
Save stmswitcher/66d6af8d4e2db7eb53b284561f26178d to your computer and use it in GitHub Desktop.
Most requested paths from web server access logs
#!/usr/bin/env php
<?php
$filename = $argv[1] ?? false;
if (!file_exists($filename)) {
echo "File not found" . PHP_EOL;
exit(1);
}
$results = [];
$handle = fopen($filename, 'r');
while (($line = fgets($handle)) !== false) {
if (preg_match("/00\]\s\"(.*)\sHTTP/", $line, $matches)) {
$url = parse_url($matches[1]);
$path = rtrim(preg_replace('#/\d+($|\/)#', '/<int>/', $url['path']), '/');
$hash = md5($path);
if (!isset($results[$hash])) {
$results[$hash] = [
'count' => 0,
'url' => $path,
];
}
$results[$hash]['count']++;
}
}
fclose($handle);
usort($results, function ($a, $b) {
return $b['count'] <=> $a['count'];
});
foreach ($results as $result) {
echo $result['count'] . ': ' . $result['url'] . PHP_EOL;
}
exit(0);
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment