Skip to content

Instantly share code, notes, and snippets.

@aleixmorgadas
Created July 8, 2024 06:14
Show Gist options
  • Save aleixmorgadas/f1b1ec56b41b9a22cef2b1cbc45f5ac9 to your computer and use it in GitHub Desktop.
Save aleixmorgadas/f1b1ec56b41b9a22cef2b1cbc45f5ac9 to your computer and use it in GitHub Desktop.
Most common crawler URIs to be ignored by OpenTelemetry SDK to avoid false alarms

Ignored URIs

We don't want our monitoring systems to raise an alert for each GET 404 Not Found because there are crawlers that try certain URIs.

Here a list of the most common URIs used by crawlers to be added to OpenTelemetry.

const sdk = new NodeSDK({
  instrumentations: [
    getNodeAutoInstrumentations({
      // We recommend disabling fs automatic instrumentation because
      // it can be noisy and expensive during startup
      '@opentelemetry/instrumentation-fs': {
        enabled: false,
      },
    }),
    new HttpInstrumentation({
      ignoreIncomingRequestHook: (incomingRequest) =>
        ignoredURIs.filter((ignored) => ignored === incomingRequest.url)
          .length >= 1,
    }),
    // Other instrumentations
  ],
  // ...
});
const ignoredURIs = [
'/',
'/.env',
'/favicon.ico',
'/robots.txt',
'/sitemap.xml',
'/app/.git/config',
'/img../.git/config',
'/assets../.git/config',
'/js../.git/config',
'/lib../.git/config',
'/images../.git/config',
'/content../.git/config',
'/events../.git/config',
'/css../.git/config',
'/media../.git/config',
'/static../.git/config',
'/wp-json/?rest_route=/wp/v2/USERS',
'/wp-login.php',
'/.env.prod.local',
'/env.dev.js',
'/.env_1',
'/.env.dev',
'/.env.production',
'/.env.old',
'/var/.env',
'/.env.bak',
'/env.js',
];
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment