Skip to content

Instantly share code, notes, and snippets.

@rindeal
Last active September 13, 2024 16:42
Show Gist options
  • Save rindeal/067713ba639ed703f60a7c503afd55bb to your computer and use it in GitHub Desktop.
Save rindeal/067713ba639ed703f60a7c503afd55bb to your computer and use it in GitHub Desktop.
Strip JS-like comments (JS, TS, CSS, ...) using GCC preprocessor and AWK (won't handle edge cases like comments in multiline strings, regexes, ..)
#!/usr/bin/awk -f
# SPDX-FileCopyrightText: ANNO DOMINI 2024 Jan Chren ~rindeal <dev.rindeal gmail.com>
# SPDX-License-Identifier: GPL-2.0-only OR GPL-3.0-only
# Homepage: https://gist.github.com/rindeal/067713ba639ed703f60a7c503afd55bb
# Usage: decomment-js.awk < input_file.js > output_file.js
# This script removes comments from JavaScript-like source files using gcc preprocessor.
# It retains shebangs, which would otherwise be interpreted as preprocessor directives.
BEGIN {
# Set up the pipe to gcc
gcc_cmd = "gcc -fpreprocessed -dD -E -P -x c -"
}
# Print shebang line unmodified
! shebang_found && /^#!/ {
shebang_found = 1
print
next
}
# For all other lines, send them to gcc
{
print | gcc_cmd
}
END {
# Close the pipe to gcc
close(gcc_cmd)
}
{
"name": "decomment-js.awk",
"version": "1.0.0",
"description": "A simple executable AWK script",
"main": "",
"bin": {
"decomment-js": "./decomment-js.awk"
},
"scripts": {
"start": "awk -f ./decomment-js.awk"
},
"keywords": [
"awk",
"script",
"executable"
],
"author": "rindeal",
"license": "GPL-2.0-only OR GPL-3.0-only"
}

Stripping JavaScript Comments

When working with JavaScript, TypeScript, and similar languages, you might need to strip out comments without altering the code structure. Here's a simple and effective way to do it using the GCC compiler.

Command

gcc -fpreprocessed -E -P -x c - < in.js > out.js

Explanation

Unlike tools like Babel or Prettier, this command doesn't parse the Abstract Syntax Tree (AST). Instead, it focuses solely on removing comments. This makes it particularly useful for handling unusual files, such as code snippets, which might otherwise cause errors due to invalid statements (like return outside of function bode), since the standard tools expect only whole files.

Benefits

  • No AST Parsing: Avoids the complexity and potential errors associated with AST parsing.
  • Handles Unusual Files: Works with files that might have syntax issues or unconventional structures.
  • Strips Blank Lines: Cleans up your code by removing unnecessary blank lines.

By using this command, you can efficiently clean your code, making it easier to read and maintain.

Note on Preprocessor Directives

The raw GCC command still interprets lines starting with # as preprocessor directives, including shebangs. To bypass all lines starting with #, you can use the following AWK script, which you can install with this command:

npm install -g https://gist.github.com/rindeal/067713ba639ed703f60a7c503afd55bb
# now you can run it like this:
#
#     decomment-js < in.js > out.js
#
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment