This document outlines modifications to how the compiler processes CommonJS require
calls.
These changes are desirable in order to allow Node.js modules to be either included or excluded from the compiler.
Currently, if a project uses a native node module like fs
the source code must omit the require
statement (relying on node --externs
) in order for the code to compile. This is undesirable because it breaks the code for the compiler. It should be possible to keep the require
statement and have the compiler ignore it. This can be achieved currently by aliasing the require
function (e.g var externalRequire = require
) and then using externalRequire
for modules that shouldn't be compiled. This is a useful but hacky solution to the problem.
In addition, it is currently impossible for a node_module
which supports Closure compiler to be built at the same time as the main code as the compiler cannot find the source files from the require statement alone (e.g require("q")
)
Currently, the compiler processes CommonJS require(<modname>)
statements by expanding it like a macro into:
goog.require(module$<modname>)
goog.provide(module$<modname>)
This works well when <modname>
is a file path (like ../foo/bar
) which is also provided to the compiler at compile time.
The problem with this approach is that it assumes that the require
d module code will be supplied to the compiler (e.g via --js
). However, the module may not always be included if:
- it is a native module where providing the source is not possible or practical like
fs
orcrypto
. - it is a
node_module
which is not compatible with Closure compiler or should be excluded for some other reason.
In these cases, the desired behaviour is that the compiler ignores these require
statements and does not expand them.
The proposed modifications would:
- Introduce a new annotation
@externalRequire
which would make the compiler ignore (not expand) the annotatedrequire
statement. - Apply a lookup algorithm specific to Node.js to resolve
node_modules
file paths.
The lookup algorithm would be applied only if the require
statement cannot be resolved (e.g because it isn't a file path). A full description of the algorithm can be found at https://nodejs.org/api/modules.html#modules_loading_from_node_modules_folders . The algorithm would:
- check if it is a native node module like
fs
. If it is, return that (this would be the extern file). Else, continue. - check the parent directory of the current script for a
node_modules
directory. - If found, it would look for a directory within it which matches the name of the module.
- If not found, it would go up a directory and look again for
node_modules
. This process would be repeated until the root of the file system is reached, where it will terminate.
If a matching directory is found, it would:
- Look for a
package.json
file. - If found, it would look for the
main
key and resolve the js file referenced. This would be the file used by Closure compiler for therequire
statement. - If not found (or
package.json
does not have amain
key), it would attempt to resolve the filesindex.js
andindex.node
in that directory.
This change was originally desired in order to compile Node.js code without modifications to the actual source code. Currently, node projects which want to be built using closure compiler have to be pre-processed by resolving require
dependencies and concatenating the output to the js files. This is suboptimal as it makes compiler errors and warnings harder to identify and provides no way to exclude certain require
s. Given there is already some CommonJS support in the compiler, it makes sense to fully support this use case.
For excluding modules, the decision to introduce a new annotation was made in order to clarify the intent of the code. The alias technique described earlier works but makes it hard to understand why this is being done without additional comments. The alias technique also unnecessarily contorts the code by introducing another variable simply for the compiler.
// closure compiler needs this in order to compile
var externalRequire = require;
var crypto = externalRequire("crypto");
vs
/** @externalRequire */
var crypto = require("crypto");
The annotation makes it clear that it is for an external tool rather than for the code being written.
For including modules, the algorithm employed is the same as the Node.js implementation. By using the same algorithm, it makes Closure compiler much more compatible with existing node code without modifications. For example, without this algorithm, developers would need to modify their code from require("q")
to directly reference the source file require("../node_modules/q/index.js")
.
If the policy for including node modules is introduced, the current extern behaviour for node modules could be improved. Currently, the compiler incorrectly assumes global variables for node modules matches the name of the module. For example, you can call methods on an undeclared crypto
variable that will resolve to the crypto
node extern file if you add --extern crypto.js
. However, it is entirely possible that the developer is already using the crypto
variable for something else and is using the crypto
node module under the variable nodeCrypto
. The compiler doesn't handle this use case well currently. The proposed policy could fix this by correctly resolving var nodeCrypto = require("crypto");
to the module.exports
of the extern file crypto.js
, rather than trying to piggy back off the global variable name.
Our biweekly meeting was cancelled this week, so it will be a couple of weeks before we can discuss this more.