The checkglobals module + patch (from LuaPowerPatches -- Download Patch for Lua 5.1.3 is a hybrid of a compile-time and run-time approach for detecting undefined variables. Consider the following trivial Lua module:
-- multiplybyx.lua
local function multiplybyx(y)
return y * X -- is X defined???
end
return multiplybyx
Is this code valid? Did we mistype x
as X
? Well, we can detect at compile time that X
is a global variable, but whether X
is a ''defined'' global variable can in general not be known until run-time:
-- main.lua
local multiplybyx = dofile 'multiplybyx.lua'
X = 2
print(multiplybyx(5)) -- multiplybyx is valid
X = nil
print(multiplybyx(5)) -- multiplybyx is now not valid
So, we'll define a function checkglobals
that determines whether all the globals "directly" referenced lexically inside the code of a given function (e.g. multiplybyx) are defined at the time checkglobals
is called:
-- main.lua
local checkglobals = require 'checkglobals'
local multiplybyx = dofile 'multiplybyx.lua'
X = 2
checkglobals(multiplybyx) -- ok: multiplybyx is valid
print(multiplybyx(5))
X = nil
checkglobals(multiplybyx) -- fails: multiplybyx is not valid
print(multiplybyx(5))
$ lua main.lua
10
lua: main.lua:8: accessed undefined variable "X" at line 3
stack traceback:
[C]: in function 'error'
etc/checkglobals.lua:77: in function 'checkglobals'
main.lua:8: in main chunk
[C]: ?
The function checkglobals(f)
operates by retrieving the environment table (env
) (known at run-time) of function {{f}} and retrieving the list of all global get and set bytecodes (GETGLOBAL
and SETGLOBAL
) lexically inside f
(known at compile-time). checkglobals
verifies that for each get or set global with name varname
that env[varname] ~= nil
. If this check fails, checkglobals
raises an error. Unless the code was stripped, i.e. luac -s
, the error also contains the line number in which the global variable was accessed.
The checkglobals
function accepts some additional parameters that make it more flexible. Let's look at the comments in the source on it. The implementation of this module (on the Lua side) is basically this:
local function checkglobals(f, env)
local fp = f or 1
if type(fp) == 'number' then fp = fp + 1 end
env = env or getfenv(2)
local gref = getinfo(fp, 'g').globals
for i=1,#gref,gref.ncols do
local op,name,linenum = unpack(gref, i,i+2)
if env[name] == nil then
error('accessed undefined variable "' .. name .. '"' ..
(linenum and ' at line ' .. linenum or ''), 2)
end
end
return f
end
This code makes use of a patched debug.getinfo
that supports a new "g" ("globals") option that returns the list of all globals accessed lexically inside the given function (including functions lexically nested inside that function). gref = getinfo(fp, 'g').globals
is an array. For each global accessed, the following values are appended to the array: the access type ("GETGLOBAL"
or "SETGLOBAL"
), the variable name (as a string), and the line number (if source was not stripped). There is also a field gref.ncols
equal to the number of columns (2 or 3) represented in the flat array.
Below are some examples of possible ways to use the module:
Example:
-- factorial.lua
function factorial(k)
if k == 1 then
return K -- opps!
else
return k * factorial(k-1)
end
end
function main()
print(factorial(10))
end
require 'checkglobals' () -- fails since K is undefined
main()
Example:
-- factorial.lua
require 'checkglobals' () -- fails since K is undefined
-- note: no new globals can be "directly" defined beyond this point
-- (though via _G and getfenv() is ok).
local function factorial(k)
if k == 1 then
return K -- opps!
else
return k * factorial(k-1)
end
end
local function main()
print(factorial(10))
end
main()
Example:
-- factorial.lua
local M = {}
local function factorial(k)
if k == 1 then
return K -- opps!
else
return k * factorial(k-1)
end
end
M.factorial = factorial
require 'checkglobals' ()
return M
Note the patch made to the Lua's debugging module (ldblib.c). The patch is rather simple and quite isolated. It only makes additions (no deletions) to lua_getinfo
and debug.getinfo
to support the new "g" ("globals") option.
The new "g" option may have uses elsewhere, so this might be a useful addition to Lua's debug module. The list of globals that a function accesses can be considered part of the function's interface, which is a very fundamental aspect of what the function is. reflection/introspection is much about accessing information on interfaces.
This "g" option may alternately be defined in terms of lhf's bytecode inspector library (lbci)[http://www.tecgraf.puc-rio.br/~lhf/ftp/lua/#lbci]. See checkglobals-lbci.lua.
What are the advantages/disadvantages/caveats to checkglobals? Here are some qualities of it:
- This code is intended to be simple, robust and suitable for general use, with the semantics fairly easy to understand without corner cases
- It detects global accesses in code that is never executed (similar to static analysis approaches).
- It does not mess with environment metatables (like strict.lua) that can potentially cause obscure conflicts.
- It makes weaker assumptions about global variable defined-ness than the static analysis approaches trick, though it makes stronger assumptions than the "strict" approach. Mainly, it assumes that globals aren't created or destroyed during and between the time that
checkglobals
is called and the function that was validated is called. Note that you may callcheckglobals
more than once (e.g. after creating new globals). - The checks are applied normally just after code loading (not off-line as with
luac -p -l
or on each variable access as with strict.lua), though may be done later or more frequently. - The
checkglobals
approach may be combined with thestrict
approach for the strongest validation. - It requires a patch to
lua_getinfo
anddebug.getinfo
to support the new "g" ("globals") option used bycheckglobals.lua
. This patch is entirely backwards compatible and rather isolated and it might be useful for other purposes as well. checkglobals
is written entirely in Lua and can be customized.checkglobals
(like the static analysis approaches) assumes that a function has a single, non-changing environment. It also assumes that lexically nested functions have the same environment as the parent function, although this restriction might be relaxed with an additional parameter that causescheckglobals
to ignore lexically nested functions:checkglobals(f,env,'norecurse')
; that will also require an extension to thedebug.getinfo
patch. See LuaList:2008-03/msg00598 for details.- As this suggestion as written is new (I believe), the design qualities of it may still need to be verified in practice.
See also mail list discussion: LuaList:2008-03/msg00440.html .
--DavidManura
(P.S. I no longer use this but rather prefer semantically aware Lua text editors.)