LuaState (a Lua Wrapper) 4.1 Alpha Distribution
http://workspacewhiz.com/Other/LuaState/LuaState.html
http://workspacewhiz.com/Other/LuaState/LuaState_LuaWrapper41Alpha.zip
Overview
Warning: The following distribution contains modified Lua code. It is not an official distribution of Lua as the authors intended it. Many modifications are superficial, but some, such as the Unicode string type, go quite a bit deeper. It is my hope these changes will be considered for inclusion into the master Lua distribution.
The intent of this package is not to masquerade as an official Lua interpreter. Any references to Lua in this document refer specifically to the modified Lua code contained herein, unless otherwise noted. Only additions have been made to Lua.
The LuaState distribution provides the following functionality:
- All the power of the core Lua distribution.
- Unicode support.
- Custom memory allocators.
- Memory optimizations.
- Can be built into Win32 DLL or LIB form.
- Win32 multithreading.
- Unified methods.
- String formatting enhancements.
- Built-in raw pointer type.
- Serialization of Lua tables.
- An easy to use C++ wrapper.
API documentation is here.
Unicode
Unicode (or rather, wide character) support is built-in as a native Lua type. While it is entirely possible to represent a Unicode string using a regular Lua 8-bit clean string, there is no way of determining whether the given string is Unicode or not. The secondary problem involves the use of standard string functionality, such as the concatenation operator. If a Unicode string is represented as a normal 8-bit Lua string, special functions would have to be written to perform operations on the string (i.e. concatenation of two Lua 8-bit clean strings which represent Unicode). Rather than confuse the user, a Unicode string type is available. The term Unicode is used loosely in this context (since there are different standards of Unicode), but for the purposes of this distribution, Unicode refers to 16-bit wide character support.
Unicode strings can be entered into Lua script using the C approach to wide character strings.
L"Wide characters"
By inserting an L
in front of the quote, the Lua lexer creates a wide character representation of the above string. If the string was entered as a regular Lua string, the Unicode equivalent would be simulated as follows:
"W\000i\000d\000e\000 \000c\000h\000a\000r\000a\000c\000t\000e\000r\000s\000\000\000"
In the event it is necessary to insert Unicode character codes in the wide character string, an additional property of the L""
approach may be used. 16-bit characters may be entered using hexadecimal notation, in the same way as C:
L"Wide characters: \x3042\x3043"
The standard Lua libraries have been upgraded to accept and operate on Unicode strings. Below is a brief list of added functionality.
-
LUA_TUSTRING
type added. Mirrors LUA_TSTRING
.
-
print()
has been upgraded to take Unicode strings as input.
-
tonumber()
can take a Unicode string as a parameter.
- If
tostring()
receives a Unicode string as input, it converts it to ANSI and returns it.
-
toustring()
added. If toustring()
receives an ANSI string as input, it converts it to Unicode and returns it.
-
read()
can read Unicode files/strings if the input string is Unicode. The file must be opened in binary mode.
-
write()
can write Unicode strings.
- The following
lstrlib.c
functions have been added and upgraded to Unicode. All behavior corresponds to the ANSI equivalents.
ustrlen()
. Mirrors strlen()
.
ustrsub()
. Mirrors strsub()
.
ustrlower()
. Mirrors strlower()
.
ustrupper()
. Mirrors strupper()
.
ustrrep()
. Mirrors strrep()
.
ustrbyte()
. Mirrors strbyte()
.
ustrchar()
. Mirrors strchar()
.
uformat()
. Mirrors format()
.
ustrfind()
. Mirrors strfind()
.
ugsub()
. Mirrors gsub()
.
- The following public Lua functions have been added or upgraded and mirror their ANSI counterparts:
lua_isustring()
added. Mirrors lua_isstring()
.
lua_toustring()
added. Mirrors lua_tostring()
.
lua_pushlustring()
added. Mirrors lua_pushlstring()
.
lua_pushustring()
added. Mirrors lua_pushustring()
.
lua_pushuliteral()
added. Mirrors lua_pushliteral()
.
lua_strlen()
will properly report the length of a Unicode string.
- The following private Lua functions have been added and mirror their ANSI counterparts:
- Addition of
luaL_check_lustr()
. Mirrors luaL_checklstr()
.
luaL_check_ustring()
added. Mirrors luaL_check_string()
.
luaL_addusize()
added. Mirrors luaL_addsize()
.
luaL_addlustring()
. Mirrors luaL_addlstring()
.
luaL_addustring()
. Mirrors luaL_addstring()
.
l_us()
define added. Mirrors l_s()
.
l_uc()
define added. Mirrors l_c()
.
LUA_NUMBER_USCAN
. Mirrors LUA_NUMBER_SCAN
.
LUA_NUMBER_UFMT
. Mirrors LUA_NUMBER_FMT
.
luaL_Buffer
can now distinguish whether it is in double byte or single byte form.
- Addition of
luaL_putwchar()
. Mirrors luaL_putchar()
.
luaL_buffinit()
takes an additional parameter.
lua_number2ustr()
added.
Memory Allocators
This distribution replaces the #define
approach to memory allocation within Lua with a callback mechanism, where the memory allocators can be replaced on a per Lua state basis. This allows a powerful mechanism to be employed to adjust memory allocation strategies on a per state basis.
For purposes of better memory tracking, the realloc()
callback allows a void pointer of user data, an allocation name, and allocation flags to be passed along. All of these arguments are optional, but they are available if the memory allocation callback needs them.
The only allocation flag available is LUA_ALLOC_TEMP
. A memory manager could react to the LUA_ALLOC_TEMP
flag, for instance, by allocating the code for the main function of a Lua file at the top of the heap. If all other Lua allocations happen at the bottom of the heap, no holes will be left in memory when the LUA_ALLOC_TEMP
flagged allocation is garbage collection.
The callbacks look like:
static
void* luaHelper_ReallocFunction(void* ptr, intsize, void* data, constchar* allocName, unsignedintallocFlags)
{
returnrealloc(ptr, size);
}
static
void
luaHelper_FreeFunction(void* ptr, void* data)
{
free(ptr);
}
The allocation functions must be assigned before a Lua global state is created, in a fashion similar to below. It is good practice to restore the previous realloc()
and free()
callbacks.
lua_ReallocFunction
oldReallocFunc;
lua_FreeFunctionoldFreeFunc;
void* oldData;
lua_getdefaultmemoryfunctions(&oldReallocFunc, &oldFreeFunc, &oldData);
lua_setdefaultmemoryfunctions(luaHelper_ReallocFunction, luaHelper_FreeFunction, NULL);
lua_State* state = lua_open(0);
lua_setdefaultmemoryfunctions(oldReallocFunc, oldFreeFunc, oldData);
Memory Optimizations
A whole host of functionality has been added to facilitate the optimization of memory usage in a tight memory environment.
-
lua_setminimumstringtablesize(int numstrings)
will ensure the global string table is always of the minimum size specified by numstrings
. When garbage collection occurs, the string table will never shrink below numstrings
. An application can determine the maximum number of strings it will use and ensure the space is reserved in advance, so as to avoid fragmentation when the string table is resized.
-
lua_setdefaulttagtablesize(int numtags)
will create a hash table of numtags
size for the tag table.
-
lua_setminimumglobaltablesize(int numentries)
ensures the globals table is capable of at least holding numentries
elements without resizing to avoid fragmenting the heap.
-
lua_setminimumauxspace(int size)
creates a minimum auxiliary space buffer of size
bytes. Auxiliary space is automatically allocated with the allocation flag LUA_ALLOC_TEMP
. An application might use this to put auxiliary space at the top of a heap or in some other location, so later freeing of it doesn't cause fragmentation issues.
-
lua_setmainfunctionallocflags(flags)
sets the allocation flags to be used when reading in the main function of a chunk. The user could pass in LUA_ALLOC_TEMP
as the main function allocation flag.
-
lua_newtablesize(lua_State* L, int size)
is an enhanced version of lua_newtable()
that creates a hash table of size so as to avoid reallocations.
-
luaM_setname()
macro added and is used internally to name groups of allocations. The name is passed into the realloc()
callback function and may be used to better categorize allocations.
-
lua_loadfile()
has been optimized to not push temporary strings to the Lua stack which immediately can be garbage collected. This is another optimization that avoids "holes" in a memory heap.
- Tables can be created in the form:
table = { &100 1, 2, 3, 4, 5, 6 }
where the &100
creates a table hash table of size 100. The &100
may, of course, be any value.
Multithreading
Multithreading is built into the LuaState distribution by default. The function lua_setlockfunctions()
can be used to set up the multithreading.
Example:
static
void
LSLock(void* data)
{
CRITICAL_SECTION* cs = (CRITICAL_SECTION*)data;
::EnterCriticalSection(cs);
}
staticvoidLSUnlock(void* data)
{
CRITICAL_SECTION* cs = (CRITICAL_SECTION*)data;
::LeaveCriticalSection(cs);
}
lua_State* m_state = lua_open(stackSize);
CRITICAL_SECTION* cs = newCRITICAL_SECTION;
::InitializeCriticalSection(cs);
lua_setlockfunctions(m_state, LSLock, LSUnlock, cs);
Fatal Error Handler
Having exit()
be called in non-command line apps is generally a bad thing. In some environments, exit()
can't be called at all. Rather than have the application blow up in an undesirable fashion, the LuaState distribution allows the fatal error exit()
function in Lua to be overridden through a call to lua_setfatalerrorfunction()
. The default fatal error callback runs the exit()
function. It can be replaced as desired.
Other Optimizations
-
lua_gettop()
has been inlined in lua.h
. This is because LuaState uses lua_gettop()
a lot, but most C++ optimizers will optimize the calls out.
-
TObject
's structure format has been manipulated to fit in less memory. This works great in an environment where a lot of tags are not created. Otherwise, the default implementation of TObject
is better.
New String Formatting Enhancements
format
has been extended with the following control types. The use of these control types makes it easy to write binary file formats to disk.
-
format("%bb", 255)
creates a string containing the 8-bit byte 255
.
-
format("%bw", 1000)
creates a string containing the 16-bit word 1000
.
-
format("%bd", 1000000)
creates a string containing the 32-bit dword 1000000
.
-
format("%bf", 1.0f)
creates a string containing the 32-bit float 1.0
.
-
format("%bF", 1.0)
creates a string containing the 64-bit double 1.0
.
-
format("%Q", str)
turns a binary string into a printable string.
Additionally, ANSI strings can use the hexadecimal character notation to insert bytes into the string:
str = "Hel\x80o"
Built-in Pointer Type
The LuaState distribution offers a built-in pointer type. The pointer type is used for just passing a raw pointer into Lua and back out to a C function. There are some advantages offered by the pointer type over the user data type:
- Handing off a pointer to Lua is "free." For user data, there is a memory cost associated with creating a user data object. For simple pointer passing, the pointer type is a much better alternative.
- Since the mantissa of a double is large enough to hold a 32-bit pointer without data loss, a Lua double could be used to hand off pointers. The pointer interface is much cleaner than the double one and far more portable.
A pointer is represented by the Lua type, LUA_TPOINTER
. The following functions are available for pointer access and mirror their Lua type counterparts:
-
lua_ispointer()
-
lua_getpointer()
-- Since lua_topointer()
is in use.
-
lua_pushpointer()
Unified Methods
Unified methods are based heavily on Edgar Toernig's Sol implementation of unified methods (note: some text is taken verbatim from the Sol documentation).
Every object in Lua has an attached method table. For C++ users, the method table is most similar to a v-table. For Lua's simple types (nil
, number
, string
, ustring
, and function
), there is one method table for all objects of the given type. Table
and userdata
objects have the ability to have method tables on a per object basis.
Unlike Edgar's Sol implementation, the colon operator for Lua's automatic self functions is not replaced with an alternate implementation. This is done in an effort to keep LuaState functionality identical to the original Lua distribution. Instead, two new function operators are introduced. The pointer symbol (->
) behaves like the colon operator, but it looks up the function to call in the method table. The second operator is the double colon operator, which behaves like the regular dot operator (no self
is passed in).
The biggest advantage of unified methods is memory savings. When dealing with many Lua objects (say, tables) of the same type, the functions don't have to be duplicated for each and every one. Significant amounts of memory may be saved by the use of the shared method table.
Every data type has methods, even numbers. It is possible to write code that looks like:
print(4->sqrt())
str = "Hello"
print(str->len())
For the majority of cases, the use of tag methods can more or less be forgotten.
The default method tables have been put into global names for convenience. They are named like the type but with a capital first letter (Nil
, Number
, String
, UString
, Function
, Table
, Userdata
).
Method tables may be accessed from within Lua using the methods()
function.
table
=
{}
-- Save the old methods.
tableMethods
=
methods(table)
newMethods
=
{
doNothing
=
function
(
self)
end
}
-- Set the new methods.
methods(table,
newMethods)
table->doNothing()
In C, methods may be retrieved using the following functions:
LUA_API
void
lua_getmethods(lua_State *L, intindex);
LUA_APIvoidlua_getdefaultmethods(lua_State *L, inttype);
LUA_APIvoidlua_setmethods(lua_State *L, intindex);
Serializing
The LuaState distribution can write out a Lua table in a nice, formatted file. The only downside to LuaState's approach is that the table can't currently be cyclic.
A table can be written both from Lua and C++. The function prototypes are:
function
WriteLuaFile(fileName,
objectName,
valueToWrite,
indentLevel,
writeAll,
alphabetical,
maxIndentLevel)
function
WriteLuaGlobalsFile(fileName,
writeAll,
alphabetical,
maxIndentLevel)
function
WriteLuaObject(filePtr,
objectName,
valueToWrite,
indentLevel,
writeAll,
alphabetical,
maxIndentLevel)
-
fileName
- The name of the file to write to disk.
-
filePtr
- A pointer from the Lua function openfile()
.
-
objectName
- The name of the initial Lua table to write to disk.
-
valueToWrite
- The Lua table to write.
-
indentLevel
- The number of tabs to indent each line. Passing in -1 will cause special formatting to occur that assumes the globals() table is being written.
-
writeAll
- If 1, writes all Lua objects out, including function and user data information.
-
alphabetical
- If 1, each table written is sorted.
-
maxIndentLevel
- May be nil. The maximum number of nested tables allowed in the write. If this value is exceeded, then no carriage returns are inserted.
The C++ functionality is very similar in form.
Standard Unified Method Callbacks
The basic types have the following unified method callbacks applied to them.
Table
=
{
function
foreach(self,
func)
function
foreachi(self,
func)
function
next(self,
[index])
function
rawget(self,
index)
function
rawset(self,
index,
value)
function
getn(self)
function
sort(self
[,
comp])
function
insert(self
[,
pos]
,
value)
-- tinsert
function
remove(self
[,
pos])
-- tremove
function
unpack(self)
}
File
=
{
function
close(self)
-- closefile
function
flush(self)
-- flush
function
open(filename,
mode)
-- openfile
function
read([self,]
format1,
...)
-- read
function
seek(self
[,
whence]
[,
offset])
-- seek
function
write([self,]
value1,
...)
-- write
function
execute(command)
-- execute
function
remove(filename)
-- remove
function
rename(name1,
name2)
-- rename
function
tmpname()
-- tmpname
stdin
stdout
stderr
}
String=
{
functionlen(self)-- strlen
functionsub(self,i[,j])-- strsub
functionlower(self)-- strlower
functionupper(self)-- strupper
functionchar(i1,i2,...)-- strchar
functionrep(self,n)-- strrep
functionbyte(self[,i])-- strbyte
functionformat(formatstring,e1,e2,...)-- format
functionfind(self,pattern[,init[,plain]])-- strfind
functiongsub(self,pat,repl[,n])-- gsub
}
UString
=
{
function
len(self)
-- ustrlen
function
sub(self,
i
[,
j])
-- ustrsub
function
lower(self)
-- ustrlower
function
upper(self)
-- ustrupper
function
char(i1,
i2,
...)
-- ustrchar
function
rep(self,
n)
-- ustrrep
function
byte(self
[,
i])
-- ustrbyte
function
format(formatstring,
e1,
e2,
...)
-- uformat
function
find(self,
pattern
[,
init
[,
plain]])
-- ustrfind
function
gsub(self,
pat,
repl
[,
n])
-- ugsub
}
Number
=
{
function
abs()
function
sin()
function
cos()
function
tan()
function
asin()
function
acos()
function
atan()
function
atan2()
function
ceil()
function
floor()
function
mod()
function
frexp()
function
ldexp()
function
sqrt()
function
min()
function
max()
function
log()
function
log10()
function
exp()
function
deg()
function
rad()
function
random()
function
randomseed()
}
Bonus Functions
FileFind
=
{
-- Returns a handle representing the first file matching fileName.
function
First(fileName)
-- Retrieves the next file matching fileName.
function
Next(self)
-- Closes the file search.
function
Close(self)
-- Gets the file name of the currently matched file.
function
GetFileName(self)
-- Determines if the currently matched file is a directory.
function
IsDirectory(self)
}
-- Added to File:
File
=
{
-- Returns the size of fileName.
function
GetFileSize(fileName)
-- Returns as two numbers the last write time for fileName.
function
GetWriteTime(fileName)
-- Sets the write time for fileName.
function
SetWriteTime(fileName,
timeLo,
timeHi)
-- Same as the C function _access.
function
access(fileName,
type)
}
-- Returns a new table with a hash table of size.
function
NewTableSize(size)
-- Copies a non-cyclic table recursively.
function
CopyTable(tableToCopy)
-- Looks up a table entry by string name: Table1.Table2.3.Value2
function
FullLookup(table,
lookupStr)
-- Processes all the files matching wildcard in the directory [path] and calls func(path, name) on each one.
function
DirProcessFiles(path,
wildcard,
func)
-- Recursively processes all the files in the directory [path], optionally matching [ext] and calls func(path, name) on each one.
function
DirProcessFilesRecursive(path,
func,
ext)