1
0
mirror of https://github.com/gnosygnu/xowa.git synced 2024-10-27 20:34:16 +00:00
gnosygnu_xowa/home/wiki/App/Xtn/Mediawiki/Scribunto/Luaj.html
2016-06-26 02:10:12 -04:00

1124 lines
47 KiB
HTML
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

<!DOCTYPE html>
<html dir="ltr">
<head>
<meta http-equiv="content-type" content="text/html;charset=UTF-8" />
<title>App/Xtn/Mediawiki/Scribunto/Luaj - XOWA</title>
<link rel="shortcut icon" href="https://gnosygnu.github.io/xowa/xowa_logo.png" />
<link rel="stylesheet" href="https://gnosygnu.github.io/xowa/xowa_common.css" type="text/css">
</head>
<body class="mediawiki ltr sitedir-ltr ns-0 ns-subject skin-vector action-submit vector-animateLayout" spellcheck="false">
<div id="mw-page-base" class="noprint"></div>
<div id="mw-head-base" class="noprint"></div>
<div id="content" class="mw-body">
<h1 id="firstHeading" class="firstHeading"><span>App/Xtn/Mediawiki/Scribunto/Luaj</span></h1>
<div id="bodyContent" class="mw-body-content">
<div id="siteSub">From XOWA: the free, open-source, offline wiki application</div>
<div id="contentSub"></div>
<div id="mw-content-text" lang="en" dir="ltr" class="mw-content-ltr">
<div id="toc" class="toc">
<div id="toctitle">
<h2>
Contents
</h2>
</div>
<ul>
<li class="toclevel-1 tocsection-1">
<a href="#Source"><span class="tocnumber">1</span> <span class="toctext">Source</span></a>
</li>
<li class="toclevel-1 tocsection-2">
<a href="#Modification"><span class="tocnumber">2</span> <span class="toctext">Modification</span></a>
</li>
<li class="toclevel-1 tocsection-3">
<a href="#luaj_xowa_changes"><span class="tocnumber">3</span> <span class="toctext">luaj_xowa changes</span></a>
<ul>
<li class="toclevel-2 tocsection-4">
<a href="#Luaj_2.0.3_errors_fixed_in_3.0"><span class="tocnumber">3.1</span> <span class="toctext">Luaj 2.0.3 errors fixed in 3.0</span></a>
<ul>
<li class="toclevel-3 tocsection-5">
<a href="#os.time_doesn.27t_handle_dates_before_1970"><span class="tocnumber">3.1.1</span> <span class="toctext">os.time doesn't handle dates before 1970</span></a>
</li>
<li class="toclevel-3 tocsection-6">
<a href="#pairs.next_fails_when_setting_val_to_null"><span class="tocnumber">3.1.2</span> <span class="toctext">pairs.next fails when setting val to null</span></a>
</li>
</ul>
</li>
<li class="toclevel-2 tocsection-7">
<a href="#Luaj_2.0.3_features_removed_from_3.0"><span class="tocnumber">3.2</span> <span class="toctext">Luaj 2.0.3 features removed from 3.0</span></a>
<ul>
<li class="toclevel-3 tocsection-8">
<a href="#string.gfind_deprecated"><span class="tocnumber">3.2.1</span> <span class="toctext">string.gfind deprecated</span></a>
</li>
<li class="toclevel-3 tocsection-9">
<a href="#math.log10_deprecated"><span class="tocnumber">3.2.2</span> <span class="toctext">math.log10 deprecated</span></a>
</li>
<li class="toclevel-3 tocsection-10">
<a href="#math.mod_deprecated"><span class="tocnumber">3.2.3</span> <span class="toctext">math.mod deprecated</span></a>
</li>
<li class="toclevel-3 tocsection-11">
<a href="#table.maxn_deprecated"><span class="tocnumber">3.2.4</span> <span class="toctext">table.maxn deprecated</span></a>
</li>
<li class="toclevel-3 tocsection-12">
<a href="#table.getn_deprecated"><span class="tocnumber">3.2.5</span> <span class="toctext">table.getn deprecated</span></a>
</li>
<li class="toclevel-3 tocsection-13">
<a href="#automatic_arg_variable_in_varargs_function_deprecated"><span class="tocnumber">3.2.6</span> <span class="toctext">automatic arg variable in varargs function deprecated</span></a>
</li>
</ul>
</li>
<li class="toclevel-2 tocsection-14">
<a href="#Luaj_3.0_defects"><span class="tocnumber">3.3</span> <span class="toctext">Luaj 3.0 defects</span></a>
<ul>
<li class="toclevel-3 tocsection-15">
<a href="#os.date_does_not_accept_UTC_format"><span class="tocnumber">3.3.1</span> <span class="toctext">os.date does not accept UTC format</span></a>
</li>
<li class="toclevel-3 tocsection-16">
<a href="#string.gsub_fails_with_out_of_bounds_error"><span class="tocnumber">3.3.2</span> <span class="toctext">string.gsub fails with out_of_bounds error</span></a>
</li>
<li class="toclevel-3 tocsection-17">
<a href="#string.gsub_fails_if_string_is_empty"><span class="tocnumber">3.3.3</span> <span class="toctext">string.gsub fails if string is empty</span></a>
</li>
<li class="toclevel-3 tocsection-18">
<a href="#string.format_ignores_precision_for_double_args"><span class="tocnumber">3.3.4</span> <span class="toctext">string.format ignores precision for double args</span></a>
</li>
<li class="toclevel-3 tocsection-19">
<a href="#string.gmatch_issues"><span class="tocnumber">3.3.5</span> <span class="toctext">string.gmatch issues</span></a>
</li>
<li class="toclevel-3 tocsection-20">
<a href="#string.tonumber_should_trim_all_whitespace"><span class="tocnumber">3.3.6</span> <span class="toctext">string.tonumber should trim all whitespace</span></a>
</li>
<li class="toclevel-3 tocsection-21">
<a href="#multi-byte_strings_not_fully_supported"><span class="tocnumber">3.3.7</span> <span class="toctext">multi-byte strings not fully supported</span></a>
</li>
</ul>
</li>
<li class="toclevel-2 tocsection-22">
<a href="#build.xml"><span class="tocnumber">3.4</span> <span class="toctext">build.xml</span></a>
</li>
<li class="toclevel-2 tocsection-23">
<a href="#Luaj_tests"><span class="tocnumber">3.5</span> <span class="toctext">Luaj tests</span></a>
</li>
</ul>
</li>
<li class="toclevel-1 tocsection-24">
<a href="#Scribunto_related"><span class="tocnumber">4</span> <span class="toctext">Scribunto related</span></a>
<ul>
<li class="toclevel-2 tocsection-25">
<a href="#getfenv.2Fsetfenv_deprecated"><span class="tocnumber">4.1</span> <span class="toctext">getfenv/setfenv deprecated</span></a>
</li>
<li class="toclevel-2 tocsection-26">
<a href="#loadString_deprecated"><span class="tocnumber">4.2</span> <span class="toctext">loadString deprecated</span></a>
</li>
<li class="toclevel-2 tocsection-27">
<a href="#table.unpack_deprecated"><span class="tocnumber">4.3</span> <span class="toctext">table.unpack deprecated</span></a>
</li>
</ul>
</li>
<li class="toclevel-1 tocsection-28">
<a href="#Miscellaneous_changes"><span class="tocnumber">5</span> <span class="toctext">Miscellaneous changes</span></a>
</li>
</ul>
</div>
<h2>
<span class="mw-headline" id="Source">Source</span>
</h2>
<p>
The luaj_xowa.jar was built using the source at <a href="http://sourceforge.net/projects/luaj/files/luaj-3.0/3.0-beta2/luaj-3.0-beta2.zip/download" rel="nofollow" class="external free">http://sourceforge.net/projects/luaj/files/luaj-3.0/3.0-beta2/luaj-3.0-beta2.zip/download</a>.
</p>
<p>
Its source is not currently included with XOWA. It is available at the following location: <a href="https://sourceforge.net/projects/xowa/files/support/luaj/" rel="nofollow" class="external free">https://sourceforge.net/projects/xowa/files/support/luaj/</a>
</p>
<p>
<br>
</p>
<h2>
<span class="mw-headline" id="Modification">Modification</span>
</h2>
<p>
The luaj_xowa.jar was created for the following reasons:
</p>
<ul>
<li>
Backward compatibility:
</li>
</ul>
<dl>
<dd>
Scribunto is currently using Lua 5.1 whereas luaj 3.0 is designed for Lua 5.2
</dd>
<dd>
Lua 5.2 is not backward-compatible with Lua 5.1; several functions are obsoleted (for example, table.maxn)
</dd>
<dd>
The luaj_xowa.jar was tailored to support Scribunto's 5.1 environment.
</dd>
</dl>
<ul>
<li>
Patches / bug fixes:
</li>
</ul>
<dl>
<dd>
Luaj has a handful of minor issues / defects. They are listed below.
</dd>
</dl>
<p>
<br>
</p>
<h2>
<span class="mw-headline" id="luaj_xowa_changes">luaj_xowa changes</span>
</h2>
<h3>
<span class="mw-headline" id="Luaj_2.0.3_errors_fixed_in_3.0">Luaj 2.0.3 errors fixed in 3.0</span>
</h3>
<h4>
<span class="mw-headline" id="os.time_doesn.27t_handle_dates_before_1970">os.time doesn't handle dates before 1970</span>
</h4>
<ul>
<li>
fix : incorrect Birth / Date; EX: ru.w:Пушкин,_Александр_Сергеевич
</li>
<li>
file: /src/core/org/luaj/vm2/lib/OsLib.java
</li>
</ul>
<h4>
<span class="mw-headline" id="pairs.next_fails_when_setting_val_to_null">pairs.next fails when setting val to null</span>
</h4>
<ul>
<li>
fix : Finnish declension table; EX:d:Latvia
</li>
<li>
file: /src/core/org/luaj/vm2/LuaTable.java
</li>
</ul>
<h3>
<span class="mw-headline" id="Luaj_2.0.3_features_removed_from_3.0">Luaj 2.0.3 features removed from 3.0</span>
</h3>
<h4>
<span class="mw-headline" id="string.gfind_deprecated">string.gfind deprecated</span>
</h4>
<ul>
<li>
fix : missing Video_game_reviews; EX: w:Sonic_Heroes
</li>
<li>
file: /src/core/org/luaj/vm2/lib/StringLib.java
</li>
<li>
code: call
</li>
</ul>
<pre>
old:
"sub"} );
new:
"sub", "gfind"} );
</pre>
<ul>
<li>
code: invoke
</li>
</ul>
<pre>
add:
case 4:
case 9: return StringLib.gmatch( args );
</pre>
<h4>
<span class="mw-headline" id="math.log10_deprecated">math.log10 deprecated</span>
</h4>
<ul>
<li>
fix : blank references; EX:w:Earth
</li>
<li>
file: /src/jse/org/luaj/vm2/lib/JseMathLib.java
</li>
<li>
code:
</li>
</ul>
<pre>
math.set("log10", new log10());
static final class log10 extends UnaryOp { protected double call(double d) { return Math.log10(d); } }
</pre>
<h4>
<span class="mw-headline" id="math.mod_deprecated">math.mod deprecated</span>
</h4>
<ul>
<li>
fix : missing table; EX:d:աղբիւր
</li>
<li>
file: /src/core/org/luaj/vm2/lib/MathLib.java
</li>
<li>
code:
</li>
</ul>
<pre>
fmod fmod_func = new fmod();
math.set("mod", fmod_func);
math.set("fmod", fmod_func);
</pre>
<h4>
<span class="mw-headline" id="table.maxn_deprecated">table.maxn deprecated</span>
</h4>
<p>
file: /src/core/org/luaj/vm2/lib/TableLib.java
</p>
<ul>
<li>
code:
</li>
</ul>
<pre>
public LuaValue getn() {
int len = length();
for (int n = len; n &gt; 0; --n )
if ( !rawget(n).isnil() )
return LuaInteger.valueOf(n);
return ZERO;
}
</pre>
<p>
file: /src/core/org/luaj/vm2/lib/TableLib.java
</p>
<pre>
table.set("maxn", new maxn());
static class maxn extends OneArgFunction {
public LuaValue call(LuaValue arg) {
return LuaValue.valueOf(arg.checktable().maxn());
}
}
</pre>
<h4>
<span class="mw-headline" id="table.getn_deprecated">table.getn deprecated</span>
</h4>
<ul>
<li>
fix : missing text; EX: d:aceite d:Module:pt-verb-form-of
</li>
<li>
file: /src/core/org/luaj/vm2/LuaTable.java
</li>
</ul>
<p>
code:
</p>
<pre>
public LuaValue getn() {
int len = length();
for (int n = len; n &gt; 0; --n )
if ( !rawget(n).isnil() )
return LuaInteger.valueOf(n);
return ZERO;
}
</pre>
<ul>
<li>
file: /src/core/org/luaj/vm2/lib/TableLib.java
</li>
<li>
code:
</li>
</ul>
<pre>
table.set("getn", new getn());
static class getn extends OneArgFunction {
public LuaValue call(LuaValue arg) {
return arg.checktable().getn();
}
}
</pre>
<h4>
<span class="mw-headline" id="automatic_arg_variable_in_varargs_function_deprecated">automatic arg variable in varargs function deprecated</span>
</h4>
<ul>
<li>
fix : Horizontal timeline; EX: w:Cretaceous%E2%80%93Paleogene_extinction_event
</li>
<li>
file: /src/core/org/luaj/vm2/LuaClosure.java
</li>
</ul>
<p>
code:
</p>
<pre>
case Lua.OP_GETTABUP: /* A B C R(A) := UpValue[B][RK(C)] */
// stack[a] = upValues[i&gt;&gt;&gt;23].getValue().get((c=(i&gt;&gt;14)&amp;0x1ff)&gt;0xff? k[c&amp;0x0ff]: stack[c]);
// HACK: handle deprecated "arg" for "..."
int OP_GETTABUP_c = (i&gt;&gt;14)&amp;0x1ff;
boolean OP_GETTABUP_b = OP_GETTABUP_c&gt;0xff;
LuaValue OP_GETTABUP_idx = OP_GETTABUP_b ? k[OP_GETTABUP_c&amp;0x0ff]: stack[OP_GETTABUP_c];
stack[a] = upValues[i&gt;&gt;&gt;23].getValue().get(OP_GETTABUP_idx);
// HACK: handle deprecated "arg"
if ( p.is_vararg == 1
&amp;&amp; stack[a] == NIL
&amp;&amp; OP_GETTABUP_b
&amp;&amp; "arg".equals(OP_GETTABUP_idx.tojstring())
)
stack[a] = new LuaTable(varargs);
continue;
</pre>
<h3>
<span class="mw-headline" id="Luaj_3.0_defects">Luaj 3.0 defects</span>
</h3>
<h4>
<span class="mw-headline" id="os.date_does_not_accept_UTC_format">os.date does not accept UTC format</span>
</h4>
<ul>
<li>
fix : incorrect age in ym; EX:w:Supreme_Court_of_the_United_States
</li>
<li>
file: /src/core/org/luaj/vm2/lib/OsLib.java
</li>
<li>
proc: invoke.DATE
</li>
</ul>
<pre>
boolean utc = false;
if (s.startsWith("!")) {
utc = true;
s = s.substring(1);
}
if (s.equals("*t")) {
Calendar d = Calendar.getInstance();
long time_in_ms = (long)(t*1000);
if (utc) {
java.util.TimeZone current_tz = d.getTimeZone();
int offset_from_utc = current_tz.getOffset(time_in_ms);
time_in_ms += -offset_from_utc;
}
</pre>
<h4>
<span class="mw-headline" id="string.gsub_fails_with_out_of_bounds_error">string.gsub fails with out_of_bounds error</span>
</h4>
<ul>
<li>
fix : blank references; EX:w:Earth
</li>
<li>
file: /src/core/org/luaj/vm2/lib/StringLib.java
</li>
<li>
proc: gsub
</li>
</ul>
<pre>
old:
if ( anchor )
break;
new:
if ( anchor )
break;
if (soffset &gt;= srclen) break; // assert soffset is in bounds, else will throw ArrayIndexOutOfBounds exception;
</pre>
<h4>
<span class="mw-headline" id="string.gsub_fails_if_string_is_empty">string.gsub fails if string is empty</span>
</h4>
<ul>
<li>
fix : blank references; EX:w:Woburn,_Massachusetts
</li>
<li>
file: /src/core/org/luaj/vm2/lib/StringLib.java
</li>
<li>
proc: gsub
</li>
</ul>
<pre>
static Varargs gsub( Varargs args ) {
LuaString src = args.checkstring( 1 );
final int srclen = src.length();
if (srclen == 0) return varargsOf(src, LuaValue.ZERO); // exit early
</pre>
<h4>
<span class="mw-headline" id="string.format_ignores_precision_for_double_args">string.format ignores precision for double args</span>
</h4>
<ul>
<li>
fix : Convert calls will show full precision for numbers; EX:w:Tomato
</li>
<li>
file: /src/core/org/luaj/vm2/lib/StringLib.java
</li>
</ul>
<pre>
FormatDesc fdsc = new FormatDesc(args, fmt, i );
int fdsc_bgn = i;
</pre>
<pre style='overflow:auto'>
old:
case 'G':
fdsc.format( result, args.checkdouble( arg ) );
new:
case 'G':
String fmt_str = new String(fmt.m_bytes, fdsc_bgn - 1, fdsc.length + 1); // -1 to include %; +1 to account for included %; basically get everything between % and f; EX: a%.1fb -&gt; %.1f
fdsc.format( result, fmt_str, args.checkdouble( arg ));
</pre>
<ul>
<li>
proc: format
</li>
</ul>
<pre style='overflow:auto'>
old:
buf.append( v) );
new:
// buf.append( String.valueOf( x ) );
if (fmt.startsWith("%0."))
fmt = "%" + fmt.substring(2); // remove leading 0, else MissingFormatWidthException
int fmt_len = fmt.length();
if (fmt_len &gt; 1 &amp;&amp; fmt.charAt(fmt_len - 2) == '.') // penultimmate char has "."
fmt = fmt.substring(0, fmt_len - 1) + "0" + fmt.charAt(fmt_len - 1); // add trailing 0, else UnknownFormatConversionException; EX: "02.f" -&gt; "02.0f"
buf.append( String.format(fmt, v) ); // call String.format
</pre>
<ul>
<li>
note: also fixes format failures
<ul>
<li>
%0.1f -&gt; remove leading 0
</li>
<li>
%02.f -&gt; add trailing 0 after .
</li>
</ul>
</li>
<li>
note: forces a 1.5 JRE (as opposed to 1.3)
</li>
</ul>
<h4>
<span class="mw-headline" id="string.gmatch_issues">string.gmatch issues</span>
</h4>
<ul>
<li>
fix : Multiple pages in enwiki's Wikipedia namespace; EX: Wikipedia:CS1/test_basics
</li>
<li>
file: /src/core/org/luaj/vm2/lib/StringLib.java
</li>
<li>
proc: GmatchAux.invoke
</li>
</ul>
<pre>
old:
for ( ; soffset&lt;srclen; soffset++ ) {
new:
for ( ; soffset&lt;=srclen; soffset++ ) {
</pre>
<pre>
old:
soffset = res;
new:
int soffset_adj = res == soffset ? 1 : 0;
soffset = res + soffset_adj;
</pre>
<h4>
<span class="mw-headline" id="string.tonumber_should_trim_all_whitespace">string.tonumber should trim all whitespace</span>
</h4>
<ul>
<li>
fix: Population tables; EX: w:Woburn,_Massachusetts
</li>
<li>
file: /src/core/org/luaj/vm2/LuaString.java
</li>
<li>
proc: scannumber
</li>
</ul>
<pre>
// trim ws
int idx = i;
while (idx &lt; j) {
switch (m_bytes[idx]) {
case 9: case 10: case 13: case 32:
++idx;
i = idx;
break;
default:
idx = j;
break;
}
}
idx = j - 1;
while (idx &gt;= i) {
switch (m_bytes[idx]) {
case 9: case 10: case 13: case 32:
j = idx;
--idx;
break;
default:
idx = i -1;
break;
}
}
</pre>
<h4>
<span class="mw-headline" id="multi-byte_strings_not_fully_supported">multi-byte strings not fully supported</span>
</h4>
<ul>
<li>
fix : Thai calendar; EX:th.w:เหตุการณ์ปัจจุบัน
</li>
<li>
file: /src/core/org/luaj/vm2/LuaString.java
</li>
</ul>
<pre style='overflow:auto'>
public static LuaString valueOf(char[] chars, int off, int len) {
// COMMENTED: does not handle 2+ byte chars; assumes 1 char = 1 byte
// byte[] b = new byte[len];
// for ( int i=0; i&lt;len; i++ )
// b[i] = (byte) chars[i + off];
// return valueOf(b, 0, len);
int bry_len = 0;
for (int i = 0; i &lt; len; i++) { // iterate over chars to sum all single / multi-byte chars
int b_len = LuaString.Utf16_Len_by_char((int)(chars[i + off]));
if (b_len == 4) ++i; // 4 bytes; surrogate pair; skip next char;
bry_len += b_len;
}
byte[] bry = new byte[bry_len];
int bry_idx = 0;
int i = 0;
while (i &lt; len) {
char c = chars[i + off];
int b_len = Utf16_Encode_char(c, chars, i, bry, bry_idx);
bry_idx += b_len;
i += b_len == 4 ? 2 : 1; // 4 bytes; surrogate pair; skip next char;
}
return valueOf(bry, 0, bry_len);
}
public static String decodeAsUtf8(byte[] bytes, int offset, int length) {
// COMMENTED: does not handle 3+ byte chars
// int i,j,n,b;
// for ( i=offset,j=offset+length,n=0; i&lt;j; ++n ) {
// switch ( 0xE0 &amp; bytes[i++] ) {
// case 0xE0: ++i;
// case 0xC0: ++i;
// }
// }
// char[] chars=new char[n];
// for ( i=offset,j=offset+length,n=0; i&lt;j; ) {
// chars[n++] = (char) (
// ((b=bytes[i++])&gt;=0||i&gt;=j)? b:
// (b&lt;-32||i+1&gt;=j)? (((b&amp;0x3f) &lt;&lt; 6) | (bytes[i++]&amp;0x3f)):
// (((b&amp;0xf) &lt;&lt; 12) | ((bytes[i++]&amp;0x3f)&lt;&lt;6) | (bytes[i++]&amp;0x3f)));
// }
// return new String(chars);
return new String(bytes, offset, length, java.nio.charset.Charset.forName("UTF-8"));
}
public static int lengthAsUtf8(char[] chars) {
// COMMENTED: does not handle 3+ byte chars
// int i,b;
// char c;
// for ( i=b=chars.length; --i&gt;=0; )
// if ( (c=chars[i]) &gt;=0x80 )
// b += (c&gt;=0x800)? 2: 1;
// return b;
int len = chars.length;
int rv = 0;
for (int i = 0; i &lt; len; i++) {
int b_len = LuaString.Utf16_Len_by_char(chars[i]);
if (b_len == 4) ++i; // 4 bytes; surrogate pair; skip next char;
rv += b_len;
}
return rv;
}
public static int encodeToUtf8(char[] chars, int nchars, byte[] bytes, int off) {
// COMMENTED: does not handle 4+ byte chars; already using Encode_by_int, so might as well be consistent
// char c;
// int j = off;
// for ( int i=0; i&lt;nchars; i++ ) {
// if ( (c = chars[i]) &lt; 0x80 ) {
// bytes[j++] = (byte) c;
// } else if ( c &lt; 0x800 ) {
// bytes[j++] = (byte) (0xC0 | ((c&gt;&gt;6) &amp; 0x1f));
// bytes[j++] = (byte) (0x80 | ( c &amp; 0x3f));
// } else {
// bytes[j++] = (byte) (0xE0 | ((c&gt;&gt;12) &amp; 0x0f));
// bytes[j++] = (byte) (0x80 | ((c&gt;&gt;6) &amp; 0x3f));
// bytes[j++] = (byte) (0x80 | ( c &amp; 0x3f));
// }
// }
// return j - off;
int bry_idx = off;
int i = 0;
while (i &lt; nchars) {
char c = chars[i];
int bytes_read = Utf16_Encode_char(c, chars, i, bytes, bry_idx);
bry_idx += bytes_read;
i += bytes_read == 4 ? 2 : 1; // 4 bytes; surrogate pair; skip next char;
}
return nchars; // NOTE: code returned # of bytes which is wrong; Globals.UTF8Stream.read caches rv as j which is used as index to char[] not byte[]; will throw out of bounds exception if bytes returned
}
private static int Utf16_Len_by_char(int c) {
if ((c &gt; -1)
&amp;&amp; (c &lt; 128)) return 1; // 1 &lt;&lt; 7
else if (c &lt; 2048) return 2; // 1 &lt;&lt; 11
else if((c &gt; 55295) // 0xD800
&amp;&amp; (c &lt; 56320))
return 4; // 0xDFFF
else if (c &lt; 65536) return 3; // 1 &lt;&lt; 16
else throw new RuntimeException("UTF-16 int must be between 0 and 2097152; char=" + c);
}
public static int Utf16_Len_by_int(int c) {
if ((c &gt; -1)
&amp;&amp; (c &lt; 128)) return 1; // 1 &lt;&lt; 7
else if (c &lt; 2048) return 2; // 1 &lt;&lt; 11
else if (c &lt; 65536) return 3; // 1 &lt;&lt; 16
else if (c &lt; 2097152) return 4;
else throw new RuntimeException("UTF-16 int must be between 0 and 2097152; char=" + c);
}
public static int Utf8_Len_of_char_by_1st_byte(byte b) {// SEE:w:UTF-8
int i = b &amp; 0xff; // PATCH.JAVA:need to convert to unsigned byte
switch (i) {
case 0: case 1: case 2: case 3: case 4: case 5: case 6: case 7: case 8: case 9: case 10: case 11: case 12: case 13: case 14: case 15:
case 16: case 17: case 18: case 19: case 20: case 21: case 22: case 23: case 24: case 25: case 26: case 27: case 28: case 29: case 30: case 31:
case 32: case 33: case 34: case 35: case 36: case 37: case 38: case 39: case 40: case 41: case 42: case 43: case 44: case 45: case 46: case 47:
case 48: case 49: case 50: case 51: case 52: case 53: case 54: case 55: case 56: case 57: case 58: case 59: case 60: case 61: case 62: case 63:
case 64: case 65: case 66: case 67: case 68: case 69: case 70: case 71: case 72: case 73: case 74: case 75: case 76: case 77: case 78: case 79:
case 80: case 81: case 82: case 83: case 84: case 85: case 86: case 87: case 88: case 89: case 90: case 91: case 92: case 93: case 94: case 95:
case 96: case 97: case 98: case 99: case 100: case 101: case 102: case 103: case 104: case 105: case 106: case 107: case 108: case 109: case 110: case 111:
case 112: case 113: case 114: case 115: case 116: case 117: case 118: case 119: case 120: case 121: case 122: case 123: case 124: case 125: case 126: case 127:
case 128: case 129: case 130: case 131: case 132: case 133: case 134: case 135: case 136: case 137: case 138: case 139: case 140: case 141: case 142: case 143:
case 144: case 145: case 146: case 147: case 148: case 149: case 150: case 151: case 152: case 153: case 154: case 155: case 156: case 157: case 158: case 159:
case 160: case 161: case 162: case 163: case 164: case 165: case 166: case 167: case 168: case 169: case 170: case 171: case 172: case 173: case 174: case 175:
case 176: case 177: case 178: case 179: case 180: case 181: case 182: case 183: case 184: case 185: case 186: case 187: case 188: case 189: case 190: case 191:
return 1;
case 192: case 193: case 194: case 195: case 196: case 197: case 198: case 199: case 200: case 201: case 202: case 203: case 204: case 205: case 206: case 207:
case 208: case 209: case 210: case 211: case 212: case 213: case 214: case 215: case 216: case 217: case 218: case 219: case 220: case 221: case 222: case 223:
return 2;
case 224: case 225: case 226: case 227: case 228: case 229: case 230: case 231: case 232: case 233: case 234: case 235: case 236: case 237: case 238: case 239:
return 3;
case 240: case 241: case 242: case 243: case 244: case 245: case 246: case 247:
return 4;
default: throw new RuntimeException("invalid initial utf8 byte; byte=" + b);
}
}
public static int Utf16_Decode_to_int(byte[] ary, int pos) {
byte b0 = ary[pos];
if ((b0 &amp; 0x80) == 0) {
return b0;
}
else if ((b0 &amp; 0xE0) == 0xC0) {
return ( b0 &amp; 0x1f) &lt;&lt; 6
| ( ary[pos + 1] &amp; 0x3f)
;
}
else if ((b0 &amp; 0xF0) == 0xE0) {
return ( b0 &amp; 0x0f) &lt;&lt; 12
| ((ary[pos + 1] &amp; 0x3f) &lt;&lt; 6)
| ( ary[pos + 2] &amp; 0x3f)
;
}
else if ((b0 &amp; 0xF8) == 0xF0) {
return ( b0 &amp; 0x07) &lt;&lt; 18
| ((ary[pos + 1] &amp; 0x3f) &lt;&lt; 12)
| ((ary[pos + 2] &amp; 0x3f) &lt;&lt; 6)
| ( ary[pos + 3] &amp; 0x3f)
;
}
else throw new RuntimeException("invalid utf8 byte: byte=" + b0);
}
public static int Utf16_Encode_int(int c, byte[] src, int pos) {
if ((c &gt; -1)
&amp;&amp; (c &lt; 128)) {
src[ pos] = (byte)c;
return 1;
}
else if (c &lt; 2048) {
src[ pos] = (byte)(0xC0 | (c &gt;&gt; 6));
src[++pos] = (byte)(0x80 | (c &amp; 0x3F));
return 2;
}
else if (c &lt; 65536) {
src[pos] = (byte)(0xE0 | (c &gt;&gt; 12));
src[++pos] = (byte)(0x80 | (c &gt;&gt; 6) &amp; 0x3F);
src[++pos] = (byte)(0x80 | (c &amp; 0x3F));
return 3;
}
else if (c &lt; 2097152) {
src[pos] = (byte)(0xF0 | (c &gt;&gt; 18));
src[++pos] = (byte)(0x80 | (c &gt;&gt; 12) &amp; 0x3F);
src[++pos] = (byte)(0x80 | (c &gt;&gt; 6) &amp; 0x3F);
src[++pos] = (byte)(0x80 | (c &amp; 0x3F));
return 4;
}
else throw new RuntimeException("UTF-16 int must be between 0 and 2097152; char=" + c);
}
public static int Utf16_Encode_char(int c, char[] c_ary, int c_pos, byte[] b_ary, int b_pos) {
if ((c &gt; -1)
&amp;&amp; (c &lt; 128)) {
b_ary[ b_pos] = (byte)c;
return 1;
}
else if (c &lt; 2048) {
b_ary[ b_pos] = (byte)(0xC0 | (c &gt;&gt; 6));
b_ary[++b_pos] = (byte)(0x80 | (c &amp; 0x3F));
return 2;
}
else if((c &gt; 55295) // 0xD800
&amp;&amp; (c &lt; 56320)) { // 0xDFFF
if (c_pos &gt;= c_ary.length)
throw new RuntimeException("incomplete surrogate pair at end of string; char=" + c);
int nxt_char = c_ary[c_pos + 1];
int v = Utf16_Surrogate_merge(c, nxt_char);
b_ary[b_pos] = (byte)(0xF0 | (v &gt;&gt; 18));
b_ary[++b_pos] = (byte)(0x80 | (v &gt;&gt; 12) &amp; 0x3F);
b_ary[++b_pos] = (byte)(0x80 | (v &gt;&gt; 6) &amp; 0x3F);
b_ary[++b_pos] = (byte)(0x80 | (v &amp; 0x3F));
return 4;
}
else {
b_ary[b_pos] = (byte)(0xE0 | (c &gt;&gt; 12));
b_ary[++b_pos] = (byte)(0x80 | (c &gt;&gt; 6) &amp; 0x3F);
b_ary[++b_pos] = (byte)(0x80 | (c &amp; 0x3F));
return 3;
}
}
private static int Utf16_Surrogate_merge(int hi, int lo) { // REF: http://perldoc.perl.org/Encode/Unicode.html
return 0x10000 + (hi - 0xD800) * 0x400 + (lo - 0xDC00);
}
</pre>
<ul>
<li>
file: /src/core/org/luaj/vm2/compiler/LexState.java
</li>
<li>
proc: read_string
</li>
</ul>
<pre style='overflow:auto'>
if (c &gt; UCHAR_MAX)
lexerror("escape sequence too large", TK_STRING);
save(c, false); // NOTE: specify that c is integer and does not need conversion; EX: \128 -&gt; 128 -&gt; (char)128, not Utf8_encode(128)
</pre>
<ul>
<li>
file: /src/core/org/luaj/vm2/compiler/LexState.java
</li>
</ul>
<pre style='overflow:auto'>
void save(int c) {save(c, true);}
void save(int c, boolean c_might_be_utf8) {
int bytes_len = c_might_be_utf8 ? LuaString.Utf8_Len_of_char_by_1st_byte((byte)c) : 1;
if (bytes_len &gt; 1) { // c is 1st byte of utf8 multi-byte sequence; read required number of bytes and convert to char; EX: left-arrow is serialized in z as 226,134,144; c is currently 226; read 134 and 144 and convert to left-arrow
temp_bry[0] = (byte)c;
for (int i = 1; i &lt; bytes_len; i++) {
nextChar();
temp_bry[i] = (byte)current;
}
c = LuaString.Utf16_Decode_to_int(temp_bry, 0);
}
if ( buff == null || nbuff + 1 &gt; buff.length )
buff = LuaC.realloc( buff, nbuff*2+1 );
buff[nbuff++] = (char)c;
}
private static byte[] temp_bry = new byte[6];
</pre>
<p>
<br>
</p>
<h3>
<span class="mw-headline" id="build.xml">build.xml</span>
</h3>
<ul>
<li>
note: this change is needed to get luaj to compile with the String.format(String, double) call
</li>
<li>
file: build.xml
</li>
</ul>
<pre>
old:
&lt;javac destdir="build/jme/classes" encoding="utf-8" source="1.3" target="1.2" bootclasspathref="wtk-libs"
srcdir="build/jme/src"/&gt;
&lt;javac destdir="build/jse/classes" encoding="utf-8" source="1.3" target="1.3"
classpath="lib/bcel-5.2.jar"
srcdir="build/jse/src"
excludes="**/script/*,**/Lua2Java*,lua*"/&gt;
&lt;javac destdir="build/jse/classes" encoding="utf-8" source="1.5" target="1.5"
classpath="build/jse/classes"
srcdir="build/jse/src"
includes="**/script/*,**/Lua2Java*"/&gt;
&lt;javac destdir="build/jse/classes" encoding="utf-8" source="1.3" target="1.3"
classpath="build/jse/classes"
srcdir="build/jse/src"
includes="lua*"/&gt;
new:
&lt;javac destdir="build/jse/classes" encoding="utf-8" source="1.5" target="1.5"
classpath="lib/bcel-5.2.jar"
srcdir="build/jse/src"
excludes="**/script/*,**/Lua2Java*,lua*"/&gt;
&lt;javac destdir="build/jse/classes" encoding="utf-8" source="1.5" target="1.5"
classpath="build/jse/classes"
srcdir="build/jse/src"
includes="**/script/*,**/Lua2Java*"/&gt;
&lt;javac destdir="build/jse/classes" encoding="utf-8" source="1.5" target="1.5"
classpath="build/jse/classes"
srcdir="build/jse/src"
includes="lua*"/&gt;
</pre>
<h3>
<span class="mw-headline" id="Luaj_tests">Luaj tests</span>
</h3>
<pre style='overflow:auto'>
package org.luaj.vm2;
import org.luaj.vm2.lib.StringLib;
import junit.framework.*;
public class Xowa_tst extends TestCase {
private Xowa_fxt fxt = new Xowa_fxt();
public void test_tonumber_ws() {
fxt.Test_tonumber_int("123" , 123);
fxt.Test_tonumber_int("\t\n\r 123\t\n\r" , 123);
fxt.Test_tonumber_nil("1a");
fxt.Test_tonumber_nil("1 2");
fxt.Test_tonumber_nil("");
fxt.Test_tonumber_nil("\t\n\r \t\n\r");
}
public void test_gsub() {
fxt.Test_gsub("abc", "a", "A", "Abc"); // basic
fxt.Test_gsub("a#b", "#", "", "ab"); // match() fails when shortening string
fxt.Test_gsub("", "%b&lt;&gt;", "A", ""); // balance() fails with out of index when find is blank
}
public void test_format() {
fxt.Test_format("%.1f" , "1.23", "1.2"); // apply precision; 1 decimal place
fxt.Test_format("(%.1f)", "1.23", "(1.2)"); // handle substring; format_string should be "%.1f" not "(%.1f)"
fxt.Test_format("%0.1f" , "1.23", "1.2"); // handle invalid padding of 0
fxt.Test_format("%02.f" , "1.23", "01"); // handle missing precision
}
}
class Xowa_fxt {
public void Test_tonumber_int(String raw, int expd) {
LuaString actl_str = LuaString.valueOf(raw);
LuaInteger actl_int = (LuaInteger)actl_str.tonumber();
Assert.assertEquals(expd, actl_int.v);
}
public void Test_tonumber_nil(String raw) {
LuaString actl_str = LuaString.valueOf(raw);
Assert.assertEquals(LuaValue.NIL, actl_str.tonumber());
}
public void Test_gsub(String text, String regx, String repl, String expd) {
Varargs actl_args = StringLib.gsub_test(LuaValue.varargsOf(new LuaValue[] {LuaValue.valueOf(text), LuaValue.valueOf(regx), LuaValue.valueOf(repl)}));
Assert.assertEquals(expd, actl_args.checkstring(1).tojstring());
}
public void Test_format(String fmt, String val, String expd) {
Varargs actl_args = StringLib.format_test(LuaValue.varargsOf(new LuaValue[] {LuaValue.valueOf(fmt), LuaValue.valueOf(val)}));
Assert.assertEquals(expd, actl_args.checkstring(1).tojstring());
}
}
</pre>
<h2>
<span class="mw-headline" id="Scribunto_related">Scribunto related</span>
</h2>
<p>
None of these changes affect the luaj_xowa.jar. They are noted for comprehensiveness's sake.
</p>
<p>
Note that the $engines variable refers to /xowa/bin/any/lua/mediawiki/extensions/Scribunto/engines/
</p>
<h3>
<span class="mw-headline" id="getfenv.2Fsetfenv_deprecated">getfenv/setfenv deprecated</span>
</h3>
<ul>
<li>
add lua-compat-env
</li>
</ul>
<dl>
<dd>
source: <a href="https://github.com/davidm/lua-compat-env" rel="nofollow" class="external free">https://github.com/davidm/lua-compat-env</a>
</dd>
<dd>
target: $engines/LuaCommon/lualib/lua-compat-env .lua
</dd>
</dl>
<ul>
<li>
add alias to $engines/Luaj/mw_main.lua
</li>
</ul>
<dl>
<dd>
_G.getfenv = require 'compat_env'.getfenv
</dd>
<dd>
_G.setfenv = require 'compat_env'.setfenv
</dd>
</dl>
<ul>
<li>
change xowa.jar to load debugLibrary
</li>
</ul>
<dl>
<dd>
Globals.load(new DebugLib());
</dd>
</dl>
<h3>
<span class="mw-headline" id="loadString_deprecated">loadString deprecated</span>
</h3>
<ul>
<li>
add alias to $engines/Luaj/mw_main.lua
</li>
</ul>
<dl>
<dd>
_G.loadstring = load
</dd>
</dl>
<h3>
<span class="mw-headline" id="table.unpack_deprecated">table.unpack deprecated</span>
</h3>
<ul>
<li>
add alias to $engines/Luaj/mw_main.lua
</li>
</ul>
<dl>
<dd>
_G.unpack = table.unpack
</dd>
</dl>
<p>
<br>
</p>
<h2>
<span class="mw-headline" id="Miscellaneous_changes">Miscellaneous changes</span>
</h2>
<ul>
<li>
2015-10-11: Increased LUAI_MAXVALUES from 200 to 249 else &lt;ref&gt; fails because function citation0 in en.wikipedia.org/wiki/Module:Citation/CS1 uses more than 200 local variables
<ul>
<li>
See: <a href="http://www.lua.org/source/5.1/luaconf.h.html" rel="nofollow" class="external free">http://www.lua.org/source/5.1/luaconf.h.html</a> LUAI_MAXVARS is the maximum number of local variables per function (must be smaller than 250).
</li>
</ul>
</li>
</ul>
</div>
</div>
</div>
<div id="mw-head" class="noprint">
<div id="left-navigation">
<div id="p-namespaces" class="vectorTabs">
<h3>Namespaces</h3>
<ul>
<li id="ca-nstab-main" class="selected"><span><a id="ca-nstab-main-href" href="index.html">Page</a></span></li>
</ul>
</div>
</div>
</div>
<div id='mw-panel' class='noprint'>
<div id='p-logo'>
<a style="background-image: url(https://gnosygnu.github.io/xowa/xowa_logo.png);" href="http://xowa.org/" title="Visit the main page"></a>
</div>
<div class="portal" id='xowa-portal-home'>
<h3>XOWA</h3>
<div class="body">
<ul>
<li><a href="http://xowa.org/index.html" title='Visit the main page'>Main page</a></li>
<li><a href="http://xowa.org/screenshots.html" title='See screenshots of XOWA'>Screenshots</a></li>
<li><a href="https://www.youtube.com/watch?v=q0qbXYXEH6M" title="See a video of XOWA Desktop in action">Video</a></li>
<li><a href="http://xowa.org/home/wiki/Help/Download_XOWA.html" title='Download the XOWA application'>Download XOWA</a></li>
<li><a href="http://xowa.org/home/wiki/Dashboard/Image_databases.html" title='Download offline wikis and image databases'>Download wikis</a></li>
</ul>
</div>
</div>
<div class="portal" id='xowa-portal-started'>
<h3>Getting started</h3>
<div class="body">
<ul>
<li><a href="http://xowa.org/home/wiki/App/Setup/System_requirements.html" title='Get XOWA&apos;s system requirements'>Requirements</a></li>
<li><a href="http://xowa.org/home/wiki/App/Setup/Installation.html" title='Get instructions for installing XOWA'>Installation</a></li>
<li><a href="http://xowa.org/home/wiki/App/Import/Simple_Wikipedia.html" title='Learn how to set up Simple Wikipedia'>Simple Wikipedia</a></li>
<li><a href="http://xowa.org/home/wiki/App/Import/English_Wikipedia.html" title='Learn how to set up English Wikipedia'>English Wikipedia</a></li>
<li><a href="http://xowa.org/home/wiki/App/Import/Other_wikis.html" title='Learn how to set up other Wikipedias'>Other Wikipedias</a></li>
</ul>
</div>
</div>
<div class="portal" id='xowa-portal-android'>
<h3>Android</h3>
<div class="body">
<ul>
<li><a href="http://xowa.org/home/wiki/Android/Setup.html" title='Setup XOWA on your Android device'>Setup</a></li>
<li><a href="https://www.youtube.com/watch?v=jsMTBxGweUw" title="See a video of XOWA Android in action">Video</a></li>
</ul>
</div>
</div>
<div class="portal" id='xowa-portal-help'>
<h3>Help</h3>
<div class="body">
<ul>
<li><a href="http://xowa.org/home/wiki/Help/About.html" title='Get more information about XOWA'>About</a></li>
<li><a href="http://xowa.org/home/wiki/Help/Contents.html" title='View a list of help topics'>Contents</a></li>
<li><a href="http://xowa.org/home/wiki/Help/Media.html" title='Read what others have written about XOWA'>Media</a></li>
<li><a href="http://xowa.org/home/wiki/Help/Feedback.html" title='Questions? Comments? Leave feedback for XOWA'>Feedback</a></li>
</ul>
</div>
</div>
<div class="portal" id='xowa-portal-blog'>
<h3>Blog</h3>
<div class="body">
<ul>
<li><a href="http://xowa.org/home/wiki/Blog.html" title='Follow XOWA''s development process'>Current</a></li>
</ul>
</div>
</div>
<div class="portal" id='xowa-portal-links'>
<h3>Links</h3>
<div class="body">
<ul>
<li><a href="http://dumps.wikimedia.org/backup-index.html" title="Get wiki datababase dumps directly from Wikimedia">Wikimedia dumps</a></li>
<li><a href="https://archive.org/search.php?query=xowa" title="Search archive.org for XOWA files">XOWA @ archive.org</a></li>
<li><a href="http://en.wikipedia.org" title="Visit Wikipedia (and compare to XOWA!)">English Wikipedia</a></li>
</ul>
</div>
</div>
<div class="portal" id='xowa-portal-donate'>
<h3>Donate</h3>
<div class="body">
<ul>
<li><a href="https://archive.org/donate/index.php" title="Support archive.org!">archive.org</a></li><!-- listed first due to recent fire damages: http://blog.archive.org/2013/11/06/scanning-center-fire-please-help-rebuild/ -->
<li><a href="https://donate.wikimedia.org/wiki/Special:FundraiserRedirector" title="Support Wikipedia!">Wikipedia</a></li>
<!-- <li><a href="" title="Support XOWA! (but only after you've supported archive.org and Wikipedia)">XOWA</a></li> -->
</ul>
</div>
</div>
</div>
</body>
</html>