Wikipedia:Guide to Scribbling

From Wikipedia
Jump to: navigation, search

This is the Guide to Scribbling. Scribbling is the act of writing, or converting to, a template that uses the Scribunto extension to MediaWiki that was developed by Tim Starling and Victor Vasiliev.[notes 1] This Guide aims to give you a broad overview of Scribbling, and pointers to further information in various places.

Scribbled templates come in two parts: the template itself and one or more back-end modules — in the Module: namespace — that contain programs that are run on the wiki servers to generate the wikitext that the template expands to. The template invokes a function within a module using a new parser function named {{#invoke:}}.

The idea of Scribbling is to improve template processing performance. Scribbling eliminates any need for template parser function programming using parser functions such as {{#if:}}, {{#ifeq:}}, {{#switch:}}, and {{#eval:}}. All of this is instead done in the module, in a language that was actually designed to be a programming language, rather than a template system onto which was bolted various extensions over time to try to make it into a programming language.[notes 2] Scribbling also eliminates any need for templates to expand to further templates and potentially hit the expansion depth limit. A fully Scribbled template should never need to transclude other templates.[notes 3]

Lua[edit | edit source]

The language in which modules are written is Lua. Unlike the template parser function system, Lua was actually designed not only to be a proper programming language, but also to be a programming language that is suitable for what is known as embedded scripting. Modules in MediaWiki are an example of embedded scripts. There are several embedded scripting languages that could have been used, including REXX and tcl; and indeed the original aim of Scribunto was to make available a choice of such languages. At the moment, however, only Lua is available.

The official reference manual for Lua is Ierusalimschy, de Figueiredo, & Celes 2006. It's a reference, not a tutorial. Consult it if you want to know the syntax or semantics for something. For a tutorial, see either Ierusalimschy 2006 (Ierusalimschy 2003 is also available, although it is of course out of date.) or Jung & Brown 2007. The downsides to these books are that quite a lot of the things that they tell you about have no bearing upon using Lua in MediaWiki modules. You don't need to know how to install Lua and how to integrate its interpreter into a program or run it standalone. The MediaWiki developers have done all of that. Similarly, a lot of the Lua library functions are, for security, not available in modules. (It's not possible to do any file I/O or make operating system calls in MediaWiki modules, for examples.) So a lot of what these books tell you about Lua standard library functions and variables that come with the language is either irrelevant or untrue in this case.

The original API specification — the Lua standard library functions and variables that are supposed to be available in modules — is given at MW:Extension:Scribunto/API specification. However, even that is untrue. What you'll actually have available is documented in MW:Extension:Scribunto/Lua reference manual, which is a cut down version of the 1st Edition Lua manual that has been edited down and modified by Tim Starling to bring it more into line with the reality of Scribbling. Again, though, this is a reference manual, not a tutorial.

The things in Lua that you will mostly be concerned with, writing Scribbled templates, are tables, strings numbers, booleans, nil, ifthenelseend, whiledoend, forindoend (generated for), fordoend (numerical for), repeatuntil, functionend, local, return, break, expressions and the various operators (including #, .., the arithmetic operators +, -, *, /, ^, and %), and the string, math, and mw global tables (i.e. libraries).

Template structure[edit | edit source]

This is simple. Your template comprises one expansion of {{#invoke:}} in the usual case. Here is {{ref label}} for example:

<includeonly>{{#invoke:citation|reflabel}}</includeonly><noinclude>{{documentation}}</noinclude>

Even {{Harvard citation}}, one of the longest Scribbled templates, is still one expansion of {{#invoke:}}:

<includeonly>{{#invoke:citation|Harvard
|BracketLeft=(
|BracketRight=)
|PageSep=, p.&nbsp;
|PagesSep=, pp.&nbsp;
}}</includeonly><noinclude>{{documentation}}</noinclude>

If you find yourself wanting to use other templates within your template, or to use template parser functions, or indeed anything at all other than {{#invoke:}} and possibly some variables as its arguments, then you are using the wrong approach.

Module basics[edit | edit source]

Overall structure[edit | edit source]

Let's consider a hypothetical module, Module:Population. (See Module:Population clocks for a similar real, but more complex, module.) It can be structured in one of two ways:

A named local table[edit | edit source]

  1. local p = {}
  2.  
  3. function p.India(frame)
  4.     return "1,21,01,93,422 people at (nominally) 2011-03-01 00:00:00 +0530"
  5. end
  6.  
  7. return p

An unnamed table generated on the fly[edit | edit source]

  1. return {
  2.     India = function(frame)
  3.         return "1,21,01,93,422 people at (nominally) 2011-03-01 00:00:00 +0530"
  4.     end
  5. }

Execution[edit | edit source]

The execution of a module by {{#invoke:}} is actually twofold:

  1. The module is loaded and the entire script is run. This loads up any additional modules that the module needs (using the require() function), builds the (invocable) functions that the module will provide to templates, and returns a table of them.
  2. The function named in {{#invoke:}} is picked out of the table built in phase 1 and called, with the arguments supplied to the template and the arguments supplied to {{#invoke:}} (more on which later).

The first Lua script does phase 1 fairly explicitly. It creates a local variable named p on line 1, initialized to a table; builds and adds a function to it (lines 3–5), by giving the function the name India in the table named by p (function p.India being the same as saying p["India"] = function[notes 4]); and then returns (on line 7) the table as the last line of the script. To expand such a script with more (invocable) functions, one adds them between the local statement at the top and the return statement at the bottom. (Non-invocable local functions can be added before the local statement.) The local variable doesn't have to be named p. It could be named any valid Lua variable name that you like.[notes 5] p is simply conventional for this purpose, and is also the name that you can use to test the script in the debug console of the Module editor.

The second Lua script does the same thing, but more "idiomatically". Instead of creating a named variable as a table, it creates an anonymous table on the fly, in the middle of the return statement, which is the only (executed during the first phase) statement in the script. The India = function(frame)end on lines 2–4 creates an (also anonymous) function and inserts it into the table under the name India. To expand such a script with more (invocable) functions, one adds them as further fields in the table. (Non-invocable local functions can, again, be added before the return statement.)

In both cases, the template code that one writes is {{#invoke:Population|India}} to invoke the function named India from the module Module:Population. Also note that function builds a function, as an object, to be called. It doesn't declare it, as you might be used to from other programming languages, and the function isn't executed until it is called.

One can do more complex things than this, of course. For example: One can declare other local variables in addition to p, to hold tables of data (such as lists of Language or country names), that the module uses. But this is the basic structure of a module. You make a table full of stuff, and return it.

Receiving template arguments[edit | edit source]

An ordinary function in Lua can take an (effectively) arbitrary number of arguments. Witness this function from Module:Wikitext that can be called with anywhere between zero and three arguments:

function z.oxfordlist(args,separator,ampersand)

Functions called by {{#invoke:}} are special. They expect to be passed exactly one argument, a table that is called a frame (and so is conventionally given the parameter name frame in the parameter list of the function). It's called a frame because, unfortunately, the developers chose to name it for their convenience. It's named after an internal structure within the code of MediaWiki itself, which it sort of represents.[notes 6]

This frame has a (sub-)table within it, named args. It also has a means for accessing its parent frame (again, named after a thing in MediaWiki). The parent frame also has a (sub-)table within it, also named args.

  • The arguments in the (child, one supposes) frame — i.e. the value of the frame parameter to the function — are the arguments passed to {{#invoke:}} within the wikitext of your template. So, for example, if you were to write {{#invoke:Population|India|a|b|class="popdata"}} in your template then the arguments sub-table of the child frame would be (as written in Lua form) { "a", "b", class="popdata" }.
  • The arguments in the parent frame are the arguments passed to your template when it was transcluded. So, for example, were the user of your template to write {{Population of India|c|d|language=Hindi}} then the arguments sub-table of the parent frame would be (as written in Lua form) { "c", "d", language="Hindi" }.

A handy programmers' idiom that you can use, to make this all a bit easier, is to have local variables named (say) config and args in your function, that point to these two argument tables. See this, from Module:Citation:

-- This is used by template {{efn}}.
function z.efn(frame)
    local pframe = frame:getParent()
    local config = frame.args -- the arguments passed BY the template, in the wikitext of the template itself
    local args = pframe.args -- the arguments passed TO the template, in the wikitext that transcludes the template

Everything in config is thus an argument that you have specified, in your template, that you can reference with code such as config[1] and config["class"]. These will be things that tell your module function its "configuration" (e.g. a CSS class name that can vary according to what template is used).

Everything in args is thus an argument that the user of the template has specified, where it was transcluded, that you can reference with code such as args[1] and args["language"]. These will be the normal template arguments, as documented on your template's /doc page.

See {{reflist}} and {{notelist}} for two templates that both do {{#invoke:Citation|reflist|x}} but do so with different arguments in place of the x, thereby obtaining different results from one single common Lua function.

For both sets of arguments, the name and value of the argument are exactly as in the wikitext, except that leading and trailing whitespace is discounted. This has an effect on your code if you decide to support or employ transclusion/invocation argument names that aren't valid Lua variable names. You cannot use the "dot" form of table lookup in such cases. For instance: args.author-first is, as you can see from the syntax colourization here, not a reference to an |author-first= argument, but a reference to an |author= argument and a first variable with the subtraction operator in the middle. To access such an argument, use the "square bracket" form of table lookup: args["author-first"].

Named arguments are indexed in the args table by their name strings, of course. Positional arguments (whether as the result of an explicit 1= or otherwise) are indexed in the args tables by number, not by string. args[1] is not the same as args["1"], and the latter is effectively unsettable from wikitext.

Finally, note that Lua modules can differentiate between arguments that have been used in the wikitext and simply set to an empty string, and arguments that aren't in the wikitext at all. The latter don't exist in the args table, and any attempt to index them will evaluate to nil. Whereas the former do exist in the table and evaluate to an empty string, "".

Errors[edit | edit source]

Let's get one thing out of the way right at the start: Script error is a hyperlink. You can put the mouse pointer on it and click.

We've become so conditioned by our (non-Scribbled) templates putting out error messages in red that we think that the Scribunto "Script error" error message is nothing but more of the same. It isn't. If you have JavaScript enabled in your WWW browser, it will pop up a window giving the details of the error, a call backtrace, and even hyperlinks that will take you to the location of the code where the error happened in the relevant module.

You can cause an error to happen by calling the error() function.

Tips and tricks[edit | edit source]

Arguments tables are "special".[edit | edit source]

For reasons that are out of the scope of this Guide,[notes 7] the args sub-table of a frame is not quite like an ordinary table. It starts out empty, and it is populated with arguments as and when you execute code that looks for them.[notes 8] (It's possible to make tables that work like this in a Lua program, using things called metatables. That, too, is outwith the scope of this Guide.)

An unfortunate side-effect of this is that neither the pairs() nor the ipairs() functions work on an args table, unless you've actually queried all arguments by their actual names already — in which case there's little point in using pairs() or ipairs() to obtain information that you already know.

Scribunto supplies the frame:argumentPairs() function,[notes 9] that is almost a drop-in replacement for pairs(). There is, however, no drop-in replacement, almost or otherwise, for ipairs(). You have to roll your own numerical loop. Here's one from Module:Headnote that transfers all of the numbered arguments into a newargs table (so that another function can just use ipairs()):

local newargs = {}
local index = 1
while args[index] ~= nil do
    local arg = args[index]
    newargs[index] = arg
    index = index + 1
end

Copy table contents into local variables.[edit | edit source]

A name in Lua is either an access of a local variable or a table lookup.[3] math.floor is a table lookup (of the string "floor") in the (global) math table, for example. Table lookups are slower, at runtime, than local variable lookups. Table lookups in tables such as the args table with its "specialness" are a lot slower.

A function in Lua can have up to 250 local variables.[4] So make liberal use of them:

  • If you call math.floor many times, copy it into a local variable and use that instead:[4]
    local floor = math.floor
    local a = floor((14 - date.mon) / 12)
    local y = date.year + 4800 - a
    local m = date.mon + 12 * a - 3
    return date.day + floor((153 * m + 2) / 5) + 365 * y + floor(y / 4) - floor(y / 100) + floor(y / 400) - 2432046
  • Don't use args.something over and over. Copy it into a local variable and use that:
    local Tab = args.tab
    (Even the args variable itself is a way to avoid looking up "args" in the frame table over and over.)

When copying arguments into local variables there are two useful things that you can do along the way:

  • The alternative names for the same argument trick. If a template argument can go by different names — such as uppercase and lowercase forms, or different English spellings — then you can use Lua's or operator to pick the highest priority name that is actually supplied:
    local Title = args.title or args.encyclopaedia or args.encyclopedia or args.dictionary
    local ISBN = args.isbn13 or args.isbn or args.ISBN
    This works for two reasons:
    • nil is the same as false as far as or is concerned.
    • Lua's or operator has what are known as "shortcut" semantics. If the left-hand operand evaluates to something that isn't false or nil, it doesn't bother even working out the value of the right-hand operand. (So whilst that first example may at first glance look like it does four lookups, in the commonest case, where |title= is used with the template, it in fact only actually does one.)
  • The default to empty string trick. Sometimes the fact that an omitted template argument is nil is useful. Other times, however, it isn't, and you want the non-Scribbled template behaviour of missing arguments being empty strings. A simple or "" at the end of an expression suffices:
    local ID = args.id or args.ID or args[1] or ""

Don't expand templates, even though you can.[edit | edit source]

If local variables are cheap and table lookups are expensive, then template expansion is way above your price bracket.

Avoid frame:preprocess() like the plague. Nested template expansion using MediaWiki's preprocessor is what we're trying to get away from, after all. Most things that you'd do with that are done more simply, more quickly, and more maintainably, with simple Lua functions.

Similarly, avoid things like using w:Template:ISO 639 name aze to store what is effectively an entry in a database. Reading it would be a nested parser call with concomitant database queries, all to map a string onto another string. Put a simple straightforward data table in your module, like the ones in Module:Language.

Footnotes[edit | edit source]

References[edit | edit source]

Cross-references[edit | edit source]

Citations[edit | edit source]

Further reading[edit | edit source]

Lua[edit | edit source]


Cite error: <ref> tags exist for a group named "notes", but no corresponding <references group="notes"/> tag was found, or a closing </ref> is missing