Module:Wt/haw/languages/doc
This is the documentation page for Module:Wt/haw/languages
Add search text after the "incategory..." filter:
This module is used to retrieve and manage the languages that can have Wiktionary entries, and the information associated with them. See Wiktionary:Languages for more information.
For the languages and language varieties that may be used in etymologies, see Module:Wt/haw/etymology languages. For language families, which sometimes also appear in etymologies, see Module:Wt/haw/families.
This module provides access to other modules. To access the information from within a template, see Module:Wt/haw/languages/templates.
The information itself is stored in the various data modules that are subpages of this module. They are listed in Category:Language data modules. These modules should not be used directly by any other module, the data should only be accessed through the functions provided by Module:languages.
Finding and retrieving languages
editThe module exports a number of functions that are used to find languages.
getByCode
editgetByCode(code)
Finds the language whose code matches the one provided. If it exists, it returns a Language
object representing the language. Otherwise, it returns nil
.
getByCanonicalName
editgetByCanonicalName(name)
Finds the language whose canonical name (the name used to represent that language on Wiktionary) or other name matches the one provided. If it exists, it returns a Language
object representing the language. Otherwise, it returns nil
.
The canonical name of languages should always be unique (it is an error for two languages on Wiktionary to share the same canonical name), so this is guaranteed to give at most one result.
This function and the following are powered by Module:Wt/haw/languages/by name, which goes through the Category:Language data modules for non-etymology languages and creates one table of canonical names with their language codes and another table with both canonical names and other names.
getByName
editgetByCanonicalName(name)
Like getByCanonicalName()
, except it also looks at the otherNames
listed in the non-etymology language data modules.
getAll
editgetAll()
- This function is expensive
Returns a table containing Language
objects for all languages, sorted by code.
This function searches through the whole database of languages, and is therefore relatively resource-intensive. It should be used sparingly.
Language objects
editA Language
object is returned from one of the functions above. It is a Lua representation of a language and the data associated with it. It has a number of methods that can be called on it, using the :
syntax. For example:
local m_languages = require("Module:Wt/haw/languages")
local lang = m_languages.getByCode("fr")
local name = lang:getCanonicalName()
-- "name" will now be "French"
Language:getCode
edit:getCode()
Returns the language code of the language. Example: "fr"
for French.
Language:getCanonicalName
edit:getCanonicalName()
Returns the canonical name of the language. This is the name used to represent that language on Wiktionary, and is guaranteed to be unique to that language alone. Example: "French"
for French.
Language:getAllNames
edit:getAllNames()
Returns a table of all names that the language is known by, including the canonical name. The names are not guaranteed to be unique, sometimes more than one language is known by the same name. Example: {"French", "Modern French"}
for French.
Language:getType
edit:getType()
Returns the type of language, which can be "regular"
, "reconstructed"
or "appendix-constructed"
.
Language:getWikimediaLanguages
edit:getWikimediaLanguages()
Returns a table containing WikimediaLanguage
objects (see Module:Wt/haw/wikimedia languages), which represent languages and their codes as they are used in Wikimedia projects for interwiki linking and such. More than one object may be returned, as a single Wiktionary language may correspond to multiple Wikimedia languages. For example, Wiktionary's single code sh
(Serbo-Croatian) maps to four Wikimedia codes: sh
(Serbo-Croatian), bs
(Bosnian), hr
(Croatian) and sr
(Serbian).
The code for the Wikimedia language is retrieved from the wikimedia_codes
property in the data modules. If that property is not present, the code of the current language is used. If none of the available codes is actually a valid Wikimedia code, an empty table is returned.
Language:getWikipediaArticle
edit:getWikipediaArticle()
Returns the name of the Wikipedia article for the language. This is either the wikipedia_article
property in the data modules, or the category name returned by :getCategoryName
.
Language:getScripts
edit:getScripts()
Returns a table of Script
objects for all scripts that the language is written in. See Module:Wt/haw/scripts.
Language:getFamily
edit:getFamily()
Returns a Family
object for the language family that the language belongs to. See Module:Wt/haw/families.
Language:getAncestors
edit:getAncestors()
Returns a table of Language
objects for all languages that this language is directly descended from. Generally this is only a single language, but creoles, pidgins and mixed languages can have multiple ancestors.
Language:getCategoryName
edit:getCategoryName()
Returns the name of the main category of that language. Example: "French language"
for French, whose category is at Category:French language.
Language:makeEntryName
edit:makeEntryName(term)
Converts the given term into the form used in the names of entries. This removes diacritical marks from the term if they are not considered part of the normal written form of the language, and which therefore are not permitted in page names. It also removes certain punctuation characters like final question marks or periods which are never present in page names. Example for Latin: "amō"
→ "amo"
(macron is removed).
The replacements made by this function are defined by the entry_name
setting for each language in the data modules.
Language:makeSortKey
edit:makeSortKey(term)
Creates a sort key for the given, following the rules appropriate for the language. This removes diacritical marks from the term if they are not considered significant for sorting, and may perform some other changes. Any initial hyphen is also removed, and anything parentheses is removed as well.
The replacements made by this function are defined by the sort_key
setting for each language in the data modules.
Language:transliterate
edit:transliterate(text, sc, module_override)
Transliterates the text from the given script into the Latin script (see Wiktionary:Transliteration and romanization). The language must have the translit_module
property for this to work; if it is not present, nil
is returned.
The sc
parameter is handled by the transliteration module, and how it is handled is specific to that module. Some transliteration modules may tolerate nil
as the script, others require it to be one of the possible scripts that the module can transliterate, and will show an error if it's not one of them. For this reason, the sc
parameter should always be provided when writing non-language-specific code.
The module_override
parameter is used to override the default module that is used to provide the transliteration. This is useful in cases where you need to demonstrate a particular module in use, but there is no default module yet, or you want to demonstrate an alternative version of a transliteration module before making it official. It should not be used in real modules or templates, only for testing. All uses of this parameter are tracked by Template:Wt/haw/tracking/module override.
Language:hasTranslit
edit:hasTranslit()
Returns true
if the language has a transliteration module, false
if it doesn't.
Language:getRawData
edit:getRawData()
- This function is not for use in entries or other content pages.
Returns a blob of data about the language. The format of this blob is undocumented, and perhaps unstable; it's intended for things like the module's own unit-tests, which are "close friends" with the module and will be kept up-to-date as the format changes.
Error function
editerr(lang, param, text)
Looks at a supposed language code passed through a template parameter and returns a helpful error message depending on whether the language code has a valid form (two or three lowercase basic Latin letters, two or three groups of three lowercase basic Latin letters separated by hyphens).
Add the parameter value in argument #1 and the parameter name in argument #2. For instance, if parameter 1
of the template is supposed to be a language code, this function can be called the following way:
local m_languages = require("Module:Wt/haw/languages")
local lang = m_languages.getByCode(frame.args[1]) or m_languages.err(frame.args[1], 1)
If you would like the error message to say something other than "language code", place the phrase in argument #3.