Vikipedio:Lua/Moduloj/URLutil/en
Ŝablona programado | Diskutoj | Lua | Testoj | Subpaĝoj | ||||
---|---|---|---|---|---|---|---|---|
Modulo | Esperanto | English | Deutsch
|
Modulo: | Dokumentado |
URLutil
– Module with functions for strings in context of internet addressing (URL; IP address – including IPv4 and IPv6 – as well as e-mail). Internationalized adresses (IRI) are also supported.
Supposing some benefit for a Wiki project, only persistent open access in world wide web is supported. Some special cases are not implemented, but hardly relevant:
Functions for templates
redaktiAll functions expect exactly one unnamed parameter (which should be provided to get a meaningful answer). Whitespace ahead and after content is ignored.
The return value is an empty string (“nothing”), if the parameter value does not fulfil the expectations. If there is a result or the query condition is true, at least one visible character will be returned. The result does not begin or end with a space, and HTML entities will be decoded.
- getAuthority
- Extract server access from a resource URL (lowercase result)
- nothing – if invalid
- getFragment
- Extract fragment (if any) from a resource URL
- Parameter 2 – (optional) decoding
2=%
– URL is %-coded2=WIKI
– URL is Wiki-coded with dots and underscore
- Result:
- nothing – if not present
- starting with
#
– if present
- getHost
- Extract domain or IP address from a resource URL (lowercase result)
- nothing – if invalid
- getLocation
- Extract resource URL without a fragment, if any
- getPath
- Extract path from a resource URL without any query or fragment.
- Beginning with
/
as basic resource identification. - getPort
- Extract port number from a resource URL (numeric result)
- nothing – if not present or invalid
- getQuery
- Extract query from a resource URL
- Parameter 2 – (optional) single parameter name
- Parameter 3 – alternative separator like
;
– default:&
- Result:
- nothing – if not present
- single value, if single parameter requested
- getRelativePath
- Extract path and query including fragment (if any) from a resource URL but relative to host.
- getScheme
- Extract scheme from a resource URL (lowercase result, including double slashes)
//
– relative protocolhttps://
– protocol- nothing – if beginning of URL is invalid
- getTLD
- Extract top level domain from a resource URL (lowercase result)
- nothing – if invalid, or IP
- getTop2domain
- Extract first two top levels of domain from a resource URL (lowercase result)
- nothing – if invalid, or IP
- getTop3domain
- Extract three top levels of domain from a resource URL (lowercase result)
- nothing – if invalid, or IP
- isAuthority
- Is it a server address (also IP) of a resource, including port?
1
– yes
- isDomain
- Is it a named domain, including sub domains?
1
– yes
- isDomainExample
- Is it an example domain defined in RFC 2606 (example.com example.edu example.net example.org)?
1
– yes
- isDomainInt
- Is it an Internationalized Domain Name (non-ASCII or Punycode)?
1
– yes
- isHost
- Is it a server address without port (also IP)?
1
– yes
- isIPlocal
- Is it an IPv4 address supposed to be local? RFC 1918, RFC 1122; even any like 0.0.0.0 (RFC 5735)
1
– yes
- isIPv4
- Is it an IPv4 address in common notation (segmentation by dots, decimal)?
1
– yes
- isIPv6
- Is it an IPv6 address?
1
– yes
- isMailAddress
- Is it an e-mail address?
1
– yes
- isMailLink
- Is it an e-mail link (mailto:)?
1
– yes
- isProtocolMW
- Is it an URL or scheme keyword, which is commonly recognized by the MediaWiki sofware?
- Current list see [1]
1
– yes
- isProtocolDialog
- Is it an URL or scheme keyword, which could be used to initiate a dialog in a Wiki?
mailto, irc, ircs, ssh, telnet
1
– yes
- isProtocolWiki
- Is it an URL or scheme keyword, which could point in a Wiki to a resource?
- Relative protocol and
ftp ftps git http https mms nntp sftp svn worldwind
- Not desired are here: gopher, wais as well as mailto, irc, ircs, ssh, telnet.
1
– yes
- isResourceURL
- Is it an URL, which provides general access to a resource? These are: relative protocol, http, https, ftp and also a valid host. Other URL might be used on project or functional pages, but not in encyclopedic context.
1
– yes
- isSuspiciousURL
- Is it an URL, which might be syntactically problematic and might trigger a warning?
1
– yes
- isUnescapedURL
- Is it an URL, where wikisyntax
[ | ]
is to be escaped?1
– yes
- isWebURL
- Is it a valid adress for a resource (any protocol)?
1
– yes
- wikiEscapeURL
- Wikisyntax-safe escaping of
[ | ]
characters.- Identical with parameter, if no problematic character present.
- Otherwise
[ | ]
replaced by webserver safe HTML entities. A pipe is not possible in plain template syntax.
Examples (test page)
redaktiA test page illustrates practical use.
Functions for Lua modules (API)
redaktiAll functions described above can be used by other modules:
local lucky, URLutil = pcall( require, "Module:URLutil" )
if type( URLutil ) == "table" then
URLutil = URLutil.URLutil()
else
-- failure; URLutil is the error message
return "<span class='error'>" .. URLutil .. "</span>"
end
Subsequently there are available:
- URLutil.getAuthority()
- URLutil.getFragment()
- URLutil.getHost()
- URLutil.getLocation()
- URLutil.getPath()
- URLutil.getPort()
numerical value, orfalse
- URLutil.getQuery()
- URLutil.getQueryTable(url, separator)
table with all assignments key=value - URLutil.getRelativePath()
- URLutil.getScheme()
- URLutil.getTLD()
- URLutil.getTop2domain()
- URLutil.getTop3domain()
- URLutil.isAuthority()
- URLutil.isDomain()
- URLutil.isDomainExample()
- URLutil.isDomainInt()
- URLutil.isHost()
- URLutil.isIP()
numerical 4, 6, orfalse
- URLutil.isIPlocal()
- URLutil.isIPv4()
- URLutil.isIPv6()
- URLutil.isMailAddress()
- URLutil.isMailLink()
- URLutil.isProtocolWM()
- URLutil.isProtocolDialog()
- URLutil.isProtocolWiki()
- URLutil.isResourceURL()
- URLutil.isSuspiciousURL()
- URLutil.isUnescapedURL()
- URLutil.isWebURL()
- URLutil.wikiEscapeURL()
If succeeding, the URLutil.get*() return a string, the URLutil.is*() true
(if no exception mentioned); on failure always false
.
Furthermore there are two string constants:
- URLutil.serial – current version ID (date)
- URLutil.suite –
"URLutil"
Usage
redaktiGeneral library; no limitations.
Dependencies
redaktiNone.
See also
redakti- mw: Uri library – other functionalities on general URI; but in particular helpful for Wiki-URL.
Antetype
redakti- de:Vorlage:URLutil - 2016-01-01, which was partielly created fromen:Module:IPAddress – 2013-03-01
- Unit tests: en:Module:IPAddress/tests