This is the list of the different language editions of Wikipedia; as of March 2020[update], Wikipedia articles have been created in 309 languages, with 299 active and 10 closed.[1]
Distribution of the 51,542,076 articles in different language editions (as of 3 April 2020);[1] the majority of the articles in Swedish, Cebuano, and Waray were created by Lsjbot.[2]
Each Wikipedia has a code, which is used as a subdomain below wikipedia.org. Interlanguage links are sorted by that code. The codes represent the language codes defined by ISO 639-1 and ISO 639-3, and the decision of which language code to use is usually determined by the IETF language tag policy. Wikipedias also vary by how thinly they slice dialects and variants; for example, the English Wikipedia includes most modern varieties of English (American English, British English, Indian English, South African English, etc.), but does not include other related languages such as Scots or Anglo-Saxon, both of which have separate Wikipedias. The Spanish Wikipedia includes both Peninsular Castilian and Latin American Spanish; Malay Wikipedia includes a large number of Malay languages; and so on.
Differences between the ISO mappings and Wikipedia codes include:
WP code | WP edition name | ISO 639 code for this language | Notes |
---|---|---|---|
sq | Albanian | 'sq' is the ISO code for the Albanian macrolanguage, which includes four individual languages. | |
als | Alemannic | 'gsw' for Swiss German, Alemannic German, and Alsatian; 'gct' for Colonia Tovar dialect; 'swg' for Swabian German, and 'wae' for Walser German | 'als' is actually the ISO code for Tosk Albanian.[3] |
roa-rup | Aromanian | rup | 'roa' is the ISO code for Romance (Other). |
map-bms | Banyumasan | 'map' is the ISO code for Austronesian (Other). | |
nds-nl | Dutch Low Saxon | The Low Saxon dialects in the Netherlands have their own ISO codes. | In ISO, nds is 'Low Saxon', restricted to Germany in Ethnologue. |
bh | Bihari | ISO collective code 'bih' is a macrolanguage which includes Bhojpuri (bho), Maithili (mai), Magahi (mag), and nine others.[4] | Bihari Wikipedia excludes Maithili (mai) and Fiji Hindi (hif) which exist as independent Wikipedias. |
zh-yue | Cantonese | yue | zh is the ISO 639-1 code for Chinese in general. |
zh-classical | Classical Chinese | lzh | As above. |
ms | Malay | ISO collective code 'ms' is a macrolanguage that includes more than 30 individual languages and dialects. However, the wiki excludes Indonesian because Indonesian Wikipedia (id) exists independently. | |
zh-min-nan | Min Nan | nan | Same as the "zh" languages. Not written in Chinese characters, but uses Pe̍h-ōe-jī or a derived romanization. hak does the same. |
no | Norwegian Bokmål | nb, nob | ISO uses no for Norwegian in general. (Norwegian Nynorsk is at 'nn' in both ISO and Wikipedia.) |
ksh | Ripuarian | none | ISO ksh is for the Kölsch language, the most prominent dialect of the Ripuarian language group. The other variants (e.g. the Aachen dialect) do not have ISO codes. |
bat-smg | Samogitian | sgs | 'bat' is the ISO code for Baltic (Other). |
simple | Simple English | none | |
roa-tara | Tarantino | none | 'roa' is the ISO code for Romance (Other). Neapolitan in general is nap. |
fiu-vro | Võro | vro | fiu is the ISO code for Finno-Ugric languages. |
cbk-zam | Zamboanga Chavacano | none | ISO cbk is for the Chavacano language. The individual variants do not have ISO codes. |
Additionally, Wikipedias vary in orthography at times. Chinese Wikipedia automatically translates from modern Mandarin Chinese into five standard forms: Mainland China and Singapore in simplified Chinese characters, and Taiwan, Hong Kong, and Macau in traditional Chinese characters. Belarusian, however, has a separate Wikipedia for the 'normative' orthography (be) and Taraškievica (be-tarask).
The table below lists the language editions of Wikipedia roughly sorted by the number of active users (registered users who have made at least one edit in the last thirty days); in particular, the "power of ten" of the count of active users (i.e., the common logarithm rounded down to a whole number) is used: so "5" means at least 105 (or 100,000), "4" means at least 104 (10,000), and so on. Script names are listed as their ISO codes.