|Spoken in:||Albania, Serbia (mainly Kosovo), Montenegro, FYROM, Greece, Turkey, Italy, and other countries|
|Total speakers:||6,000,000 (Ethnologue, 2005)|
|Language family:|| Indo-European|
|Writing system:||Latin alphabet (Albanian variant)|
|Official language of:||Albania, Serbia (Kosovo), Montenegro, and Republic of Macedonia|
|Regulated by:||no official regulation|
|ISO 639-2:||alb (B)||sqi (T)|
sqi — Albanian (generic)
aln — Gheg
aae — Arbëreshë
aat — Arvanitika
als — Tosk
|Note: This page may contain IPA phonetic symbols in Unicode. See IPA chart for English for an English-
based pronunciation key.
Albanian (gjuha shqipe IPA /ˈɟuˌha ˈʃciˌpɛ/) is a language spoken by about 6 million people, primarily in Albania, Serbia (Kosovo), Montenegro, and the Republic of Macedonia but also in other parts of the Balkans, along the eastern coast of Italy and in Sicily, as well as by a significant diaspora in Scandinavia, Germany, the United Kingdom, Egypt, Australia, Turkey, and the United States. The language forms its own distinct branch of the Indo-European language family.
Albanian was proven to be an Indo-European language in 1854 by the German philologist Franz Bopp. The Albanian language is its own independent branch of the Indo-European language family with no living close relatives (even though there are many dialects of Albanian, many distant and remote). There is no scholarly consensus over its origin. Some scholars maintain that it derives from the Illyrian language while others claim that it derives from Daco-Thracian (Illyrian and Daco-Thracian, however, might have been closely related languages; see Thraco-Illyrian). A recent linguist has even stated that Illyrian and Thracian may have been as close as Czech to Slovak (Paliga, 2002).
How Albanian compares with other languages
|Albanian||muaj||i ri / e re||nënë||motër||natë||hundë||tre||i zi / e zezë||i kuq||i verdhë||i blertë / i gjelbër||ujk|
|Other Indo-European languages|
|Latin||mēnsis||novus||māter||soror||nox||nasus||trēs||āter, niger||ruber||flāvus, gilvus||viridis||lupus|
|Welsh||mis||newydd||mam||chwaer||nos||trwyn||tri||du (/di/)||coch, rhudd||melyn||gwyrdd, glas||blaidd|
Note: Aside from kuq, verdhë, and gjelbër, these Albanian words are directly inherited from Proto-Indo-European. Albanian motër is cognate with the Indo-European root for "mother", unique in having the meaning shift to "sister".
Note: The Albanian words for "yellow" and "green" are very similar to the Romanian and Italian for "green" and "yellow", and several of the other languages show similarities with one of the two words (for example, German gelb, but not grün). Does this show a similarity with Latin or with other languages of the Balkans? The inversion must have occurred when "green" and "yellow" were not considered distinct colours much as we consider both "turquoise" (cyan) and "indigo" (primary blue) to be "blue", with the result that the Albanian words for "green" and "yellow" appear switched relative to the other languages. However, some Gheg speakers of older generations may use "verdhë" to refer to "green", or even "blue". Note also Albanian and Welsh words for "red", the term being borrowed from Latin coccum "scarlet". So, "verdhë" is used for yellow, seeing the distinction that is made by most the world.
Albanian is spoken by about 6 million people mainly in Albania, Kosovo and the FYROM, and by immigrant communities in many countries such as Belgium, Egypt, Germany, Greece, Italy, Sweden, Turkey (Europe), Ukraine, UK and USA.
Template:Cleanup-section Albania has hundreds of dialects. However, the dialects can be divided into two main dialects, Gheg and Tosk. The Shkumbin river is roughly the dividing line, north of the Shkumbin is where Gheg is spoken and south of the Shkubin is where Tosk is spoken. The Gheg literary language has been documented since 1555. Until the communists took power in Albania, the standard was based on Gheg. Although the literary versions of Tosk and Gheg are mutually intelligible, many of the regional dialects are not.
Tosk is divided into many dialects. The main groups are Northern Tosk (Berat, Pojan, Vlorë) and Labërisht Labëria. In Greece, the Çam and the Arvanites speak different Tosk dialects with the dialect of the Arvanites only partially intelligible with other Tosk dialects. The Tosk dialects are spoken by most members of the large Albanian immigrant communities of Ukraine, Turkey, Egypt, and the United States. Tosk dialects called Arbërisht are spoken by the Arbëreshë, descendants of 15th and 16th century immigrants in southeastern Italy, in small communities in the provinces of Sicily, Calabria, Basilicata, Campania, Molise, Abruzzi, and Puglia.
Gheg (or Geg) is spoken in Northern Albania, Macedonia, Kosovo, and in parts of Montenegro. Each area of northern Albania has its own dialect and can be divided into dialect groups: Tirana, Durrës, Elbasan and Kavaja; Kruja and Laci; Mati, Dibra and Mirdita; Lezha, Shkodra, Kraja, Ulqinj; etc. Malësia e Madhe, Rugova, and villages scattered alongside the Adriatic Coast form the northmost dialect of Albania today although, Albanian was formerly spoken in Dalmatia until recently. There are many other dialects in the region of Kosovo and in parts of southern Montenegro, and in FYROM. The dialects of Malsia e Madhe and Dukagjini near Shkodra are being lost because the younger generations prefer to speak the dialect of Shkodra.
Gheg and Tosk differ mainly by:
- rhotacism - Gheg has n where Tosk has r
- late Proto-Albanian â + tautosyllabic nasal > Geg low-central or low-back vowel; > Tosk mid-central, or low-front-to-central vowel
- Proto-Albanian ô > uo > Gheg vo, Tosk va
- infinitival use of verbal adjective preceded in Gheg by me and in Tosk by për të
- difference in lexemes, noun plurals, suppletion of the aorist system of the verb
Subdialects may vary based on:
- retention or loss of final schwa (-ë)
- devoicing of final voiced segments
- treatment of intervocalic and final nj
- treatment of clusters of nasal + voiced stop
- development of anaptyctic homorganic stops after nasals that follow a stressed vowel and precede unstressed -ël or -ër
- treatment of vowel clusters ie, ye, and ua
- treatment of stressed /e/ before a nasal
Notable lexicological differences between Tosk and Gheg
|Standard form||Tosk form||Gheg form||Translation|
|një||një||nji / njo||one|
|është||është||âsht / â||is|
|për të punuar||për të punuar||me punue||to work|
|qenë||qënë||kjênë / kânë||been (part.)|
|baltë||llum||bâltë / lloç||mud|
( ˆ ) denotes nasal vowels, which are a common feature of Gheg.
Albanian has 7 vowels and 29 consonants. Gheg has a set of nasal vowels which are absent in Tosk. Another peculiarity is the mid-central vowel "ë" reduced at the end of the word. The stress is fixed mainly on the penultimate syllable.
|plosive||p b||t d||c ɟ||k ɡ|
|fricative||f v||θ ð||s z||ʃ ʒ||h|
|affricate||ʦ ʣ||ʧ ʤ|
|lateral approximant||l ɫ|
- The affricates are pronounced as one sound (a stop and a fricative at the same point).
- The palatal stops q and gj are completely unknown to English, so the pronunciation guide is approximate. Palatal stops can be found in other languages, for example, in Hungarian (where these sounds are spelt ty and gy respectively).
- The palatal nasal nj corresponds to the sound of the Spanish ñ or the French or Italian digraph gn (as in gnocchi). It is pronounced as one sound, not a nasal plus a glide.
- The ll sound is a velarised lateral, close to English "dark L".
- The contrast between flapped r and trilled rr is the same as in Spanish. English does not have any of the two sounds phonemically (but tt in butter is pronounced as a flap r in most American dialects).
- (1) The letter ç can be spelt ch on American English keyboards, both due to its English sound, but more importantly, due to analogy with Albanian xh, sh, zh and also, the relation of c/ch (IPA: ʦ/ʧ) to x/xh (IPA: ʣ/ʤ). (Usually, however, it's spelt simply c, which may cause confusion; however, meanings are usually understood).
Albanian nouns are inflected by gender (masculine, feminine and neuter) and number (singular and plural). There are 4 declensions with 6 cases (nominative, accusative, genitive, dative, ablative and vocative), although the vocative only occurs with a limited number of words. The cases apply to both definite and indefinite nouns and there are numerous cases of syncretism. The equivalent of a genitive is formed by using the prepositions i/e/të/së with the dative.
The following shows the declension of the masculine noun mal (mountain):
|Indefinite Singular||Indefinite Plural||Definite Singular||Definite Plural|
|Nominative||mal (mountain)||male (mountains)||mali (the mountain)||malet (the mountains)|
|Genitive||i/e/të/së mali||i/e/të/së maleve||i/e/të/së malit||i/e/të/së maleve|
The following table shows the declension of the feminine noun vajzë (girl)
|Indefinite Singular||Indefinite Plural||Definite Singular||Definite Plural|
|Nominative||vajzë (girl)||vajza (girls)||vajza (the girl)||vajzat (the girls)|
|Genitive||i/e/të/së vajze||i/e/të/së vajzave||i/e/të/së vajzës||i/e/të/së vajzave|
- The definite article can be in the form of noun suffixes, which vary with gender and case.
- For example in singular nominative, masculine nouns add -i or -u:
- mal (mountain) / mali (the mountain);
- libër (book) / libri (the book);
- zog (bird) / zogu (the bird).
- Feminine nouns take the suffix -(j)a:
- veturë (car) / vetura (the car);
- shtëpi (house) / shtëpia (the house);
- lule (flower) / lulja (the flower).
- For example in singular nominative, masculine nouns add -i or -u:
- Neuter nouns take -t.
Albanian has developed an analytical verbal structure in place of the earlier synthetic system, inherited from Proto-Indo-European. Its complex system of moods (6 types) and tenses (3 simple and 5 complex constructions) is distinctive among Balkan languages. There are two general types of conjugation. In Albanian the Constituent Order is Subject Verb Object and negation is expressed by the particles nuk or s' in front of the verb, for example:
- Toni nuk flet anglisht "Tony doesn't speak English";
- s'e di "I don't know".
In imperative sentences, the particle mos is used:
- mos harro "do not forget!".
There are Albanian words which have cognates (of non-Latin origin) in Romanian and there is a theory that the Dacian language, spoken by the Dacians before the Romanisation was a language related to proto-Albanian.
The Illyrian Vocabulary is very small and is mainly of peoples names and places. Only a few Illyrian words are cited in Classical sources by Roman or Greek writers:
- brisa, "husk of grapes"; cf. Alb bërsi
- mantía, "bramblebush"; cf. Alb (Tosk) mën "mulberry bush", (Gheg) mandë
- oseriates, "lakes"; akin to Old Church Slavonic ozero (Serb-Croat jezero), Lith ẽžeras, OPruss assaran, Gk Akéroun "river in the underworld"
- rhinos, "fog, cloud"; cf. OAlb ren, mod. Alb re "cloud"
- sabaia, sabaium, sabaius, "a type of beer"; akin to Eng sap, Lat. sapere "to taste", Skt sabar "sap, juice, nektar", Avest. višāpa "having poisonous juices", Arm ham, Greek apalós "tender, delicate", Old Church Slavonic sveptǔ "bee's honey"
- sibina (Lat.), sibyna (Lat.), sybina (Lat.); σιβυνη (Gk.), σιβυνης (Gk.), συβινη (Gk.), ζιβυνη (Gk.): "a hunting spear", generally, "a spear", "pike"; an Illyrian word according to Festius, citing Ennius; is compared to συβηνη (Gk.), "flute case", a word found in Aristophanes' Thesmophoriazusai; the word appears in the context of a barbarian speaking. Akin to Farsi zôpîn, Arm səvīn "spit"
- tertigio, "merchant"; akin to Alb tregë "market", Old Church Slavonic trĭgĭ (Serb-Croat trg), Lith tirgus
Some additional words have been extracted by linguists from toponyms, hydronyms, anthroponyms, etc.:
- lugo, "a pool"
- teuta <from the Illyrian personal name Teuta< PIE *teuta-, "people"
- Bosona, "running water" (Possible origin of the name "Bosnia", in Bosnian; Bosna)
The earliest accepted documentation in the Albanian language is from the 15th century AD, even though recently claims have been made for documents dating late 12th to have been found, one in the Vatican Library and another one in the Athos Monastry in Greece. Church documents in Latin have passages mentioning "la Lingua Albanesca" in the 12th century as well. This is a time when Albanian Principalities start to be mentioned and expand inside and outside the Byzantine Empire. It is assumed that Greek and Balkan Latin (which was the ancestor of Romanian and other Balkan Romance languages), would exert a great influence on Albanian. Examples of words borrowed from Latin: qytet < civitas (city), qiell < caelum (sky), mik < amicus (friend). But note Illyrian God of friendship was called "Mikon". The Illyrian Goddes of hunting "Zana", "Thana", or "Dhiana" to be seen in the Roman deities as "Diana" is always pictured with a goat. Note "Dhiana" meaning "Lady of the Goats" in Albanian.
After the Slavs arrived in the Balkans, another source of Albanian vocabulary were the Slavic languages, especially Bulgarian. The rise of the Ottoman Empire meant an influx of Turkish words; this also entailed the borrowing of Persian and Arabic words through Turkish. Surprisingly the Persian words seem to be absorbed the most. Some loanwords from Modern Greek also exist especially in the south of Albania. A lot of the loaned words have been resubstituted from Albanian rooted words or modern Latinized (international) words. The random use of loaned words decreases everyday and is considered as "Villagers Talk" and is made fun of.
Even though a lot of foreign influences have come and gone the proper Albanian Language has managed to survive and earn its own category in the Indo-European tree of languages. The language structure makes it an old language and whether it is the descendand of an Illyrian Dialect or not has divided the linguists today.
- Full article: Albanian alphabet
Albanian has been written with many different alphabets since the 15th century. Originally, the Tosk dialect was written with the Greek alphabet and the Gheg dialect was written with the Latin alphabet. They have both also been written with the Ottoman Turkish version of the Arabic alphabet, the Cyrillic alphabet, and some local alphabets.
Template:Indo-European topics The Albanian language has been variously attached to Illyrian and Messapian, both of which were probably related. Only the latter, to a small extent, has left any evidence that may in any way liken it to Albanian. Consider the Messapic words bilia (Alb bijë "daughter"), brendon "deer" (Alb bri, brî "horn", pl. brirë, brinë), klaohi 'listen!' (Alb quaj, quej "to call, give a name"), kos (Alb kush "who"), veinan (Alb vehte "self"), venas (Alb uri, û "hunger"), etc. Messapian settlements are known to have existed along the Adriatic in both Italy and Illyria, especially around Durrës. Indeed, Messapian has left several words in Italian or in neighboring Italo-Roman languages, including manzo "ox" (cf. Alb mëz, mâz "poney"), northern bagola, bagula (cf. Alb bajgë "dung"), dialectal musso "ass" (cf. Alb mushk "mule").
Even the name Albanian is of some dispute. Appearing in the 9th c. in Greek as the Arvanoi, and thereafter under similar names, including obsolete Albanian arbër or arbën, it had been presumed to stem from Vulgar Latin Albanus, from the southern Illyrian tribal name Albanoí. However, others like Orel attach it instead to a slight corruption of Labëri "Laberia", from South Slavic labanĭja, from olbanĭja. The name Tosk, Alb toskë, was borrowed from Venetian tosko "rough, crude", literally "Tuscan".
The trouble of a homeland for the Albanians becomes all the more problematic. Despite Albanian nationalist claims to the contrary, the Albanians almost certainly came from farther north and inland than would suggest the present borders of Albania. First, Albanian has few early Greek borrowings, most of which are from the Northwest, e.g. WGk (Doric) mākhaná gave Alb mokër "mill" and WGk drápanon gave Alb drapër "sickle". Indeed, the very word for Greek, gërk, was borrowed from South Slavic; cf. Bulg. grŭk, Serb-Croat gr"k. Similarly, the Illyrian coast is not a likely source since Albanian has no inherited nautical or indigenous sea-faring terminology, and has instead supplemented this absence with subsequent borrowing from Latin or Greek or recent metaphorical lexical creations. Third, toponyms along the coast, in contrast with native penultimate accent (ex: mbësë "niece" < PA nepô'tia), often show substratal antepenultimate accent (ex: Durrës < Dúrrhachium; Pojanë < Apóllonia), though there are some exceptions (Vlorë < Aulónâ vs. Greek Aúlon). Also, Albanian is believed to be the source for a number of grammatical and lexical similarities shared by otherwise dissimilar languages including Romanian, Bulgarian, Serbo-Croatian, and to some extent Greek. Also, there is a lack of Proto-Albanian place names in Illyria. Likewise, the word shqa, from Lat Sclavus "Slav" refers only to Bulgarians.
Instead, given the overwhelming amount of shepherding and mountaineering vocabulary as well as the extensive influence of Latin, it is more likely the Albanians come from north of the Jireček line, on the Latin-speaking side, perhaps from the late Roman province of Dardania from the western Balkans. The Northern Albanian Alps are referred to as Bjeshkët e Namena, and this region's name is believed by some to come from Proto-Albanian beškai tâi, giving Alb bjeshkë "mountain", borrowed ultimately from Vulgar Latin pastica "pasture".
Yet, one area in the late Roman province of Praevitana (modern northern Albania) seems to show an area where a primarily shepherding, transhumance population of Illyrians retained their culture. This area was based in the Mat district and the region of high mountains in Northern Albania, as well as in Dukagjin, Mirditë, and the mountains of Drin, from where the population would descend in the summer to the lowlands of western Albania, the Black Drin (Drin i zi) river valley, and into parts of Old Serbia. Indeed, the region's complete lack of Latin place names seems to imply little latinization of any kind and a more likely spot for the origin of Albanian.
The period in which Proto-Albanian and Latin interacted was protracted and drawn out over six centuries, 1st c. AD to 6th or 7th c. AD. This is born out into roughly three layers of borrowings, the largest number belonging to the second or middle layer. The first, with the fewest borrowings, was a time of less important interaction. The final period, probably preceding the Slavic or Germanic invasions, also has a notably smaller amount of borrowings. Each layer is characterized by a different treatment of most vowels, the first layer having several that follow the evolution of Early Proto-Albanian into Albanian; later layers reflect vowel changes indemic to Late Latin and presumably Proto-Romance. Other formative changes include the syncretism of several noun case endings, especially in the plural, as well as the largescale palatalization.
After this period followed a period, 7th c. AD to 9th c. AD, in which Slavic borrowings were most common, some of which predate the o-a shift in Southern Slavic, though evidently not as much as Romanian had made. Following this period was a stage of protracted contact with the Proto-Romanians, though the borrowing seems to have been mostly one sided - from Albanian into Romanian. This indicates the Romanians interacted longer with the Slavs and then moved into an area with a majority of Albanian speakers, since presumably this would explain the one-way borrowing. This places the Albanians in the Western or Central Balkans, probably in the center and the Romanians further to the East, close perhaps to the Bulgarians. Indeed, the best match for the Slavic cognates borrowed into Romanian is Middle Bulgarian.
Combined with archaeology and history, it seems likely that the core of Albanian territory lay in a quadrilateral with vertices at Bar, Prizren, Ohrid, and Vlorë during the Middle Ages. Indeed, the center of the Albanians remained the river Mat, and in 1079 AD they are recorded in the territory between Ohrid and Thessalonika as well as in Epirus; Albanian place names from a large portion of Macedonia and parts of Serbia indicate former Albanian territories.
Furthermore, the major Tosk-Gheg dialect division is based on the course of the Shkumbin River, a seasonal stream that lay near the old Via Egnatia. Since rhotacism postdates the dialect division, it is reasonable that the major dialect division occurred after the christianization of the Roman Empire (4th c. AD) and before the eclipse of the East-West land-based trade route by Venetian seapower (10th c. AD).
References to the existence of Albanian as a distinct language survive from the 1300s, but without recording any specific words. The oldest surviving documents written in Albanian are the "Formula e Pagëzimit" (Baptismal formula), "Un'te paghesont' pr'emenit t'Atit e t'Birit e t'Spirit Senit." (I baptize thee in the name of the Father, and the Son, and the Holy Spirit) recorded by Pal Engjelli, Bishop of Durres in 1462 in the Gheg dialect, and some New Testament verses from that period.
The oldest known Albanian printed book, Meshari  or missal, was written by Gjon Buzuku, a Roman Catholic cleric, in 1555. The first Albanian school is believed to have been opened by Franciscans in 1638 in Pdhanë. In 1635, Frang Bardhi wrote the first Latin-Albanian dictionary.