Albanian ( ) is an Indo-European language spoken by nearly 6 million people,[1] primarily in Albania and Kosovo but also in other areas of the Balkans in which there is an Albanian population, including the west of Republic of Macedonia, Montenegro, and southern Serbia. Albanian is also spoken by communities in Greece, along the eastern coast of southern Italy, and on the island of Sicily. Additionally, speakers of Albanian can be found elsewhere throughout the latter two countries resulting from a modern diaspora, originating from the Balkans, that also includes Scandinavia, Switzerland, Germany, United Kingdom, Turkey, Australia, New Zealand, Canada and the United States. An estimated 3 million Albanians are believed to be the total of the diaspora concentrated mostly in Western Europe and North America.
Albanian was proven to be an Indo-European language in 1854 by the philologistFranz Bopp. The Albanian language constitutes its own branch of the Indo-European language family.
Establishing longer relations, Albanian is often compared to Balto-Slavic on the one hand and Germanic on the other, both of which share a number of isoglosses with Albanian. Moreover, Albanian has undergone a vowel shift in which stressed, long o has fallen to a, much like in the former and opposite the latter. Likewise, Albanian has taken the old relative jos and innovatively used it exclusively to qualify adjectives, much in the way Balto-Slavic has used this word to provide the definite ending of adjectives.
Other linguists link Albanian with Greek and Armenian, while placing Germanic and Balto-Slavic in another branch of Indo-European.[5][6][7]
Albanian is spoken by nearly 6 million people[1] mainly in Albania, Kosovo, Italy (Arbereshe), Republic of Macedonia, Montenegro, Greece (Arvanites or Arvanitians), Turkey, Bulgaria and Romania; and by immigrant communities in many countries such as Belgium, Egypt, Germany, Greece, Italy, Sweden, Turkey (Europe), Russia, Ukraine, the United Kingdom, Canada, the United States, and Australia.
Official status
Albanian in a revised form of the Tosk dialect is the official language of Albania and Kosovo; and is official in the municipalities where there are more than 22% ethnic Albanian inhabitants in the Republic of Macedonia. It is also an official language of Montenegro where it is spoken in the municipalities with ethnic Albanian populations.
Dialects
Albanian can be divided into three dialects: Gheg and Tosk with a transitional dialect zone between them.[8]
The Shkumbin river is roughly the dividing line, with Gheg spoken north of the Shkumbin and Tosk south of it. The Gheg literary language has been documented since 1462. Until the Communists took power in Albania, the standard was based on Gheg. Although the literary versions of Tosk and Gheg are mutually intelligible, many of the regional dialects are not.
Gheg is divided into four sub-dialects: Northwest Gheg, Northeast Gheg, Central Gheg, and Southern Gheg. Northwest Gheg is spoken in all of Montenegro, Lezhė, Mirditė, Pukė and Shkodėr. Northeast Gheg is spoken in all of Kosovo, Has, Kukės and Tropojė. Central Gheg is spoken in Debar, Gostivar, Krujė, Peshkopi, Mat, Struga and Tetovo. Southern Gheg is spoken in Durrės, Elbasan, Kavajė and Tirana.
The transitional dialects are spoken in Cėrrik, Dumresė, Polisit, Lushnjė, Rajcė, Shpatit, Sulovė and Vėrēės. They have features of both Tosk and Gheg, including the rhotacism of Tosk and the nasal vowels of Gheg.
Tosk is divided into five sub-dialects: Northern Tosk, Labėrisht, Ēam, Arvanitika and Arbėrisht. Northern Tosk is spoken in Berat, Fier, Gramsh, Kolonjė, Korēė, Ohrid, Pogradec, Prespa and northern Vlorė. Labėrisht is spoken in southern Vlorė, Dukat, Tepelenė, Himarė, Mallakastėr, Pėrmet, Delvinė, Gjirokastėr and Sarandė. Ēam is spoken in extreme southern Albania such as Xarrė and northern Greece. Arvanitika is spoken in southern Greece by the Arvanites in Joanina, Paramithia, Filat, Margarit, Arta, Preveza, Kastoria, Florina, Parga. Arbėrisht is spoken by the Arbėreshė, descendants of 15th and 16th century immigrants in southeastern Italy, in small communities in the regions of Sicily, Calabria, Basilicata, Campania, Molise, Abruzzi, and Puglia. Tosk sub-dialects are spoken by most members of the large Albanian immigrant communities of Egypt, Turkey, Ukraine, and all of Europe.
Gheg and Tosk differ mainly by:
rhotacism - Gheg has n where Tosk has r
late Proto-Albanian ? + tautosyllabic nasal > Gheg low-central or low-back vowel; > Tosk mid-central, or low-front-to-central vowel
Proto-Albanian ? > uo > Gheg vo, Tosk va
infinitival use of verbal adjective preceded in Gheg by me and in Tosk by pėrtė
difference in lexemes, noun plurals, suppletion of the aorist system of the verb
Subdialects may vary based on:
retention or loss of final schwa (-ė)
devoicing of final voiced segments
treatment of intervocalic and final nj
treatment of clusters of nasal + voiced stop
development of anaptyctichomorganic stops after nasals that follow a stressed vowel and precede unstressed -ėl or -ėr
treatment of vowel clusters ie, ye, and ua
treatment of stressed /e/ before a nasal
Notable phonological and lexicological differences between Tosk and Gheg
Standard form
Tosk form
Gheg form
Translation
Shqipėri
Shqipėri
Shqypni/ Shipni/ Shqipni
Albania
njė
njė
nji/njā/njo
a/one
nėntė
nėntė/nėndė
nāndė/nant/non
nine
ėshtė
ėshtė
āsht/osht/ā
is
bėj
bėj
bāj/boj
do
emėr
emėr
źmėn
name
pjekuri
pjekuri
pjekuni
maturity
gjendje
gjėndje
gjźndje
situation
zog
zok
zog
bird
mbret
mbret
mret
king
pėr tė punuar
pėr tė punuar
me punue/me punau
to work
rėrė
rėrė
rānė/zall
sand
qenė
qėnė
kźnė / kānė
been (part.)
dėllinjė
enjė
bėrshź
juniper
baltė
baltė
bāltė / lloē
mud
cimbidh
mashė
danė
tongs
sy
sy/si
sy/sö
eye
( ? ) denotes nasal vowels, which are a common feature of Gheg.
Sounds
Standard Albanian has 7 vowels and 29 consonants. Gheg uses long and nasal vowels which are absent in Tosk. Another peculiarity is the mid-central vowel "ė" reduced at the end of the word. The stress is fixed mainly on the penultimate syllable.
The palatal stops and have no English equivalent, so the pronunciation guide is approximate. Palatal stops can be found in other languages, for example, in Hungarian (where these sounds are spelled ty and gy respectively).
The palatal nasal corresponds to the sound of the Spanish ń or the French or Italian digraph gn (as in gnocchi). It is pronounced as one sound, not a nasal plus a glide.
The ll sound is a velarised lateral, close to English dark L.
The contrast between flapped r and trilled rr is the same as in Spanish. English does not have either of the two sounds phonemically (but tt in butter is pronounced as a flap r in most American dialects).
The letter ē can be spelt ch on American English keyboards, both due to its English sound. (Usually, however, it's spelled simply c or more rarely q, which may cause confusion ; however, meanings are usually understood).
The definite article can be in the form of noun suffixes, which vary with gender and case.
For example in singular nominative, masculine nouns add -i, or those ending in -g/-k, take -u (to avoid palatalization):
mal (mountain) / (the mountain);
libėr (book) / libri (the book);
zog (bird) / zogu (the bird).
Feminine nouns take the suffix -(j)a:
veturė (car) / vetura (the car);
shtėpi (house) / shtėpia (the house);
lule (flower) / lulja (the flower).
Neuter nouns take -t.
Albanian has developed an analytical verbal structure in place of the earlier synthetic system, inherited from Proto-Indo-European. Its complex system of moods (6 types) and tenses (3 simple and 5 complex constructions) is distinctive among Balkan languages. There are two general types of conjugation. In Albanian the constituent order is subject verb object and negation is expressed by the particles nuk or s in front of the verb, for example:
Toni nuk flet anglisht "Tony does not speak English" ;
Toni s'flet anglisht "Tony doesn't speak English" ;
Nuk e di "I do not know" ;
S'e di "I don't know".
In imperative sentences, the particle mos is used :
Mos harro "do not forget!".
However, with verbs in the non-active form (forma joveprore), the verb is often in sentence-initial position :
Parashikohet njė ndėrprerje "An interruption is anticipated".
Some were borrowed through Late Latin, while others came from the Ostrogothic expansion into parts of Praevalitana around Nak?i? and the Gulf of Kotor in Montenegro.
shkulkė "branch indicating a pasture is off limits" < Goth skulka "guardian"
shkumė "foam" < Goth sc?ma
tirq "trousers" < Late Latin tubrucus < Goth *žiobroc "knee-britches"; cf. OHG dioh-bruoh
The earliest accepted document in the Albanian language is from the 15th century AD.
The earliest reference to a Lingua Albanesca is from a 1285 document of Ragusa. This is a time when Albanian Principalities start to be mentioned and expand inside and outside the Byzantine Empire. It is assumed that Greek and Balkan Latin (which was the ancestor of Romanian and other Balkan Romance languages), would exert a great influence on Albanian. Examples of words borrowed from Latin: qytet < civitas (city), qiell < caelum (sky), mik < amicus (friend).
After the Slavs arrived in the Balkans, another source of Albanian vocabulary were the Slavic languages, especially Bulgarian. The rise of the Ottoman Empire meant an influx of Turkish words; this also entailed the borrowing of Persian and Arabic words through Turkish. Surprisingly the Persian words seem to have been absorbed the most. Some loanwords from Modern Greek also exist especially in the south of Albania. A lot of the loaned words have been resubstituted from Albanian rooted words or modern Latinized (international) words.
Albanian has been written using many different alphabets since the 15th century. The earliest written Albanian records come from the Gheg area in makeshift spellings based on Italian or Greek and sometimes in Turko-Arabic characters. Originally, the Tosk dialect was written in the Greek alphabet and the Gheg dialect was written in the Latin alphabet. They have both also been written in the Ottoman Turkish version of the Arabic alphabet, the Cyrillic alphabet, and some local alphabets.
In 1908 an official, standardized Albanian spelling was developed, based on a Gheg dialect and using the Latin alphabet with the addition of the letters ė, ē, and nine digraphs. After World War II the official language changed in that it adopted the Tosk dialect as its model.
History
Linguistic affinities
The Albanian language is a distinct Indo-European language that does not belong to any other existing branch. Sharing lexical isoglosses with Greek, Balto-Slavic, and Germanic, the word stock of Albanian is quite distinct. Hastily tied to Germanic and Balto-Slavic by the merger of PIE *? and *? into *? in a supposed "northern group",[11] Albanian has proven to be distinct from the other two groups as this vowel shift is only part of a larger push chain that affected all long vowels.[12] Albanian does share with Balto-Slavic two features: a lengthening of syllabic consonants before voiced obstruents and a distinct treatment of long syllables ending in a sonorant.[13] However, Albanian is best known for its singular conservatism, having retained the distinction between active and middle voice, present and aorist, three series of tectal consonants before front vowels (e.g., palatals, velars, and labio-velars), and initial PIE *h4 as an h.[14]
The origin of the ethnonymAlbanian is of some dispute. It appears for the first time in the 2nd c. AD in Late Greek as Albanoķ (later Byz Gk Arbanitai) and thereafter in similar forms, including obsolete Albanian arbėr/arbėn "Albanian"; however, these last two stem directly from Vulgar Latin *Albanus, most likely borrowed from Greek Albanoķ; the adjective too, arbėresh/arbėnesh, are derived from Latin albanensis. This same name appears in Slavic and was used to name the town of Labėri "Laberia", from South Slavic laban?ja, from earlier *olban?ja.
While it is considered established that the Albanians originated in the Balkans, the exact location from which they spread out is hard to pinpoint. Despite varied claims, it seems that the Albanians came from slightly farther north (Kosovo) and inland (Northwest Skopje) than would suggest the present borders of Albania, with a homeland concentrated in the mountains. The purely linguistic reasons are listed below.
The Jire?ek Line divides the areas of the Balkans which were under Latin (North) and Greek (South) influence.
First, Albanian has few early Greek borrowings, most of which are from the Northwest dialect, probably via the islands off the coast of Albania, e.g. WGk (Doric) m?khanį gave Alb mokėr "mill" and WGk drįpanon gave Alb drapėr "sickle".
Similarly, the Illyrian coast is not a likely source since Albanian has no inherited nautical or indigenous seafaring terminology, and has instead supplemented this absence with subsequent borrowing from Latin or Greek or recent metaphorical lexical creations.
Third, toponyms along the coast, in contrast with native penultimate accent (ex: mbėsė "niece" < PA nep?'tia), often show substratal antepenultimate accent (ex: Durrės < Dśrrhachium; Pojanė < Apóllonia), though there are some exceptions (Vlorė < Aulón? vs. Greek Aślon).
Also, some consider Albanian to be the source for a small number of grammatical and lexical similarities shared by otherwise dissimilar languages including Romanian, Bulgarian, Serbo-Croatian, and to some extent Greek. Based on their extent of grammaticalization, these include: the postposition of articles, the presence and grammatical use of schwa, object reduplication, admirative through verbal constructions, and the loss of infinitives.
Finally, few if any Proto-Albanian place names exist in what was the former Roman province of Illyria.
Instead, given the overwhelming amount of shepherding and mountaineering vocabulary as well as the extensive influence of Latin, it is more likely the Albanians come from north of the Jire?ek Line, on the Latin-speaking side, perhaps in part from the late Roman province of Dardania from the western Balkans. However, archaeology has more convincingly pointed to the early Byzantine province of Praevitana (modern northern Albania) which shows an area where a primarily shepherding, transhumance population of Illyrians retained their culture. This area was based in the Mat district and the region of high mountains in Northern Albania, as well as in Dukagjin, Mirditė, and the mountains of Drin, from where the population would descend in the summer to the lowlands of western Albania, the Black Drin (Drin i zi) river valley, and into parts of Old Serbia. Indeed, the region's complete lack of Latin place names seems to imply little latinization of any kind and a more likely spot for the early medieval heart of Albanian territory, following the collapse of the Illyrian province.
Linguistic influences
The period during which Proto-Albanian and Latin interacted was protracted and drawn out over six centuries, 1st c. AD to 6th or 7th c. AD. This is born out into roughly three layers of borrowings, the largest number belonging to the second layer. The first, with the fewest borrowings, was a time of less important interaction. The final period, probably preceding the Slavic or Germanic invasions, also has a notably smaller amount of borrowings. Each layer is characterized by a different treatment of most vowels, the first layer having several that follow the evolution of Early Proto-Albanian into Albanian; later layers reflect vowel changes endemic to Late Latin and presumably Proto-Romance. Other formative changes include the syncretism of several noun case endings, especially in the plural, as well as a large scale palatalization.
A brief period followed, between 7th c. AD and 9th c. AD, that was marked by heavy borrowings from Southern Slavic, some of which predate the "o-a" shift common to the modern forms of this language group. Starting in the latter 9th c. AD, a period followed characterized by protracted contact with the Proto-Romanians, or Vlachs, though lexical borrowing seems to have been mostly one sided - from Albanian into Romanian. Such borrowing indicates that the Romanians migrated from an area where the majority was Slavic (i.e. Middle Bulgarian) to an area with a majority of Albanian speakers, i.e. Dardania, where Vlachs are recorded in the 10th c. AD. Their movement is probably related to the expansion of the Bulgarian empire into Albania around that time. This fact places the Albanians at a rather early date in the western or central Balkans.
Historical considerations
Indeed, the center of the Albanians remained the river Mat, and in 1079 AD they are recorded in the territory between Ohrid and Thessalonika as well as in Epirus.
Furthermore, the major Tosk-Gheg dialect division is based on the course of the Shkumbin River, a seasonal stream that lay near the old Via Egnatia. Since rhotacism postdates the dialect division, it is reasonable that the major dialect division occurred after the Christianization of the Roman Empire (4th c. AD) and before the eclipse of the East-West land-based trade route by Venetian seapower (10th c. AD).
References to the existence of Albanian as a distinct language survive from the 1300s, but without recording any specific words. The oldest surviving documents written in Albanian are the "Formula e Pagėzimit" (Baptismal formula), "Un'te paghesont' pr'emenit t'Atit e t'Birit e t'Spirit Senit." (I baptize thee in the name of the Father, and the Son, and the Holy Spirit) recorded by Pal Engjelli, Bishop of Durrės in 1462 in the Gheg dialect, and some New Testament verses from that period.
The oldest known Albanian printed book, Meshari or missal, was written by Gjon Buzuku, a Roman Catholic cleric, in 1555. The first Albanian school is believed to have been opened by Franciscans in 1638 in Pdhanė. In 1635, Frang Bardhi wrote the first Latin-Albanian dictionary.