With the events in Iran today, people are more and more exposed to the Persian language, in signs and speeches.  Some people might, therefore, be interested in learning more about the language (or even in learning the language itself).  I thought those people might be interested in an introductory essay about the language.

The name of the language

The common name of the language in English is "Persian", after Persia, an older name for the country of Irân.  Very often you will see the language called Fârsi, however.

Fârsi is the name given to the language by the Persian-speaking people of Irân.  However, Persian is not only spoken in Irân; it is also spoken in Afghânistân (where it is called Dari) and in Tâjikistân (where it is called Tâjiki).  The differences are comparable to the differences between Australian, British, Irish, and American English -- they are all the same language, with some differences in pronunciation and usage.

Fârsi is an adjective formed from the name of the province of Fârs -- formerly Pârsa -- which is the part of Irân that lies along the coast of the Persian Gulf (khalij-e Fârs), including the city of Shirâz, famous for its poets.  In classical times Fârs was called Persis (a modification of Pârsa), and was famous as the region from which both Cyrus the Great, founder of the first Persian Empire (550-330 BCE) and Ardashir, founder of the second Persian Empire (226-651 CE) originated -- hence the entire country came to be called Persia, after the province of Persis.  Fârsi and Persian are therefore, for most purposes, equivalent terms.

The role of the language today

Persian is the most widely spoken language in Irân today: it is used in TV broadcasts, newspapers, in public debates and speeches, and of course online.  More than half of all Iranians use it as a native language; it is a second language for almost anybody else in Irân.  There is probably nowhere in Irân where you cannot find someone who will understand you if you speak Persian.  There are however significant areas where other languages are spoken: Azeri Turkish and Kurdish in the northwest, Baluchi in the southwest, Arabic in regions bordering southern Iraq, and various other languages in different pockets and corners of the country.  Even in these regions, however, there will be a significant Persian-speaking population.

History of the language
Persian is an Indo-European language, that is, it is part of one of the largest language superfamilies on Earth, one which includes English, Russian, Spanish, Gaelic, Greek, Lithuanian, Albanian, Armenian, and Hindi.  Because Persian is related to all of these languages, many of its words will seem very familiar to an English-speaker.  For instance, the Persian word for "mother" is mâdar and the word for "brother" is barâdar.  These resemblances are not for the most part accidental (a few of them are, like the resemblance between English "bad" and the Persian word bad, which means the same thing) -- they mostly arise from the fact that English and Persian have a common ancestor, and in some cases the words have not changed very much over the intervening millennia since the languages began to drift apart.

The first speakers of an Indo-European language (called Proto-Indo-European), are thought to have lived some five or six thousand years ago in the grasslands north of the Black Sea.  They were totally illiterate, so everything we know about their language is reconstructed on the basis of the surviving languages (first written down no earlier than c. 1400 BCE); but we do have some archeological evidence.  This shows that they were among the first peoples to tame and breed horses, and to attach carts and chariots to them.  This gave them a mobility that other peoples of that age could not match; and as a result, they moved very quickly into regions both to their west (including almost all the regions of Europe) and to their east.  As the various Indo-European peoples were cut off from each other by the vast distances separating them, their languages also came to differ from each other, until at last -- probably not later than 2000 BCE -- they were totally unintelligible to each other.

The Indo-European peoples who moved to the East were, in part, the ancestors of most of the peoples of Irân and Afghanistân and some of the peoples of India.  They are therefore called "Indo-Iranians".  They moved initially from the steppes to the region north of the Caspian and Aral Seas, and then down into the river valleys that in those days fed the Aral Sea.  Moving up the rivers they came into the hill country of Tâjikistân and Afghânistân, and then filtered west into Irân.  By the 16th century BCE, one group had come as far west as northern 'Irâq, where we have the earliest evidence of an Indo-Iranian language preserved in cuneiform.

Another group crossed the mountains and entered northern India and Pakistân, in what is now the Panjâb.  They spoke an early version of what we now call Sanskrit; and some of their hymns and poems have been preserved, through a careful oral tradition, down to the present day.  They lost all contact with their relatives back on the other side of the mountains, and soon became very different in language and culture: Indian, as opposed to Iranian.

Some centuries after the departure of these early Sanskrit speakers, a religious prophet arose in what is now Afghânistân, whose name was Zarathushtra.  He preached a doctrine in which two powerful forces, Good and Evil, Truth and Lies, battled each other for dominion over the world and the souls of mankind.  His teaching brought about a remarkable cultural reformation among all the Iranian peoples.  How long it took his doctrines to spread is unknown, but in the end there was no land where Iranian languages were spoken that was not unaffected.  Today we call the religion he propagated "Zoroastrianism" (after a Greek alteration of his name, Zoroaster).

The earliest evidence we have for an Iranian language is in certain ancient hymns (Gâthas) attributed to Zarathushtra.  These are in a language called "Avestan", after a later name for the collection of religious writings in which they were included.  The Avestan language preserved in these hymns is remarkably similar to Sanskrit in overall structure and in much of its vocabulary, though the sounds of both languages had changed so much that it is unlikely that they were mutually intelligible.  However, the religious ideas of the hymns of the two languages could not be more different: the Sanskrit Vedas teach a polytheistic religion with a vast pantheon of devas, or deities; the Avestan Gâthas preach the worship of one god, the "Wise Lord" Ahura Mazda, and regard the daêvas (so the word appears in Avestan) as the wicked servants of Angra Mainyush, the Evil Spirit.

In these Gâthas we also see the first use of the word airya to describe the Iranian peoples.  The corresponding Sanskrit word means something like "noble" or "praiseworthy"; but in Irân it came to be used as the name of the whole people, contrasted to the anairya, the non-Iranians.  From the genitive plural airyânem ("of the Airyas") came the name Êrân (i.e., land of the Airyas), which in modern Fârsi is pronounced Irân.

About four centuries after Zarathushtra (though we have no firm historical dates to fix his time) the Iranian peoples further west began to organize themselves into powerful political entities that historians call "empires": first the Medians (centered around the area of northern Irân, including modern Tehrân), around 650 BCE, and then the Persians under Cyrus (Kurush) the Great, in 550 BCE.  Cyrus conquered the Medes, and also Mesopotamia, the Levant, and Asia Minor, forming the first "Persian Empire".

Old Persian
Up to this point the Iranian peoples had been wholly illiterate; the Gâthas of Zarathushtra were transmitted orally.  Under Cyrus, or perhaps one of his successors, however, the Persians adopted a much-simplified form of the cuneiform script, and used it for inscriptions on everything from massive monuments to tableware.  This script is essentially syllabic: it represents combinations of consonant+vowel, and though ambiguous in some ways fairly accurately represents the sounds of the language.  The Old Persian language revealed in these inscriptions is the direct ancestor of modern Persian (the related Avestan language, written down centuries later, is more like a great-granduncle).  It is, however, very different, having a complex system of declensions and conjugations, as well as a much system of sounds that was greatly simplified later on.  To take one example: the Old Persian word for "king" was khshâyathya.  After many centuries, it eventually arrived at the modern Persian word, shâh -- a vast reduction!

The cuneiform writing fell out of use after the conquest of the first Persian Empire by Alexander the Great.  For over a century after Alexander, the Iranian peoples were under the control of Macedonian Greek overlords; then came the Parthian dynasty, ruled by a people from eastern Iran who spoke a language related to but distinct from Persian. In this period literate people in the region tended to use the Aramaic language and writing to keep records.  Gradually, however, the Aramaic script was modified to allow the writing of Persian -- a Persian which, however had changed a great deal from the Old Persian of Cyrus and Darius, and had moved far in the direction of modern Persian.

Middle Persian
This "Middle Persian" language flourished under the Sâsâni dynasty, which emerged from Persia to rule an area somewhat larger than modern Irân in 226 CE. The modified Aramaic script used to write this language is called "Pahlavi" (literally: "Parthian") and the language is sometimes called Pahlavi too.  Unfortunately, the script is often unclear, as many letters representing different sounds have the same or nearly the same shapes, resulting in a high degree of ambiguity.  The later holy books of the Zoroastrian religion are written in a form of this script.  Luckily, we have texts in Middle Persian written down in another script, this one designed by the Manichaeans (members of a religion that was founded by the prophet Mani in the 3rd century CE) which is far less ambiguous; this helps us understand the sounds of Middle Persian much better.

In its latest form, Middle Persian was not immensely different from modern Persian; many words are nearly identical. Among its archaic features are the presence of an initial "w" sound, which has since become "b" or "g" in modern Persian; e.g. wuzurg "big" (from Old Persian wazrka) has become modern Persian bozorg. Similarly, the Middle Persian "g" at the ends of words has been lost from modern Persian when it followed a vowel (the letter h is written instead, but not pronounced): e.g. frêshtag "angel", becomes modern Persian fereshteh.  This example also illustrates another characteristic of modern Persian, the tendency to split up the consonants of an initial consonant cluster by putting a vowel between them; that even happened with words borrowed much more recently from western languages, e.g. France -> Ferânseh.

There are many other differences in the quality of the vowels, but a lot of these belong to the succeeding state of the language.

New Persian
The second Persian Empire was overthrown in the mid-7th century by the Arabs of the early Caliphate.  These Arabs brought in both Islâm and their own alphabet, which, as modified and improved in succeeding centuries, became a much more effective tool for writing than the Pahlavi alphabet had been.

For a long time, however, very little literature was written in Persian; the exceptions were in the religious literature of the Zoroastrians, but they were a dwindling sect without power.  There were many highly educated Persian Muslims, but for centuries they preferred to write in Arabic, to the literature of which they contributed very significantly.

Nonetheless, Persian continued to be the dominant spoken language of the Iranian region, in this respect differing greatly from the languages of other areas where Arabic had been introduced.  In Mesopotamia and the Levant, the dominant Aramaic language was almost entirely replaced by its sister Arabic; it survives today only in a few villages.  In Egypt, where Coptic was the native language in the 7th century, Arabic was less successful at first but more so in the long run; by the end of the Middle Ages, Coptic was dead as a spoken language, only remaining in the religious writings of the Coptic Christians.

The predominance of literary Arabic in Irân continued through the Umayyad Caliphate and well into the Abbasid Caliphate, down to the 10th century.  At that time, the Abbasid Caliphate was breaking up; its furthest provinces were breaking away, and among these was the Persian-speaking province of Transoxiana (approximately in the area of modern Uzbekistân).  Here, in the mid-900s, the Sâmânids, a dynasty of Iranian origin, began to cultivate the writing of Persian poetry.  One of them called for a huge project: the writing of a vast epic which would celebrate all of Iranian history, as recorded in the surviving records of the Sâsânians and the Zoroastrian writings, from the beginning of civilization down to the end of the Sâsânid dynasty.  This was started by the poet Daqiqi, and completed by the poet Ferdowsi; and by the time it was done (CE 1010), the Sâmâni dynasty had been overthrown by the Turkish Ghaznavids.  It is called the Shâhnâmeh, and is probably the most famous of all Persian poems.

But the loss of Persian independence did not lead to an end of the Persian renaissance; in fact, the new style of Persian, written in Arabic letters, containing many Arabic words, flourished and attained a remarkable degree of literary uniformity which it maintains to this day.  Texts 1000 years old can be read by the modern speaker of Persian with only modest difficulty; contrast the complete unintelligibility of an Old English text like Beowulf (a poem of comparable age) to modern English speakers.

Persian spread even beyond Irân itself.  The Turks adopted it as a literary language, along with several Persian customs, and they brought Persian with them into Turkey and the Balkans.  Mehmet the Conqueror was quoting a Persian poet when he rode into fallen Constantinople, and there are to this day people as far away as Bosnia and Albania who have names of Persian origin.

On the other side of Irân, Persian was carried into northern India by the conquering armies of various Muslim peoples, and became the administrative language of Moghal India for several centuries.  One can find Persian words and names throughout India, and a great deal of Hindi (and even more of its sister dialect, Urdu) consists of Persian loanwords.

Modern Persian
Despite its slow rate of change, Persian has changed in pronunciation in less striking ways.   The short i and u of New Persian are now pronounced e and o; the long ê has merged with long î and likewise long ô has merged with long û (in Irân; the distinction is maintained in Afghânistân and Tâjikistân).  The diphthongs ay and aw are now pronounced ey and ow.  In colloquial speech, the sequences ân is now generally pronunced un (oon). But on the whole, there is very little difference in pronunciation.

There are, however, a host of colloquial forms and contractions of words and inflections which stand side-by-side with the literary forms: e.g. -in for -id, -e for -ad, -an for -and, migam for miguyam, miyâm for miyâyam, mirim for miravim, and so forth.   These are commonly used, but less commonly represented in writing. The more literary forms will always be understood, even if they sound a bit stiff.  So we can say that the Persian of 2009 is one and the same language as the Persian of 1009, in the same way that our English is the same language as that of Shakespeare.

Written Persian

Persian is written in the Arabic alphabet, with the addition of four letters to represent sounds which did not exist in Arabic.  All of these sounds, however, are found in English: they are p, ch (as in chess), zh (the s-sound in measure) and g (as in go).  They were placed in the Arabic alphabet after the most similar sound existing in Arabic: p after b, ch after j, zh after z, and g after k.  They are written similarly to the preceding Arabic letter, but with the addition of three dots, or (in the case of g) with a line drawn on top.  Another change from the Arabic alphabet is the sequence of the last three letters; in Arabic they go h, w, y but in Persian they go v, h, y.

The Persian alphabet therefore has 32 letters, and is written from right to left.  In the transliteration I'm using, these letters are:

'(1), b, p, t(1), s(1), j, ch, h(1), kh, d, z(1), r, z(2), zh, s(2), sh, s(3), z(3), t(2), z(4), '(2), gh, f, q, k, g, l, m, n, v, h(2), y.

You will notice something funny about this at once: there are 2 t's, 2 h's, 3 s's, and 4 z's!  And what's that apostrophe about?

The reason for there being so many letters with the same sounds is that the Persians could not pronounce Arabic properly; and when they came across a difficult sound, they substituted the closest sound they had.  As a result, it's a lot easier for a Westerner to pronounce Arabic the Persian way than it is to do it the Arabic way!

With a tiny number of exceptions, when there are multiple letters for the same sound, one of them is used in all words of Persian origin, and the rest are used in Arabic words wherever it is etymologically appropriate.  Thus t(1) is Persian, t(2) is Arabic; h(2) is Persian, h(1) is Arabic; s(2) is Persian, while s(1) and s(3) are Arabic; z(2) is Persian, and z(1,3,4) are Arabic; '(1) is Persian and '(2) is Arabic.  (All the "Persian" letters can, of course, be used in Arabic too.)  

Each of the letters has its own name, so it's easy to tell them apart when a word is spelled out; in Fârsi, the names of the letters are:

alef, beh, peh, teh, seh, jim, cheh, heh, kheh, dâl, zâl, reh, zeh, zheh, sin, shin, sâd, zâd, tâ, zâ, eyn, gheyn, feh, qâf, kâf, gâf, lâm, mim, nun, vâv, heh, yeh.

The only two letters with the same name are the two heh's: the first ح is sometimes called heh-jimi, because it looks similar to the letter jim چ, only without the dot; the second ه is called heh-do-chashm or "two-eyed heh" because in some forms it consists of two loops.

Almost all the sounds of Persian can be found in English.  The apostrophe ' represents either the letters alef or eyn, and, when not completely silent (as those letters are when put at the beginning of a word) is pronounced like the catch in one's throat that one can feel at the end of the "uh" in the exclamation "uh-oh!"

The only consonant sounds of Persian that don't occur in English are kh, gh, and q.  Kh is the same sound as the ch in the name of Bach, the composer, when pronounced in the German way, or the ch in the name of the Jewish holiday Chanukah, when pronounced as in Hebrew.  Gh is the same sound, but with additional voice; that is, it has the same relationship to kh as a hard g (as in go) does to k.

The q-sound was originally a kind of k pronounced very far back in the throat, but by most Iranians it is pronounced the same as gh.

The vowel sounds of Persian are almost all found in English, or close enough.  They are:

a: the sound of a in "hat"
â: the sound of aw in "law"
e: the sound of e in "pet"
i: the sound of i in "machine"
o: the sound of o in "only" (but even shorter)
u: the sound of u in "lunar"

There are also two diphthongs:
ey: the sound of ey in "hey"
ow: the sound of ow in "grow"

According to traditional grammars, the vowels a, e, and o are "short vowels", and the vowels â, i, and u are "long vowels".  This contrast has very little to do with the modern pronunciation, but two things should be noticed: one, when there is variation in the pronunciation of a word, it is usually between the sounds a, e and o (for instance, chashm "eye" is also pronounced cheshm); two, the long vowels are all written out as full letters, but the short vowels are not written at all.

â is written with the letter alef;
i is written with the letter yeh;
u is written with the letter vâv.

The diphthongs ey and ow are also written with yeh and vâv respectively.

This means that in order to read Persian, you need to memorize both the spelling and the pronunciation of a word separately.  E.g., the word bozorg ("big") is actually written bzrg -- and only experience tells you that it is pronounced bozorg and not bazarg or bezorg (neither of which exists) or something else.  However, this is mastered fairly easily; Persian texts for children often use small diacritics next to the letter to indicate whether it is followed by a, e, or o; most tools for teaching Persian to Westerners write out the word in Roman-letter transcription along with the Arabic script.  In fact one could learn to speak Persian from a Roman-letter transcription alone, though your ability to function in society would be severely compromised (as you'd be effectively illiterate).

About learning Persian

One thing to stress to anybody who decides to learn a little Persian is this: it is an easy language.  It is far, far easier than Arabic; perhaps that's one reason that Persian became so successful at spreading all the way from the Balkans to Bangladesh, even though those areas were never under the rule of any actual Iranians!  It has some irregularities, as all languages do; but in structure and in syntax, it is quite a bit more regular than English is.

Like French, Persian has an inflected verb system, but the system is far simpler than French.   One only has to remember six endings (one for each person and number) and a few prefixes.  There are some irregular verb formations (comparable to English go, went, gone; eat, ate, eaten; see, saw, seen) but one only has to remember two forms for each verb (the present stem and the past stem), not three as in English.

Persian nouns and pronouns are not inflected at all, not even to the extent that they are in English (John, John's; I, me, my; he, him, his).  There are a few suffixes used optionally to mark possession or relation, as in barâdar-am "my brother" next to barâdar-e man, literally "brother-(of)-me".  But there is very little in the grammar of Persian that is a strain on the memory; one does not have to memorize page after page of paradigms, and there are no arbitrarily varying declensions or conjugations. There isn't even any grammatical gender; u, the third person pronoun, means both "he" and "she".

The word order is simple, straightforward, and quite consistent, though different from English, being Subject-Object-Verb rather than Subject-Verb-Object.  The last word in a sentence is generally an inflected verb.

One can thus progress very quickly from beginnings to reading or speaking fairly complex sentences.  The biggest difficulty in mastering more complicated Persian communications, particularly writing, is the abundance of Arabic loanwords.  These words play the same role in Persian that words of Latin and Greek origin do in English -- they are longer, often (though not always) have a more "educated" feel, and carry with them a whole different set of rules.  Not only are there occasional plural forms as unfamiliar (and frequently optional) as radii beside radiuses, or indices beside indexes -- for example madâres "schools", plural of madraseh, or olum "sciences", plural of elm; but there are complex relations between verbal nouns, e.g. one must learn the relationship betwen ettehâd "union" and mottahed "united" or entekhâb "choice" and montakhab "chosen", at least for a deeper understanding of the relations between words.  But you can also just learn these as individual words, without worrying about their structure or relationships.

      are one of the best things about DK.  We have a really broad wealth of knowledge withing this community.

      Had to correct someone in a previous diary who said that Israelis and Persians were alike because they were both Semetic.

      Most look at the language and think it is related to Arabic, and thus a semitic language, simply because they use the same script.

      That way it will show up in the DK University diary series that comes out every week.

      My first mother-in-law was Persian.  My father-in-law went away to fight WWII and came home with a beautiful bride of some distinction.  So my children are 1/4 Persian.  I learned a few words in Farsi from her, and some nursery rhymes and songs that mostly had to do with the scatological functions of infants.  I can't really pull them up in my memory banks after 30+ years, although my children can.  Still, they don't know nearly enough about where their grandmother came from or the language she spoke.

      I'm fairly well versed in Arabic, but not Farsi.  Your diary is very intriguing to me.  Thank you so much for writing it.

      Calling bullshit on "bracing rhetorical thrusters" since Fall 2006....

      by Got a Grip on Thu Jun 18, 2009 at 03:05:17 PM PDT

      I'm learning Persian at the moment and am getting a ton of practice in due to recent events, and am very much looking forward to one day visiting Iran.

      The first time I heard Persian was after putting a lot of time into Turkish, and in the beginning it sounded like 20% Turkish, but spoken softer and farther back in the throat, and with words in between that I couldn't make out (because I only knew common vocab to Turkish and Persian at the time).

      Anyway, definite rec for the diary and fully agree with Jake McIntyre's comment just below yours.

  •  Languages in Iran (11+ / 0-)

    Linked is one of the better maps highlighting the linguistic diversity and cosmopolitan nature of Iran:

  •  Relationship between Vedism and Zoroastrianism (13+ / 0-)

    is more complicated than you state. Both belief systems arise out of a common Indo-European religious substrate.

    Vedism, however, is the religion of the warrior caste, with the endorsement of the religious/scholar caste - apropos for a religion whose rites and hymns were recorded by a conquering elite.

    Zoroastrianism appears to have arisen out of a religious revolt on the part of the third, farmer, class. The roles of the figures are inverted: in Vedic language, devas are gods and asuras demons. In Zoroastrian parlance, ahuras are benificent spirits (Ahura Mazda being supreme among them), and devas are demons.

    It's a rather fascinating split, actually. The gods of the conquerors are, to the conquered, demons.

  •  very cool. allow me to add some general notes (10+ / 0-)

    Some of your readers might have wondered why the land is called Persia and the language is Farsi. We can see that these words are similar. This is because there is a basic pattern to ancient languages and language transmission.

    When reading an ancient word, one should keep in mind that the vowels are somewhat flexible and change over time. The consonants, even more interestingly, have pairs that they can alterante between, usually paired between the "hard" and "soft" prononciation of the same basic exhalatory movement. Once you know about this it is intuitive. So for example, the pair for f is p, b is v, s is sh, d is t, k is kh, g is h, l is r (ever wonder why Asians substitute l for r?), m is n, and so on. Over time as words are transmitted in oral tradition, and writings where the vowel sound is implied by the consonant (what the diarist refers to as the syllabic structure), vowels change randomly while the consonants tend to oscillate between their pairings.

    So you get the divergence of Pers and Fars from the same root, and once you realize these rules you quickly realize it's the same word. The best known Persian word that is in the English lexicon is the chess term "check" which is descended from "shakh" - as in, your king is threatened. Checkmate comes from the combitation of shakh and mat (dead). Shahk of course is also rendeered as "Shah" and in the Arabic as "sheikh." And so on. I find it fascinating how the consonants and vowels muutate with transmission.  

    Of course we will have Fascism in America, but we will call it Democracy. - Senator Huey Long

    Pater to Father, for instance.

      I had always recollected that, in another odd language note, Australian English is actually much closer to Elizabethan English in terms of pronunciation than the Oxfordian English the plays tend to get produced with in "authentic" productions. Looking at Wikipedia, I'm no longer certain about that, because I thought that was part of the great vowel shift, but their chronology suggests that was much earlier for the most part than the deportations to Australia. (I want to say that there's a line in Jonson, for instance, where it's useful to know he would have pronounced the word "chine" rather than "chain," but unfortunately, it's been ages since graduate school, and most of this stuff is unused and thus mostly forgotten.)

      But the deportees to Australia
        Recommended by:
        marina, steve davis, Larsstephens

        may have tended to come from a part of England where the pronunciation was more conservative in some respects. Much of the U.S., for instance, was settled by people from areas of England that had conservative aspects of pronunciation (East Anglians, for instance).

        It's a pretty common pattern for a colonial language to be more conservative in some respects than the language in the parent country. This has happened to both English and Spanish in the (so-called) New World.

        Listen to progressive talk radio 6 a.m. - 7 p.m. every weekday at

        by AlanF on Thu Jun 18, 2009 at 02:26:32 PM PDT

        I always believed that the peculiarities of the
          Recommended by:
          marina, Marcion, Larsstephens

          Australian pronunciation of English are derived from Cockney.

          We're shocked by a naked nipple, but not by naked aggression.

          Australian English
            Recommended by:
            AlanF, marina, where4art, Larsstephens

            Australian English diverged from British English in the early 19th century.  There is thus virtually no chance of it preserving any trace of "Elizabethan English" that was not conserved in 19th-century British.  Most of the peculiar pronunciations that we think of as "Australian" were actually developed in Australia, and are not conserved from any English dialect, though there might have been some existing tendencies that were exaggerated in Australian.  For instance, the Australian pronunciation of the i in time, with the back low vowel at the beginning of the diphthong, is just an exaggeration of changes already in progress and followed (to a lesser extent) in American and British English.  

            It is in fact the opposite of the Elizabethan pronunciation, in which the first vowel of the diphthong was much further forward.  If we heard a speaker of Elizabethan English say "time", we would think he or she was saying "tame".

            Likewise, the pronunciation of the "long a" in a word like face is not conserved in modern English dialects; it rhymed, at the time, with grass, though the vowel was somewhat longer (i.e. [fæ:s]).

            The word chain, by coincidence, would be pronounced almost the same in Elizabethan as in Australian English.  But this is an accident of history, and doesn't represent conservation.  It was in Elizabethan [tSæIn] (more or less); it then developed in the early 17th century to [tSæ:n] (merging with the long a of "face"), then became [tSE:n], [tSEIn], and at length (in Australian) back to [tSæIn].  That this is not conservation of the original sound we can see from the fact that the vowel of "face" has gone through the same changes.

            We have pretty good documentation of English pronunciation from the beginning of the 1600s on, so these developments are not hypothetical.  We actually have a very good idea of how Shakespearean English was pronounced -- much more certain than our ideas of, say, Chaucer's English.

    As a linguist

      I'm afraid I can't agree with your generalizations.  There are a very specific set of regular developments for each language, and they depend upon what the current shape of the language is at any given time, and upon intricate interrelationships between languages in contact.  It's not as free-and-easy as you make it seem, and requires a good deal of study to determine the true connections.

      To take two of your examples:

      1. Fârs is not a natural development from Pârs.  Ancient Persian had a p and modern Persian has a p, and at the beginning of the word they correspond.  So Old Persian pancha becomes modern Persian panj.  This is regular; in fact, we never see p -> f in related pairs of Old Persian and modern Persian words.

      So, how to explain it?  You have to look at Arabic.  Arabic does not have the p sound, and did not in the 7th century when the Arabs conquered Persia.  The closest sounds in Arabic are b and f (both sounds produced with the lips).  Although nowadays b is felt to be closer to p (so Arabs will often say Bakistan for Pakistan), in the 7th century f was felt to be closer, and so we have, for instance Filastîn for Palæstina.  And so the Arabs naturally pronounced Pârs (the final a in Pârsa was lost by the 7th century) Fârs.  Since they were rulers of the country for centuries, it was naturally their provincial names that stuck.  End of story.

      1. Despite the superficial similarity of shape, Arabic shaikh is not related to Persian shâh.  Shâh comes from khshâyathya, as I said, and has always mean "king"; it comes from a root khshi-, meaning "govern, rule, have power", of Indo-European origin; originally the root was *tkê-.

      Shaikh on the other hand is from an Arabic verb shâkha meaning "to grow old", and simply means "old man" (hence "elder, patriarch").  Nothing to do with ruling or governing, and no relation to the Persian word.

      a lot of consonant switches
        Recommended by:
        marina, Larsstephens

        Are caused by translation from langauge to language as the Persian to Arabic rendering of Pers as Fars which you metion. As for the sheikh, that's a hell of a coincidence then, very interesting.

        Of course we will have Fascism in America, but we will call it Democracy. - Senator Huey Long

        by Marcion on Thu Jun 18, 2009 at 03:00:21 PM PDT

    But, of course, I also want to learn Swedish, Dutch, Russian, Arabic, Mandarin, Swahili, Cherokee, and every other language on the planet. Trouble is, there just ain't enough time.

  •  Very interesting! In the modern Persian section, (3+ / 0-)
    Recommended by:
    AlanF, WIds, Larsstephens

    where I believe you're talking about infixes or suffixes, some things have strike through. You can avoid that with a backslash: \

    so this infix or suffix

    turns into -infix or -suffix

    You'll need a backslash in front of each hyphen.

  So I have the most ignorant question ever...
    Recommended by:
    WIds, Old Gardener, Larsstephens

    and I will admit that I only skimmed your treatise here, because I am lazy.

    How different is Arabic from Persian?

    If you speak Persian, can you understand Arabic and vice versa?

    My husband works a lot in Egypt, and is thinking about learning Arabic, but it seems really hard--maybe he should learn Persian (which according to you is easier) and just make do?

    What do you think?

    Happiness is not a potato... -Charlotte Bronte

    They are two completely different languages
      Recommended by:
      WIds, marina, Inland, Larsstephens, coquiero

      Persian, as the diarist says, is an Indo-European language. Arabic is a Semitic language-completely different ancestry, different (and rather strange when you first start learning it) grammer, different vocabulary. It's about as similar to Persian as Hungarian is to English.

      Due to the Arabic conquest and Islamicization of Persia, Persian borrowed the alphabet and a lot of vocabulary (esp. govt, scientific, and religious words) from Arabic. However, non-borrowed vocabulary, phonology, grammer, etc. are, like I said, completely different. If you want to talk to Egyptians, I would suggest you learn Arabic, as Persian will emphatically NOT be helpful there.

      Fire breathing Liberal.

      As someone who has studied both languages
        Recommended by:
        WIds, marina, Anna M, Larsstephens, coquiero

        Though is nowhere close to fluent in either, I will add that Persian is easier to learn-its an Indo-European language like English, so the phonology (that is, the sounds) and grammar are closer to what you're used to. Arabic has about 6 different sounds that simply do not exist in English, and the grammar will seem rather strange and difficult at first.

        Fire breathing Liberal.

        Well, it was an idea...
          Recommended by:
          WIds, Larsstephens

          a bad one, it turns out!

          Thanks.  I guess he's going to have to slog through the Arabic lessons.

          Luckily, he likes to learn new languages.

          Happiness is not a potato... -Charlotte Bronte

          Recommended by:

          There are ten different consonant sounds that exist in Arabic but not in English: ˀalif, ḥâ, khâ, ṣâd, ḑâd, ṭâ, ẓâ,  ˁayn, ghayn, and qâf.

          Alif, khâ, and ghayn aren't that big a deal, but the uvular and emphatic sounds are a b---- to pronounce, and don't sound terribly nice when you've succeeded.  Well, except ˁayn, which sounds really nice when you've got it right, but that takes lots of practice.  Otherwise it sounds like your lungs are trying to escape through your nostrils.

    Persian vs. Arabic
      Recommended by:
      WIds, Mithridates, Larsstephens, coquiero

      As the diary says, Persian is an Indo-European language, completely different than Arabic. It's just written in the Arabic alphabet, due to the conquest of the Persian empire by the Caliphate.

      I'm wondering if it's somewhat like Japanese in its relationship to Chinese. They both can be written using the same characters, but the languages don't have a common ancestor.

      So I think your husband would be out of luck going to Egypt yet only knowing Persian.

  •  Thanks -- better than Wikipedia! n/t (4+ / 0-)
    Recommended by:
    WIds, marina, rhutcheson, Larsstephens
  The only work penned by a Persian on my
    Recommended by:
    PeterHug, WIds, Larsstephens

    shelves is in Arabic - Avicenna's Al-Ilahiyyat.

    We're shocked by a naked nipple, but not by naked aggression.

  •  This is so great! (4+ / 0-)

    I'm going to read the whole thing later, when I have more time. I love languages, and I learned a few Persian phrases for fun years ago, when I was visiting a friend in London who knew a few Iranian students (they called themselves Persians). This is more background than I ever had, though! I hope you write more on this...

  a question for the linguists
    Recommended by:

    where does Amharic fit into the great families of languages? I've always been fascinated by the beauty of the alphabet and wonder as to its derivation

    We're shocked by a naked nipple, but not by naked aggression.

    Amharic
      Recommended by:

      Is an Ethiopic language, which is one of the larger branches of the Semitic family.  The Ethiopic languages, together with old South Arabian (which is not a form of Arabic) form a branch of the Semitic languages opposite the branch that includes Aramaic, Arabic, and Hebrew.

      The ancestors of the Semitic peoples may well have come out of Africa (they are distantly related to a large number of groups all over Africa, from the Copts to the Berbers to the Hausa-speakers of northern Nigeria); however, the Ethiopians were not Semitic peoples who stayed in Africa, but Semitic peoples who came back to Africa, coming out of Southern Arabia.

      Links between Ethiopia and southern Arabia were always very close: they lie opposite each other at the mouth of the Red Sea.  The southern Arabians (who were, I repeat, not Arabs) developed their own alphabet at a very early date (on the basis of an alphabet used by other Semitic peoples, which was also ancestral to the Phoenician and Hebrew alphabets).  Monumental inscriptions in Old South Arabian still exist; they show a highly geometric form of the alphabet that looks a little like Runic (but is, of course, wholly unrelated).  They show a peculiar alphabetical order quite unlike the Phoenician order: the first letter of the alphabet was h and the second l.  The order is as ancient as the Phoenician 'alp-bêt; inscriptions in Ugaritic cuneiform have been found with a very similar alphabetical order. Like the Phoenician and Hebrew alphabets, vowels were not indicated.

      This South Arabian alphabet was imported to Ethiopia and modified in order to represent the sounds of Ge'ez, a now extinct Ethiopic language.  In the process, signs to indicate the vowels were added.  This Ge'ez alphabet is, with some further modifications, the alphabet used to write Amharic (also an Ethiopic language, though not very closely related to Ge'ez).

  I thoroughly enjoyed this
    Recommended by:
    WIds, where4art, Larsstephens

    I thought I'd throw in a couple more things.

    With two exceptions, all European languages can be traced back to Indo-European. The two exceptions are Basque (in the Pyrenee mountains) and the Finno-Ugric languages (Finnish and Hungarian, which Attila and his Huns brought to Europe from Asia).

    Before the 1940s, it was sometimes called 'Indo-Aryan,' but after WWII, the word 'aryan' was not very popular. The word 'aryan' traces back to the same ancient root-word as 'Iran.'

    William Jones, in 1756, was the first person to put forth the proposition that Latin, Greek, and Sanskrit were somehow related:

    The Sanscrit language, whatever be its antiquity, is of a wonderful structure; more perfect than the Greek, more copious than the Latin, and more exquisitely refined than either, yet bearing to both of them a stronger affinity, both in the roots of verbs and the forms of grammar, than could possibly have been produced by accident; so strong indeed, that no philologer could examine them all three, without believing them to have sprung from some common source, which, perhaps, no longer exists; there is a similar reason, though not quite so forcible, for supposing that both the Gothic and the Celtic, though blended with a very different idiom, had the same origin with the Sanscrit; and the old Persian might be added to the same family.

    Later, Jakob Grimm (of fairy tale fame) worked out some of the vowel and consonant shifts (which you can see in the list below).

    bhratar (Sanskrit)
    frater (Latin)
    phrater (Greek)
    frere (French)
    brother (Modern English)
    brothor (Saxon)
    bruder (German)
    broeder (Dutch)
    bratu (Old Slavic)
    brathair (Old Irish)

    Sure, understanding today's complex world of the future is a little like having bees live in your head. But, there they are.

    William Jones
      Recommended by:

      ...actually wrote more to the last sentence of those remarks (made in 1786, not 1756):

      and the old Persian might be added to the same Family, if this were the Place for discussing any Question concerning the Antiquities of Persia.

      This was a bit of a joke on Jones' part, and probably elicited a few chuckles from his audience, the Asiatick Society.  Jones was well-known as a scholar in Persian; in fact, his knowledge of Persian was largely responsible for his relocation to India, where he served as a magistrate. This required him to be familiar with Moghal law, as the British were administering a former Moghal province, amid the chaotic wreck of the disintegrating Moghal empire.  Such laws were frequently written in Persian.  His interest turned to Sanskrit because he wanted to learn more about the traditional laws of the non-Muslim peoples of India.

      Sadly, Jones had gotten into an intemperate quarrel over the authenticity of a translation of the Avesta made by the French scholar Anquetil du Perron; the first into a western language.  Jones, with some other scholars, had wrongly argued that the work was a forgery (it did not match the assumptions that people of that period had about what the teachings of "Zoroaster" should look like).  That was about 15 years before Jones' speech, but his knack for getting into wrangles about "the Antiquities of Persia" was evidently well-known.  Jones, it seems, did not mind having some fun at his own expense.

  Well presented
    Recommended by:
    WIds, Larsstephens

    Thank you for the information and details. This is the kind of in depth writing that we need in media.

  a wonderful (literally) thread!
    Recommended by:

    We're shocked by a naked nipple, but not by naked aggression.

  By the way

    This article on the demonstrations explains what a few of the chants mean:

