With the events in Iran today, people are more and more exposed to the Persian language, in signs and speeches. Some people might, therefore, be interested in learning more about the language (or even in learning the language itself). I thought those people might be interested in an introductory essay about the language.
(more below fold)
The name of the language
The common name of the language in English is "Persian", after Persia, an older name for the country of Irân. Very often you will see the language called Fârsi, however.
Fârsi is the name given to the language by the Persian-speaking people of Irân. However, Persian is not only spoken in Irân; it is also spoken in Afghânistân (where it is called Dari) and in Tâjikistân (where it is called Tâjiki). The differences are comparable to the differences between Australian, British, Irish, and American English -- they are all the same language, with some differences in pronunciation and usage.
Fârsi is an adjective formed from the name of the province of Fârs -- formerly Pârsa -- which is the part of Irân that lies along the coast of the Persian Gulf (khalij-e Fârs), including the city of Shirâz, famous for its poets. In classical times Fârs was called Persis (a modification of Pârsa), and was famous as the region from which both Cyrus the Great, founder of the first Persian Empire (550-330 BCE) and Ardashir, founder of the second Persian Empire (226-651 CE) originated -- hence the entire country came to be called Persia, after the province of Persis. Fârsi and Persian are therefore, for most purposes, equivalent terms.
The role of the language today
Persian is the most widely spoken language in Irân today: it is used in TV broadcasts, newspapers, in public debates and speeches, and of course online. More than half of all Iranians use it as a native language; it is a second language for almost anybody else in Irân. There is probably nowhere in Irân where you cannot find someone who will understand you if you speak Persian. There are however significant areas where other languages are spoken: Azeri Turkish and Kurdish in the northwest, Baluchi in the southwest, Arabic in regions bordering southern Iraq, and various other languages in different pockets and corners of the country. Even in these regions, however, there will be a significant Persian-speaking population.
History of the language
Persian is an Indo-European language, that is, it is part of one of the largest language superfamilies on Earth, one which includes English, Russian, Spanish, Gaelic, Greek, Lithuanian, Albanian, Armenian, and Hindi. Because Persian is related to all of these languages, many of its words will seem very familiar to an English-speaker. For instance, the Persian word for "mother" is mâdar and the word for "brother" is barâdar. These resemblances are not for the most part accidental (a few of them are, like the resemblance between English "bad" and the Persian word bad, which means the same thing) -- they mostly arise from the fact that English and Persian have a common ancestor, and in some cases the words have not changed very much over the intervening millennia since the languages began to drift apart.
The first speakers of an Indo-European language (called Proto-Indo-European), are thought to have lived some five or six thousand years ago in the grasslands north of the Black Sea. They were totally illiterate, so everything we know about their language is reconstructed on the basis of the surviving languages (first written down no earlier than c. 1400 BCE); but we do have some archeological evidence. This shows that they were among the first peoples to tame and breed horses, and to attach carts and chariots to them. This gave them a mobility that other peoples of that age could not match; and as a result, they moved very quickly into regions both to their west (including almost all the regions of Europe) and to their east. As the various Indo-European peoples were cut off from each other by the vast distances separating them, their languages also came to differ from each other, until at last -- probably not later than 2000 BCE -- they were totally unintelligible to each other.
The Indo-European peoples who moved to the East were, in part, the ancestors of most of the peoples of Irân and Afghanistân and some of the peoples of India. They are therefore called "Indo-Iranians". They moved initially from the steppes to the region north of the Caspian and Aral Seas, and then down into the river valleys that in those days fed the Aral Sea. Moving up the rivers they came into the hill country of Tâjikistân and Afghânistân, and then filtered west into Irân. By the 16th century BCE, one group had come as far west as northern 'Irâq, where we have the earliest evidence of an Indo-Iranian language preserved in cuneiform.
Another group crossed the mountains and entered northern India and Pakistân, in what is now the Panjâb. They spoke an early version of what we now call Sanskrit; and some of their hymns and poems have been preserved, through a careful oral tradition, down to the present day. They lost all contact with their relatives back on the other side of the mountains, and soon became very different in language and culture: Indian, as opposed to Iranian.
Some centuries after the departure of these early Sanskrit speakers, a religious prophet arose in what is now Afghânistân, whose name was Zarathushtra. He preached a doctrine in which two powerful forces, Good and Evil, Truth and Lies, battled each other for dominion over the world and the souls of mankind. His teaching brought about a remarkable cultural reformation among all the Iranian peoples. How long it took his doctrines to spread is unknown, but in the end there was no land where Iranian languages were spoken that was not unaffected. Today we call the religion he propagated "Zoroastrianism" (after a Greek alteration of his name, Zoroaster).
The earliest evidence we have for an Iranian language is in certain ancient hymns (Gâthas) attributed to Zarathushtra. These are in a language called "Avestan", after a later name for the collection of religious writings in which they were included. The Avestan language preserved in these hymns is remarkably similar to Sanskrit in overall structure and in much of its vocabulary, though the sounds of both languages had changed so much that it is unlikely that they were mutually intelligible. However, the religious ideas of the hymns of the two languages could not be more different: the Sanskrit Vedas teach a polytheistic religion with a vast pantheon of devas, or deities; the Avestan Gâthas preach the worship of one god, the "Wise Lord" Ahura Mazda, and regard the daêvas (so the word appears in Avestan) as the wicked servants of Angra Mainyush, the Evil Spirit.
In these Gâthas we also see the first use of the word airya to describe the Iranian peoples. The corresponding Sanskrit word means something like "noble" or "praiseworthy"; but in Irân it came to be used as the name of the whole people, contrasted to the anairya, the non-Iranians. From the genitive plural airyânem ("of the Airyas") came the name Êrân (i.e., land of the Airyas), which in modern Fârsi is pronounced Irân.
About four centuries after Zarathushtra (though we have no firm historical dates to fix his time) the Iranian peoples further west began to organize themselves into powerful political entities that historians call "empires": first the Medians (centered around the area of northern Irân, including modern Tehrân), around 650 BCE, and then the Persians under Cyrus (Kurush) the Great, in 550 BCE. Cyrus conquered the Medes, and also Mesopotamia, the Levant, and Asia Minor, forming the first "Persian Empire".
Up to this point the Iranian peoples had been wholly illiterate; the Gâthas of Zarathushtra were transmitted orally. Under Cyrus, or perhaps one of his successors, however, the Persians adopted a much-simplified form of the cuneiform script, and used it for inscriptions on everything from massive monuments to tableware. This script is essentially syllabic: it represents combinations of consonant+vowel, and though ambiguous in some ways fairly accurately represents the sounds of the language. The Old Persian language revealed in these inscriptions is the direct ancestor of modern Persian (the related Avestan language, written down centuries later, is more like a great-granduncle). It is, however, very different, having a complex system of declensions and conjugations, as well as a much system of sounds that was greatly simplified later on. To take one example: the Old Persian word for "king" was khshâyathya. After many centuries, it eventually arrived at the modern Persian word, shâh -- a vast reduction!
The cuneiform writing fell out of use after the conquest of the first Persian Empire by Alexander the Great. For over a century after Alexander, the Iranian peoples were under the control of Macedonian Greek overlords; then came the Parthian dynasty, ruled by a people from eastern Iran who spoke a language related to but distinct from Persian. In this period literate people in the region tended to use the Aramaic language and writing to keep records. Gradually, however, the Aramaic script was modified to allow the writing of Persian -- a Persian which, however had changed a great deal from the Old Persian of Cyrus and Darius, and had moved far in the direction of modern Persian.
This "Middle Persian" language flourished under the Sâsâni dynasty, which emerged from Persia to rule an area somewhat larger than modern Irân in 226 CE. The modified Aramaic script used to write this language is called "Pahlavi" (literally: "Parthian") and the language is sometimes called Pahlavi too. Unfortunately, the script is often unclear, as many letters representing different sounds have the same or nearly the same shapes, resulting in a high degree of ambiguity. The later holy books of the Zoroastrian religion are written in a form of this script. Luckily, we have texts in Middle Persian written down in another script, this one designed by the Manichaeans (members of a religion that was founded by the prophet Mani in the 3rd century CE) which is far less ambiguous; this helps us understand the sounds of Middle Persian much better.
In its latest form, Middle Persian was not immensely different from modern Persian; many words are nearly identical. Among its archaic features are the presence of an initial "w" sound, which has since become "b" or "g" in modern Persian; e.g. wuzurg "big" (from Old Persian wazrka) has become modern Persian bozorg. Similarly, the Middle Persian "g" at the ends of words has been lost from modern Persian when it followed a vowel (the letter h is written instead, but not pronounced): e.g. frêshtag "angel", becomes modern Persian fereshteh. This example also illustrates another characteristic of modern Persian, the tendency to split up the consonants of an initial consonant cluster by putting a vowel between them; that even happened with words borrowed much more recently from western languages, e.g. France -> Ferânseh.
There are many other differences in the quality of the vowels, but a lot of these belong to the succeeding state of the language.
The second Persian Empire was overthrown in the mid-7th century by the Arabs of the early Caliphate. These Arabs brought in both Islâm and their own alphabet, which, as modified and improved in succeeding centuries, became a much more effective tool for writing than the Pahlavi alphabet had been.
For a long time, however, very little literature was written in Persian; the exceptions were in the religious literature of the Zoroastrians, but they were a dwindling sect without power. There were many highly educated Persian Muslims, but for centuries they preferred to write in Arabic, to the literature of which they contributed very significantly.
Nonetheless, Persian continued to be the dominant spoken language of the Iranian region, in this respect differing greatly from the languages of other areas where Arabic had been introduced. In Mesopotamia and the Levant, the dominant Aramaic language was almost entirely replaced by its sister Arabic; it survives today only in a few villages. In Egypt, where Coptic was the native language in the 7th century, Arabic was less successful at first but more so in the long run; by the end of the Middle Ages, Coptic was dead as a spoken language, only remaining in the religious writings of the Coptic Christians.
The predominance of literary Arabic in Irân continued through the Umayyad Caliphate and well into the Abbasid Caliphate, down to the 10th century. At that time, the Abbasid Caliphate was breaking up; its furthest provinces were breaking away, and among these was the Persian-speaking province of Transoxiana (approximately in the area of modern Uzbekistân). Here, in the mid-900s, the Sâmânids, a dynasty of Iranian origin, began to cultivate the writing of Persian poetry. One of them called for a huge project: the writing of a vast epic which would celebrate all of Iranian history, as recorded in the surviving records of the Sâsânians and the Zoroastrian writings, from the beginning of civilization down to the end of the Sâsânid dynasty. This was started by the poet Daqiqi, and completed by the poet Ferdowsi; and by the time it was done (CE 1010), the Sâmâni dynasty had been overthrown by the Turkish Ghaznavids. It is called the Shâhnâmeh, and is probably the most famous of all Persian poems.
But the loss of Persian independence did not lead to an end of the Persian renaissance; in fact, the new style of Persian, written in Arabic letters, containing many Arabic words, flourished and attained a remarkable degree of literary uniformity which it maintains to this day. Texts 1000 years old can be read by the modern speaker of Persian with only modest difficulty; contrast the complete unintelligibility of an Old English text like Beowulf (a poem of comparable age) to modern English speakers.
Persian spread even beyond Irân itself. The Turks adopted it as a literary language, along with several Persian customs, and they brought Persian with them into Turkey and the Balkans. Mehmet the Conqueror was quoting a Persian poet when he rode into fallen Constantinople, and there are to this day people as far away as Bosnia and Albania who have names of Persian origin.
On the other side of Irân, Persian was carried into northern India by the conquering armies of various Muslim peoples, and became the administrative language of Moghal India for several centuries. One can find Persian words and names throughout India, and a great deal of Hindi (and even more of its sister dialect, Urdu) consists of Persian loanwords.
Despite its slow rate of change, Persian has changed in pronunciation in less striking ways. The short i and u of New Persian are now pronounced e and o; the long ê has merged with long î and likewise long ô has merged with long û (in Irân; the distinction is maintained in Afghânistân and Tâjikistân). The diphthongs ay and aw are now pronounced ey and ow. In colloquial speech, the sequences ân is now generally pronunced un (oon). But on the whole, there is very little difference in pronunciation.
There are, however, a host of colloquial forms and contractions of words and inflections which stand side-by-side with the literary forms: e.g. -in for -id, -e for -ad, -an for -and, migam for miguyam, miyâm for miyâyam, mirim for miravim, and so forth. These are commonly used, but less commonly represented in writing. The more literary forms will always be understood, even if they sound a bit stiff. So we can say that the Persian of 2009 is one and the same language as the Persian of 1009, in the same way that our English is the same language as that of Shakespeare.
Persian is written in the Arabic alphabet, with the addition of four letters to represent sounds which did not exist in Arabic. All of these sounds, however, are found in English: they are p, ch (as in chess), zh (the s-sound in measure) and g (as in go). They were placed in the Arabic alphabet after the most similar sound existing in Arabic: p after b, ch after j, zh after z, and g after k. They are written similarly to the preceding Arabic letter, but with the addition of three dots, or (in the case of g) with a line drawn on top. Another change from the Arabic alphabet is the sequence of the last three letters; in Arabic they go h, w, y but in Persian they go v, h, y.
The Persian alphabet therefore has 32 letters, and is written from right to left. In the transliteration I'm using, these letters are:
'(1), b, p, t(1), s(1), j, ch, h(1), kh, d, z(1), r, z(2), zh, s(2), sh, s(3), z(3), t(2), z(4), '(2), gh, f, q, k, g, l, m, n, v, h(2), y.
You will notice something funny about this at once: there are 2 t's, 2 h's, 3 s's, and 4 z's! And what's that apostrophe about?
The reason for there being so many letters with the same sounds is that the Persians could not pronounce Arabic properly; and when they came across a difficult sound, they substituted the closest sound they had. As a result, it's a lot easier for a Westerner to pronounce Arabic the Persian way than it is to do it the Arabic way!
With a tiny number of exceptions, when there are multiple letters for the same sound, one of them is used in all words of Persian origin, and the rest are used in Arabic words wherever it is etymologically appropriate. Thus t(1) is Persian, t(2) is Arabic; h(2) is Persian, h(1) is Arabic; s(2) is Persian, while s(1) and s(3) are Arabic; z(2) is Persian, and z(1,3,4) are Arabic; '(1) is Persian and '(2) is Arabic. (All the "Persian" letters can, of course, be used in Arabic too.)
Each of the letters has its own name, so it's easy to tell them apart when a word is spelled out; in Fârsi, the names of the letters are:
alef, beh, peh, teh, seh, jim, cheh, heh, kheh, dâl, zâl, reh, zeh, zheh, sin, shin, sâd, zâd, tâ, zâ, eyn, gheyn, feh, qâf, kâf, gâf, lâm, mim, nun, vâv, heh, yeh.
The only two letters with the same name are the two heh's: the first ح is sometimes called heh-jimi, because it looks similar to the letter jim چ, only without the dot; the second ه is called heh-do-chashm or "two-eyed heh" because in some forms it consists of two loops.
Almost all the sounds of Persian can be found in English. The apostrophe ' represents either the letters alef or eyn, and, when not completely silent (as those letters are when put at the beginning of a word) is pronounced like the catch in one's throat that one can feel at the end of the "uh" in the exclamation "uh-oh!"
The only consonant sounds of Persian that don't occur in English are kh, gh, and q. Kh is the same sound as the ch in the name of Bach, the composer, when pronounced in the German way, or the ch in the name of the Jewish holiday Chanukah, when pronounced as in Hebrew. Gh is the same sound, but with additional voice; that is, it has the same relationship to kh as a hard g (as in go) does to k.
The q-sound was originally a kind of k pronounced very far back in the throat, but by most Iranians it is pronounced the same as gh.
The vowel sounds of Persian are almost all found in English, or close enough. They are:
a: the sound of a in "hat"
â: the sound of aw in "law"
e: the sound of e in "pet"
i: the sound of i in "machine"
o: the sound of o in "only" (but even shorter)
u: the sound of u in "lunar"
There are also two diphthongs:
ey: the sound of ey in "hey"
ow: the sound of ow in "grow"
According to traditional grammars, the vowels a, e, and o are "short vowels", and the vowels â, i, and u are "long vowels". This contrast has very little to do with the modern pronunciation, but two things should be noticed: one, when there is variation in the pronunciation of a word, it is usually between the sounds a, e and o (for instance, chashm "eye" is also pronounced cheshm); two, the long vowels are all written out as full letters, but the short vowels are not written at all.
â is written with the letter alef;
i is written with the letter yeh;
u is written with the letter vâv.
The diphthongs ey and ow are also written with yeh and vâv respectively.
This means that in order to read Persian, you need to memorize both the spelling and the pronunciation of a word separately. E.g., the word bozorg ("big") is actually written bzrg -- and only experience tells you that it is pronounced bozorg and not bazarg or bezorg (neither of which exists) or something else. However, this is mastered fairly easily; Persian texts for children often use small diacritics next to the letter to indicate whether it is followed by a, e, or o; most tools for teaching Persian to Westerners write out the word in Roman-letter transcription along with the Arabic script. In fact one could learn to speak Persian from a Roman-letter transcription alone, though your ability to function in society would be severely compromised (as you'd be effectively illiterate).
About learning Persian
One thing to stress to anybody who decides to learn a little Persian is this: it is an easy language. It is far, far easier than Arabic; perhaps that's one reason that Persian became so successful at spreading all the way from the Balkans to Bangladesh, even though those areas were never under the rule of any actual Iranians! It has some irregularities, as all languages do; but in structure and in syntax, it is quite a bit more regular than English is.
Like French, Persian has an inflected verb system, but the system is far simpler than French. One only has to remember six endings (one for each person and number) and a few prefixes. There are some irregular verb formations (comparable to English go, went, gone; eat, ate, eaten; see, saw, seen) but one only has to remember two forms for each verb (the present stem and the past stem), not three as in English.
Persian nouns and pronouns are not inflected at all, not even to the extent that they are in English (John, John's; I, me, my; he, him, his). There are a few suffixes used optionally to mark possession or relation, as in barâdar-am "my brother" next to barâdar-e man, literally "brother-(of)-me". But there is very little in the grammar of Persian that is a strain on the memory; one does not have to memorize page after page of paradigms, and there are no arbitrarily varying declensions or conjugations. There isn't even any grammatical gender; u, the third person pronoun, means both "he" and "she".
The word order is simple, straightforward, and quite consistent, though different from English, being Subject-Object-Verb rather than Subject-Verb-Object. The last word in a sentence is generally an inflected verb.
One can thus progress very quickly from beginnings to reading or speaking fairly complex sentences. The biggest difficulty in mastering more complicated Persian communications, particularly writing, is the abundance of Arabic loanwords. These words play the same role in Persian that words of Latin and Greek origin do in English -- they are longer, often (though not always) have a more "educated" feel, and carry with them a whole different set of rules. Not only are there occasional plural forms as unfamiliar (and frequently optional) as radii beside radiuses, or indices beside indexes -- for example madâres "schools", plural of madraseh, or olum "sciences", plural of elm; but there are complex relations between verbal nouns, e.g. one must learn the relationship betwen ettehâd "union" and mottahed "united" or entekhâb "choice" and montakhab "chosen", at least for a deeper understanding of the relations between words. But you can also just learn these as individual words, without worrying about their structure or relationships.