I’m toying with a Sanskrit-esque conlang. At the moment this is likely to be just a naming language, but there’s a good chance that I’m going to need to expand it later, so I want to make sure I get off on the right foot.

But this poses the question: what is Sanskrit-esque? I’m mostly concerned with phonology and mouth-feel, not syntax or morphology—which is convenient, since I know basically nothing about Sanskrit beyond its phonology. A little brainstorming suggests the following characteristics:

  1. A four-way stop contrast, with all combinations +/- voice and +/- aspirated for most places of articulation
  2. Palatal and retroflex consonant series
  3. a as the most common vowel, followed by i
  4. Syllabic sonorants, especially r
  5. Lack of w, but v and y very common.
  6. Onset clusters of the form Cr, but few/no other onset clusters
  7. Vowel length distinction
  8. Relatively few word-final consonants, and those that occur are usually nasals or h

I found this Sanskrit text as a good language sample, from which I drew most of the preceding observations. Obviously some of these are generalizations about Sanskrit romanization and not necessarily about phonology per se, but since my end-goal here is to create a Sanskrit-flavored naming language, observing the romanization conventions is part of the deal.

Now I further complicate my requirements by noting that I already have a decent number of names in use for this setting, which I have to retrofit without completely destroying. Let’s start with the city formerly named Wyrnas, a grotesquely cliche pseudo-Welsh name. My initial concept of this language used the digraph yr to indicate a syllabic [r], so this name can be changed to Vrnas with almost no change in actual pronunciation. But what a wonderful difference in flavor! I’m off to a good start.

Next is Corath. This name doesn’t violate any of our rules outright, but that final -ath doesn’t sit right. Obvious alternatives would be Coratha or Corathi, which are merely okay. While looking at these names I thought of simply geminating the th to Corattha, which seems just right.

On to Gocem. I’m pretty sure that CoCeC is not a possible word-shape in Sanskrit, so we have to change at least one of the vowels. But the most minimal change here seems like the best: Gocam

(Note that I’m editing purely for flavor here, without any concern for the morphology or phonotactics of the target language. This is fine as a first step, though later of course I’ll have to figure such things out.)

I won’t go through the rest of the 20-ish names that would have to be retrofitted, since this is just a preliminary sketch. But I’m heartened that the retrofit seems to be possible.

Advertisements

While listening to Conlangery recently I was impressed by one of the host’s suggestion to "try different things and see what works." It’s an anodyne, nearly cliché bit of advice, but it prompted me to wonder: how, exactly, does something in a constructed language "not work?" After all, you’re inventing a language. You get to make up the rules; whatever you invent works if you say it does. Doesn’t it?

Yet anyone who has been conlanging for more than a few months will recognize that this isn’t actually the case. I felt this intuitively, but it still had trouble articulating what exactly it meant for something to "not work" in an invented language. So I’ve compiled a list below of common conlang failure modes. These are things which frequently happen to conlang creators which cause their languages to go awry, and I think that almost all conlangers should be able to recognize many of the items on the list. More importantly, I’ve tried to articulate why each of the mishaps represents a failure, and how they detract from the artistic integrity of the conlang.

Relex: A relex is a language whose vocabulary exists in a close correspondence with the language creator’s native language. Relexes are a hallmark of new and naive conlangers, and they mostly represent a failure of imagination. Languages differ enormously in the structure of their vocabulary — why repeat what you already know?

Syntactic relex: A more sophisticated version of the relex, the syntactic relex mimics not the vocabulary but the grammar of the creator’s first language. This, likewise, is a failure of imagination, but it’s much subtler and is something that even sophisticated conlangers fall into periodically.

Excessive ambiguity: The language, as designed, is not really usable for communication because it presents too many opportunities for misunderstanding. No matter how creative or artistic a conlang is, it must still function (at least in theory) as a medium of communication. This is actually a hard error for most conlangers to fall into, since conlangers tend to hate unclarity, and the human language facility is remarkably tolerant of ambiguity.

Excessive specificity: The language, as designed, is not really usable for communication because it requires the user to encode too many different aspects of the utterance. There are myriad different things that languages can encode, but there is no language that encodes all of them. Many conlangers get overly excited about all of the different things that their language could mark, and try to throw in all of them. This results in a bulky, unwieldy morphology and syntax that resembles no real-world language.

Unlearnability: The language can’t be learned by ordinary humans, because it uses structures that aren’t natural for human minds. Occasionally, as with Fith, this is a deliberate design choice, but more often the language creators intended for the language to be human-speakable, but failed by choosing an unsuitable underlying model. Lojban seems to suffer from this flaw.

Incoherence: The parts of the language don’t fit together, and there are gaps or clashes where the mismatched parts meet. The language’s phonology might make mutually exclusive demands of its word-shapes, or the syntax might contain valency-changing operations which are superfluous in a non-configurational grammar. This usually indicates that the conlanger has included an interesting linguistic feature without understanding how to use it or considering its consequences for the rest of the grammar.

Artificiality: The language is unnatural in some way or another, and this unnaturalness marks the language as artificial. Any aspect of the language (phonology, morphology, syntax, vocabulary) can show these telltale signs of artifice. Occasionally this is a deliberate artificial choice, but more often this is a result of the conlanger attempting to create a naturalistic language and failing.

Aesthetic failure: The most difficult and subjective failure. Most conlangers begin their languages with a particular artistic purpose in mind, whether that goal is phonoaesthetic, morphosyntactic, or lexical. But sometimes the language fails to meet its creator’s goals. This failure is so idiosyncratic that almost nothing can be said about it in general, but anyone who has ever tried and failed at an artistic endeavor will understand what it is like.

Lately I’ve been thinking about the origins of the Yivrian passive (and how it’s related to the Praseo passive, and if Praseo even has a passive). Eventually I wrote up the following, which I am very pleased with:


In Common Yivrian (CY), the thematic vowel of the verb has three grades1, which are reflexed in Yivrian (Y) but with different semantics:

Base        -yā
Focus       -yō
Intensive   -yū

The base form is most commonly used and has unmarked semantics. The focus form is used when the verb itself carries the discourse focus. This form becomes the Yivrian passive, and is the topic of the following discussion. (The intensive form in CY is outside of the scope of this discussion, but it eventually becomes the Yivrian reflexive.)

In CY the default word order was SVO, though the case-marking allowed for mild nonconfigurationality, with the order of subjects and objects relative to each other and to the verb being unconstrained. This word order was exploited for discursive purposes, with the utterance-initial position serving to indicate focus. Nouns had no additional marking for focus, but as mentioned above when you wished to focus the verb, the thematic vowel of the verb ending changed:

Unmarked word order:

[CY]    Daθu   leθθyā  nawimu.
        Bird    eat     worm-ACC
        "The bird eats the worm."

Verb-focused word order:

[CY]    Leθθyō      nawimu      daθu.
        eat-FOCUS   worm-ACC    bird
        "It *eats* the worm, that's what the bird does."

This could be used with stative verbs as well:

[CY]    Hāðiyōhi            yīse.
        Be-beautiful-FOCUS  woman.
        "The woman is *beautiful*."

This latter case has survived essentially unchanged into Yivrian, where an intransitive verb can be marked for focus by being marked with -o and moved to the beginning of the sentence.

[Y]     Harayoa             nayiise.
        Be-beautiful-PASS   that-woman.
        "That woman is *beautiful*."

However, the semantics and syntax of this sentence are somewhat changed from what they were in CY, due to the concomitant changes to the transitive case. The transitive verb-focused statements were reinterpreted as passives due to the following three changes:

  1. The accusative case marker was lost. Since previously verb-focused sentences could have VSO or VOS word order, this caused an ambiguity.

  2. To disambiguate agents from patients in verb-focused transitive sentences, the agent was marked with the instrumental case.

  3. Once the agent was marked by the instrumental case, the now-unmarked object was reinterpreted as the syntactic subject, and the verb-focus marker was reinterpreted as a passive marker.

Yivrian retains vestiges of this system in its word order. The unmarked word order for active transitive sentences is SVO, but passives are VS(A), with the agent optionally indicated by an oblique argument in the ablative case. (The Yivrian ablative conflates the old instrumental and locative cases.) For examples of each type:

[Y]     Doth    lethya  na.  
        Bird    eat     worm.
        "The bird eats the worm."

        Lethyo      na      dathun.
        Eat-PASS    worm    bird-ABL.
        "The worm is eaten by the bird."

The VS(A) word order for passives was the normal word order throughout the classical period, and passives with the SV(A) word order that more closely mimicked the active word order were rare. They become more common in post-Classical Yivrian, as the passive significance of the verb marking becomes more salient and the verb-focusing origins of the construction are lost.

For intransitive verbs the passive -o retained its role as a marker of verbal focus. However, once -o was understood primarily as a passive transformation, the argument-reducing aspect of the passive voice was applied to intransitives as well. Thus, if you wish to omit the subject of an intransitive verb, you must apply the passive morphology to it as well. This is a special case of the verb-focused intransitives discussed above.

[Y]     Volassumyoa.
        INTENSIVE-be-stupid-PASS-PROG
        "Someone sure is being stupid."

Finally, ordinarily subject-less verbs such as weather verbs were influenced by this pattern. In CY (and in both Praseo and Tzingrizhil), such verbs take the ordinary verbal ending -a, but in early classical Yivrian we find them occurring oftentimes with -o, and by the late classical period the passive marking of such verbs has become obligatory.

[Y]     Lavyon          kayana.
        rain-PASS-FUT   tomorrow.
        "It will rain tomorrow."

  1. These grades are very similar to the ā/ō/ū grades found in the stative nouns (Yivrian nouns of emotion), and probably are etymologically related.

Yivrian is historical conlang, designed with a proto-language and a set of sound changes that derive it, and with parent and sister languages. But I made it and its family backwards. Yivrian itself was conceived first (and it was not originally designed as a historical conlang), and only after the language was originally designed did I begin to speculate on what its parent language was like, and begin to design its sisters. This is not how your supposed to do these things, but it worked out reasonably well.

The biggest difficulty that I encounter with this approach is that Yivrian is too similar to its parent Common Yivrian, and the other sister languages are too different. Since Yivrian came first and retains its pride of place, everything about Common Yivrian that I didn’t specifically intend to be different defaults to being the same as Yivrian, while the other languages (Praseo and Tsingrizhil) wind up with a much greater distance from the proto-lang.

Fleshing out Praseo for The Wedding of Earth and Sky forced me to confront this problem anew. It also presented a different problem: while the Yivrian-like proto-forms work fine for deriving Yivrian, when I take those forms and put them through the sound changes to create Praseo, the result is often very ugly.

For instance, for Wedding I had to consider what to call the diety that in Yivrian is named Aratelor. If we extrapolate backwards into Common Yivrian by the most direct route, we would reconstruct something like *arātelōra, which as you can see is very similar to the Yivrian form, and not very interesting. Worse, the Praseo generated from that proto-form is Arotlura which I don’t like at all.

So I did some speculating. First, the Yivrian ending -elor is commonly attached to the names of dieties, and for that reason it may be innovative or analogical. Furthermore we know that the stem from which this name is formed is arat- (which appears in several other words), so it’s reasonable to assert that the CY name is *arāti or something similar, and the Y -elor is an innovation.

The second step was a new sound change. I had long known that CY contained /*ð/, which has disappeared in all of the daughter languages but left behind traces. In Yivrian the normal reflexes were (I thought) /d/ and 0, but about this time I began to speculate that there had been a sound change of *ð => r. Yivrian has a lot of r‘s, and I find so many r‘s to be unpleasant outside of the particular phonoaesthetic context of Yivrian, so this seemed like a good chance to turn a certain number of Yivrian r sounds into something that wouldn’t be reflexed as r in the other sister languages.

Applying that to this case, I changed the proto-form to *aðāti — and this was paydirt. The Praseo reflex of *aðāti is Azatsi, and I loved the sound of that! I liked it so much that the name became canon: in Wedding the name Azatsi appears as the name of the diety in question, and that’s unlikely to change in the future.

Yivrian has been conceived as a member of a language family, though none of the other members of the family have ever seen significant development. This presented me with a challenge and an opportunity when I sat down to write my current WiP, since it’s set in a part of the world that doesn’t speak Yivrian, but one of its sister languages Praseo.

Praseo, known as Praçí in an earlier incarnation, is the language of the city of Prasa (formerly Praç) and its environs. Praseo was originally conceived as a Portuguese-like relative of Yivrian, and it retains several features from that early stage: nasal vowels, several syllable reductions, and vocalization of coda -l to create lots of diphthongs ending in -o. However, over time that Portuguese flavor has largely been lost, partly because my conception of the conculture of the Yivrian cultural area changed quite a bit, and partly because the Yivrian lexical base was hard to warp into something that felt Romance-like. The current incarnation has phonoaesthetic elements of Japanese and Pacific Northwest Native American languages to go with the Portuguese substrate, but it retains enough similarity to Yivrian to feel like a member of the same family.

But I may be getting ahead of myself. Let me show you a chart of Yivrian’s close relatives:

There is not a lot of breadth here, as the near relatives of Yivrian number only two (plus a possible, unnamed third language which I haven’t included in the chart above, since it’s little more than conjecture at this point). For brevity, I often refer to the languages mentioned above with the following abbreviations:

  • PY (Proto-Yivril)
  • CY (Common Yivrian)
  • OY (Old Yivrian)
  • OTz (Old Tzingrizil)
  • Y (Yivrian)
  • Pr (Praseo)
  • Tz (Tzingrizil)

[Speaking from an in-universe perspective:] Yivrian, Praseo, and Tzingrizil are all attested, literary languages that had hundreds of thousands of speakers during the classical period and are quite well-documented. The “old” languages Old Yivrian and Old Tzingrizil are also attested, though much more sparsely, while the ancestral languages of Common Yivrian and Proto-Yivril are only known through reconstruction. There is a pretty high level of intelligibility between OY and OTz, though the daughter languages are all mutually unintelligible.

[Speaking from a conlanging perspective:] My Yivrian lexicon includes the ancestral Common Yivrian forms for all of its entries (except those that are borrowings or later coinages, obviously), which means that I have a pretty large CY lexical base to use for deriving other sister languages. The challenge that this presents, though, is to create forms that are phonoaesthetically pleasing and linguistically plausible for the other daughter languages. Y itself doesn’t have this problem, since all of the CY forms in the lexicon were created by extrapolating backwards from Y, but I find that the sound changes from CY forward to the other daughters require a lot of tweaking. Next post I may go into detail on a few of the problems I encountered and some solutions that I worked up for them.

But using Praseo for my WiP actually presented a bigger problem. Namely, the people in the book don’t actually speak Praseo yet.

The book I’m working on is set in an early part of the Yivrian history, at the stage of Old Yivrian and Old Tzingrizil. The story takes place in and around Prasa, but at the time of the story Prasa had been settled by explorers from Tsingris for only about two generations. So the characters are largely speaking OTz. But I don’t want to use OTz for my names and language snippets, because OTz is ugly. I have strong phonoaesthetic expectations for my conlangs, especially those that are going to go into books. I consider OY and OTz to be intermediate stages, and I don’t much worry about tuning them, but this means that they aren’t suitable for use as the main conlang sources in a novel. Furthermore, I have to keep my readers in mind — it would be really confusing if I publish three different novels in different time periods, and all of the place names were slightly different in each book due to linguistic shifts.

To get around these problems, I established a policy which I intend to follow from here on out. All place names and language snippets appear in the canonical, classical form of each language, which is generally the latest developed form of the language in con-historical time. Furthermore, wherever possible place names are given in the form of the local language, regardless of the language of the speaker or POV character in the story. That is, the city of Prasa will always be called Prasa, since that’s its name in Praseo, despite the fact that in Y the name is Parath, and in Tz it’s something else yet.

But all of this is just backstory and extra-literary justification for my linguistic decisions. The actual work of creating Praseo is still underway, and next week I’ll talk about some of the challenges.

In conlang parlance, a naming language is a language sketch which is designed only for generating names in a work of fiction. Naming languages are sometimes held in low regard by conlangers as not being "real" languages, but this is an unnecessary bias. A naming language is like a minimalist painting: it only consists of a few strokes, but it should suggest the shape of something much bigger, and when done well it has a beauty and an elegance of its own.

Also, often you just don’t have time to create a full language. And that’s how it was with Yakhat: I needed a language to provide placenames and personal names for one of the tribes in the story, but I didn’t have the time or the interest to develop a full-blown lang for them. So I made a naming language.

All you need for a good naming language is two things:

  1. A phonology
  2. Some basic morphology

Yakhat phonology is very simple. I want the language to be reminiscent of the languages of Southeast Asia, so I pick out the following consonant phonemes:

p   t   tʃ  k
b   d   dʒ  g
bʱ  dʱ      
            kʰ
    s   ʃ
m   n
    l, r 
        j

Some unusual things to note: we have a single series of aspirated stops, but the labial and dental members are phonetically voiced, while the velar member is voiceless. At a featural level, all of these stops are unspecified for voice, but the labial and dental members are phonetically voiced because they lie further forward in the oral cavity and thus easily fall prey to spontaneous voicing. And why is the aspirated affricate missing? Here I imagine that there once was an aspirated affricate /tʃʰ/, but that this member became deaffricated and gives the /ʃ/ phoneme shown above.

Meta-linguistic concerns actually drive most of the decisions above. I like the digraphs bh and dh, but I dislike ph and th, since English speakers are likely to pronounce those as [f] and [θ] respectively. Furthermore, /tʃʰ/ is nearly impossible to romanize well, as you either choose the abominable chh, or you use ch and then find some other way to indicate /tʃ/. The conjectured sound change above justifies me avoiding it, and gives me an excuse to include /ʃ/, which I had already used in several names that I liked very much.

To this basic phoneme set, I add a few basic phonotactic constraints and some phonological processes, which I won’t cover in detail here. You’ll see some of them in action below.

On to morphology. For the purposes of my language, I created exactly two morphemes: a patronymic suffix -lik, and a reduplicative suffix for collective plurals. The patronymic is unremarkable. The primary character from this tribe is named Keshlik /’kɛʃlɪk/, the son of Keishul /’ke:ʃul/. In the derivation of that name you can observe a few phonological processes at work, such as syncope of an unstressed vowel, but otherwise there’s little to say.

The reduplicative plural is much more interesting. The hometown of the primary character is Khaat Ban [kʰa:t ban], and the people from his town are known as the Khaatat [kʰa:tat]. This collective plural is formed by reduplicating the vowel and final consonant of the stem: Khaat Ban gives Khaatat, those from Louk Ban are the Lougok, and those from Bhut Ban the Bhudhut, etc. You can observe several phonological changes in these forms. For example, voice and aspiration are both neutralized in codas, so that Bhut has the underlying form /bʱudʱ/ which is realized as [bʱut] in the simple name, but the underlying form of the final consonant reasserts itself in the reduplicated form.

And that’s it! With a relatively simple phonology, a few phonological rules, and some morphemes I have a naming language, but one that has just enough depth to suggest that a complete language underlies it. I don’t know what the stems of the names mean, and I don’t need to. If I ever decide that I need to elaborate Yakhat further, I’ll already have the groundwork laid down to create something fuller.

Next time: Praseo, and the challenges of developing something for a language family you already have.

One of the longest-lasting and most rewarding friendships of my life began in the sixth grade. I had just transfered to a new school, and being a shy, unathletic kid, I naturally gravitated to the other shy, unathletic kids, which in this case included Brett: a tall, skinny boy with glasses, allergies, and a gloriously nerdy set of interests. We played chess and read books together at recess. He got me to read Tolkien. And he got me into language.

In sixth grade Brett had already studied Latin and Old English, and his enthusiasm for arcane and obscure linguistic trivia infected me. I started studying Hebrew, we both dabbled in Tolkien’s languages, and we both tried to make our own languages. His languages were initially much better than mine, as he had a big head start on linguistics, and having two foreign languages already under his belt was a tremendous advantage for his initial language-construction forays. He taught me the International Phonetic Alphabet and the basics of phonology and historical linguistics. I don’t exaggerate much to say that my friendship with Brett changed my life: the interest in linguistics that he sparked never died out; Linguistics became my major in college, which led indirectly into my current day job; and my linguistic training was part of what motivated and prepared me to go to Romania where I met my wife.

He’s still better than me at linguistics, too, since he is in the last stages of finishing his PhD. in Linguistics, while I have a lowly B.A.

However, I do have one thing over him: I kept up the hobby of language creation (conlanging, as we call it), while he seemed to abandon it in high school. I’ve continued to develop languages for my fictional settings and my private amusement, and just the other day I completed an application for an actual paid conlanging gig. At this point I have at one well-documented language, Yivrian, and a whole slew of sketches, planned languages, and notes.

I’ve also put a lot of work lately into Praseo, the language used in my current WIP. And with the confluence of conlang-y things going on in my life right now, this seems like a good time to write about that aspect of my writing process, talking about how I use and create languages for my fictional settings, with pointers to how you can do the same if you’re interested.

Next week: a naming language.