Understanding Morphology - Introduction and Basic Concepts

9 min read
paper fragments with different words on them
Jason Leung

What is Morphology

Definition 1

Morphology is the study of systematic covariation in the form and meaning of words.

Morphology is the study of the internal structure of words1. Words have internal structure in two very different senses.

  1. Words are made up of sequences of sounds (or gestures in sign language), i.e. they have internal phonological structure.

  2. Formal variations in the shapes of words correlate systematically with semantic changes. Morphological structure exists if there are groups of words that show identical partial resemblances in both form and meaning.

Definition 2

Morphology is the study of the combination of morphemes to yield words.

Morphological analysis consists of the identification of parts of words, or, more technically, constituents of words.

The smallest meaningful constituents of words that can be identified are called morphemes.

Morphology in Different Languages

Linguistics use the terms analytic and synthetic to describe the degree to which morphology is made use of in a language.

  • When a language has almost no morphology and thus exhibits an extreme degree of analyticity, it is also called isolating.

  • When a language has an extraordinary amount of morphology and many compound words, it is called polysynthetic.

The distinction between analytic and (poly)synthetic languages is not a bipartition or tripartition, but a continuum.

Basic Concepts

Consider the examples below.

readreadsreaderreadablewritewriteswriteerwriteablekindkindnessunkindhappyhappinessunhappy\begin{array}{llll} read & read-s & read-er & read-able \\ write & write-s & write-er & write-able \\ kind & kind-ness & un-kind \\ happy & happi-ness & un-happy \end{array}

These words are easily segmented, i.e. broken up into individually meaningful parts. These parts are called morphemes. Morphemes can be defined as the smallest meaningful constituents of a linguistic expression.

Words like chameleon\textit{chameleon} cannot be segmented into several morphemes; these words are monomorphemic.

Different Notions of 'word'

The Problem

The most basic concept of morphology is of course the concept 'word'. Then how many words are there in the first sentence of the paragraph? In English, words are separated by a blank space. A word may be considered as a sequence of letters. But, for words like live, lives, lived, living\textit{live, lives, lived, living}, they are different sequences of letters, thus different words. But when a dictionary is made, not all these words are listed. The dictionary user is expected to know that live,lives,lived,livinglive, lives, lived, living are all concrete instantiations of the 'same' word LIVE\small \text{LIVE}.

From the examples above, we can see that even the term 'word' may refer to different notions in different context. There are three different notions of 'word'.

word token

When a word is used in some text or speech, that occurrence of the word is sometimes referred to as a word token.


A lexeme is a word in an abstract sense. It is usually written in small capital letters.

Lexemes are abstract entities that have no phonological form of their own.

LIVE\small \text{LIVE} is a verb lexeme. It represents the core meaning shared by forms such as live, lives, lived, living\textit{live, lives, lived, living}.


A word-form is a word in a concrete sense. It is a sequence of sounds that expresses the combination of a lexeme and a set of grammatical meanings appropriate to that lexeme.

Lexemes are like sets of word-forms, and every word-form belongs to one lexeme. The set of word-forms that belongs to the same lexeme is called a paradigm.

A set of related lexemes is called a word family. For example, here are two English word families.


Why Different Morphological Relationships?

The Problem

From the example of the two English word families above, we can see that READ\small \text{READ} and READABLE\small \text{READABLE} are considered as two different lexemes, rather than that readable\textit{readable} is a word-form belonging to the lexeme READ\small \text{READ}. That is to say, words like readable, reader\textit{readable, reader} got their own entries in a dictionary, but reads, readings\textit{reads, readings} are not. Thus, the difference between word-forms and lexemes, and between paradigms and word families, is well established in the practice of dictionary-makers. Why is it that the different morphological relationships are treated in different ways?

Complex lexemes (such as READER\small \text{READER} or LOGICIAN\small \text{LOGICIAN}) generally denotes new concepts that are different from the concepts of the corresponding simple lexemes; word-forms often exist primarily to satisfy the requirement of the syntactic machinery of the language.

Complex lexemes are less predictable than word-forms so that they must be listed separately in dictionaries. For example, we cannot predict whether the logician\textit{logician} or *logicist\textit{*logicist} is the correct form; and the meaning of a complex lexeme is often unpredictable, too.

Kinds of Morphological Relationships

  • inflection: the relationship between word-forms of a lexeme
  • derivation: the relationship between lexemes of a word family
  • compounding: some morphologically complex words may belong to more than one word families simultaneously, like FIREWOOD\small \text{FIREWOOD}

morphological relationships

Affixes, Bases and Roots

The Problem

In both inflection and derivation, morphemes have various kinds of meanings. Some meanings are very concrete and can be described easily (like the meanings of morphemes wash, logic, read\textit{wash, logic, read}), but others are abstract and more difficult to describe. For instance, the morpheme -al\textit{-al} in logic-al\textit{logic-al} can be said to mean 'relating to'. But some meanings are even more abstract, like English -s\textit{-s} in read-s\textit{read-s}. It is only required when the subject is a third person singular noun phrase. It is unclear whether it can be said to have meaning.

In such cases, these morphemes are said to have certain grammatical functions. Word-forms in an inflectional paradigm generally share one longer morpheme with a concrete meaning and are distinguished form each other in that they additionally contain different shorter morphemes, called affixes.

An affix attaches to a word or a main part of a word. It usually has an abstract meaning, and an affix cannot occur by itself.

The part of the word that an affix is attached to is called the base. A base is also called a stem, especially in an inflectional morphological relationship.

Different kinds of affixed are determined by the position of the affix within a word.

prefixprecedes the baseEnglish -ful\textit{-ful} in hand-ful\textit{hand-ful}
suffixfollows the baseEnglish un-\textit{un-} in un-happy\textit{un-happy}
infixoccurs inside the baseTagalog -um-\textit{-um-} in s-um-ulat\textit{s-um-ulat} 'write'
circumfixoccurs on both sides of the baseGerman ge-...-en\textit{ge-...-en} in ge-fahr-en\textit{ge-fahr-en} 'driven'

Bases or stems can be complex themselves. For example, in activity\textit{activity}, -ity\textit{-ity} is a suffix attached to the base active\textit{active}, which itself consists of the suffix -ive\textit{-ive} and the base act\textit{act}.

A base that cannot be analyzed any further into constituent morphemes is called a root.

A base may or may not be able to function as a word-form. For example, in English, cat\textit{cat} is both the base of the inflected form cats\textit{cats} and itself a word-form. However, in Spanish word-form gato\textit{gato} ('cat') can be broken up into the suffix -o\textit{-o} ('masculine') and the base gat-\textit{gat-}, but gat-\textit{gat-} is not a word-form. Bases that cannot function as word-forms are called bound stems.

Roots and affixed can generally be distinguished quite easily, but in some languages they are not. For example, the Salishan language Bella Coola has a number of suffix-like elements that do have concrete meanings.

-usface-likbody-an’ear’-altwasky, weather-uc’mouth’-ltchild\begin{array}{llll} \textit{-us} & '\text{face}' & \textit{-lik} & '\text{body}' \\ \textit{-an} & '\text{'ear'}' & \textit{-altwa} & '\text{sky, weather}' \\ \textit{-uc} & '\text{'mouth'}' & \textit{-lt} & '\text{child}' \end{array}

English has a number of morphemes that are similarly difficult to classify as roots or affixes, too.

biogeographyaristocratbioethicsautocratbioengineeringdemocrat\begin{array}{ll} biogeography & aristocrat \\ bioethics & autocrat \\ bioengineering & democrat \end{array}

The elements bio-\textit{bio-} and -crat\textit{-crat} could be regarded as affixes because they do not occur as independent lexemes, but their meaning are very concrete and could be regarded as bound stems that have the special property of occurring only in compounds.

Morphemes and Allomorphs

Morphemes may have different phonological shapes under different circumstances. The term allomorph is used in this situation.

For example, plural suffix for nouns in Turkish has two forms, ler and lar\small -ler \text{ and } -lar, due to vowel harmony.

evevlerdag˘dag˘larfilfilleryılyıllargo¨lgo¨llertoptopllargu¨ngu¨nlerpulpullar\begin{array}{ll} ev - evler & dağ - dağlar \\ fil - filler & yıl - yıllar \\ göl - göller & top - topllar \\ gün - günler & pul - pullar \end{array}

Types of Allomorphs

Phonological Allomorphs

From the example above, we can see that the two different forms of the suffix have the same meaning and are phonologically similar. Being phonologically similar is a common property of allomorphs, but is not a necessary one. Allomorphs that have this property are phonological allomorphs. The formal relation between phonological allomorphs is called an alternation. Phonological allomorphs occur in different environments in complementary distribution.

It's convenient to think about phonological allomorphy in terms of a single underlying representation that is manipulated by rules under certain conditions, resulting the actually pronounced surface representation.

Suppletive Allomorphs

Morphemes may also have allomorphs that are not at all similar in pronunciation. Suppletion is another type of allomorphs, like gowent, goodbetter\small go-went,\ good-better in English. Suppletion is most often used to refer to stem shape, but affixes are also potentially suppletive.

Whether an alternation is phonological or suppletive is not a clear-cut binary distinction.

  • weak suppletion: in cases like English buy/bought\small \textit{buy/bought}, teach/taught\small \textit{teach/taught}
  • strong suppletion: in cases like English go/went\small \textit{go/went}, good/better\small \textit{good/better}

Conditioning of the Allomorphy

Different allomorphs are selected under a certain condition.

phonological conditioningchoice of allomorphs depends on phonological contextEnglish plural depends on final sound in stem
morphological conditioningchoice of allomorphs depends on morphological contextSpanish verb ir, va-\small \textit{ir, va-} or fu-\small \textit{fu-} depending on tense
lexical conditioningchoice of allomorphs depends on the individual lexical itemEnglish past participle -en/-ed\small \textit{-en/-ed} is unpredictable and depends on individual verbs


  1. We should have defined what is a 'word' before we could give a definition to morphology. But to keep it simple, we'll appeal to a loose, intuitive concept of 'word' as for now.