April 24, 2016

Remembering Life Stories (5): Biographical Expertise

our ability to “chunk” elaborate sequences into single units of mnemonic content is dramatically enhanced by extreme familiarity with diverse variations of common biographical patterns

Remembering is constructive, but in more ways than one. Beyond the famous “schema theory” of Frederick Bartlett (in which “gaps” are filled in due to subconscious prejudice) we’ve also discussed “memory for the time of events” according to William Friedman (in which “dots” are connected via contextual implications). Now, in my last post, I’ve introduced what might be considered a third manner of mnemonic reconstruction, in which a familiar serial pattern is remembered constructively, piece by piece, and yet recognized as a whole. Even if this typology is not yet as crisp as I’d like, one thing does seem clear. Common patterns of Biographical Temporality can provide ready made sequences which can help structure a reader’s mnemonic fabula of a Life Story. A familiar serial pattern is mnemonically compressed into one “chunk”, one single “unit” of content. Because of this cognitive process, any reader familiar with Biographical Patterns of Temporality is naturally prone to find greater coherence in Biographical Storylines.

This contention demands further scrutiny. One challenging question we’ve not yet asked has to do with complexity. For a given reader, how many Life Story patterns can be chunked as units (and thus, efficiently reconstructable)? How many of these patterns can individually be recognized as separate and distinct whole units? If “unit-izing” a single pattern requires strong familiarity, how can a general readership be “familiar” with a broad diversity of various patterns?

In part one of this series, I said we should never suppose all readers’ minds generically or uniformly employ the same general pattern of biographical content, such as Burridge’s stages of composed ancient ‘bioi’. In parts two and three, I suggested that probability and statistical awareness can support any particular pattern a given mind might so happen to employ. In today’s post my goal is to mediate between these two points. How can a multitude of distinct biographical patterns be available, each one being as unique as some particular life story might require, and yet also being general enough that a given biography can be remembered coherently by a broad audience of individual readers? How many biographical patterns can the mind possibly recognize as unified life stories?

The answer depends not on the strength of an individual mind, but its orientation. The volume and variety of familiar sequences a mind can learn (i.e., “chunk” into long-term memory) depends on the studied interests of that particular mind. A depth of interest and years of paying attention to one subject - the redundant encoding of repeated observation over countless iterations - produces what psychologists call “expertise”. In short, we differentiate biographical patterns in memory because we have all become expert observers of basic human behavior.

As Carlyle first pointed out, people are obsessed with people. There is no other subject to which we pay such close and constant attention. Thus, generally by adulthood, biographical patterns become our area of shared expertise.


Psychological discussions of expertise usually began with the famous studies of chess masters, begun by Adriaan de Groot in the 1940’s and followed by William Chase & Herbert Simon in the 1970’s. Surprisingly, instead of higher IQ’s or superior mnemonic ability in general, de Groot found that what distinguished chess masters from novice players was their studied familiarity with the patterns of the game. For example, when asked to memorize a chess board with some chess pieces laid out in random positions, a grand master and a beginner would fare equally well. However, when asked to reconstruct board layouts from actual game play, the expert players were far superior in both speed and accuracy. It was Chase & Simon (drawing on G.A. Miller) who first realized that experienced players were reconstructing the board in chunks, by because they remembered the placement of pieces by groupings, rather than individually. By contrast, without a sense of the structural contexts involved in situations of actual gameplay, an inexperienced player can only attempt to memorize pieces one at a time. To a novice, every placement seems random. To the expert, complex patterns are not only familiar, but integral.

What is most significant here are the contrasts. First, it takes familiarity in a domain to observe patterns amidst the arbitrary. Chess is definitely arbitrary. The total possible layouts during actual chess games is a number that begins with “1” and ends with dozens of zeroes. Within those possibilities, however, the structural contours and basic rules of gameplay guarantee statistically that some layouts are going to be far more common than others. That means frequent patterns are there to be found, amidst all the arbitrariness, but those patterns cannot be noticed until players develop expert level familiarity. Second, the phenomenon of expertise is domain-specific. Chess masters are not exceptional at remembering in general. They only possess superior chunking and reconstructive abilities in their area of expertise. When psychologists discuss “expertise”, this domain-specificity is always implied.

All this was well summarized in the Oxford Handbook of Memory by Robert Lockhart. “It indicates that memory superiority is limited to the area of expertise and that the impact of expertise on memory lies in the expert’s ability to extract meaningful patterns from stimuli that to the novice are effectively random.” In short, we most easily recognize things we already know.

Aside from Chess, the cognitive benefits of specific expertise have been documented in other areas such as music, science, sports, and medicine, most famously by the “expert expert” Anders Ericsson. While Ericsson’s work has focused on task oriented performance, rather than information oriented recognition, and although he prefers to write in terms of knowledge and learning rather than memory and chunking, the core principles stand in alignment. For instance, in one 1999 article, Ericsson notes that experts demonstrate superior quality in their mental representations, and these “acquired representations appear to be essential for experts’ ability to monitor and evaluate [and improve] their own performance” (MITECS p.299). Surely this process of acquiring and leveraging “mental representations” is equivalent to cognitive chunking and its applications in constructive remembering, because expert level performance relies on first having built up an extensive knowledge of the domain. This is self-evident, for instance, as much in the number of chords and progressions learned by musicians before they surpass Ericsson’s “ten thousand hour” benchmark as it is for the detailed variations learned by experienced dancers which enable them to pick up new choreography quickly. Indeed, the same holds for learning a second language; those who pay rigorous attention to mastering patterns of grammar and syntax will acquire the ability to apply that knowledge (to employ their memory of those serial chunks) in speaking and reading fluently at advanced levels.

In addition to Chess, and other areas discussed by researchers like Ericsson, I submit that expertise applies also to this domain of Biography, and at this point we can once again start to get practical.

The reason Biographical Expertise is such an advantage for Remembering Life Stories is simply that readers have already done the hard work of learning various sequences before they ever begin reading. That is, such readers have previously spent their whole lives cognitively chunking a multitude of acutely differentiated domain-specific serial patterns which represent commonly observed sequences in human behavior, and this includes all the “time patterns” Friedman accounted for as “general knowledge”. For example, all the illustrations of temporal patterns I offered in post #3 are most likely embedded right now in the remembering minds of whoever buys next month’s most popular biography.

It’s worth noting here, from a hermeneutical point of view, that whether the arbitrary sequence of a particular Biography can be said to contain non-random subsequences is dependent upon prior audience knowledge, which means the “Unity” of a Life Story depends on information not in the text. Traditionally, such a prospect has been hermeneutically dauting, and I cannot help suggesting this may be one factor that helps explain why critical theorists of historical narrative have traditionally quarrantined Biography as a genre.

What’s most challenging about all this, at the moment, is attempting to grasp just how immense and how vastly differentiatied this collection of biographical patterns must be. How can it possibly be true that so many distinct sequences can become “familiar”, even to experts?

Consider the label “education & career”. Whether this counts as one concept or as two put together, the number of actual sequences entailed by “education and career” can be endlessly diversified. On the education side, there is elementary school, middle school, high school. Or perhaps some particular mind focused more specifically on each individual grade (K, 1st, 2nd, etc). There are also variations less common but not uncommon, such as dropping out and returning to school, or dropping out and passing the general equivalency test, or being homeschooled, or being educated on various army bases during a parent’s multiple deployments, or just taking a year off due to illness or circumstance. And so on and so on, ad infinitum. On the career side, again, variations abound. Many careers begin with graduation, but some careers begin before commencement. Entrepreneurs and professional athletes are only two of the more high profile examples. So there is sometimes “startup year, graduation, second year…” Then there are internships, a distinctly different pattern of unpaid labor in between graduation and entry-level career opportunities, which becomes yet another unique sequence when a summer internship takes place one or two years prior to graduation. There is also an endless variety of patterns within individual careers. For some blue color job tracks, there are strict thresholds for “rookie, journeyman, veteran”. In other fields, sadly, it can often be typical for an entire career sequence to be defined by a series of entry level jobs that never result in advancement. You, dear reader, may not have known such workers intimately, but you’ve interacted with literally thousands of cashiers, and you’ve easily had hundreds of chances to observe similar low-level workers interact with their managers.

This incredible diversity of biographical sequences seems to defy formal classification as a collection of patterns. From a wide lens perspective, it looks more like chaos. But then, consider how chaotic the English language must have seemed before Samuel Johnson. How many domain-specific patterns can one expert learn to recognize? Well, how many words do you know in the dictionary? If you can easily see how the letters “a-m-a-z-i-n-g-l-y” have been chunked in your own memory, then multiply that experience by the hundreds of thousands. The Oxford English Dictionary contains over 171,000 active words, 47,000 archaic words, and nearly ten thousand more with derivative sub-entries. If we also count plurals and forms in the past tense as distinct variations (which of course technically they are) and an avid reader might conceivably chunk 300,000 letter sequences in the course of a lifetime, if not perhaps even more. Just consider the variations “farm, farmed, farms, farmers, farmer’s, farmers’, farming, farmland” and ask yourself if the diversity of biographical sequences is comparatively worse.

How many variations of serial patterns can the mind actually establish as familiar? How many biographical patterns can we realistically remember? In theory, given enough patterned content and domain-specific expertise, we are each equally capable of recognizing (and “unit-izing”) as many serial patterns of familiar human behavior as the numbers of words we can spell. English writers long before Samuel Johnson had formalized the spelling of several thousand words. Over 30,000 distinct “chunks” of unitized content are displayed in Shakespeare’s writing alone, all of which - without a dictionary - he had to spell by heart. Critically, it’s beside the point to wonder if Shakespeare’s audience could do more than sound out a few basic spellings. The Bard himself had long passed through the requisite stages of acquiring language. His expertise was established and his chunked vocabulary was massive. In theory, we can all develop a similar volume of chunks in our repertoire. We can all develop increased powers of remembering, even of multiple groupings and sequences as whole units, provided that such patterned content belongs to a personal area of expertise.

More to the point, we have already developed (to our own personalized degree) our own relative expertise in observing patterns of biographical development.

Going back to the “farm, farmed, farms, farmers…” example, it might be considered helpful to realize that a large portion of the new patterns we typically chunk are no more than slight variations of patterns we’ve previously chunked. Along these lines, some psychologists expand chunking theory into something called Template Theory. The basic idea behind template theory is that a familiar sequence (“On top of old smokey, all covered in snow”) can facilitate the quick learning of a slightly modified sequence (“On top of spaghetti, all covered in cheese”). Test yourself quickly, please, by attempting to memorize these four 17 word strings:

(1) Beulah asked Millicent to expunge the problematic automotive accounts and to validate Jerilyn’s mediocre draft proposal. (2) Take Johnson Highway until Henderson Trail. Take the next two left turns and you’re five miles away. (3) I came, I saw, I conquered, and I’d appreciate a little courtesy and respect for my trouble. (4) Ask not what your mother can do for you. Ask what you can do for your mother.

Notice how each verbal sequence is more patterned than the one before it. In the third string, you had the first six words memorized as soon as you’d read them, and the remaining portion was made up of common phrases that seem familiar together. However, the fourth string offered you an entire “Template” with one slight change. Where JFK had said “country”, we insert “mother”. I dare say you should have memorized that last one somewhat instantly. To round off this digression, template theory is one way of accounting for the more acute differentiations involved in learning a multitude of differentiated patterns.

There’s no question, dear reader, that your own expertise for such things has grown continuously during each decade of your own life experience. You’ve had to learn literally millions of patterns, but you’ve also had approximately two hundred million seconds of waking attention per decade to devote to such learning. You’ve watched, noticed, and thought about other people’s lives, consciously and subconsciously, probably more than you’ve done anything else. Thus, to some degree or another, you are all certifiable experts, by now, in observing biographical sequences.
Your biographical expertise makes it easy to diversify your own knowledge of biographical patterns, and on some level you have already done this for lots and lots and lots and lots of these patterns. It may seem incredible to think we have learned countless variations in biographical sequence, since we tend to think of each one needing to be learned individually before it can be individually familiar, but that’s not how the process actually works. In practice, a lot of patterns are similar.

You learned to read and to recognize spellings of words as the result of a complicated process (about which psychologists don’t entirely agree) but the point at which you began to master spelling as a discipline was after you had thousands of opportunities to recognize common patterns in word structure. Think back. Nobody sat down and explained to you, then, that our most common vowel is E, or that T is most often followed by H, or that N is most often followed by a T or a D. They might once have mentioned that Q is invariably followed by U, but written English displays hundreds of relevant digraphs and trigraphs. What actually happened was consciously imperceptible, but on some level the frequent pairings and triplets were a benefit in your learning process. The common digraphs (TH, QU, NT, ND, etc) and trigraphs (THA, ENT, ION, TIO, etc) helped you learn to read and spell with proficiency.

So it has been, all your life, with biographical sequences.

General readers already possess expertise in observing human experience, which allows readers’ minds to recognize various longitudinal patterns of behavior in narrative literature. Because our minds have long ago chunked a collection of similar patterns as sequentially ordered groupings of mnemonic content, our constructive remembering of (the chronological fabula of) narrated life stories is significantly advantaged. Even when life stories eschew narrativized emplotment, we remain capable of receiving such stories coherently despite their lengthy, elaborate, and arbitrary storylines. We can do this because biographical content is essentially self-structuring, and because the vast bulk of biographical structures are themselves self-contained (pre-structured, organized, chunked, “unit-ized”).
In short, our ability to constructively remember lengthy, elaborate sequences is greatly advantaged when working with biographical content.

As it so happens, Thomas Carlyle had inadvertently tapped into more than he knew.


It must be emphasized once again that chunking serial patterns (or “templates”) is a profoundly different experience than utilizing frames or scripts or schemata. Frederic Bartlett’s signature experiment was based on his careful and deliberate selection of a bizarre little story, “The War of the Ghosts”. Shrewdly counting on the forgetability of that weirdness is the primary reason why Bartlett was able to observe such a wealth of contrived rationalizations. To be sure, his conclusions (when observed rightly) remain unassailable. Readers’ minds tend to replace unfamiliar details with other material that aligns better with personal expectations or conforms more with general & conventional experiences. We tend to remember creatively when story content is lost, and we often fail to preserve data for which we have no prior context.

By the same token, however, our ability to reproduce stories mnemonically should increase when material happens to fit well with familiar contexts. Indeed, Bartlett’s own summary of conclusions from that signature study includes statements like “Detail is outstanding when it fits in with a subject’s pre-formed interests and tendencies.” and “the reduction of material to a form that can be readily and ‘satisfyingly’ dealt with is very prominent. [This process] gives the whole dealt with that specific ground, frame, or setting, without which it will not be persistently remembered” (Remembering, p.92). In other words, one could technically argue that chunking serial patterns is very much like schema theory in their core concepts and functions. However, I must insist we continue to emphasize their distinctions, and for two particular reasons.

First, the connotations of scholarly discourse about schema theory have overwhelmingly focused on what happens when information is lost from memory, rather than what happens when information is preserved (Ost & Costall, 2002). People think of schema theory as imposing familiar contexts that don’t necessarily apply. That’s not remotely what happens with chunking and specific domain expertise. Second, schema theory typically speaks of stereotypes and generalities. Schemas are gist memory contexts that may or may not include details. Likewise, “cognitive scripts” are far more generalized as sequential experiences than the types of biographical sequences our minds tend to chunk as serial patterns. Scripts and schemata do not offer the kind of variation we find with expert chunking (with or without “template theory”).

In sum, Bartlett was right to teach us that unfamiliar content is usually replaced, but one must therefore also suppose familiar content is often retained. That is, especially if by “familiar” we mean the levels of expertise illustrated in this post, above.

The first point to recap is that sequential “chunks” of story content can be mnemonically unified. A familiar serial pattern enables multiple pieces of information to combine as one single unit of memory. In other words, defying Aristotle’s opinion, we may justifiably declare that biographical patterns possess a degree of Unity. Although less comprehensive than the unity of a classical plot, Life Stories can and do enter audience memory with a sense of wholeness. This is a cognitive phenomenon, occurring when readers recognize a familiar serial pattern in material that reflects biographical temporality, enabling that pattern to be more easily remembered altogether - a sequential Unit which incorporates many parts in one particular whole.

The most critical problem this post has tried to address is that all this requires decades of learning. This kind of remembering requires not common knowledge, but focused expertise, which is where Thomas Carlyle’s intuition comes back into play.  We have all been obsessively focused on learning about other humans’ experiences.

In principle, a biographical storyline can be individually arbitrary and yet resemble a temporal pattern that is well known to some readers. As often as this happens, the “arbitrary” storyline can be remembered coherently - at least by those particular readers. We must note once again how none of this reflects any sense of causality. As we’ve seen in posts #2 and #3, the reader’s familiarity can be purely statistical. If some generic “event sequence” is easily recognizable as a frequent set of consecutive outcomes, then any storyline evoking that sequence is more efficiently rememberable. That is to say, patterns increase coherence.

Because our Biographical Expertise enables us to recognize many diverse patterns as individual units, we have an increased ability to maintain coherence when Remembering Life Stories.


In my next post, we’ll reassess biographical patterns from a statistical point of view. If we redefine common patterns as collections of frequencies, we can re-examine the coherence of biographical sequences in more informational terms, as statistical regularities.

It has been said that Plots (in hindsight) are always predictable, Biographies are arbitrary, and Chronicles are random. But my next post will declare these are not three distinct categories. They are portions of a continuum. The truth is, Plots, Biographies, and Chronicles all present various degrees of coherence, but that coherence is not literary. It is mnemonic. It is, more precisely, mnemonically reconstructive. Whether aided by causality or probability, some discourses stories are more coherent (i.e., more reconstructable) than others.

Altogether, this suggests that mnemonic encoding of stories (storylines) can be hypothetically measurable, according to (what I’m going to call) Narrative Redundancy.



Addendum: on Biographical Patterns and Historical Theory

We might here briefly revisit the conflict between Louis Mink and David Carr. With Mink, I say the non-fiction stories we reconstruct in our minds are not accurate accounts of the past, but against Mink, I say that life experiences are cognitively compressed into a storyline by the natural process of autobiographical memory. Against Carr, I say that human experience itself does not posses an inherent narrative structure, but with sympathy for Carr (whose arguments could be modified without too much trouble into cognitive terms) I say that our minds naturally create “chunks” of meaning and structure from whatever information gets past the “filtering and boosting” function of our “attentional system” (Daniel Bor, The Ravenous Brain, p.126-7).

In other words, when I say that “our minds have long ago chunked patterns as sequentially ordered groupings of mnemonic content”, I do not mean to suggest that these patterns are necessarily accurate models of any human experience in particular, or even in general. What I am saying is that these patterns are accurate representations of the way that similar samples of lived experience (first, second or third hand experience) have been processed over time by our cognitive faculties. The familiar sequence of “high school, college, entry-level career opening” is not strictly an accurate account of anyone’s life. Technically, that sequence admits an astounding level of distortion and commits countless sins of omission. We cannot stress strongly enough that Louis O. Mink was absolutely correct on this point.

On the other hand, we do not contradict Mink if we say the familiar sequence of “high school, college, entry-level career opening” is an accurate account of a common distortion, a mnemonic compression which occurs frequently in human remembering. In short, while Carr remains absolutely mistaken to insist “human experience” has an inherent narrative structure, he would not have been incorrect to insist that life often seems that way in retrospect.

Car was incorrect because he tried to identify stories with reality. It might have helped him to realize how significantly his own view of reality had been distorted by personal remembering.

Some biographical sequences become familiar precisely because the activities they distort and the experiences they compress display a common pattern in long term human behavior, which is thus a common source of our resulting memories. These sequences thus do represent the past “accurately”, albeit within some degree of distortion. Human activity does not have an inherent narrative structure, but our cognitive nature - the consciousness which must observe things in linear order and the encoded content that implies temporality in constructive remembering - is consistently at work making experience seem like a narrative in retrospect.

Lived experience is indeed very much unlike a story…
Except as it seems afterwards, on the inside of our minds.


April 8, 2016

Remembering Life Stories (4): Familiar Serial Patterns & Cognitive Chunking

This series has been trying to answer one question: Why is it so easy to remember life stories despite storylines which arbitrarily narrate “one thing after another”? For answers, we need to bring cognitive theory on constructive remembering into conversation with narratative theory about “story content” and “story structure”. How does a discoursed sequence find coherence in an audience memory?

Previously: Biographies lack a Plot, but their unique narrative dynamics provide other mnemonic efficiencies; see Introduction (post #1). The first is self-sequencing content, a.k.a. Temporal Content (post #2), which helps the remembering mind construct Biographical Temporality (post #3) by recognizing aspects of “necessary causality”, statistical probability, and familiar “time patterns”. The latter term is from William Friedman, whose psychological research considered these patterns as general knowledge, the context which gives content a sequence.

Today: We move beyond self-sequencing content and consider self-contained sequences. Having seen one level of efficiency when individual pieces of content imply their place in a larger contextual pattern, we will now examine a greater efficiency, which occurs when any pre-structured pattern of information is remembered as if it were a single unit of content. In cognitive terms, one trace memory can do more than seek its place in a larger framework. As it turns out, a single trace memory may comprise many individual pieces of content which are already in sequence. To explain this, today’s post will introduce the cognitive science of “chunking” .


Today’s post is about Familar Serial Patterns. Before we get into technical details about Cognitive Psychology, let me first illustrate the basic idea in plain terms.

Familiar patterns are all around, from basic facial features and the human form to highway sign markings and the layout of most retail stores. With plenty of room for variation, the recognition (and construction) of basic patterns is a major part of how our minds cope with the daily bombardment of unceasing informational input. Repetition and redundancy increases recall and remembering, and so patterns just naturally (by definition!) become easy things to remember. Beyond the concrete patterns visible in the world, we also find (and form) patterns with abstract information, like letters, numbers, names, and dates. And of course we have already discussed temporal patterns, like days of the week, lunar cycles, seasons, and the Gregorian calendar. All types of patterns aid human remembering, but this last category brings up an important distinction.

Familiar SERIAL patterns involve any recurring sequence of objects, or symbols, or identifiable outcomes. Most patterns are not serial patterns. Faces, chairs, stop signs, and ranch style houses are non-serial patterns because they do not involve linear order. The most familiar serial patterns in our lives are common daily routines, like tying your shoelaces, observing the cycle of traffic lights, normal procedures at work, or scrolling through the menu on a frequently visited website. Other serial patterns include memorized cooking recipes, musical compositions, song lyrics, rehearsed speeches, and any famous phrase or catch-phrase. In a more abstract direction, mathematics features an endless variety of serial patterns, like primes, odds, evens, and the multiples of any given number. Numbers can also be used to make non-ordinal serial patterns. We also used to memorize area codes and local dialing prefixes, tv channels, and zip codes. Today, you probably use an assigned Logon ID or procedure codes at work. Basically, anything repeatedly seen or done in the same particular sequence is a familiar serial pattern, or eventually becomes one in your mind.

Perhaps the largest and most varied category of familiar serial patterns is written language. Of course, the alphabet itself is a serial pattern and so is every word in the dictionary. In fact, at the present word count, this blog post has presented you with over 500 familiar serial patterns. Amazingly, each single word struck you not as a collection of individual letters, but as a whole. Each word is a unified collection in your mind. Even more amazingly, if I were reading this out loud, you could probably spell out every collection of letters. The word “amazingly” strikes you as a single unit of information, and yet your mind can break it down into nine distinct pieces of data. This everyday common experience is a textbook example of information compression.

School teachers use this same mnemonic dynamic to invent helpful acronyms. Kids who memorize “PEMDAS” and “Roy G. Biv” find they have learned five or seven times as much information, and that in serial order no less! By the same principle, “My Very Excellent Mother Just Served Us Nine Pizzas” becomes a stock phrase, memorized as a single item in the brain, which encodes the serial order of nine planets by distance, from Mercury to Pluto. (Long live Pluto!) Just like the word “amazingly”, once those nine words become a single unit in the mind, that single unit becomes an easy way to compress nine times as much data, which, again, includes serial order! (If you don’t think it’s difficult to memorize random data in sequence, test yourself sometime. Try memorizing phone numbers. But then, the key here is “random”.)

You experience this with time patterns as well. If we invoke “the academic calendar”, you can begin rattling off the basic sequence of annual events, which is filled with familiar event-chains like mid-August regsitration, Fall mid-terms after Columbus Day and winter break after exams. Sitting down with pencil and paper, you could probably reconstruct most every important detail with its approximate annual placement on the calendar. Psychologically, some of that reconstruction might come together by associative remembering, but a great deal of the overall framework is, in fact, what your mind recognizes as (the extensive definition of) “the academic calendar”. 

In this example, we illustrate that it is possible to evoke an elaborate sequence of events with a single mnemonic encoding. To really accomplish this, however, requires a high level of familiarity with “the academic calendar”. The cognitive dynamic is precisely the same, but to remember that much content as a single “chunk” of information, you have to become an “expert” on the material. In fact, that’s the point of today’s post. The science of “Chunking” becomes more powerful and more broadly effective whenever a subject possesses extreme “Expertise”.

Now, before we apply this dynamic to Remembering Life Stories, let’s explore in more technical detail what cognitive psychology really says about “Chunking” and “Expertise”. If you don’t care about the scientific details, skip the next section and we’ll get back to talking about Biographies. If you care very much, a general bibliography is appended at bottom.



To explain “chunking”, psychologists typically begin with the work of George A. Miller, whose famous 1956 paper, “The Magical Number Seven” attempted to quantify the amount of new content his labratory subjects could retain in their short-term memory (a.k.a. “stm”, and today more often called “working memory”). When Miller’s subjects organized information into units, a handful of three digit numbers proved as easy to memorize as a handful of one digit numbers, and Miller found this perplexing because the longer string obviously preserved triple the payload. That is, by memorizing five three digit numbers, instead of five one digit numbers, you could actually memorize fifteen digits. The limitations of short term memory were consistent, but stm’s effectiveness could be expanded if the target information was compressed.

Miller thus began referring to “chunks” (single units of data) and “bits” (the amount of information conveyed by each unit). Ten numbers were ten “chunks”, no matter how many digits they held, but a three digit number contained more “bits” of raw information. To justify this innovation, Miller invoked Claude Shannon’s recent paper “A Mathematical Theory of Communication” (1948), where he described the benefits of distinguishing between data and information. Despite the benefit of standing on Shannon’s shoulders, Miller could see that quantifying human memory was a different kettle of fish. Unsurprisingly, few psychologists followed Miller and Shannon in measuring informational “bits”, which requires complex algebraic equations and reams of statistical data, but the concept of chunking remains prominent in psychological research.

In practice, the term “chunk” soon came to reference only those items of memory with increased informational value. Thus, in the parlance of cognitive science, the string “1, 2, 3, 4, 5, 6, 7, 8, 9, 10” would be referred to as ten “items” but “1 - 10” could be called a “chunk”, essentially a shorter label denoting a long string of data. In effect, today, a chunk is a grouping or a category. In discussion, therefore, it is possible to compare larger or smaller “chunks”. The small chunk just mentioned denotes only ten items but “odd numbers” is a large chunk denoting an infinite string. This also demonstrates, depending on context and observer, that the recall of one item can be more informative than the recall of five items. Such is the nature of information - or perhaps more accurately - such is the nature of human remembering. Yes, this may go that deep, or perhaps even deeper. According to Daniel Bor, cognitive chunking may provide the best way to understand human consciousness itself! Obviously, the research has progressed far beyond measuring a handful of three digit numbers.

For our purposes today, the phenomenon of chunking explains how our mind can remember familiar serial patterns as single units of information. Chunking provides mnemonic efficiency.

Consider the acronym “IRS”, a single piece of data which implies (embeds, condenses, compresses, encodes into memory) a string of three words, “Internal Revenue Service”, which expands again to be 22 characters; 24, if you’re typing. Those characters altogether comprise, by definition, an arbitrary string. Yet, most readers of English, and Americans especially, will find this string to seem entirely non-random. For non-English speakers, memorizing those 24 characters would be very difficult, and virtually impossible in sequence, but for educated Americans, the acronym “IRS” evokes that string of 24 items on an unconscious level. The lengthy and arbitrary string can be reconstructed instantly when the mind recognizes “IRS” and “Internal Revenue Service” as familiar serial patterns.

Most literature on “chunking” has been written about knowledge and learning, but I dare say we can re-state things in terms of constructive remembering. If we consider “IRS” as a single trace memory then “Internal Revenue Service” would be a reconstruction - and yet not an assembling of associated bits but rather an unfolding of the embedded content and its intrinsic context. Somehow, all of this information is implicitly encoded within a mind that understands the meaning of the acronym “IRS”. Recalling that trace somehow initiates a constructive remembering process that - literally - decodes the acronym. Beyond that, I'm not sure precisely what else to say. (However, see my addendum at bottom for an attempt to think more precisely about how “de-chunking” actually qualifies as constructive remembering.)

Getting back to the main point, we are talking about mnemonic efficiency - in particular, for reconstructing a sequence. A familiar serial pattern is something you’ve observed repeatedly over time. It’s not a sequence that you must now begin to rehearse. It’s a sequence which has become familiar, already. A common sequence of things has become general knowledge through the natural redundancy of experiencing repetition. And when you embed that general knowledge in a single self-contained concept, a large volume of sequential data can be reconstructed with ease.

Now recall Friedman’s idea of time patterns. The “days of the week” is one idea that labels a sequence of seven repeating events. “The Seasons” reminds you of a pattern with four periods in sequence. If I simply say “draw a calendar”, you can begin writing out a detailed list of 365 particular code names in precise order, all with slight variations, and you can even give it a unique structure if you begin working from today’s date. Through years of repetition and redundancy, you learned that May follows April, that Monday precedes Tuesday, and how the advent of each season always makes you anticipate its own corresponding holidays and other annual events. That massive trove of sequenced information is self-contained within your grasp of “calendar”.

Recalling a familiar serial pattern can spark a remembering of the whole pattern, including each and every one of its parts. In precisely this way, the Temporal Patterns we looked at in posts 2 and 3 can each be recognized as singular cognitive chunks, which convey serial order and a unified wholeness. This goes beyond merely extrapolating the implications of a single trace memory (e.g., recalling one bit of life story content after reading a biography) in order to remember what probably came next, or necessarily came earlier, or what general time pattern the trace content might properly evoke. The prospect of chunking can offer a greater advantage in reconstructing the larger patterns of Biographical Temporality (assuming as always that temporal content is preserved rather than lost). 

Clearly, this brings us right back to Narrative Theory. If coherence or unity depends on the remembering of a whole, the inarguable Unity of chunking with Familiar Serial Patterns inarguably offer a relative degree of such wholeness. One may not be able to unify and entire story, as would an Aristotelian Plot, but one can unify large portions of a Life Story by remembering elaborate patterns as biographical chunks. (See post #3 for several examples.) Evoking multiple points of story structure with a single “chunk” of story content is more efficient than recalling multiple points individually, even when they effectively sequence themselves.

When the arbitrary sequence happens to be familiar, our minds can bring coherence to larger volumes of data. But there is still a problem. How many “Life Story” patterns can a single mind possibly “chunk”? How can it be possible to remember a dozen or two dozen biographical storylines, each distinctly unique, merely by recourse to the same knowledge of general patterns? For the answers to these vital questions, we must advance to the next subject in this survey of cognitive psychology: expertise.

As George A. Miller observed in 1956, “a great deal of learning has gone into the formation of these familiar units”.  And as Thomas Carlyle observed way back in 1832, we have all spent our lifetimes obsessing about other people, and consequently, to some degree we have all become experts at observing diverse variations in basic patterns of human growth and development.

To be continued...


Addendum on the nature of “un-chunking” as constructive remembering:

With chunking (e.g., words and acronyms), our minds do not learn to associate multiple pieces of information, nor do they build a neural network (per se) of related memory traces. Rather, a collection of information becomes integrated as one new data point (engram, memory trace, piece of mnemonic content). But does this “integration” get recalled as a whole? My feeling is that it must be seen as another instance of constructive remembering?

However, at the moment I do not know whether to say this is like Bartlett’s model (filling in gaps of lost content, according to schemas, e.g., patterns of spelling) or like my own recent model of self-sequencing content based on Friedman’s work (connecting the dots of preserved content, according to implicit context, e.g., each bit connects to its place in the whole chunk). Perhaps this another unnoticed type of constructive remembering, which does not involve connecting dots or filling gaps. If we consider the way the mind works with “IRS” the information seems to “un-chunk”, or as Shannon might say, “decompress”. We do not retrieve a pristine file from storage which spits out the right 24 characters in order. Perhaps we construct a mental image of the word, visually. Perhaps we mentally sound it out afresh. I do not know how this occurs, but I dare say there is clearly a process. 

Now, for inexperienced spellers, and for unfamiliar words, perhaps this process is just like schema theory, as the mind searches for the right pieces of content to fill in the missing part of a word. In these cases, a mind works to assemble associated trace memories. However - and this is today’s key idea - when the expert’s mind is extremely familiar with a chunk, it somehow recognizes every piece in the unified chunk. In such cases, it seems the mind unpacks something that was previously packed, as if some “engrams” can be decoded as a precise inversion of the way they were encoded. At least, with a chunk like “IRS”, we seem to “reconstruct” or “decompress” these 24 characters by inverting whatever cognitive process allowed us to embed the information as one “chunk” in the first place.

What seems to matter the most is familiarity. A small child may initially struggle to decompress the serialized chunk “c-a-t” but she will hopefully move on to the challenge of reconstructing sequences like “h-e-m-i-s-p-h-e-r-e”, “v-e-r-t-i-b-r-a-t-e”, and “g-e-r-a-n-i-u-m”. With each new and unfamiliar word, learning ensues. The chunk, itself, is constructed via a process. It takes time. The details that will later re-sequence themselves so automatically are not embedded as quickly. However long it may take, a serial pattern is successfully chunked when its parts are re-encoded as one whole, creating a brand new engram (trace memory), which essentially compresses a large amount of contextual information. Now, in all that, by definition, the chunk itself has become a familiar item. It has been, we tend to say, “memorized”.

By the way, this is where the subjective aspect of compression (in information theory, more properly) may become less perplexing. In ancient Rome, “SPQR” was a familiar serial pattern, a single chunk compressing 26 characters: “Senatus_Populesque_Romanus”. Today, “SPQR” is meaningless to most observers, a single chunk denoting only itself. Thus, the science of Information seems beholden to systems of memory. It’s not subjective at all. It’s cognitive. There is no “information” in the universe anywhere. DNA is structured by (and RNA transmits) complex genetic sequences that strike us as “information”, and they can certainly be analyzed most fruitfully as such, but isn’t this merely a biochemical phenomenon? Anyway, now I’m definitely in the deep waters, so let’s end this excursus right here.

Anon, then...

Recent Posts
Recent Posts Widget
"If I have ever made any valuable discoveries, it has been owing more to patient observation than to any other reason."

-- Isaac Newton