April 8, 2016

Remembering Life Stories (4): Familiar Serial Patterns & Cognitive Chunking

This series has been trying to answer one question: Why is it so easy to remember life stories despite storylines which arbitrarily narrate “one thing after another”? For answers, we need to bring cognitive theory on constructive remembering into conversation with narratative theory about “story content” and “story structure”. How does a discoursed sequence find coherence in an audience memory?

Previously: Biographies lack a Plot, but their unique narrative dynamics provide other mnemonic efficiencies; see Introduction (post #1). The first is self-sequencing content, a.k.a. Temporal Content (post #2), which helps the remembering mind construct Biographical Temporality (post #3) by recognizing aspects of “necessary causality”, statistical probability, and familiar “time patterns”. The latter term is from William Friedman, whose psychological research considered these patterns as general knowledge, the context which gives content a sequence.

Today: We move beyond self-sequencing content and consider self-contained sequences. Having seen one level of efficiency when individual pieces of content imply their place in a larger contextual pattern, we will now examine a greater efficiency, which occurs when any pre-structured pattern of information is remembered as if it were a single unit of content. In cognitive terms, one trace memory can do more than seek its place in a larger framework. As it turns out, a single trace memory may comprise many individual pieces of content which are already in sequence. To explain this, today’s post will introduce the cognitive science of “chunking” .

 ~~~~~~~~~~~~~~~

Today’s post is about Familar Serial Patterns. Before we get into technical details about Cognitive Psychology, let me first illustrate the basic idea in plain terms.

Familiar patterns are all around, from basic facial features and the human form to highway sign markings and the layout of most retail stores. With plenty of room for variation, the recognition (and construction) of basic patterns is a major part of how our minds cope with the daily bombardment of unceasing informational input. Repetition and redundancy increases recall and remembering, and so patterns just naturally (by definition!) become easy things to remember. Beyond the concrete patterns visible in the world, we also find (and form) patterns with abstract information, like letters, numbers, names, and dates. And of course we have already discussed temporal patterns, like days of the week, lunar cycles, seasons, and the Gregorian calendar. All types of patterns aid human remembering, but this last category brings up an important distinction.

Familiar SERIAL patterns involve any recurring sequence of objects, or symbols, or identifiable outcomes. Most patterns are not serial patterns. Faces, chairs, stop signs, and ranch style houses are non-serial patterns because they do not involve linear order. The most familiar serial patterns in our lives are common daily routines, like tying your shoelaces, observing the cycle of traffic lights, normal procedures at work, or scrolling through the menu on a frequently visited website. Other serial patterns include memorized cooking recipes, musical compositions, song lyrics, rehearsed speeches, and any famous phrase or catch-phrase. In a more abstract direction, mathematics features an endless variety of serial patterns, like primes, odds, evens, and the multiples of any given number. Numbers can also be used to make non-ordinal serial patterns. We also used to memorize area codes and local dialing prefixes, tv channels, and zip codes. Today, you probably use an assigned Logon ID or procedure codes at work. Basically, anything repeatedly seen or done in the same particular sequence is a familiar serial pattern, or eventually becomes one in your mind.

Perhaps the largest and most varied category of familiar serial patterns is written language. Of course, the alphabet itself is a serial pattern and so is every word in the dictionary. In fact, at the present word count, this blog post has presented you with over 500 familiar serial patterns. Amazingly, each single word struck you not as a collection of individual letters, but as a whole. Each word is a unified collection in your mind. Even more amazingly, if I were reading this out loud, you could probably spell out every collection of letters. The word “amazingly” strikes you as a single unit of information, and yet your mind can break it down into nine distinct pieces of data. This everyday common experience is a textbook example of information compression.

School teachers use this same mnemonic dynamic to invent helpful acronyms. Kids who memorize “PEMDAS” and “Roy G. Biv” find they have learned five or seven times as much information, and that in serial order no less! By the same principle, “My Very Excellent Mother Just Served Us Nine Pizzas” becomes a stock phrase, memorized as a single item in the brain, which encodes the serial order of nine planets by distance, from Mercury to Pluto. (Long live Pluto!) Just like the word “amazingly”, once those nine words become a single unit in the mind, that single unit becomes an easy way to compress nine times as much data, which, again, includes serial order! (If you don’t think it’s difficult to memorize random data in sequence, test yourself sometime. Try memorizing phone numbers. But then, the key here is “random”.)

You experience this with time patterns as well. If we invoke “the academic calendar”, you can begin rattling off the basic sequence of annual events, which is filled with familiar event-chains like mid-August regsitration, Fall mid-terms after Columbus Day and winter break after exams. Sitting down with pencil and paper, you could probably reconstruct most every important detail with its approximate annual placement on the calendar. Psychologically, some of that reconstruction might come together by associative remembering, but a great deal of the overall framework is, in fact, what your mind recognizes as (the extensive definition of) “the academic calendar”. 

In this example, we illustrate that it is possible to evoke an elaborate sequence of events with a single mnemonic encoding. To really accomplish this, however, requires a high level of familiarity with “the academic calendar”. The cognitive dynamic is precisely the same, but to remember that much content as a single “chunk” of information, you have to become an “expert” on the material. In fact, that’s the point of today’s post. The science of “Chunking” becomes more powerful and more broadly effective whenever a subject possesses extreme “Expertise”.

Now, before we apply this dynamic to Remembering Life Stories, let’s explore in more technical detail what cognitive psychology really says about “Chunking” and “Expertise”. If you don’t care about the scientific details, skip the next section and we’ll get back to talking about Biographies. If you care very much, a general bibliography is appended at bottom.

 ~~~~~~~~~~~~~~~

Chunking

To explain “chunking”, psychologists typically begin with the work of George A. Miller, whose famous 1956 paper, “The Magical Number Seven” attempted to quantify the amount of new content his labratory subjects could retain in their short-term memory (a.k.a. “stm”, and today more often called “working memory”). When Miller’s subjects organized information into units, a handful of three digit numbers proved as easy to memorize as a handful of one digit numbers, and Miller found this perplexing because the longer string obviously preserved triple the payload. That is, by memorizing five three digit numbers, instead of five one digit numbers, you could actually memorize fifteen digits. The limitations of short term memory were consistent, but stm’s effectiveness could be expanded if the target information was compressed.

Miller thus began referring to “chunks” (single units of data) and “bits” (the amount of information conveyed by each unit). Ten numbers were ten “chunks”, no matter how many digits they held, but a three digit number contained more “bits” of raw information. To justify this innovation, Miller invoked Claude Shannon’s recent paper “A Mathematical Theory of Communication” (1948), where he described the benefits of distinguishing between data and information. Despite the benefit of standing on Shannon’s shoulders, Miller could see that quantifying human memory was a different kettle of fish. Unsurprisingly, few psychologists followed Miller and Shannon in measuring informational “bits”, which requires complex algebraic equations and reams of statistical data, but the concept of chunking remains prominent in psychological research.

In practice, the term “chunk” soon came to reference only those items of memory with increased informational value. Thus, in the parlance of cognitive science, the string “1, 2, 3, 4, 5, 6, 7, 8, 9, 10” would be referred to as ten “items” but “1 - 10” could be called a “chunk”, essentially a shorter label denoting a long string of data. In effect, today, a chunk is a grouping or a category. In discussion, therefore, it is possible to compare larger or smaller “chunks”. The small chunk just mentioned denotes only ten items but “odd numbers” is a large chunk denoting an infinite string. This also demonstrates, depending on context and observer, that the recall of one item can be more informative than the recall of five items. Such is the nature of information - or perhaps more accurately - such is the nature of human remembering. Yes, this may go that deep, or perhaps even deeper. According to Daniel Bor, cognitive chunking may provide the best way to understand human consciousness itself! Obviously, the research has progressed far beyond measuring a handful of three digit numbers.

For our purposes today, the phenomenon of chunking explains how our mind can remember familiar serial patterns as single units of information. Chunking provides mnemonic efficiency.

Consider the acronym “IRS”, a single piece of data which implies (embeds, condenses, compresses, encodes into memory) a string of three words, “Internal Revenue Service”, which expands again to be 22 characters; 24, if you’re typing. Those characters altogether comprise, by definition, an arbitrary string. Yet, most readers of English, and Americans especially, will find this string to seem entirely non-random. For non-English speakers, memorizing those 24 characters would be very difficult, and virtually impossible in sequence, but for educated Americans, the acronym “IRS” evokes that string of 24 items on an unconscious level. The lengthy and arbitrary string can be reconstructed instantly when the mind recognizes “IRS” and “Internal Revenue Service” as familiar serial patterns.

Most literature on “chunking” has been written about knowledge and learning, but I dare say we can re-state things in terms of constructive remembering. If we consider “IRS” as a single trace memory then “Internal Revenue Service” would be a reconstruction - and yet not an assembling of associated bits but rather an unfolding of the embedded content and its intrinsic context. Somehow, all of this information is implicitly encoded within a mind that understands the meaning of the acronym “IRS”. Recalling that trace somehow initiates a constructive remembering process that - literally - decodes the acronym. Beyond that, I'm not sure precisely what else to say. (However, see my addendum at bottom for an attempt to think more precisely about how “de-chunking” actually qualifies as constructive remembering.)

Getting back to the main point, we are talking about mnemonic efficiency - in particular, for reconstructing a sequence. A familiar serial pattern is something you’ve observed repeatedly over time. It’s not a sequence that you must now begin to rehearse. It’s a sequence which has become familiar, already. A common sequence of things has become general knowledge through the natural redundancy of experiencing repetition. And when you embed that general knowledge in a single self-contained concept, a large volume of sequential data can be reconstructed with ease.

Now recall Friedman’s idea of time patterns. The “days of the week” is one idea that labels a sequence of seven repeating events. “The Seasons” reminds you of a pattern with four periods in sequence. If I simply say “draw a calendar”, you can begin writing out a detailed list of 365 particular code names in precise order, all with slight variations, and you can even give it a unique structure if you begin working from today’s date. Through years of repetition and redundancy, you learned that May follows April, that Monday precedes Tuesday, and how the advent of each season always makes you anticipate its own corresponding holidays and other annual events. That massive trove of sequenced information is self-contained within your grasp of “calendar”.

Recalling a familiar serial pattern can spark a remembering of the whole pattern, including each and every one of its parts. In precisely this way, the Temporal Patterns we looked at in posts 2 and 3 can each be recognized as singular cognitive chunks, which convey serial order and a unified wholeness. This goes beyond merely extrapolating the implications of a single trace memory (e.g., recalling one bit of life story content after reading a biography) in order to remember what probably came next, or necessarily came earlier, or what general time pattern the trace content might properly evoke. The prospect of chunking can offer a greater advantage in reconstructing the larger patterns of Biographical Temporality (assuming as always that temporal content is preserved rather than lost). 

Clearly, this brings us right back to Narrative Theory. If coherence or unity depends on the remembering of a whole, the inarguable Unity of chunking with Familiar Serial Patterns inarguably offer a relative degree of such wholeness. One may not be able to unify and entire story, as would an Aristotelian Plot, but one can unify large portions of a Life Story by remembering elaborate patterns as biographical chunks. (See post #3 for several examples.) Evoking multiple points of story structure with a single “chunk” of story content is more efficient than recalling multiple points individually, even when they effectively sequence themselves.

When the arbitrary sequence happens to be familiar, our minds can bring coherence to larger volumes of data. But there is still a problem. How many “Life Story” patterns can a single mind possibly “chunk”? How can it be possible to remember a dozen or two dozen biographical storylines, each distinctly unique, merely by recourse to the same knowledge of general patterns? For the answers to these vital questions, we must advance to the next subject in this survey of cognitive psychology: expertise.

As George A. Miller observed in 1956, “a great deal of learning has gone into the formation of these familiar units”.  And as Thomas Carlyle observed way back in 1832, we have all spent our lifetimes obsessing about other people, and consequently, to some degree we have all become experts at observing diverse variations in basic patterns of human growth and development.

To be continued...


xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx


Addendum on the nature of “un-chunking” as constructive remembering:

With chunking (e.g., words and acronyms), our minds do not learn to associate multiple pieces of information, nor do they build a neural network (per se) of related memory traces. Rather, a collection of information becomes integrated as one new data point (engram, memory trace, piece of mnemonic content). But does this “integration” get recalled as a whole? My feeling is that it must be seen as another instance of constructive remembering?

However, at the moment I do not know whether to say this is like Bartlett’s model (filling in gaps of lost content, according to schemas, e.g., patterns of spelling) or like my own recent model of self-sequencing content based on Friedman’s work (connecting the dots of preserved content, according to implicit context, e.g., each bit connects to its place in the whole chunk). Perhaps this another unnoticed type of constructive remembering, which does not involve connecting dots or filling gaps. If we consider the way the mind works with “IRS” the information seems to “un-chunk”, or as Shannon might say, “decompress”. We do not retrieve a pristine file from storage which spits out the right 24 characters in order. Perhaps we construct a mental image of the word, visually. Perhaps we mentally sound it out afresh. I do not know how this occurs, but I dare say there is clearly a process. 

Now, for inexperienced spellers, and for unfamiliar words, perhaps this process is just like schema theory, as the mind searches for the right pieces of content to fill in the missing part of a word. In these cases, a mind works to assemble associated trace memories. However - and this is today’s key idea - when the expert’s mind is extremely familiar with a chunk, it somehow recognizes every piece in the unified chunk. In such cases, it seems the mind unpacks something that was previously packed, as if some “engrams” can be decoded as a precise inversion of the way they were encoded. At least, with a chunk like “IRS”, we seem to “reconstruct” or “decompress” these 24 characters by inverting whatever cognitive process allowed us to embed the information as one “chunk” in the first place.

What seems to matter the most is familiarity. A small child may initially struggle to decompress the serialized chunk “c-a-t” but she will hopefully move on to the challenge of reconstructing sequences like “h-e-m-i-s-p-h-e-r-e”, “v-e-r-t-i-b-r-a-t-e”, and “g-e-r-a-n-i-u-m”. With each new and unfamiliar word, learning ensues. The chunk, itself, is constructed via a process. It takes time. The details that will later re-sequence themselves so automatically are not embedded as quickly. However long it may take, a serial pattern is successfully chunked when its parts are re-encoded as one whole, creating a brand new engram (trace memory), which essentially compresses a large amount of contextual information. Now, in all that, by definition, the chunk itself has become a familiar item. It has been, we tend to say, “memorized”.

By the way, this is where the subjective aspect of compression (in information theory, more properly) may become less perplexing. In ancient Rome, “SPQR” was a familiar serial pattern, a single chunk compressing 26 characters: “Senatus_Populesque_Romanus”. Today, “SPQR” is meaningless to most observers, a single chunk denoting only itself. Thus, the science of Information seems beholden to systems of memory. It’s not subjective at all. It’s cognitive. There is no “information” in the universe anywhere. DNA is structured by (and RNA transmits) complex genetic sequences that strike us as “information”, and they can certainly be analyzed most fruitfully as such, but isn’t this merely a biochemical phenomenon? Anyway, now I’m definitely in the deep waters, so let’s end this excursus right here.

Anon, then...

No comments:

Recent Posts