Instructional techniques that are highly effective with inexperienced learners can lose their effectiveness and even have negative consequences when used with more experienced learners
This week’s paper is The Expertise Reversal Effect (2003) from Sweller, J., Ayres, P. L., Kalyuga, S. & Chandler, P. A.
It walks through a number of pedagogical strategies based on Cognitive Load Theory and demonstrates that most of them have a marked ‘flip’ in results depending on the level of expertise of the learner.
The Full Paper is only 8 pages, so it’s a quick read.
This paper describes lots of instructional techniques. The cognitive science explanations all come down to limits on working memory. Cognitive Load Theory says that there are only a few “slots” available for “chunks” of information to fit in working memory at a time. If you try to fit in too many chunks, you get reduced processing ability.
The fix is schemas. Learners build schemas that make concepts fit into less working memory.
Because of the limited capacity working memory, the proper allocation of available cognitive resources is essential to learning. If a learner has to expend limited resources on activities not directly related to schema construction and automation, learning may be inhibited.
In Cognitive Load Theory, all of learning is framed in light of working memory limits. Learners acquire new schemas, then through practice, make schema use automatic instead of effortful. This reduces the working memory burden. Expertise means having lots of schemas, practiced to the level of automatic use. Experts can handle complex tasks because their schemas reduce working memory demands.
The goal of instruction is to scaffold the construction of schemas.
Novices have fewer schemas in place - therefore, less ability to organize new information. Effective instruction can substitute for missing schemas by structuring new information - pre-chewing the tough new knowledge to make it easier to digest. That instructional structuring can also model the schemas for learners, and through example help learners build schemas faster. Without that structuring, learners are more prone to cognitive overload, which limits learning.
Experts already have schemas in place to guide them in dealing with a new task. If instruction provides the schema-construction guidance that’s helpful for novices, it may be redundant for experts. Experts still need to process the redundant information - it still requires attention, i.e. it takes up working memory.
The overlap of schema-based guidance (already in experts’ heads) and guidance from instruction can lead to cognitive overload. The rest of the paper explores situations like these, where guidance that is useful for novices can be negative for experts.
The Split Attention Effect and the Redundancy Effect flip-flop depending on how experienced the learner is.
Separating sources adds a cognitive load of searching and matching between representations. If you have a diagram and explanatory text side-by-side, readers have to scan back and forth to match up concepts. This adds cognitive load, limiting learning. If you integrated the text with the diagram, it would reduce the load from the searching and matching.
This effect holds up similarly for text shown now and text shown later. If you have to e.g. think about a previous slide in a deck and compare it to the current slide, that adds cognitive load compared to showing them at once. Spatially and chronologically integrated materials reduce cognitive load.
If multiple sources of info are necessary for learning a concept, integrating them is good. However, if they could stand on their own, eliminating the redundant one is better.
This is surprising! One source alone is better than redundant sources. This is counterintuitive, because we expect that repetition and presenting information in multiple modalities makes something easier to learn. The authors address this below, but I think the way to understand it might be to think about cognitive load in a particular moment - at any given time, adding redundant sources of information will increase cognitive load. Over time, seeing something repeated or presented in different ways might aid learning, but not if it leads to cognitive overloading.
A source of information that is essential for a novice may be redundant for someone with more domain-specific knowledge.
inexperienced trainees benefitted from textual explanations integrated into the diagrams (to reduce split attention). However, more experienced trainees performed significantly better with the diagram-only format. For these more knowledgeable learners, the textual information, rather than being essential and so best integrated with the diagram, was redundant and so best eliminated.
This is the Expertise Reversal Effect.
When reading about new concepts, verbose and detailed explanations can help learners with new concepts. Learners with more expertise get distracted by the additional explanatory text, and benefit more from minimal text.
Less knowledgeable learners benefitted from additional explanatory material, but more knowledgeable learners were better able to process the material without the additions
Text that is minimally coherent for novices may well be fully coherent for experts. Providing additional text is redundant for experts and will have negative rather than positive effects.
Expertise Reversal again!
We mentioned above that having multiple modalities of presentation might contradict the (slightly surprising) Redundancy Effect. The authors have another CLT-based explanationfor why multi-modal learning might work:
capacity to process information is distributed over several partly independent subsystems
Then, spreading the load across more systems might tap into increased total processing power or working memory.
Many studies have demonstrated that learners can integrate words and diagrams more easily when the words are presented in auditory form rather than visually
Seeing and hearing at the same time seems to work. It spreads the load across more cognitive resources.
auditory explanations may also become redundant when presented to more experienced learners
But interestingly, it detracts based on learners’ level of experience! Again, experts get distracted by additional material that would benefit novices. As an instructor or experience designer, adding additional explanatory material or additional modalities will make things better for novices, but hurt your more advanced learners.
Worked examples are problems presented with their solutions and the solution steps. Worked examples are often more effective than other problem solving based learning situations. For instance, a guided tutorial often works better than unstructured exploration for introducing a concept to beginners.
However, for experts, worked examples add cognitive load compared to learners working through the problems on their own.
Inexperienced mechanical trade apprentices were presented with either a series of worked examples to study or problems to solve. On subsequent tests, inexperienced trainees benefitted most from the worked examples condition. Trainees who studied worked examples performed better with lower ratings of mental load than similar trainees who solved problems, duplicating a conventional worked example effect. With more experience in the domain, the superiority of worked examples disappeared. Eventually, with sufficient experience, additional learning was facilitated more by problem solving than through studying worked examples. The worked examples became redundant and problem solving proved superior, demonstrating another expertise reversal effect.
There’s an intuitive connection here - before you’ve seen someone do something, it’s overwhelming to have to be ‘thrown in the deep end’ and try to solve things on your own. However, after you’ve seen someone else demonstrate a skill, you benefit most from trying it on your own.
inexperienced learners benefitted most from an instructional procedure that placed a heavy emphasis on guidance. Any additional instructional guidance (e.g., indicating a goal or subgoals associated with a task, suggesting a strategy to use, providing solution examples, etc.) should reduce cognitive load for inexperienced learners, especially in the case of structurally complex instructional materials. At the same time, additional instructional guidance might be redundant for more experienced learners and require additional working memory resources to integrate the instructional guidance.
Lots of types of support can reduce cognitive load for beginners, but add cognitive load for experts because it is redundant. The implication here is that there’s also a missed opportunity to allow experts to practice their schemas.
Unsurprisingly, experience is a gradient. For all of these effects, we see a gradual fading out and then crossing over, not a sudden flip.
a fading out procedure was superior to an abrupt switch from worked examples to problems.
Systems with lots of interacting elements are hard to learn. There’s a double bind - you need to keep all the pieces in your head in order to understand how any of the pieces work, but you become cognitively overloaded if you try to keep all the new things in your head at once.
Experienced learners’ schemas reduce the working memory needed for each of the interacting parts. Concepts or systems with many interacting parts present a chicken-and-egg problem, where you need the schemas to reduce cognitive load, but can’t reduce cognitive load to learn before you have the schemas.
For example, learning foreign language syntax requires you to keep the track of the relations between all the different parts of speech in your head. Conversely, learning new vocabulary requires keeping few new items in working memory.
The instructional solution is to present a simplified (but false) model to help learners build a partial schema. That way learners have some tools in place before they encounter the full system with all its interacting parts.
Interestingly, this instructional strategy did not result in the full expertise reversal effect. Experienced learners showed no difference in effectiveness between the mixed approach (isolated elements followed by interacting elements) and the conventional method (interacting elements instruction during both stages).
This suggests that its safe to explore false-but-suggestive isolated elements models with students of all levels as a way of building up to the true interacting elements model.
Asking learners to imagine the contents of instruction is more effective than asking them to study… so long as they have enough experience to imagine effectively. The imagination effect “turns on” as learners gain more experience. When students don’t have sufficient experience, worked examples are more effective.
Once again, the explanation is in terms of working memory overload - imagining works if schemas are in place. Without the schemas, learners working memory gets overloaded. They have to process too many individual components.
The benefits of imagination are also couched in terms of constructing and automating schemas. Imagining instruction encourages automation of schemas.
Instructional design should take into account the expertise of learners. This seems like an obvious insight, but there are subtleties in the particulars of how to design for novices and for experts.
More than anything, instructional design should account for cognitive load. If materials recognize the level of the learner in terms of 1) the schemas they have incorporated and 2) the level of automation with those schemas, they can reduce the chances of overload.
This suggests instructional design tools that audit the number of new concepts learners have to hold in memory as well as learners’ automaticity of schemas. If curriculum designers and teachers had visibility into cognitive load, they could take advantage of the most appropriate instructional strategy for that level of expertise.