The Efficacy of Learning Modalities

Sep 8

Introduction

At McCoy, we believe education rests on two pillars: entertainment-driven learning, which sustains engagement and motivation, and research-driven learning, which ensures knowledge sticks. This paper focuses on the latter: what rigorous studies show about the efficacy of different learning modalities, techniques, and settings.

We review traditional and modern methods—reading, video, classroom learning, tutoring, and audio—while also defining the technical concepts (like effect size, Cohen’s d, Hedges g) that researchers use to measure effectiveness. Our goal: to make research on learning transparent, practical, and actionable.

Understanding Educational Research

When we talk about “what works” in education, the answers often come from decades of carefully controlled studies. But to make sense of those studies, researchers rely on a common statistical language—one that can feel unfamiliar to practitioners, investors, or even learners themselves. Terms like effect size, Cohen’s d, and meta-analysis are the tools researchers use to translate thousands of data points into clear conclusions about whether a learning method truly makes a difference. In this section, we break down these concepts into simple definitions so that anyone—regardless of statistical background—can follow along. With this foundation, the rest of the paper moves beyond abstract numbers into practical insights: what the evidence actually tells us about the effectiveness of different learning methods.

What Actually Works in Learning: Modalities, Techniques, and Evidence-Based Efficacy

A concise, evidence-graded snapshot of durable learning across ages and subjects. We separate delivery modes (tutoring, in-class, video, audio, reading) from learning techniques (retrieval, spacing, feedback), because outcomes depend more on what learners do with content than the medium that carries it.

McCoy Universe Inc.September 2025

Table of contents

How to read the ratings
A) Delivery modes
B) Learning techniques
C) Medium effects
D) Age groups & domains
E) Practical stack
F) Quick reference
Caveats & scope

How to read the ratings

Effect sizes are standardized differences (Hedges g / Cohen’s d). Rule of thumb: ~0.20 small, 0.50 moderate, 0.80 large.
High/Medium/Low are practical ratings for long-term retention and near-term knowledge acquisition, based on meta-analyses and large reviews.

A) Delivery modes (format & setting)

Delivery mode	Practical efficacy (retention → acquisition)	Typical impact / notes	When it shines
High-dosage tutoring (1:1 or 1:2–4)	High → High	Average 0.37–0.42 SD across PreK–12 RCTs; teacher/paraprofessional tutors > volunteers/parents; in-school scheduling > after-school.	Catch-up/gap-closing; early reading; later-grade math; structured, 3–5×/week programs.
In-class active learning vs. lecture	High → High	In undergrad STEM, exams up 0.47 SD; failure risk ~1.5× lower than lecture. Effects hold across class sizes & disciplines.	Problem-solving, discussions, clickers, peer instruction; biggest gains in sections ≤50 students.
Traditional lecture (mostly listening)	Low → Medium	Underperforms active learning on exams and failure rates.	Efficient for broadcasting information—add interaction to convert exposure into learning.
Blended learning (online + face-to-face)	Medium → Medium	U.S. Dept of Ed meta: blended modestly > face-to-face; newer syntheses: mixed but generally positive; design quality matters more than modality label.	Use online time for retrieval practice, spaced work, and feedback; in-person time for application.
Online-only courses	Low/Medium → Medium	On average ≈ comparable to face-to-face, with huge variance; asynchronous > synchronous in some reviews; success hinges on embedding high-impact techniques.	Adult/professional learning, flexibility contexts; add frequent quizzes and feedback.
Video lessons (watching)	Low → Medium	Video ≈ text for comprehension when passive; adding graphics helps text comprehension (g≈0.39). Embedding questions/pauses (retrieval + feedback) moves the needle.	Short, focused videos with built-in low-stakes questions; pair with practice.
Audio-only (podcasts/audiobooks)	Low → Medium	Reading and listening broadly comparable for gist; reading often better for inference-heavy tasks; reading-while-listening gives trivial edge (g≈0.18) when pacing is externally controlled.	Commutes, review; add visuals for technical content; pair with quizzing.
Reading (solo)	Low → Medium	Reading alone is mediocre for retention if it’s just rereading; add retrieval, spacing, and graphics to make it strong.	Self-paced study with structured practice and spaced schedules.

Takeaway Delivery mode matters far less than whether the experience contains active ingredients like retrieval practice, spacing, feedback, interleaving, and worked examples.

B) Learning techniques (the “active ingredients”)

Technique	Practical efficacy (retention → acquisition)	Typical impact / key notes	Best uses
Retrieval practice (practice testing, low-stakes quizzing)	High → High	Meta-analyses: ~0.5–0.6 SD vs. restudy; bigger with feedback; robust from K-12 to adults and across subjects.	Frequent low-stakes quizzes, flashcards with feedback, end-of-lesson questions.
Spaced practice (distributed review)	High → Medium	Optimal gaps depend on retention goal; an optimal spaced review can double long-term recall vs. cramming (equal study time). Rule-of-thumb gap ≈ 5–10% of the target interval.	Plan reviews days → weeks → months; use SRS tools (Anki-style) or calendar nudges.
Feedback (formative/corrective)	High → Medium	Large, consistent benefits; average d≈0.48 across hundreds of effects; specific, task-focused feedback > generic praise.	Show correct answers on quizzes, targeted hints, error analysis.
Interleaving (mixing problem types/topics)	Medium → Medium	Overall g≈0.42; strongest in category learning & problem solving (e.g., math); weaker or negative for some simple tasks.	Mix 2–4 skills/problems per session; avoid long blocked sets once basics are learned.
Worked examples (study step-by-step solutions)	Medium/High → High	In math, g≈0.48 (≈0.44 after publication-bias adjustment); correct examples beat “incorrect” variants; pairing with forced self-explanations sometimes reduced impact in this meta.	Early skill acquisition; alternate example → problem; fade steps over time.
Concept mapping / graphic organizers	Medium → Medium	Meta across 142 effects: g≈0.58; helps organize and connect ideas; complements (doesn’t replace) retrieval.	Consolidation after initial exposure; planning essays; unit reviews.
Mnemonics (e.g., method of loci, keyword)	High for verbatim recall → Low/Medium for transfer	Method-of-loci RCTs g≈0.65 in adults; very large effects in special-education populations (≈1.6 SD). Best for names, terms, lists; less for conceptual transfer.	Languages, anatomy, legal elements; combine with spaced retrieval.
Self-explanation / elaborative interrogation	Medium → Medium	Helpful when guided; weaker than retrieval/spacing but valuable for sense-making.	“Explain like I’m teaching it”; “why does this step work?” prompts.
Summarizing (unguided)	Low/Medium → Medium	Benefits depend on training/quality; novices tend to write superficial summaries.	Use structured prompts (e.g., 1-minute paper + quiz).
Highlighting / rereading	Low → Low	Popular but consistently weak for retention versus active techniques.	Only as a first pass before practice retrieval.

C) Medium effects: reading vs. video vs. audio—what do studies actually say?

Reading vs. listening: Overall comprehension is comparable; reading often edges out listening on inference-heavy tasks. Reading-while-listening adds a trivial edge (g≈0.18), mainly when pacing is externally controlled. For technical/spatial content, add visuals.
Graphics help text: Adding well-designed graphics to text yields a moderate comprehension gain (g≈0.39).
Video alone ≠ magic: Passive watching ≈ reading; gains appear when you embed retrieval and feedback (in-video questions, recall pauses, follow-up quizzes).
Learning styles: Matching to “visual/auditory/kinesthetic” preference does not improve outcomes; choose the medium that fits the content and layer active techniques.

D) Age groups & domains—what generalizes?

K–12: Tutoring (especially in-school, high-dosage) is powerful (~0.37–0.42 SD). Retrieval practice, spacing, and worked examples generalize from early grades through high school.
Undergraduate / adult: Active learning beats lecture across STEM, lowering failure and boosting exams (~0.47 SD). Retrieval and spacing remain robust; interleaving aids transfer.
Haptic/sensory skills: Purely online/video approaches are weaker; blended with hands-on practice usually required; design quality dominates.

E) What this means in practice (a simple “stack” you can deploy anywhere)

Acquire efficiently: Use worked examples or a short reading/video to grasp the model solution or core schema. Add graphics if the content is spatial.
Get it out of your head: Within minutes, do retrieval practice—2–6 questions, flashcards, or a one-minute free recall—with corrective feedback.
Space it out: Schedule reviews at increasing intervals. Heuristic: for a ~100-day target, an early gap of ~7–10 days (≈5–10%).
Mix it: Interleave 2–4 problem types or topics per session to improve discrimination and transfer.
Consolidate understanding: Build a concept map or a brief self-explanation, then quiz again.
Heavy factual loads: Use mnemonics (e.g., method of loci) but still space and test them.
High stakes or wide gaps: Add tutoring (3–5×/week; small groups; in-school/structured).

F) Quick reference — efficacy ratings at a glance

Highest-confidence Retrieval practice, Spaced practice, Active learning, High-dosage tutoring, Feedback, Worked examples.

Helpful / context-dependent Interleaving, Concept mapping, Self-explanation (guided).

Low yield unless combined Rereading, Highlighting, Passive watching/listening.

Caveats & scope

Effect sizes vary with outcome type (immediate vs delayed), subject matter, implementation fidelity, and who delivers instruction. Treat these as typical impacts from meta-analyses, not guarantees.
Large “meta-meta” compilations are debated; this summary relies on primary meta-analyses for each technique.

Ryan Fuller