Special Education Assessments 101: A Parent's Decoder Ring

A Parent’s Guide to Special Education Assessments

When your kid is referred for a full evaluation, the special education assessments the school plans to run are usually listed on a consent form you sign. That list might include five tests. It might include twelve. Most of them come with acronyms like CTOPP-2, WIAT-4, WJ-IV, KTEA, BASC, BRIEF, or WISC-V.

It’s a lot. Nobody sits you down and explains what each special education assessment actually measures before you sign the form.

This page is that explanation. For each test your school is likely to use, you will find what domain it covers (reading, math, phonological processing, executive function), what a strong or weak score looks like, and what it does and does not tell you about your child’s learning profile.

The goal is simple. By the time you walk into the IEP meeting to review results, you should know what each number represents, which patterns to flag, and what to ask the evaluator directly.

Cognitive ability (IQ)

Cognitive tests measure how your child reasons, remembers, processes information, and works under time pressure. They are the foundation of most special education evaluations because they establish what your child is capable of, which then becomes the comparison point for academic achievement. A meaningful gap between cognitive ability and academic achievement is one of the patterns that supports a specific learning disability finding.

WISC-V — the cognitive test you are most likely to see in your child’s evaluation report if they are between 6 and 16. Five composite scores, plus the Full Scale IQ and the General Ability Index (GAI). The GAI in particular is worth asking about.

Academic achievement

Academic achievement tests measure what your child has actually learned. Reading, writing, math, and oral language are the standard domains. Schools usually pick one of two batteries; both cover similar ground but with different subtests and slightly different normative samples.

KTEA-3 — the Kaufman Test of Educational Achievement. Strong reading composites and a clean dyslexia-screening setup when paired with the WISC-V.
WIAT-4 — the Wechsler Individual Achievement Test. The other standard achievement battery. Watch for grade equivalents in the report; they are not what you think they are.

If you see both KTEA-3 and WIAT-4 in your child’s evaluation, that is unusual; one or the other is the norm.

Reading-specific tests

The achievement batteries above include reading subtests, but evaluators sometimes add reading-specific tests when the question is whether a child has dyslexia or a fluency-only profile. These zero in on word-level reading and oral reading.

TOWRE-2 — the Test of Word Reading Efficiency. Times a child reading lists of real words (SWE) and pseudowords (PDE). Time pressure is the test; it is designed to expose decoding that does not yet feel automatic.
GORT-5 — the Gray Oral Reading Test. Oral reading of passages with rate, accuracy, fluency, and comprehension scores. Slow does not always mean struggling; the four subscores tell a more nuanced story.

Phonological awareness

Phonological processing is the foundation of reading. Most kids with dyslexia have a measurable weakness here, and the pattern shows up in this one test more reliably than anywhere else in a typical battery.

CTOPP-2 — the Comprehensive Test of Phonological Processing. Measures phonological awareness, phonological memory, and rapid naming. Look for the double deficit pattern (weak phonological awareness AND weak rapid naming); it is the strongest pattern indicator of dyslexia in this test battery.

Language

Language tests measure receptive language (how well your child understands what they hear) and expressive language (how clearly they can use words to communicate). Language disorders often co-occur with dyslexia and ADHD, and they affect reading comprehension in ways that phonics-only interventions cannot reach.

CELF-5 — the Clinical Evaluation of Language Fundamentals. The standard language battery in school evaluations. The Core Language Score is a composite; pay attention to the underlying subtests because the composite can mask a real receptive vs expressive split.

Articulation

GFTA-3 — the Goldman-Fristoe Test of Articulation. Measures how clearly your child produces the sounds of English in single words and connected speech. Often paired with the CELF-5 if speech is a concern.

Visual-motor integration

Beery VMI — the Beery-Buktenica Developmental Test of Visual-Motor Integration. Measures the coordination between visual perception and motor output (how well your child sees a shape and can reproduce it with a pencil). Often part of an OT evaluation; relevant for handwriting concerns.

Behavior, executive function, and ADHD

These three tests are rating scales, not direct measures of your child. Parents, teachers, and sometimes the child themselves answer questions about what they observe. The score profiles capture patterns that show up across raters and across settings.

BASC-3 — the Behavior Assessment System for Children. Wide net across emotional, social, and adaptive functioning. Look at the difference between “Clinically Significant” and “At-Risk” thresholds; both matter, but they mean different things.
BRIEF-2 — the Behavior Rating Inventory of Executive Function. T-scores at or above 65 are the threshold to take seriously. The BRIEF-2 catches what classroom observation often misses, especially in bright kids who hide their executive function struggles.
Conners-3 — the standard ADHD rating scale. One rater is a snapshot; the real signal is in the parent vs teacher disagreement and what each rater saw differently.

Screeners vs diagnostics

Not every test in a school evaluation is a full diagnostic instrument. Screeners are shorter, designed to flag kids who need a closer look. They are not designed to rule dyslexia in or out on their own.

SDQA — the San Diego Quick Assessment. Reading screener, not a diagnostic. Useful as a first pass; not enough on its own.
How accurate are screeners, really? — sensitivity, specificity, and why a passed screener does not mean clear of dyslexia.

How to read the scores

Every test on this page produces standard scores, percentiles, and confidence intervals. Once you know how those three things relate, you can read any of these reports.

What you need to know about tests — standard scores, percentiles, confidence intervals, in plain English.
Confidence intervals and IEP qualification — why a single score is rarely the whole story.
Why your reading report says “on track” — when you can see otherwise.
How to read your child’s PLAAFP — the IEP section where the test scores get translated into a description of your child.

Free tools to translate the report

Three interactive tools live on this site to help you translate evaluation numbers into plain English.

Confidence Interval Calculator — paste in a test score and see the range your child’s actual score sits inside.
Reading Profile Builder — drop in your child’s evaluation scores and see where the gaps are.
Spelling Under Load — a working-memory simulator that shows why kids with dyslexia spell fine in isolation and fall apart in a paragraph.

How a special education evaluation gets put together

School psychologists and educational diagnosticians do not pick random tests. The battery is built to answer specific questions about your child. Understanding which test is doing which job helps you read the report and helps you ask the evaluator better questions before you sign off.

The standard structure of a thorough evaluation is roughly:

One cognitive test — usually the WISC-V. Establishes overall reasoning ability and the cognitive profile.
One academic achievement battery — usually the KTEA-3 or the WIAT-4. Measures what your child has actually learned to do in reading, writing, and math.
One or more area-specific tests when there is a focused concern. Examples: CTOPP-2 if dyslexia is on the table, GORT-5 or TOWRE-2 if reading fluency is the question, CELF-5 if language is the concern, GFTA-3 if articulation is the concern, Beery VMI if handwriting or visual-motor concerns are flagged.
Behavior and executive-function rating scales filled out by parents and teachers. Usually the BASC-3 (broad behavior), BRIEF-2 (executive function), and Conners-3 (ADHD-specific). Often more than one is given so the evaluator can compare across raters.
A reading screener like the SDQA, sometimes used before the deeper diagnostic tests are ordered, sometimes used as a quick check during the evaluation.

Your child’s evaluation will not include every test on this page. A typical eligibility evaluation includes 5 to 8 tests. A focused reevaluation might include 2 to 4. If a test you expected to see is missing, that is a fair question to ask the evaluator: why did you choose this battery, and what question is each test answering?

Common evaluation pairings

Tests rarely stand alone. The pattern across two or three tests is usually more meaningful than any single score. The pairings below are the ones you are most likely to see together.

Cognitive + academic (the standard SLD pair)

WISC-V + KTEA-3 (or WIAT-4). The cognitive test tells you what your child is capable of; the achievement battery tells you what they have learned. A meaningful gap between the two is the classic pattern that supports a specific learning disability finding. The evaluator will calculate this gap explicitly; you should be able to read it on the report.

The dyslexia screening triad

CTOPP-2 + TOWRE-2 + GORT-5. Phonological awareness, timed word reading, and oral reading. If all three come in below average, that is a strong dyslexia signal. If only one or two are weak, the report should explain which specific reading skill is breaking down and which is intact. The CTOPP-2 in particular catches dyslexia early because phonological processing is the foundation that reading fluency builds on.

The behavior and executive function triad

BASC-3 + BRIEF-2 + Conners-3. Three rating scales that cover overlapping but distinct ground. The BASC-3 casts a wide net (emotional, social, adaptive). The BRIEF-2 zeroes in on executive function. The Conners-3 is ADHD-specific. The most useful pattern is comparing parent and teacher ratings on the same test; large disagreements are diagnostic in their own right.

The language and speech pair

CELF-5 + GFTA-3. The CELF-5 measures language broadly (receptive and expressive); the GFTA-3 measures articulation. Often given together when speech or language is a concern. Pay attention to the gap between receptive and expressive scores on the CELF-5; the composite Core Language Score can mask a real split.

What to push back on

A few patterns that come up often enough in evaluation reports to flag in advance.

A single Full Scale IQ used to dismiss concerns. The FSIQ is rarely the most informative number in the report. Ask for the index scores and the General Ability Index.
Grade equivalents reported as if they were diagnostic. Grade equivalents are loose approximations and not what the test was normed to measure. Standard scores and percentiles are the meaningful numbers.
A reading composite reported without the underlying subtests. A child can have an average reading composite while being a full standard deviation behind on word-reading fluency. The composite hides that. Ask for the subtest breakdown.
A passed screener used as evidence the child does not need a full evaluation. Screeners are designed to catch most kids who need a closer look, not all of them. Read the post on screener accuracy for the full picture.
Confidence intervals not reported. Every score has a margin of error. A standard score of 85 might really be a 79 to 91 range. The confidence interval changes whether a score is “average” or “below average,” especially near eligibility cutoffs.

Where to go next

If you have a specific test you need to look up right now, every cheat sheet is also indexed in the Assessments Library. If you would rather think about reading instead of testing, start with Phonics fixed everything, right? for the science of reading framing that sits behind every test on this page.