Assessment & Feedback: Lesson 19.1
Testing and Assessment in English Language Teaching
This unit examines the essential role of assessment in EFL and ESL teaching - covering formative and summative assessment, the main categories of language tests, assessment design principles, approaches to evaluating different language skills, and strategies for giving feedback that supports real learning progress.
Assessment is one of the most important parts of teaching English as a foreign or second language. Without proper assessment, how can we know if our students are actually learning? How can we identify what they're struggling with? How do we measure progress over time? In this unit, we're going to explore the various types of assessment used in EFL and ESL classrooms, examine how to design effective tests, look at ways to evaluate different language skills, and discuss how to give meaningful feedback that helps students improve.
Assessment is broader than the tests and quizzes that happen at scheduled intervals. Useful evidence about learning also comes from classroom observation, student participation patterns, written work completed during lessons, teacher feedback records, and the informal monitoring that occurs whenever a teacher circulates during a task or listens to a learner's response. When teachers treat these everyday classroom moments as assessment data, they develop a clearer and more continuous picture of what learners can and cannot yet do.
A teacher who listens carefully during pair work, notices which students avoid initiating interaction, and records patterns of error across several written tasks is gathering assessment evidence continuously - not only at the end of a unit. This broader view of assessment matters particularly in language teaching, where a learner may demonstrate strong control of a form in one skill but uncertain or inconsistent use of the same form in another. A single test score captures one moment in one context; a wider observational record captures development across time and skill areas, giving a more complete and instructionally useful picture of where each learner actually is.
Check Your Understanding
According to the introduction, which of the following most accurately reflects the stated scope of this unit?
- The unit focuses primarily on IELTS and TOEFL test preparation, covering standardized assessment instruments used in academic and professional contexts alongside strategies for raising student scores on international proficiency examinations
- The unit will explore the various types of assessment used in EFL and ESL classrooms, examine how to design effective tests, look at ways to evaluate different language skills, and discuss how to give meaningful feedback
- The unit provides a comprehensive historical overview of language testing theory and practice, tracing the development of major assessment frameworks from the grammar-translation era through to current communicative assessment models
- The unit introduces teachers to a range of self-assessment and peer assessment tools specifically designed for use in the EFL context, alongside practical scoring rubrics and sample assessments that can be adapted for different teaching contexts and levels
The introduction states directly: "In this unit, we're going to explore the various types of assessment used in EFL and ESL classrooms, examine how to design effective tests, look at ways to evaluate different language skills, and discuss how to give meaningful feedback that helps students improve." Option (b) reflects this exactly. Option (a) focuses the unit on IELTS and TOEFL, which are never mentioned in the introduction - these are specific test types, not the subject of the unit. Option (c) describes a historical overview of testing theory, which is also not stated in the introduction. Option (d) describes self-assessment and rubric design as the unit's primary focus, which misrepresents the broader scope described in the source.
Understanding Assessment: Formative vs. Summative
Before we look into specific types of tests, we need to recap and go more in-depth into two fundamental categories of assessment that serve very different purposes in the classroom.
Formative assessment is an ongoing evaluation that happens during the learning process. Think of it as checking the temperature while you're cooking rather than waiting until the dish is finished. Formative assessment helps teachers understand what students know right now, what they're confused about, and what needs more attention before moving forward. This type of assessment is informal, flexible, and designed to inform instruction rather than assign grades.
When teachers first start, assessment often follows a predictable pattern: vocabulary quizzes on Fridays and a comprehensive unit test at the end of each month. It feels organized and systematic. But this approach has a significant blind spot: teachers don't actually know whether students understand the material until the test results come back, often revealing widespread confusion that could have been caught and addressed weeks earlier.
A teacher carefully explains the past perfect tense, sees students nodding and completing worksheets, and assumes comprehension. Then the unit test arrives, and nearly half the class fails. The teacher is shocked: "I explained this so clearly! What went wrong?"
What went wrong is the absence of formative assessment throughout the learning process. Without regular checks for understanding, teachers are essentially flying blind. They don't know which students are confused about verb tense changes, which ones missed the nuance of reported speech, or which ones are just copying answers from their neighbors.
Consider This
The formative strategies described in this unit - exit tickets, thumbs signals, brief pair-shares - share one important quality: they are brief, low-stakes, and designed to be part of regular classroom routine rather than separate, scheduled events. Consider which of these strategies you could realistically incorporate into a typical lesson without significantly disrupting the planned activity sequence.
Think also about the difference between a formative check that confirms what you already expected and one that reveals something you did not anticipate. What would it mean for your practice if a quick check revealed, as the classroom scenario above illustrates, that a technique you felt confident about had not been understood by a large portion of the class? How would that change the way you plan the following lesson, and what would you do differently next time to build in a check before reaching that point in the sequence?
Check Your Understanding
According to the text, what does it mean for a teacher to be "essentially flying blind" when it comes to student understanding?
- It describes the difficulty of planning future lessons when a coursebook's unit structure does not clearly signal which grammar or vocabulary points will be assessed at the end of the term, leaving teachers without a clear instructional target to work toward
- It refers to teaching under poor physical conditions such as limited classroom resources or the absence of proper audio equipment that prevents teachers from checking whether students have heard instructions clearly enough to complete the task
- It characterizes a teacher who reacts emotionally rather than analytically to poor test results, assigning blame to students rather than reflecting on the quality of their own explanation and the feedback routines built into the lesson
- Without regular checks for understanding, teachers don't know which students are confused about verb tense changes, which ones missed the nuance of reported speech, or which ones are copying answers
The source states: "Without regular checks for understanding, teachers are essentially flying blind. They don't know which students are confused about verb tense changes, which ones missed the nuance of reported speech, or which ones are just copying answers from their neighbors." Option (d) reflects this directly. Option (a) describes a coursebook planning problem unrelated to the concept of flying blind as described. Option (b) introduces physical classroom conditions such as audio equipment - no connection to the source's use of the phrase. Option (c) describes emotional teacher reaction, which is a separate issue and not how the source defines teaching without formative checks.
A real positive outcome shift can happen when teachers start incorporating quick formative checks into every lesson: exit tickets in which students write one thing they understood and one thing they're confused about, thumbs-up/down signals for understanding, or brief pair-shares where the teacher circulates and listens. These take just five or ten minutes but provide crucial information.
Interestingly, when a teacher hands out index cards at the end of a lesson on reported speech and asks students to identify one point of confusion, the results often shock them. Nearly half the class might be mixing up verb tense changes - something the teacher was certain had been explained clearly. Instead of moving on to the next grammar point as planned, the teacher spends the next lesson addressing the confusion, using students' own questions to guide instruction.
Two weeks later, when the unit test arrives, the results are dramatically different. The formative checks caught problems early, before they solidified into bad habits. The summative test no longer surprises the teacher with unexpected failures - it confirms what they already know about their students' learning. This is the real power of formative assessment: it transforms teaching from a guessing game into an informed, responsive process.
From the Field
The formative assessment strategies that tend to work best in practice depend partly on class size, learner age, and the degree of comfort students have with self-disclosure in the classroom. In larger classes, written exit cards work effectively because they allow every student to respond at once without requiring individual verbal interaction with the teacher. In smaller groups, brief oral sweeps - where the teacher asks each student to complete one sentence or answer one question before leaving - can provide equally rich information. In lower-level classes, visual signals such as thumbs-up indicators or traffic light cards provide quick, non-verbal feedback that does not require learners to articulate their confusion in the target language before they are ready to do so.
The key principle across all these contexts is consistent: formative data needs to be collected frequently enough to inform the next lesson, and in a way that is truly useful to the teacher rather than simply reassuring. When the feedback channel between learner and teacher is kept open throughout a unit, the summative test at the end becomes a confirmation rather than a revelation.
On the other hand, summative assessment evaluates what students have learned at the end of a learning period. It's the final grade on the report card, the end-of-term exam, the proficiency test that determines university admission. Summative assessment measures achievement against a standard or set of learning objectives. Unlike formative assessment, summative assessment is formal, structured, and typically high-stakes. Students are compared against benchmarks, and the results often have real consequences for their academic or professional futures.
Both types of assessment are essential in language teaching. Formative assessment helps you adjust your teaching in real time and helps students understand their progress. Summative assessment provides accountability, measures overall achievement, and certifies that students have reached certain levels. The challenge for teachers is to balance these two types appropriately and ensure that summative assessments actually measure what we've been teaching.
Research shows that when teachers rely too heavily on summative assessment, students can become anxious, demotivated, and focused only on grades rather than learning. Conversely, when summative assessments overshadow the constructive feedback provided by formative assessment, students may miss crucial guidance for improvement. Therefore, effective language teaching requires a thoughtful mix of both approaches, with formative assessment happening regularly and summative assessment used at appropriate intervals to measure cumulative learning.
Beyond the Basics
When evaluating any assessment tool - whether one you have designed yourself or one provided by an institution - it helps to consider it against a set of core qualities. A valid assessment measures what it claims to measure: a test claiming to assess speaking ability should require students to speak, not merely answer grammar questions about speaking situations. A fair assessment gives all students an equal opportunity to demonstrate their ability, regardless of their cultural familiarity with the test format or their access to preparation resources. A practical assessment is one that can actually be administered, marked, and interpreted within the real constraints of time, class size, and available resources - even a well-designed test loses its value if it cannot be implemented consistently. And a supportive assessment is one that helps learners understand where they are and where they need to go next, because scores that produce no usable information for the learner are limited in their instructional value.
These qualities do not always sit comfortably together, and designing assessments frequently involves trade-offs among them - but keeping them as guiding principles helps teachers make more deliberate and defensible decisions about the assessments they use.
Check Your Understanding
According to the text, what specific risk is identified when summative assessments overshadow the constructive feedback provided by formative assessment?
- Students may miss crucial guidance for improvement when summative assessments overshadow the constructive feedback that formative assessment provides
- When summative assessments are introduced too frequently they begin to function as formative assessments instead, blurring the boundary between the two approaches and undermining the purpose of both types of evaluation in the language classroom over time
- Summative assessments become statistically unreliable when used without accompanying formative data, making it impossible for teachers to determine whether individual student scores represent real proficiency gains or primarily reflect test-taking strategies learned over time
- The administrative demands of frequent summative testing prevent teachers from having sufficient planning time to design the communicative tasks and interaction activities that language learners need most at each stage of their development
The source states: "when summative assessments overshadow the constructive feedback provided by formative assessment, students may miss crucial guidance for improvement." Option (a) reproduces this directly. Option (b) introduces a claim that frequent summative testing turns into formative testing - the source makes no such claim and does not describe this as the risk of over-relying on summative assessment. Option (c) introduces statistical reliability as the concern, which the source does not mention in this context. Option (d) describes teacher workload and planning time as the consequence, which also has no basis in the source's description of this particular risk.
Types of Language Tests
Similarly, language tests can be categorized by their purpose. Understanding these categories helps teachers choose or design appropriate assessments for different situations.
Placement tests are used when students first enter a language program. The goal is to determine which level class is most appropriate for each student. A good placement test assesses general language ability across multiple skills, typically reading, writing, listening, and sometimes speaking. The results help administrators group students with similar proficiency levels, making teaching more effective by allowing lessons to be targeted appropriately.
Most placement tests use a format that becomes progressively more difficult. They might start with very basic questions about the present simple tense and common vocabulary, then gradually increase in complexity to advanced structures and academic vocabulary. Students naturally reach a point where questions become too difficult, and that's where their current level is identified. Many language schools use standardized placement tests, while others develop their own based on the levels they offer.
However, placement tests aren't perfect. They provide a snapshot of general ability but might not capture specific strengths and weaknesses. A student might be placed into the Intermediate level based on strong reading and grammar skills, but actually struggles significantly with listening or speaking. A placement test might include 60 multiple-choice questions covering grammar, vocabulary, and reading comprehension, with clear cutoffs: 80% or above goes to advanced, 60-79% to intermediate, and below 60% to beginners.
The placement test scenario described in the source highlights a broader limitation that arises whenever assessment isolates individual language skills into separately measured components. In actual communicative situations, language users rarely deploy only one skill at a time: reading a text and responding to it in writing, listening to a task description and speaking a response, or following a conversation that simultaneously demands both reception and production are all conditions that combine skills in real use. Assessments that treat each skill as an entirely separate ability may therefore give a less complete picture of a learner's actual communicative capacity than tasks that require the integration of two or more skills in combination.
A writing task that asks students first to read or listen to source material and then compose a response is one example of this approach. Such integrated tasks tend to reflect more authentic conditions of language use and can surface different patterns of learner ability than single-skill formats alone. Integrated-skills assessment is not appropriate for every purpose - a placement test may reasonably assess skills separately for practical reasons - but understanding its potential value helps teachers design assessments that go beyond measuring language in isolated, artificial conditions.
Check Your Understanding
According to the text, why are placement tests described as providing only a "snapshot" of student ability?
- Placement tests use only multiple-choice formats, which are well-suited for measuring grammar and vocabulary but structurally unable to assess the oral fluency and spontaneous listening comprehension that determine real communicative competence in any classroom interaction setting
- They require students to perform under artificial time constraints that are unlike the conditions of real classroom learning, producing scores that may not accurately reflect how students will perform once settled into a regular teaching programme with familiar materials and routines
- They provide a snapshot of general ability but might not capture specific strengths and weaknesses, potentially placing students based on strong reading and grammar performance while overlooking significant struggles in listening or speaking
- Placement tests are typically produced by external publishers who may not be familiar with the specific curriculum taught at each institution, resulting in test items that measure general language features rather than the particular vocabulary and structures covered at each level of the programme
The source states: "placement tests aren't perfect. They provide a snapshot of general ability but might not capture specific strengths and weaknesses. A student might be placed into the Intermediate level based on strong reading and grammar skills, but actually struggles significantly with listening or speaking." Option (c) reflects this reasoning exactly. Option (a) claims placement tests use only multiple-choice formats - the source does not say this; it gives MCQ as an example of one format but does not claim it is universal. Option (b) introduces the idea of artificial time pressure, which is not mentioned as the reason placement tests are limited. Option (d) describes external publisher misalignment with curriculum, which is also not part of the source's explanation of why placement tests provide only a snapshot.
The problem emerges when a student scores 85% on the written test and gets placed in an advanced conversation class, only to freeze when called on to speak. This student completes written exercises perfectly and clearly understands complex grammar rules, but they struggle to form simple sentences aloud. They studied English grammar intensively but rarely spoke it. They can ace any written test but panic when speaking.
This student isn't actually advanced - they're advanced in receptive skills but intermediate (or lower) in productive skills. The placement test measured only grammar, vocabulary, and reading, completely ignoring speaking and listening. The student ends up in a class far above their actual speaking ability, struggling while other students grow frustrated waiting for responses during pair work.
The solution is to recognize that language proficiency isn't one-dimensional. A revised placement process should include multiple measures: a written test in grammar, vocabulary, and reading; a 10-minute oral interview in which students answer questions, describe pictures, and engage in brief role-plays; and a short listening comprehension task. This takes longer than a single written test, but placement accuracy improves dramatically. Students end up in classes where they can actually succeed across all skills, rather than being misplaced based on a single measure of ability.
Progress tests, on the other hand, are designed to measure how much students have learned during a specific period of instruction. These tests focus on the material that's been taught recently, whether that's a particular unit in the coursebook, a set of grammar structures, or specific vocabulary themes. Progress tests are typically given monthly or after completing a certain amount of material, and they help both teachers and students understand whether learning objectives are being met.
The content of progress tests should directly reflect what's been covered in class. If you've spent three weeks teaching past tenses through stories, reading comprehension, and speaking activities, your progress test should assess students' ability to use and recognize past tenses in similar contexts. This alignment between teaching and testing is crucial for validity, which we'll discuss more later.
Consider This
The alignment described here between what is taught and what is tested is sometimes referred to in assessment research as washback - the influence that a test exerts on the teaching and learning that precedes it. When progress tests accurately reflect the content and methods used in class, the washback is positive: students practice communicatively in class because they expect the test to reward that kind of practice. When the test misrepresents what was taught, however, the washback can work against the teacher's intentions. Students who recognise that the test will only measure grammar rules may disengage from the speaking and reading tasks the teacher designed, redirecting their preparation toward memorisation of forms instead.
Thinking carefully about progress test design is therefore not just about measuring learning accurately - it is about ensuring that the test reinforces the quality of the learning that happened in class. As you design or review progress tests, ask yourself: if students prepared only for this test, would that preparation align with or undermine the kind of language use you have been building toward in your lessons?
Beyond the Basics
Returning a marked progress test with a score alone provides limited information for a learner who wants to improve. Effective feedback on a progress test should identify what the student has done well - not only where marks were lost - because recognition of successful performance reinforces productive language habits and supports continued effort. It should also guide the student toward specific, actionable next steps rather than general encouragement, so that they know what to review, practise, or attempt differently before the next assessment point arrives.
In larger classes where written commentary on every script is not always practical, focused written feedback on two or three high-value points - combined with a class discussion of common patterns noticed across student scripts - can deliver much of the same value more efficiently. The principle underlying both approaches is the same: assessment data is most useful when it helps learners understand precisely where they are and what they should do next, not simply how well they performed on one occasion.
Check Your Understanding
According to the text, what specific example is given to illustrate the alignment between teaching and testing in progress assessments?
- A valid progress test should be administered under the same physical conditions as the end-of-term summative assessment so that students become familiar with the examination environment, making their summative performance a more accurate reflection of their underlying language ability rather than their anxiety about the format
- If students have spent three weeks learning past tenses through stories, reading comprehension, and speaking activities, the progress test should assess their ability to use and recognize past tenses in similar contexts
- Progress tests should include all four language skills in every administration to ensure that the results reflect balanced communicative ability rather than strong performance in only the skill areas that happened to be the focus during the most recent unit of instruction
- Alignment between teaching and testing is achieved when students are given access to the assessment criteria in advance, since transparent marking standards allow learners to self-assess during their preparation and approach the test with an accurate understanding of what correct performance looks like at their level
The source states: "If you've spent three weeks teaching past tenses through stories, reading comprehension, and speaking activities, your progress test should assess students' ability to use and recognize past tenses in similar contexts." Option (b) reproduces this example accurately and directly. Option (a) introduces the idea of matching physical test conditions between progress and summative tests, which is not mentioned in the source's discussion of alignment. Option (c) claims progress tests must always include all four skills - the source's discussion of alignment focuses on reflecting taught content, not on ensuring all four skills appear in every test. Option (d) describes providing criteria to students in advance, which the source does not raise as the mechanism for achieving alignment in progress tests.
