The ongoing COVID-19 crisis may be claiming another victim in one of Canada’s leading education provinces – sound, reliable, standards-based and replicable summative student assessment. After thwarting a 2017-18 Learning Province plan to subvert the province’s Grade 3 provincial student assessment and broaden the ‘measures of success,’ the Ontario Doug Ford government and its education authorities appear to be falling into a similar trap.
What’s most unexpected is that the latest lubricant on the slippery slope toward ‘accountability-free’ education may well have been applied in Doug Ford’s Ontario under a government ostensibly committed to ‘back-to-basics’ and ‘measurable standards’ in the K-12 school system.


All K-12 provincial tests, administered by the Education Quality and Accountability Office (EQAO) were the first to go, rationalized as a response to the pandemic and its impact upon students, teachers, and families. More recently, Ontario’s education ministry opened the door to cancelling final exams by giving school boards the right to replace exam days with in-class instructional time.
Traditional examinations, the long-established benchmark for assessing student achievement, simply disappeared, for the second assessment cycle in a row, going back to the onset of the COVID-19 outbreak. Major metropolitan school districts, led by the Toronto District School Board, Peel District School Board and their coterminous Catholic boards, jumped in quickly to suspend exams in favour of what were loosely termed “culminating tasks” or “demonstrations of learning.”
Suspending exams was hailed in the Toronto Star news report as ‘a rare bright spot” for Ontario high school students. Elsewhere the decision to eliminate exams, once again, elicited barely a whimper, even from the universities. “Nobody’s missed standardized tests or final exams,” University of Ottawa professor Andy Hargreaves noted rather gleefully during the October 29-30 Canadian EdTech Summit.
Suspending examinations has hidden and longer-term consequences not only for students and teachers, but for what remains of school-system accountability. What’s most surprising, here in Canada, is that such decisions are rarely evidence-informed or predicated on the existence of viable, proven and sustainable alternatives.
Proposing to substitute culminating projects labelled as “demonstrations of learning” is based upon the fallacious assumption that teacher assessments are better than final exams. Cherry-picking a recent sympathetic research study, such as a May 2019 Journal of Child Psychology and Psychiatry article highlighting exam stress, may satisfy some, but it is no substitute for serious research into the effectiveness of previous competency-based “culminating activity” experiments.
Sound student evaluation is based upon a mix of assessment strategies, ranging from formative (daily interaction and feedback) assessment to standardized tests and examinations (summative assessment). It is highly desirable to base student assessment upon a suitable combination of reasonably objective testing instruments as well as teacher-driven subjective assessment. UK student assessment expert, Daisy Christodoulou, puts it this way: “Tests are inhuman – and that is what is good about them.”
Teacher-made and evaluated assessments appear, on the surface, to be more gentle and fairer than exams, but such assumptions can be misleading, given the weight of research supporting “level playing field” evaluations. The reality is that teacher assessments tend to be more impressionistic, not always reliable, and can produce outcomes less fair to students.
Eliminating provincial tests and examinations puts too much emphasis on teacher assessment, a form of student evaluation with identified biases. A rather extensive 2015 student assessment literature review, conducted by Professor Rob Coe at the Durham University Centre for Evaluation and Monitoring, identifies the typical biases. Compared to standardized tests, teacher assessment tends to exhibit biases against exceptional students, specifically those with special needs, challenging behaviour, language difficulties, or personality types different than their teacher. Teacher-marked evaluations also tend to reinforce stereotypes, such as boys are better at math or racialized students underperform in school.
Replacing final exams with teacher-graded ‘exhibitions’ or ‘demonstrations of learning mastery’ sounds attractive, but is fraught with potential problems, judging for their track record since their inception in the late 1980s. Dreamed up by the North American father of Outcome-Based Education, Dr. William Spady, assessing student competencies based upon ‘demonstrations of learning’ have a checkered history. Grappling with the OBE system and its time-consuming measurement of hundreds of competencies finished it off with classroom teachers.

A more successful version of DOLM (Demonstration of Learning Mastery), developed by Deborah Meier, Theodore Sizer and the Coalition of Essential Schools (1988 -2016), was piloted in small schools with highly-trained teachers. Such exhibitions were far from improvisational but rather “high stakes, standards aligned assessments” which aimed at securing “commitment, engagement and high-level intellectual achievement” and conceived as “a fulcrum for school transformation.” Systemic distrust, aggravated by testing and accountability, Meier conceded, “rendered attempts to create such contexts infertile.”
Constructing summative evaluation models to replace final exams is not easy and it has defeated waves of American assessment reformers. The Kentucky Commonwealth Accountability and Testing System (CATS) 2007-2008, and its predecessor, KRIS (1992-1998) serve as a case in point. Like most of these first generation reforms, the KRIS experiment was widely considered a failure. Its performance-based tools were found to be unreliable, professional development costs too high, and two elements of the program, Mathematics Portfolios and Performance Events, summarily abandoned. Writing portfolios continued under CATS but a 2008 audit revealed wide variations in marking standards and lengthy delays in returning the marked results of open answer questions.
Most of the recent generation of initiatives were sparked by a January 2015 white paper, “Performance Assessments: How State Policy Can Advance Assessments for 21st Century Learning,” produced by two leading American educators, Linda Darling-Hammond and Ace Parsi. Seven American states were granted a waiver under the Every Student Succeeds Act (ESSA) to experiment with such competency-based assessment alternatives.
Constructing a state model compliant with established national standards in New Hampshire proved to be an insurmountable challenge. While supported by Monty Neill and Fair Test Coalition advocacy forces, New Hampshire’s Performance Assessments for Competency Education (PACE) system ran into significant problems trying to integrate Classroom-Based Evidence (CBE) with state testing criteria and expectations. Establishing evaluation consistency and “comparability” across schools and districts ultimately sunk the experiment. It was anchored in state standards and required external moderation, including re-scoring of classroom-based work. Serving two masters created heavier teacher marking loads and made it unsustainable. Federal funding for such competency-based assessment experiments was cut in December 2019, effectively ending support for that initiative.
Provincial tests and exams exist for a reason and ensure that we do not fly blind into the future.. Replacing final exams with a patchwork solution is not really a wise option this school year. Simply throwing together culminating student activities to replace examinations is, judging from past experiments, most likely a recipe for inconsistency, confusion, and ultimate failure.
Teachers will, as always, do their best and especially so given the current turbulent circumstances. Knowing what we know about student assessment, let’s not pretend that the crisis measures are better than traditional and more rigorous systems that have stood the test of time.
What are the fundamental purposes of summative student assessment? Should provincial tests and final exams be suspended during the second year of the COVID-19 pandemic? Where’s the research to support the effectiveness of alternative ‘demonstration of learning’ strategies? Are we now on the slippery slope toward ‘accountability-free’ education?