Oxford University Centre for Educational Assessment The meaning

Oxford University Centre for Educational Assessment The meaning

Oxford University Centre for Educational Assessment The meaning of examination standards Jo-Anne Baird 3/1/20 Measuring educational standards Pre-school Learning at home Child health Child diet Input factors

Schooling Attendance Pupil : teacher ratio Teacher pay Funding per pupil % GDP on education Curriculum Assessment outcomes Literacy & numeracy rates Progression to Higher Education Economically active Well-being Output factors

Page 2 Assessment outcomes Children take a wide variety of assessments The content of assessments varies in different years for different subjects for different qualifications in different countries

How can we tell if assessments are of the same standard? Why would we want to do this? Page 3 Equivalent assessment standards are required for

People applying for the same job with a qualification taken in different years Applications for a course from students who have taken a qualification in different subjects Creating grade point averages Calculating school league tables Allocating funds to schools on the basis of assessment outcomes Certification of language proficiency Comparisons of countries levels of education Others ?

Page 4 So how do we know if qualifications are the same standard? What you think examination standards are? What evidence would you need to collect to measure those standards? How would you know if that evidence was reliable and valid?

Demands of the assessment? I have a collection of old GCE O-level mathematics papers dated 1972. I doubt if many of today's A-level students would be able to answer them. Students are rarely taught how to think anymore. C. Bishop (BBC website 13 Aug 02) Compare an A level paper from 30 years ago to one from 2011 and there is no comparison. Katharine Birbalsingh, teacher Telegraph on-line 18 August 2011 Royal Society of Chemistry (2008) 7

8 Syllabus, question paper & marking scheme demands What is the volume of the course? How much curriculum time would it take?

What is the syllabus content and how demanding is it? Are higher order skills being assessed, as well as rote learning? What is the weighting given to different assessment criteria? How clear are the question papers? Do diagrams assist or get in the way? What impact does the layout of the questions have upon demand? What instructions are given to examiners and how will that affect the easiness/difficulty of scoring marks? How would time demands affect scores? What is the right demand? What are you comparing with? Page 9 Pass marks? Michael Gove says that the standards

for exams for 16-year-olds is too low. He indicates that he wants the pass mark to rise from 20 per cent in some cases to closer to 40 per cent, saying it was common sense. He said: "I know it's too low at the moment. I know that it doesn't compare with international best practice. That's where we've got to get to. "My view is that at the moment you need to ensure that exams command public confidence. That means having a passmark that people trust. Telegraph, 7 November 2008 GCSE Mathematics grade C pass marks what do they tell us about standards?

Maximum available mark 100 Higher tier question paper grades A* to D available Grade C pass mark Examination session 25 Jun-10 25 Nov-09 31 Jun-09 34 28 Nov-08 Different examination

Jun-08 26 28 Nov-07 Different examination Jun-07 Page 11 Marks are not necessarily meaningful Teachers urge return to percentage exams Teenagers taking A-levels and GCSEs should be given percentage marks instead of A, B and C grades, teachers urged today. Former PAT [Professional Association of Teachers] chairman Barry Matthews, , said he was concerned that pass marks changed every year. The

public would have more faith that exam standards were being maintained if the pass mark stayed the same every year. Wesley Paxton put forward a motion demanding a return to numerical marks. Delegates passed the motion Pre-testing using IRT or Rasch Examinatio n1 We know how difficult this is through standard setting Link This must either be

through common items or students Examinatio n2 Estimate difficulty of this examination Or design the exam to have particular pass mark Page 13 Raising standards a sporting analogy Do we want to raise the height of the bar? Standards = what you have to do.

Do we want to count how many get over the bar? Standards = how many people can do it. Pass rates? The A-level pass rate has risen for 29 years in a row and is now around 98 per cent, fuelling concerns about dumbing down of exams. The Telegraph (12 Aug 2012) GCSE Mathematics grade C pass rates what do they tell us about standards?

Maximum available mark 100 Higher tier question paper grades A* to D available Grade C pass Examination % reaching at least Number of examination mark session grade C candidates 25 Jun-10 54.9 49,332 25 Nov-09 41.2 12,221 31

Jun-09 53.3 56,699 34 Nov-08 41.8 8,581 28 Jun-08 52.6 64,420 26 Nov-07 51.4 6,302 28 Jun-07

54 86,805 Page 16 Standards are falling when pass rates go up Height of the bar lowered in England? The pass rate has gone up for the 21st year in a row and more pupils are getting A grades than ever before about 20% of those taking the exams. Chris Woodhead believes public exams like GCSEs and A-levels are getting easier. "When you look at the rate of increase and the fact that each year each new generation does do better It can't all be down to better teaching, greater dedication, more intelligent students." (BBC News Online, 14 August 2003)

Standards are falling when pass rates go down Too few students getting over the bar? But there was concern last night when it emerged that a fall in the pass rate for Higher English is even worse than feared, with four out of 10 failing the exam. 2% lower than has previously been suggested and represents a 12% fall over the past two years ... called for smaller class sizes and a greater focus in schools on literacy to drive up standards in English, following allegations that some sitting the exam are "barely literate". (Scotland on Sunday, 10 August 2003) Cohort referencing approach Definitions of educational standards Statistical Cohort referencing Catch-all

Item Response Theory or Rasch modelling Judgmental Criterion referencing Standards referencing (weak criterion) Conferred power Catch-all approach Two examinations are of comparable standards if students with the same characteristics are awarded the same grades on average, no matter which

examination they entered. Cresswell (1996) Yellis Durham University test 3/1/20 Page 23 Measurement problems for catch-all What does it mean to be equally prepared for different exams? How would we measure this?

How could we collect data on ALL factors that influence exam performance? Given the above, would the model be sensitive enough to detect differences in standards? Interpretation problems for catch-all Which variables are valid controls? Attractiveness, anger, comfort of clothing, gender, ethnicity, ? Controls should have the same effect on each examination. Or should they? Value judgment predicament for researchers Interaction problems for catch-all

Mathematics control Grade English Hours of study Item response theory & Rasch modelling Latent trait assumption ability & difficulty are on the same scale Claims to be population-independent Claims to be able to adjust for the difficulty of items/tests when calculating ability estimates Arguments against IRT

Too complex & obfuscatory The assumptions do not hold it is not population-independent Assumptions are rarely tested and/or tests are not reported When the data do not fit the model, items are rejected affects the construct being tested Test performance is curriculum-dependent Criterion referencing More absolute standards Sir Keith Joseph (1984)

define precisely, for each subject, the skills, competences, understanding and areas of knowledge which a candidate must have covered and the minimum level of attainment he must demonstrate in each of them, if he is to be awarded a particular grade. See also Popham (1978) This is a pass, but at what level?

respond to texts in a critical way rather than just offering personal reactions; develop a line of argument supported by appropriate textual references; show understanding of how a writer conveys meaning; show insight into aspects of language, structure and theme. adapt style and form to suit purpose and audience; use a range of appropriately punctuated sentence structures to convey meaning effectively; make vocabulary choices for deliberate effect and to sustain a readers interest; write in an impersonal style when appropriate;

offer logical and coherent explanations in paragraphs that are linked effectively; create shaped narratives with developed characters. Exceptional performance in English, at what level? writing has shape and impact and shows control of a range of styles maintaining the interest of the reader throughout. Narratives use structure as well as vocabulary for a range of imaginative effects, and non-fiction is coherent, reasoned and persuasive.

A variety of grammatical constructions and punctuation is used accurately and appropriately and with sensitivity. Paragraphs are well constructed and linked in order to clarify the organisation of the writing as a whole. Conferred power definition Empower individual(s) to make the decisions Decision is a speech act (Searle) Due process is observed Cresswell 1996; Wiliam 1996

Who is empowered to judge? Sociological, value judgment definition of standards: empower experts to decide Only those with subject expertise considered legitimate experts How do we achieve commonality across committees?

Mistrust of experts & administrators with power Implicit models of decision making Weak criterion referencing students performances are said to be equivalent if they are of equal merit, in the judgment of senior examiners, after they have taken into account any changes in demand of the assessment Baird, Newton & Cresswell, 2000 Standards referencing

Stated simply, the assessor's grading task is to find the class or grade description which best fits the object in question, in the knowledge that no description is likely to fit it perfectly (Sadler, 1987, p. 206) Definitions of educational standards Statistical Takes account of Adjusts for Cohort referencing

Students rank order Nothing Catch-all Students grades Student, teacher and institutional characteristics Item Response theory Ability

Item difficulty Candidates performances Nothing Judgmental Criterion referencing Weak criterion Candidates performances referencing/Standa and the assessment itself rds referencing Difficulty of assessment

Conferred power Specified by due process Specified by due process Systemic examination failures Scotland Higher Still examination results were not sent out on results day (2000) England A-level results disputed

(2002) New Zealand Scholarship examination results wildly different from previous years (2004) Policy implementation: management incompetence Lack of planning and monitoring at an operational level, Leadership and delegation issues,

Problems relating to management skills (inability to manage change, expertise in education not management) and Politically driven changes without scoping of the projects Politics & assessment policy Evolving policy & competing perspectives Lack of role clarity & diffusion of responsibility Unclear decision Implementation time Unclear what to do making structures

shrinks Protracted negotiations Different understandings Reactive management Practitioner concerns not always connected Quality & timeliness with policy decisions affected Power & responsibility not always coupled Managing the risks What are the expectations of results from the new examinations? How are those results used by stakeholders? Where are the logistics issues likely to arise? (Moderation, grade

inflation, etc.) How will you know if those expectations are being met? Where will the likely challenges come from? How can you identify those and tackle them? 3/1/20 Page 40

Recently Viewed Presentations