Transcription

TOEIC & TOEFLVocabularySecrets RevealedJALT 2013 KobeOct 26, 2013Presenter: Guy CihiLexxica R&D2-7-8 Shibuya 5FShibuya-ku, Tokyo [email protected] 2013 Lexxica

Presentation Outline1) What is “Coverage?”2) Corpus Analysis - TOEIC and TOEFL3) Secrets of TOEIC and TOEFL vocabulary4) How and why ETS uses esoteric vocabulary5) How graded readers can best support TOEICand TOEFL score increases

CoverageThere are specific words thatoccur most frequently within aparticular subject domain.The most frequently occurringwords provide the greatestamount of coverage for a domain.Focusing on learning missing highfrequency words is the fastestway to increase coverage of adomain.Copyright 2013 Lexxica

We do our own corpus analysis workWe study exactly which words are required to master each subject area.All General English13,384 wordsTOEFLBusiness English7,501 words8,742 wordsCoreTOEICCollege Entrance5,435 wordsHigh School6,480 wordsIELTS5,870 words3,552 wordsElementary2,000 basic wordsCopyright 2013 Lexxica

TOEIC Corpus Analysis1,250,000 total words14,652 different words6,480 different words constitute99% of all occurrences982 different words constitute90% of all occurrences.These 982 are the absolutely essentialSuper High Frequency words of TOEICCopyright 2013 Lexxica

TOEFL Corpus Analysis1,250,000 total words16,736 different words7,501 different words constitute99% of all occurrences1,513 different words constitute90% of all occurrences.These 1,513 are the absolutely essentialSuper High Frequency words of TOEFLCopyright 2013 Lexxica

Secret #1TOEIC and TOEFL are Item Response TheoryProficiency Tests – not Englishability diagnostic tests. These tests arenot designed to provide meaningfuladvice for improving English ability.Students are scored based on their correctresponses to questions having knowndifficulty metrics. The difficulty metrics areestablished through statistical analysis of allprior uses of each question.Copyright 2013 Lexxica

Secret #2Without a full range of questions from easyto difficult, Education Testing Service “ETS,”would be unable to maintain its bell-curveand generate ‘reliable’ scores.It is impossible to write statisticallydifficult questions. Only field testingcan identify the difficulty of questions.Copyright 2013 Lexxica

Secret #395% of test questions are recycled.5% are new questions that are in the processof being measured for difficulty.The 95% recycling requirement meansthat vocabulary on the tests can beaccurately predicted.Copyright 2013 Lexxica

Secret #4ETS has never, and likely will never issue avocabulary guide for any of its major testsincluding: TOEIC, TOEFL, SAT and GRE.Why?Copyright 2013 Lexxica

Secret #4Because using difficult words, and irregulardefinitions, are the best way to create a widevariety of questions at all levels of difficulty.Publishing an official vocabulary guidewould both expose a scoring systemvulnerability and defeat the purpose of theirtests which is to measure familiarity andproficiency with authentic English.Copyright 2013 Lexxica

TOEIC, TOEFL (and IELTS)versus General English1/3 of the words in all parts of TOEICand TOEFL are not common, highfrequency words in General English.(¼ of the words in IELTS.)Copyright 2013 Lexxica

What kinds of wordsCopyright 2013 Lexxica

Top 2000 high frequency words of TOEIC and General gazegeargenegeneralFrequent only in the TOEIC corpus.Frequent only in the General corpus.Our general corpus contains 850 million words from all genres.Copyright 2013 Lexxica

What does this mean?EFL students can’t learn thewords they need becausethey aren’t in their studyand reading materials.(Because study materials are simplified.)Copyright 2013 Lexxica

I used to say:Education Testing Service (ETS) purposefully usesdifficult words and seldom used meanings ofcommon words because otherwise their scoringsystem fails.(Then I talked to ETS authors and editors)Copyright 2013 Lexxica

Now I say:Education Testing Service (ETS) purposefully usesdifficult words and seldom used meanings ofcommon words because otherwise their scoringsystem fails.Copyright 2013 Lexxica

To create new test questions:Authors are told to search through authenticmaterials to find texts and dialogs to adaptfor the different types of test questions.Copyright 2013 Lexxica

To evaluate new test questions:When finished, the authors and editors do notknow how difficult their new questions are.The only way to find out is for ETS to put theminto actual tests alongside questions for whichthey do know the difficulty.Copyright 2013 Lexxica

Testing the test questions:On every TOEIC and TOEFL test 5% of thequestions are new questions that have no affecton scoring.95% are recycled questions that have known andreliable difficulties that can be used for scoring.Copyright 2013 Lexxica

ETS’s Primary ConcernETS’s primary concern is the consistency withwhich their test scores reflect each respondent’srelative proficiency with authentic English.Copyright 2013 Lexxica

From corpus analysis we confirm:1/3 of the words on TOEIC and TOEFL tests are lowfrequency ‘authentic’ vocabulary words.Vocabulary is the primary reason that one testquestion is more or less difficult than another.Copyright 2013 Lexxica

Note that many of the 1/3 low frequency words have multiple rgazegeargenegeneralFrequent only in the TOEIC corpus.Frequent only in the General corpus.Our general corpus contains 850 million words from all genres.Copyright 2013 Lexxica

Typical low frequency definition:crackA line along which something has split withoutbreaking into separate parts: “a crack in the surface.”An illegal street drug: “possession of crack."Very good, esp. at a specified activity: “He’s a crack shot.”To open something after making a concertedeffort: “to crack a safe.”Copyright 2013 Lexxica

Typical low frequency definition:crackA line along which something has split withoutbreaking into separate parts: “a crack in the surface.”An illegal street drug: “possession of crack."Very good, esp. at a specified activity: “He’s a crack shot.”ETS used this:“ it took several years for Apple to the market.”A: crackB: break openC: secureD: invertCopyright 2013 Lexxica

Why use low frequency definitions?They are difficult and they are authentic.(ETS doesn’t promise practical English.)Copyright 2013 Lexxica

ETS’s advice for scoring higher on TOEICand TOEFL is to read authentic texts.(Graded readers can’t help because the vocabulary is simplified)Copyright 2013 Lexxica

How much authentic text?Based on incidence ofoccurrence research byRob Waring, they’ll needto read 6,250 hours ofauthentic text in order tomeet the lower frequencytest words often enoughto learn them.Copyright 2013 Lexxica

Reading at 70 authentic words per minute 2 hours each day for8.5 yearsCopyright 2013 Lexxica

Reading at 70 graded words per minute Copyright 2013 Lexxica

Graded readers are general EnglishAll General English18,000 semantemesAdvancedGraded Readers9,000 semantemesCore99% of GradedReaders4,000 semantemesCopyright 2013 Lexxica

TOEIC and TOEFL are not general EnglishAll General English18,000 semantemesTOEFL9,000 semantemesAdvancedGraded Readers9,000 semantemes99% of GradedReadersCoreTOEIC8,000 semantemes4,000 semantemesCopyright 2013 Lexxica

TOEIC and TOEFL are not general EnglishAll General EnglishEFL studentsarehere Advanced18,000 semantemesTOEFL9,000 semantemesGraded Readers8,000 semantemes99% of GradedReadersCoreTOEIC8,000 semantemes4,000 semantemesCopyright 2013 Lexxica

How can graded reading helpEFL students prepare forTOEIC and TOEFL?Copyright 2013 Lexxica

90% of the words that occur in beginner andintermediate level graded readers are alsosuper high frequency words in the TOEIC andTOEFL domains. Because the tests are timed,students who can process the Super HighFrequency words faster enjoy a huge scoringadvantage. Graded readers can’t teachvocabulary they don’t contain but, they canhelp students develop automaticity (instantrecognition) for the Super High Frequencywords occurring in every TOEIC and TOEFL.Copyright 2013 Lexxica

What is the best way to useexisting graded readers toimprove reading and listening?Copyright 2013 Lexxica

Repeated timed aural readings.Copyright 2013 Lexxica

Example of a repeated, timed, spoken readingapproach. This method is highly effective!WPMSpoken ReadingSpeedTitle; HeadwordsGood Dog, Bad Dog; 75Good Dog, Bad Dog; 75Good Dog, Bad Dog; 75Good Dog, Bad Dog; 75Good Dog, Bad Dog; 75Good Dog, Bad Dog; 75Good Dog, Bad Dog; 75Good Dog, Bad Dog; 75Good Dog, Bad Dog; 75Good Dog, Bad Dog; 75Goal: 622622622622622622622622622Words permin.6269787878898996969696ReadingCopyright 2013 Lexxica

Implemented properly, a gradedspeed-reading program can helpAll GeneralautomaticityEnglishdevelopfor the SHF core 18,000 semantemesAdvancedGraded Readers8,000 semantemes99% of GradedReadersTOEFL9,000 semantemesCoreTOEIC8,000 semantemes4,000 semantemesCopyright 2013 Lexxica

The WordEngine high speed vocabularysystem has been proven todevelop automaticityfor all of the words 18,000 semantemes8,000 semantemesTOEFL9,000 semantemesCoreTOEIC8,000 semantemesCopyright 2013 Lexxica

When improved outcomes areimportant, professionals trustWordEngine to get results!

Average TOEIC score increases 86%

Average TOEFL score increases 135%

Contact Lexxicato start a trialprogram at yourschool.Lexxica R&D2-7-8 Shibuya 5FShibuya-ku, Tokyo [email protected] 2013 Lexxica