Most language tests are local, devised by national or regional exam boards, or by university language departments. They reflect the language needs of a particular body of students and address, in one way or another, the goals of a national or local curriculum. The tests that the ELT community is most familiar with – IELTS, TOEFL, TOEIC, FCE, PTE and all the other acronyms – represent, by contrast, a tiny minority of the English tests taken annually. They do, however, play a very important complementary role. Unlike the school and university tests, one of the key objectives of these global tests is to determine the extent to which a learner can operate in an international English environment.
This argues for a test in which a plurilingual, pluricontextual learner is assessed by being exposed to a variety of international dialects and situations. This doesn’t always happen in practice: OUP, for example, publishes separate British and American versions of the Oxford Online Placement Test. But even if the difference between the two relates mainly to accent in the Listening section, it’s difficult to visualise circumstances in the 2020s in which a language learner would be exposed only to British or American voices. Surely to be valid, a test must expose the candidate to the variety of accents they are likely to hear in real life.
However, while this may be broadly true, judgements still have to be made. While most British and American accents are mutually intelligible, not all are. And what about other varieties of English? A long term Hong Kong expatriate resident can listen to a Hong Kong speaker of English and understand every word, while a visitor might only pick out one word in three. (I’ve seen it happen.) This means a Hong Kong speaker of English may be comfortably intelligible within their own community, while being unable to operate in the same way globally. (Let’s be clear, incidentally, that this is just an example: it doesn’t apply to all Hong Kong speakers, by any means.) This neatly illustrates how an accent may be appropriate in a local exam, but not in a global one. So for international tests we need to switch from the familiar metric of ‘comfortable intelligibility’ to a new metric: ‘internationally intelligible’.
Vocabulary is similar. Consider these sentences:
– My faucet is leaking. (North American English)
– Let’s go and sit on the stoep. (South African English)
– Is it OK if we prepone the meeting? (Indian English)
The words in italics would be comprehensible to English speakers at the appropriate CEFR level within their own community but would not be recognised by equivalent level learners in different parts of the world. Indeed, they would probably not be understood by native speakers in other communities. It is therefore important to identify words that do not have a place in the lingua franca and to exclude them.
Issues of cultural appropriacy are well-rehearsed. As Laura Edwards explains in this post, test items need to be both culturally fair and culturally sensitive. So, if a reading topic is on the French Revolution, test takers in France are likely to have an unfair advantage over those in Indonesia. Similarly, learners in Saudi Arabia would be disadvantaged if the topic in a test item is culturally or politically taboo in their society because the subject matter forces them to engage with their feelings as well as with the language. It is generally accepted in ELT publishing that some topics are off the table.
Finally, item writers can be too keen to make their content diverse and international. To avoid needlessly engaging the critical faculties of test takers, we should avoid questions which pointlessly mix cultures. So, for example, it is jarring to locate a Disneyland in Morocco, or to have speakers with Australian accents in a dialogue clearly located in Canada. A test should aim for at least superficial authenticity.
In practice, how is an item writer to avoid all these pitfalls? Fortunately, there are a number of tools readily available. For phonology, we can use the lingua franca core to measure whether any given accent is internationally intelligible. It will tell you, for example, that in vowels, the long-short contrast is important (which will partly explain the issue with Hong Kong English). For vocabulary, we have two useful tools with which to gauge whether a lexical item is internationally intelligible: standard dictionaries and national corpora. Chambers Dictionary (UK), for example, specifies that faucet is a US term; Webster’s (US) does not include prepone at all; neither does Macquarie Dictionary (Australia) include stoep. This tells us that none of these words are appropriate in an international test of English. A corpus can be even more useful. It can, for example, show us whether a word is in current usage. And for cultural and situational issues, there is no substitute for a team of experienced editors with cross-cultural experience.
The focus of this post has been on avoiding factors which may make the test less valid or less fair, but there are positive measures that can be taken too. One approach, which applies particularly to speaking, is to give credit to test takers who are able to demonstrate ‘accommodation strategies’ (such as intelligibility, clarification, self-repair and turn-taking) – even if this is done using language that would be judged incorrect in native English (Jenkins & Leung, 2013). These measures are a topic for a future post.
References:
Jenkins, J. and Leung, C. (2013). English as a Lingua Franca. In The Companion to Language Assessment, A.J. Kunnan (Ed.).