Assessing+speaking

=ASSESSING SPEAKING, Sari Luoma (2004)= Book review by Liesbeth Meulemans

When I ordered Weigle's //Assessing writing// I noticed that there was an entire series about assessment : **The Cambridge Language Assessment Series**. Assessing speaking sounded equally interesting as assessing writing, so I ordered both books. And I'm glad I did. I find Luoma's book more accessible than Weigle's and got more practical tips to use straight away. Luoma is a Finnish language testing specialist with several university degrees. I'll describe what the chapters include and highlight what i found most interesting, and finish off by stating my opinion on some problems brought about in chapter 1.

In __chapter one__ we get an overview of what the nature of speaking is. Luoma points out that people are judged, often in a first impression, on their **pronunciation**. She questions the fact that people's speech is rated by the standard of the native speaker. Yet defining //**'the'**// **native speaker** seems impossible. Luoma also defines what lies in pronunciation (pitch, stress, volume..) and sketches how difficult it is to assess the whole or parts of it. She states grammar is easier to assess in speaking than in pronunciation, since it's clearly detectable, but she strongly seperates written grammar from **spoken grammer**. She further analyses the internal structures of spoken language, introducing terms like **idea unit**, **topicalisation** and **tails** and points out which words are markers of the speaker's fluency. Many of these theories are based upon Brown (1984).

In __chapter 2__ Luoma discusses speaking tasks that take into account the statements she made in chapter 1. What to include when creating a **context** for speaking tasks, is difficult, since not all aspects of context can be controlled. Especially important is the relationship between the speakers, which include the relation between speaker and assessor! Essential in defining the task is choosing a kind of talk. Bygate (1987) makes fine distinctions between talks in two main categories: **factually-orientated talks** and **evaluative talks**. A lot in this chapter reflects on what we discussed with Inger last time: that you should think of what and how you're going to assess before you start teaching!

This chapter contains a lot of aspects that we can use as a good base for developping oral examination tasks!
 * Macrofunction** of language and **microfunction** are explained from the CEF's viewpoint in this chapter. In accordance with the chosen communicative function for the tasks, different task formats (settings) are compared: individual, pair or group tasks.

In __chapter 3__ a range of rating scales are presented and analysed. I recognize both rubrics and criteria and the problems in creating these. I liked the way North (1996) describes working on this: «trying to describe complex phenomena in a small number of words on the basis of incomplete theory.» Some scales presented in this chapter are of **holisitic** form: divided in superior, advanced, intermediate and novice levels, each of them sub-divided in high, medium and low performace.
 * Behavioural scales** (like the ACTFL) are distinguished from **theory-derived analytic scales** (e.g. TSE), the main difference being the presence or lack of context for the described use of language. The CEF speaking scales contain both behavioural and analytical components, since they are not developped for one specific test, but want to provide a basis for creating own scales. Furthermore **numerical rating scales** are discussed: their scores are proven to vary a lot due to the rater's interpretation of the numbers. Apart from the scales format, the definition of levels and criteria are analysed, among others as to **quantative or qualitative testing**.

I found great help in the many criteria and rubrics examples, which made the content very accessible and usablle as well as easy to compare. The __next three chapters__ are about different **models of tasks**, not assessment. First the theory is presented and then examples are offered and discussed. This brings the theory to a classroom level and opens up for teachers to put into practice what they've learned in the first half of the book. I especially liked the example of the comparing/contrasting task on p. 146-147, where pupils not only have to state their opinion, but at the same time have to base their input on anothers' pupil output. That way their concentration and hopefully interest will be maintained better and will give them better grades.

The __last chapter__ focuses on **reliability** and **validity**. Relevant for me is the difference between **intra-rater** reliability (the rater stands by the grade given) and **inter-rater** reliability. (how different raters regard the scales) It makes me wonder...
 * if I really should change my rating practice?
 * What if I'm the only one conveying to the theories Luoma offers?
 * How effective will my rating be in comparison to others?
 * How change a school's rating culture?


 * Thoughts on chapter1**

I benefited most from the first, introductory chapter where you get all the basic terms. I liked that **comprehensibility** in discourse is the ultimate goal and should be graded most in assessment, as I practice. Good to get some confirmation :) but I also learned a lot and got refreshed some of the things I studied a long long time ago... That we speak in idea units and not in sentences was a good reminder! As the matter of a fact, we speak mostly ingroups of seven words, with pauses, no fancy conjunctions. Pauses, repetitions and reformulations are normal also for the most advanced speaker. I think I have always made the mistake of grading my pupils on written grammar when they speak...

Another wake-up call, was that while I assumed that using accurate words and speccfic terms is what defines a great speaker, it turns out that it's the use of the most **simple words,** that make you an advanced speaker! The plain words and structures that keep a conversation going, are markers for speaking levels. These are generic words, hesitation markers, fillers, that provide you the time to think...

(Here a funny interpretation of the use of fillers) [|use of fillers]

So I get the feeling we should rate these **interactional phrases** more than we do. After all a lot of goals in K06 demand discussion and conversation and not only well-prepared monologues! This affects rating at the oral examinations. In the **oral examinations**, part 2, the teacher-pupil discussion, is just as important as part 1, the presentation. These parts should be equally weighed, yet each of them differently rated! From what I have experienced and - I'm sorry to say- have practised, these 'discussions' look a lot like interviews. Pupil and rater tend not to be equal parts in these conversations. Not only puils should be made aware of their right/duty to take the initiative, but raters should also value the conversation as such.

Furthermore is the importance of **social interaction** not always visible in school tests. If a pupil of advanced level presents a topic to the class in fine lexical words, I tend to rate well, even though most listeners couldn't understand the content. Brown characterises dimensions of kinds of talk and explains that for **information-related talks** (not only by teachers in everyday class situations, but also for pupils presenting themes) «getting the message across and **confirming** that the listener has understood it» is essential. «Establishing common ground, giving the information in bite-sized chunks, logical progression, questions, repetitions and comprehension checks, help speakers reach this aim» This should be rewarded more in the assessment of pupils' presentations!

I have on more occasions commented pupils that they were a bit vague. Now I read that Luoma states vagueness can be a clear sign of advanced speaking, since maintaining an open meaning is a natural characteristic in spoken interaction. Seems I have to be more critical to my rating practice. Which brings me back to my reliability question: if I change my rating practice, should everyone?