Assessing Language in Content and Language Integrated Learning: A Review of the Literature towards a Functional Model

Evaluacion del lenguaje en aprendizaje integrado de contenidos y lenguas extranjeras: una revision de la literatura hacia un modelo funcional Avaliacao da linguagem em CLIL: uma revisao da literatura para um modelo funcional Assessment is one of the most contested topics in Content and Language Integrated Learning (CLIL) because of the duality between content and language, and the lack of official guidelines and research on this matter. Furthermore, as CLIL is an umbrella term portraying different realities, it is essential to consider the educational contexts in which this methodological approach is set. These various settings make each CLIL program unique concerning general aspects such as the educational level, the amount of exposure to the foreign language, the students’ age and level in the foreign language, and the different subjects being taught through it. The aim of this article is to discuss existing research on CLIL assessment and to offer a preliminary functional model for practitioners to deal with language issues. By analyzing the literature in a systematic way, the concepts of discrete and integrated assessment are revisited, and a closer look at the importance of considering students’ limited language proficiency and errors are also considered. It is hoped that the suggested functional model and the recommendations derived from it can serve as an aid to teachers in assessing language in a variety of CLIL subjects and contexts. To reference this article (APA) / Para citar este articulo (APA) / Para citar este artigo (APA) Otto, A. (2018). Assessing language in CLIL: A review of the literature towards a functional model. LACLIL, 11 (2), 308-325. DOI: 10.5294/laclil.2018.11.2.6 Received: 23/11/2018 Approved: 18/02/2019


Introduction
Assessing the language in CLIL Content and Language Integrated Learning (CLIL) assessment has a primary focus on measuring the students' progress in content (Coyle et al., 2010) and, thus, it is more related to assessment in non-linguistic subjects rather than in foreign languages. However, the dual focus of CLIL might complicate the assessment, as teachers commonly doubt whether to place the focus on both content and language issues. In fact, due to the relevance of language in CLIL as the vehicle to express content knowledge and skills, language-related assessment issues are one of the most contested aspects of the CLIL literature (Raitbauer, Fürstenberg, Kletzenbauer, & Marko, 2018;Lo & Fung, 2018;Morton, 2018;Aiello, Di Martino, & Di Sabato, 2017;Llinares, Morton, & Whittaker, 2012;Massler, 2011;Kiely, 2009Kiely, , 2011Serragiotto, 2007). When it comes to deciding whether and how to assess language in CLIL, the following are common questions which arise. First, do we assess content, language, or both? Do we sometimes assess one and not the other?
Second, should we assess the language in CLIL (Morton, 2018;Llinares, Morton & Whittaker, 2012); if so, which aspects of language should be assessed, and who is responsible for that-the language teacher, the content teacher, or both? Third, research has also focused on how to compensate for limited language proficiency, i.e., what happens with those students who are weak in language skills but good at content?
In this regard, questions on assessment are posed to whether students should be allowed to use their mother tongue as a communication strategy (Coyle, 2010;Kiely, 2009), the effect this might have on their grades (if any), and whether an overt focus on form favors language skills (Pérez-Vidal, 2007;Pica, 2002).
As opposed to foreign language teaching, where language objectives are at the forefront, the attention given to language in CLIL can vary among practitioners depending on their profile, the teachers' expectations, and its relative priority within CLIL objectives (Coyle et al., LACLIL ISSN: 2011-6721 e-ISSN: 2322-9721 VOL. 11, No. 2, JULY-DECEMBER 2018DOI: 10.5294/laclil.2018 PP. 308-325 2010). Consequently, concerning the treatment of language-related issues, we find two approaches to assessment: discrete assessment and integrated assessment.

Discrete assessment
Discrete assessment (Barbero & Clegg, 2005;Järvinen, 2009;Serragiotto, 2007), which is the most popular approach to CLIL assessment (García, as cited in Wewer, 2014;Mohan, 1986), considers language and content separately. According to advocates of discrete assessment, language should be given special attention so that it is not downgraded in the subject. Thus, since language inevitably interferes with content as the vehicle of expression, it is important to distinguish the language-related aspects from the disciplinary ones to prevent "muddied assessment" (Weir, 1990). Muddied assessment results from the overlapping of tasks, for instance, as when the performance of one task depends on language skills such as understanding a reading or listening extract. Therefore, "assessment must be structured in such a way that there remain no doubts as to whether missing elements or mistakes are linguistic-oriented, content-related or both" (Serragiotto 2007, p. 271). But, should the language be taken into account in the grade? Frigols, in Megías-Rosa (2012), asserts that foreign language proficiency should be kept apart from the content proficiency and skills so that it does not contaminate the grade or is marked down in the task/exam. She advocates, then, for assessing both content and language, and to inform students about the language they need to focus on to improve: "We should not assess or mark down content in the subject of English as well as we should not assess or mark down English in Math or Science" (Frigols, as cited in Megías-Rosa, 2012, p. 13), she concludes.

Integrated assessment
On the other hand, teachers can also use the integrated assessment recommended in The CLIL Compendium (2001), where content and language are assessed simultaneously. In this type of assessment, language is used as an instrument through which learners can show "the breadth of their knowledge and skills in relation to both content and UNIVERSIDAD DE LA SABANA DEPARTMENT OF FOREIGN LANGUAGES AND CULTURES language" (Marsh, Marshland, & Stenberg, 2001, p. 12). In this sense, Coyle et al. (2010) consider that language objectives may serve several functions as related to content objectives. First, they might relate to the effective communication of content or include notions-specific vocabulary or Cognitive Academic Language Proficiency (CALP)-or functions, such as the ability to communicate and use language to conduct practical discussion on the subject. Second, language objectives might also focus on form but related to the type of academic discourse in question-like the ability to use tenses correctly depending on the subject and discipline. Following this instrumental approach to language issues, language is used to improve content communication, i.e., to ensure the message in the foreign language is clear enough and that it fulfills its expected function in the subject academic discourse.
Besides, language-related skills are necessary to "make the language more visible and give students the chance to progress in academic language" (McKay, as cited in Massler, 2011, p. 34). Thus, although students need to master the language allowing them to express skills and knowledge in content subjects, language-related issues are measured in relation to content objectives.
In any case, regardless of the teachers' approach to language, teachers should be clear about why they are assessing language as well as content, and how they would like to do so (Coyle, Hood, & Marsh, 2010, p. 11) so that they can communicate their intentions to students.
Likewise, they should also consider the changes they need to implement formative assessment when it is not present in mainstream education. Besides, as was stated before, considerations about assessment in CLIL need to take into account several factors, such as the CLIL model, which shapes the amount of language present in the curriculum and program, and the students' level in the foreign language. In immersion programs or high exposure or hard CLIL, where lesson objectives are content-driven, for instance, there is a significant prevalence of both content and language or content only, which facilitates the focus on content-related issues. Contrarily, low exposure or soft CLIL models are more language focused (Bentley, as cited in Wewer, 2014) and, thus, teachers tend to give more prominence to linguistic aspects.
In general, and despite suggestions by researchers (Coyle et al., 2010), national recommendations tend to require language proficiency that students are meant to acquire over content knowledge (Eurydice, 2006, p. 56). Nevertheless, for those assessing language-related aspects, the biggest problem, as Cushing Weigle and Jensen (1997), Hönig (2010), and Wewer (2014) point out, lies in the lack of a CLIL curriculum specifying the role and weight of language in CLIL assessment. This curriculum could help to determine "the extent of English language exposure in subjects other than language, the subjects which follow the CLIL curriculum, the contents instructed through the foreign language, and the desired level of English in all four skills plus cultural skills" (Wewer, 2014, p. 234). To compensate for a lack of curriculum, official regulations and established criteria, Cushing Weigle and Jensen (1997) suggest anchoring the proportion of target language in CLIL (say, 25%). This way, practitioners could have a rule of thumb or an approximate idea of the weight given to the target language, i.e., 25%, and proceed accordingly. Other authors such as Gottlieb (2006) recommend to parallel language proficiency and academic achievement so that content objectives can help us define the academic language required for achieving content standards. In this sense, teacher collaboration about the aspects that should be considered, and the weight they are given (if any) can facilitate the content teachers' work and make language visible in the content class. Likewise, as Bentley (2010, p. 124) explains, in considering linguistic aspects, we contribute to narrow the focus of assessment depending on the subjects, and help in the design of assessment instruments that pinpoint essential language features for the topics and subjects in question. For instance, subjects like Art require limited language production, while in Social Science, where language is needed for the correct expression of content knowledge, both content and language-related issues are subject to assessment if the teacher decides to assess language at all.

Functional assessment
As was pointed out before, language is an essential part of CLIL instruction and, as such, it should be devoted specific attention as the primary evidence that teachers use to judge students' achievement in certain subjects. Nevertheless, for the integrated assessment of content and language, a new vision of language literacy, emerging from the systemic-functional UNIVERSIDAD DE LA SABANA DEPARTMENT OF FOREIGN LANGUAGES AND CULTURES model of language (Halliday & Hasan, 1985), is required. Considering the relevance of language in CLIL, a model to assess language registers following a functional approach is now provided. For this purpose, the students' level in a foreign language depending on the different skills, and the choice of code will be also considered.
Systemic Functional Linguistics (SFL)-an approach to linguistics centered around the notion of language function-views language not just as a way to communicate and function in society, but also as a resource for creating meaning in a range of contexts (Coyle, 2005). In defining the language in CLIL, SFL helps us to consider how each subject makes use of different genres along with academic vocabulary to serve academic discourse and thus, to express content knowledge. SFL also helps to ascertain how the language in CLIL can be assessed by taking into consideration specific domains and genres. Nevertheless, because of the CLIL nature, the way language proficiency is considered deserves closer attention. In CLIL contexts, students do not need to master the vehicular language before instruction, and thus, this new language literacy should be viewed as limited if compared to native-like proficiency in monolingual and immersion contexts (Lasagabaster & Sierra, 2010) García & Lin, 2014), which is incomplete and subject to change.
In fact, due to the students' limited language proficiency, learning a subject through the vehicle of a foreign language is not the same as learning it in a first language. If the student is not able to express herself/himself in this foreign language, the grade she or he receives might be lower than the one by the student who is more proficient.
Thus, as language expectations are often embedded in the assessment criteria, when language is not assessed appropriately, it can threaten the validity of assessment, and fail to provide an accurate picture of students' content knowledge and skills (Boscardin, Jones, Nishimura, Madsen, & Park, 2008, p. 4). To prevent this, the language needed for the competent performance of content learning needs to be clearly visible.
First, it should be linked to the achievement of content-based learning objectives (Coyle et al., 2010;Llinares, Morton, & Whittaker, 2012 pp. 284-285), and adapted to the students' proficiency levels to determine the desired level of English (Wewer, 2014, p. 234). Third, these language-related goals should be shown to students. In this sense, teachers need to be aware on one hand, of the students' language proficiency, and be familiar with the different levels in the CEFR. On the other hand, teachers are also encouraged to know about the specific language competence descriptors intrinsic in CLIL. The following are the main aspects of language competence content teachers need to take into account when assessing language in CLIL: Among the descriptors displayed above, productive skills such as the ability to recall academic vocabulary, operate using functions, presenting, discussing, and reasoning in the vehicular language demand a high level of English proficiency on the part of the learners. To overcome and compensate for limited language skills, which can compromise (some) students' scores, Massler (as cited in Ioannou-Georgiou, & Pavlou, 2011) suggests that teachers try to use the most direct method of assessment which uses the least language such as completing grids, and drawing diagrams or pictures to boost students' comprehension. However, although reducing the level and amount of language present in assessment tasks can be beneficial for pre-primary and primary students, for higher educational levels, cognitively challenging content requires more advanced language use and skills supporting content expression. So, if CLIL is aimed at developing both content and linguistic skills, diminishing the presence of language in assessment tasks does not seem to succeed in the long-term, especially in those sub-

UNIVERSIDAD DE LA SABANA DEPARTMENT OF FOREIGN LANGUAGES AND CULTURES
jects and contexts in which speaking and writing tasks prevail over reading and listening. For students to be language-competent in CLIL, they need to be able to express themselves in both the written and the spoken form along with any specific aspects of foreign language grammar and vocabulary helping them to communicate that content knowledge (Hargett, 1998). Regardless of the weight given to linguistic aspects in CLIL, if teachers decide to assess it, they should define the construct or specify what aspects of language should be assessed.
According to the CLIL Compendium (2001), for students to be able to function in CLIL contexts, they first need to improve their overall target language competence; second, develop communicative skills; and third, deepen an awareness of both their mother tongue and the target language. The problem arises when students fail to improve the target language competence, the output they produce is not adequate or correct for the context in question, and the teacher doubts as for the type of mistakes she or he would correct (if any). In this regard, Mohan and Huang (2002) suggest that, since language is not learnt separately from content knowledge in CLIL, mistakes should not be considered regarding grammatical correctness/incorrectness but in functional terms. As they point out: "the question is not whether a language form is grammatically correct but whether a form is used appropriately to convey a meaning in functional contexts" (Mohan & Huang, 2002, p. 240). Although an overt focus on form is believed to have a positive impact on the development of students' linguistic production in immersion programs and CLIL contexts (Pérez-Vidal, 2007;Pica, 2002), language mistakes should be judged differently as compared to EFL mistakes i.e., as taking into account their communicative intention in terms of linguistic functions rather than language accuracy or grammatical correction. Thus, contrarily to traditional practice in a foreign language lesson, the question of assessment in CLIL does not deal with the students' ability to use a linguistic form correctly but to use the appropriate form to express meaning in the particular academic context. For instance, in history, we need to focus on whether the student was successful in using factorial explanation, causal language and simple language forms to express degree of certainty (The war was probably caused by…) rather than focusing on accuracy and spelling in verb tenses (Llinares, Morton, & Whittaker, 2012, p. 294 As regards limited language proficiency, which should always be taken into consideration when analyzing the students' output, the type of language mistakes deserve special attention since their treatment would be different depending on their nature. Errors are important in that they help differentiate among CLIL assessment practices. In fact, individual differences usually lie in the approach teachers take to error correction, which inevitably has a profound impact on how students perceive assessment. It seems that, in general, a large number of CLIL teachers tend to assess language with an apparent prevalence of lexical errors over pronunciation ones, which are usually ignored (Dalton-Puffer, 2008) regarding the use of target academic vocabulary (Fuentes-Arjona, 2013). However, a closer look at different practices in CLIL usually reveals that decisions about whether to assess language-related issues or not and if so, the best criteria to assess language in CLIL, greatly depend on individual teachers and not departments or institutions.
Currently, errors are considered as part of the process of acquiring a language and, as such, teachers have to undertake specific pedagogical procedures to reduce their number and promote reflexive attitudes with their students to help them develop their linguistic skills. The approach to errors is, consequently, different to that of mistakes, so they should be corrected in such a way that they do not interfere with communication while encouraging students, and providing clear feedback and correct models (Council of Europe, 2001, p. 27).
Regarding error typology, Ernst (as cited in Hönig, 2010) divides them into the following categories in the context of typical error correction typical in form-focused instruction (FFI) (Pawlak, 2014). FFI or the instructional activities intended to focus on language forms is broadly understood as any attempt on the part of the teacher to encourage learners to pay attention, reflect and gain control over targeted language features, whether they are grammatical, phonological, lexical or pragmalinguistic in nature, in a planned or spontaneous way (Pawlak, 2014, p. 2). This typology can help teachers to identify the kind of errors which should be corrected in the CLIL context as depending on the extent to which understanding is impeded or impaired, i.e., considering language as the vehicle for expressing content knowledge. The first type refers to phonological, morphological, syntactic, semantic or pragmatic UNIVERSIDAD DE LA SABANA DEPARTMENT OF FOREIGN LANGUAGES AND CULTURES errors that impede or impair understanding, which, in the context of CLIL assessment, should be corrected and assessed. The second type of errors is pragmatic errors or errors of register, which are considered inappropriate to both culture and situation, and which should be corrected. The third type of errors are errors of form, i.e., deviations from grammar rules that do not impede understanding and that could be treated differently than in language lessons as will be explained below. Finally, errors in content-specific terminology-particularly those previously dealt with in class-, which impede understanding and prevent students from progressing in content subject knowledge due to the absence of specific academic vocabulary or CALP, which should be corrected and assessed (Hönig, 2010, p. 29).
Finally, about the choice of vehicular language in CLIL assessment, and again due to the lack of clear guidelines or specifications about CLIL assessment in general and the use of L1 in particular (Lin, 2015), options vary among CLIL practitioners. Regardless of the fact that instruction should be mediated in English, the teacher should be open to using the L1 moderately, and allow students to do the same occasionally (Gablasova, 2014;González & Barbero, 2013;Massler, 2011;and Hönig, 2010). This moderate use of the students' L1 is especially recommended in monolingual contexts, and when they need to engage in "exploratory talk" to co-construct knowledge and understanding of the topic, check comprehension, and promote interlingual work by exploring the two languages (Kiely, 2011, p. 62), and thus, support learning. By giving students the choice of using their mother tongue or the language of instruction, they benefit from the explicit clear and plurilingual approach in deepening awareness of both the target language and the mother tongue, and develop plurilingual interests and attitudes (Marsh, Marshland, & Stenberg, 2001). The use of the L1 is particularly relevant in some CLIL contexts such as Primary Bilingual Schools in the Spanish CAM Bilingual Project, in which official guidelines recommend the reinforcement of academic vocabulary in both Spanish and English.
In an attempt to assess learning in subject matter, the model proposed by Polias (2006), based on the SFL (Halliday & Hasan, 1985;Halliday & Matthiesen, 2004;Bachman & Palmer, 2010), can be useful for teachers to assess language effectively in specific CLIL genres  Unsworth, 2000;Whittake, O'Donnell, & McCabe, 2006). This model is functionally organized as to operate in all three manifestations of register-field, tenor, and mode-, this register being what distinguishes different types of genre. The genre refers to the text type and structure, i.e., the purpose, stages, organization, and phases in the text. The field deals with the type of lexis, i.e., how varied it is and its degree of technicality and abstraction. The tenor describes whether the text is consistent with the roles taken on by the language user, i.e., the degree of expertise and objectivity the text shows. Finally, the mode refers to whether the information in the text is organized in a coherent and cohesive way along with spelling and punctuation patterns (Polias, 2006, p. 59). According to Polias, the more able students are to operate successfully in the register continua, the better and more appropriate their production becomes.

Genre
Stages and phases of the text are logically organized according to the genre and the task All the stages and phases are included Each of the stages and phases achieve their purpose

Field
The text includes all the field knowledge expected Students' vocabulary is varied and adapted to their level The student has expanded the nominal groups in relation to his/her level The level of technicality and/or abstraction in the text is appropriate

Tenor
The student shows appropriate level of expertise in the academic field Appropriate level of uncertainty is used Appropriate level of objectivity is used

Mode
The student chooses theme (orientation) appropriately

Conjunctions are well selected and facilitate readability
Text is presented in a cohesive way Grammatical elements are accurate Spelling is accurate Punctuation is accurate and facilitates the text readability Source: Adapted from Polias (2006).

UNIVERSIDAD DE LA SABANA DEPARTMENT OF FOREIGN LANGUAGES AND CULTURES
One of the strengths of the model is that teachers can use it not only for product-based assessment, such as essays, project work and oral presentations but also for the process-based assessment tasks recommended in CLIL contexts. That is, for instance, the case of portfolio work, in which students can reflect on their work at distinct periods of time, and thus comment on their improvements. In fact, the focus on long-term work can be helpful for students in the first years of secondary education who often lack academic language or Higher Language Cognition (HLC) (Hulstijn, 2015) to produce high-quality academic explanations in subjects like Science and History (Aguirre-Muñoz, Park, Amabisca, & Boscardin, 2009), and whose production should be judged following a process-based approach.

Conclusions
As this paper has demonstrated, assessment in CLIL varies depending on several factors such as the CLIL model, the extent to which teachers have been trained to deal with linguistic aspects and the subjects taught. After having described the different options to deal with the language in CLIL, and in the absence of standard assessment criteria, this paper aims to engage practitioners in reflection as for the best model suiting their purposes, and offer a functional vision of language they can use to abandon the focus on form which is typical in some contexts. Thus, the vision supported here advocates for the assessment of language issues depending on the CLIL model, context and subjects in particular. In hard CLIL, and those subjects requiring less language production, the focus should be on content, and language should be assessed as integrated with content knowledge, (Coyle et al., 2010). Contrarily, in soft CLIL, and subjects demanding more language production, the language should be treated as a separate component. Regardless of the choice, an appropriate treatment of language following a functional approach, and highlighting the role of language in the construction of academic discourse is still essential.
This way, we can avoid language becoming an invisible part of instruction (Morton, 2018;Llinares, Morton, & Whittaker, 2012) and use