TY - JOUR
T1 - Test-taker perception of what test items measure
T2 - a potential impact of face validity on student learning
AU - Sato, Takanori
AU - Ikeda, Naoki
N1 - Publisher Copyright: © 2015, Sato and Ikeda.
PY - 2015/12/1
Y1 - 2015/12/1
N2 - Background: High-stakes tests have an immense washback effect on what students learn and affect the content of student learning. However, if students fail to recognize the abilities that the test developers intend to measure, they are less likely to learn what the test developers wish them to learn. This study aims to investigate test-taker perception of the ability being measured by items (i.e., face validity) in high-stakes tests and examines the extent to which test-taker perception and test developer intention agree. Methods: University students in Japan and Korea (N = 179) were given past entrance examinations administered in the respective countries and asked to read test items and record what ability they thought each item was measuring. Results: Although the overall agreement rate was moderately high, items aiming to measure an ability to read between the lines were perceived to be measuring an ability to understand the content objectively. Furthermore, many participants perceived items designed to indirectly measure writing ability as those tapping into reading ability. Conclusions: Face validity could be integrated for test development with the ultimate aim of promoting positive washback on students, which should be one of the intentions of test developers. In order to obtain the positive and intended washback effect on English learning, the present study suggests that the Japanese and Korean test committees need to (a) widely inform test-takers of the ability measured by each test item and (b) incorporate performance testing that measures test-takers’ productive skills more directly.
AB - Background: High-stakes tests have an immense washback effect on what students learn and affect the content of student learning. However, if students fail to recognize the abilities that the test developers intend to measure, they are less likely to learn what the test developers wish them to learn. This study aims to investigate test-taker perception of the ability being measured by items (i.e., face validity) in high-stakes tests and examines the extent to which test-taker perception and test developer intention agree. Methods: University students in Japan and Korea (N = 179) were given past entrance examinations administered in the respective countries and asked to read test items and record what ability they thought each item was measuring. Results: Although the overall agreement rate was moderately high, items aiming to measure an ability to read between the lines were perceived to be measuring an ability to understand the content objectively. Furthermore, many participants perceived items designed to indirectly measure writing ability as those tapping into reading ability. Conclusions: Face validity could be integrated for test development with the ultimate aim of promoting positive washback on students, which should be one of the intentions of test developers. In order to obtain the positive and intended washback effect on English learning, the present study suggests that the Japanese and Korean test committees need to (a) widely inform test-takers of the ability measured by each test item and (b) incorporate performance testing that measures test-takers’ productive skills more directly.
KW - Entrance examination
KW - Face validity
KW - Test-taker perception
KW - Washback
UR - https://www.scopus.com/pages/publications/85014119268
U2 - 10.1186/s40468-015-0019-z
DO - 10.1186/s40468-015-0019-z
M3 - Article
AN - SCOPUS:85014119268
SN - 2229-0443
VL - 5
JO - Language Testing in Asia
JF - Language Testing in Asia
IS - 1
M1 - 10
ER -