ISSN : 1226-9654
Traditionally, researchers have used rating accuracy and rater bias (severity, centrality, and randomness) as individual-level indicators of rating quality. While these have been studied mostly for expert raters, research on whether evaluation capacity is domain-general over two or more different tasks is lacking. Thus, we investigated the two indicators in the context of undergraduate raters. In two studies, undergraduates scored outputs from a verbal-linguistic task and a visual-spatial task. The results showed that proficient students in one domain are also likely to be proficient in the other in terms of rating accuracy and the use of rating scale. In addition, students with lower rating accuracy were more significantly affected by the difference between domains compared to more accurate students. We also discuss the implications and limitations of our findings on measuring student evaluation capacity.