- James E Houston: Department of Educational Psychology, College of Education, University of Illinois at Chicago, Chicago, Illinois 60607, USA. houstonje@aol.com
PURPOSE: To determine (1) whether judges differed in the levels of severity they exercised when rating candidates' performance in an oral certification exam, (2) to what extent candidates' clinical competence ratings were related to their organization/communication ratings, and (3) to what extent clinical competence ratings could predict organization/communication ratings.
METHOD: Six hundred eighty-four physicians participated in a medical specialty board's 2002 oral examination. Ninety-nine senior members of the medical specialty served as judges, rating candidates' performances. Candidates' clinical competence ratings were analyzed using multifaceted Rasch measurement to investigate judge severity. A Pearson correlation was calculated to examine the relationship between ratings of clinical competence and organization/communication. Logistic regression was used to determine to what extent clinical competence ratings predicted organization/communication ratings.
RESULTS: There were about three statistically distinct strata of judge severity; judges were not interchangeable. There was a moderately strong relationship between the two sets of candidate ratings. Higher clinical competence ratings were associated with an organization/communication rating of acceptable, whereas lower clinical competence ratings were associated with an organization/communication rating of unacceptable. The judges' clinical competence ratings correctly predicted 61.9% of the acceptable and 88.3% of the unacceptable organization/communication ratings. Overall, the clinical competence ratings correctly predicted 80% of the organization/communication ratings.
CONCLUSIONS: The close association between the two sets of ratings was possibly due to a "halo" effect. Several explanations for this relationship were explored, and the authors considered the implications for their understanding of how judges carry out this complex rating task.