[IPAC-List] Good BARS Inter-rater Agreement?

Fri Jul 15 14:45:34 EDT 2016

Hi Joel,

You indicated that there is no communication between the panels during the 
rating process, but you didn't say that each member of a panel first makes 
his/her ratings prior to any communication with other panel members.  Are 
they allowed to discuss their views, then make their rating?  Or do they 
first make their independent ratings which cannot be changed, then have a 
discussion and resolution, with each panel member now able to make a final 
rating?  You can't assess inter-rater reliability on any such final ratings 
that take place after discussion.  Basically, that's an LGD and someone is 
likely to have more influence.

Without the restriction to ensure that the ratings are independent (no 
discussion), you can't assess the inter-rater reliability of a panel.

However, if there is discussion before ratings, I believe you could consider 
each panel as a person and correlate their assigned scores.  Again, I think 
you can use the intra-class correlation, with level differences considered 
error.  The result will give you the single rater reliability (where a panel 
is considered a single rater).  Since your final results will be the sum (or 
mean) of the ratings of the two panels, you can use the Spearman-Brown 
formula to get the reliability of your sum of two raters (i.e., same as 
doubling a test in length).

Now, if you're actually getting independent ratings to start out with, then 
it seems to me that you can do an intra-class correlation for 6 raters --  
again using level differences as error - then when you get the result 
(single rater reliability), you can use the S-B formula to extend it based 
on six raters since your final scores are essentially the sum or mean of all 
six raters.  I say use all 6 raters because if there's no discussion on 
either panel, hence the panel assignment is irrelevant.  You're just 
determining the reliability of independent ratings where you have 6 raters.

It occurs to me that you may also want to correlate the actual final 
consensus ratings of each panel with the results based on their independent 
ratings.  This would tell you whether or not the discussion process 
radically changed the independent rating results or whether they were pretty 
much in line with them.  If you get an odd result here (correlation less 
than .80), you can review the data to see what happened.

Anyhow, Joel, maybe I'm missing something but these are my thoughts.

Good luck,
Richard Joines

----- Original Message ----- 
From: "Joel Wiesen" <jwiesen at appliedpersonnelresearch.com>
To: "IPAC-List" <IPAC-List at ipacweb.org>
Sent: Friday, July 15, 2016 8:58 AM
Subject: [IPAC-List] Good BARS Inter-rater Agreement?

> Perhaps you can share your experience or thoughts concerning inter-rater 
> agreement.
>
> Based on your experience (or published literature) what level of agreement 
> is reasonable to expect between 2 independent rating panels, each 
> consisting of 3 raters?
>
> Assume:
>
> 1. The exercise being graded is part of a public safety promotional 
> process and includes written instructions, with a one hour prep time 
> followed by a 15 min role play.
>
> 2. Each panel rates each candidate on 4 managerial/supervisory dimensions, 
> using one BARS scale per dimension.
>
> 3. Each rating panel bases its ratings a video recording of each 
> candidate.
>
> 4. The two panels are trained together before the rating of candidates 
> begins.  There is no communication between the two panels during the 
> rating of candidates.
>
> We can consider the scores at two levels: at the dimension level and 
> overall.  For each rating panel, a candidate's score on a dimension is the 
> mean of 3 ratings (one rating by each of the three raters), and a 
> candidate's overall score is the mean of 12 ratings (3 ratings for each of 
> 4 dimensions).
>
> We can consider agreement of two types: agreement of the 3 raters within 
> each panel, and the agreement of the 2 panels.
>
> In short, what level(s) of agreement among the raters within each panel 
> and across the two panels is reasonable to expect: for the 4 dimension 
> scores and for the overall scores?
>
> Your comments will be greatly appreciated.
>
> Joel
>
>
>
>
> -- 
> Joel P. Wiesen, Ph.D., Director
> Applied Personnel Research
> 62 Candlewood Road
> Scarsdale, NY 10583-6040
> http://www.linkedin.com/in/joelwiesen
> (617) 244-8859
> http://appliedpersonnelresearch.com
>
>
>
>
> Note: This e-mail and any attachments may contain confidential and/or 
> legally privileged information. Please do not forward any contents without 
> permission. If you have received this message in error please destroy all 
> copies, completely remove it from your computer, and notify the sender. 
> Thank you.
>
> _______________________________________________________
> IPAC-List
> IPAC-List at ipacweb.org
> https://pairlist9.pair.net/mailman/listinfo/ipac-list
>