The Reliability of ISO/IEC PDTR 15504 Assessments
During phase two of the SPICE trials, the Proposed Draft Technical Report version of ISO/IEC 15504 is being empirically evaluated. This document set is intended to become an international standard for Software Process Assessment. One thread of evaluations being conducted during these trials is the extent of reliability of assessments based on ISO/IEC PDTR 15504. In this paper we present the first evaluation of the reliability of assessments based on the PDTR version of the emerging international standard. In particular, we evaluate the interrater agreement of assessments. Our results indicate that interrater agreement is considerably high, both for individual ratings at the capability attribute level, and for the aggregated capability levels. In general, these results are consistent with those obtained using the previous version of the Software Process Assessment document set (known as SPICE version 1.0), where capability ratings were also found to have generally high interrater agreem ent. Furthermore, it was found that the current 4-point scale cannot be improved substantially by reducing it to a 3-point or to a 2-point scale.