It is also crucial to bear in mind that only the mental health records were contained in the data resource and that general medical notes from other providers were not available for review. However, the nature of the syndrome is such that nearly all patients received active management during the course of their illness. Furthermore, in most

cases mental health records were maintained during and after periods of care on general medical units, so relatively little information was lost. Since Gurrera and colleagues Inhibitors,research,lifescience,medical [Gurrera et al. 1992] compared the three main sets of diagnostic criteria for NMS, three new sets have been published: those of Caroff and colleagues [Caroff et al. 1991], DSM-IV [American Psychiatric Association,

1994] and those of Adityanjee and colleagues [Adityanjee et al. 1999], who proposed research diagnostic criteria. Gurrera and colleagues [Gurrera et al. 1992] found ‘only modest agreement’ among the criteria of Levenson [Levenson, 1985], Addonizio and Inhibitors,research,lifescience,medical colleagues [Addonizio et al. 1986] and Pope and colleagues [Pope et al. 1986]. Our comparison, also based on a retrospective review of medical notes, likewise found only modest, and if anything rather more modest, agreement. Gurrera and colleagues [Gurrera et al. 1992] derived κ and ICC statistics of between 0.41 and 0.65, Inhibitors,research,lifescience,medical and specifically modified the criteria of Levenson and Addonizio and colleagues, so as Inhibitors,research,lifescience,medical to conform to the ‘probable’ this website category allowed by Pope and colleagues. Their lowest ICC of 0.41 applied to a three-way comparison of the unmodified versions and Pope’s probable category, while the highest ICC applied to a three-way comparison of the modified versions and Pope’s probable category. Our study, while broadly in line with the conclusions of Gurrera and colleagues, showed some differences [Gurrera et al. 1992]. In particular, our measures of agreement were generally lower for overall and pairwise comparisons. Gurrera and colleagues reported κ values of 0.51 between the criteria of Levenson and those of Addonizio and colleagues, 0.60 between those of Pope and colleagues and those of Addonizio Inhibitors,research,lifescience,medical and colleagues,

and 0.48 between those of Pope and colleagues and those of medroxyprogesterone Levenson. In comparison, we found κ statistics for these comparisons of 0.51, 0.24 and 0.26 respectively. Subsequent to the completion of the study reported here, Delphi consensus criteria for NMS were published [Gurrera et al. 2011]. However, we believe that these criteria would have little utility for retrospective analyses such as those carried out here because, like those of Sachdev [Sachdev, 2005], they assume relatively specific sets of information are recorded in clinical records and are potentially better suited to prospective, more specific studies. Also of note is that Delphi methodology simply reflects the agreement of experts on the basis of the best evidence available.

