The present study assessed test–retest and inter-observer reliability of diffusion tensor imaging (DTI) in cervical spondylotic myelopathy (CSM), as well as the agreement among measurement methods. A total 34 patients (12 men, 22 women; mean age, 58.7 [range 45–79] years) who underwent surgical decompression for CSM, with pre-operative DTI scans available, were retrospectively enrolled. Four observers independently measured fractional anisotropy (FA) values twice, using three different measurement methods. Test–retest and inter-observer reliability was assessed using intraclass correlation coefficients (ICCs). Overall, inter-observer agreements varied according to spinal cord level and the measurement methods used, and ranged from poor to excellent agreement (ICC = 0.374–0.821), with relatively less agreement for the sagittal region of interest (ROI) method. The radiology resident and neuro-radiologist group showed excellent test–retest reliability at almost every spinal cord level (ICC = 0.887–0.997), but inter-observer agreements varied from fair to good (ICC = 0.404–0.747). Despite excellent test–retest reliability of the ROI measurements, FA measurements in patients with CSM varied widely in terms of inter-observer reliability. Therefore, DTI parameter data should be interpreted carefully when applied clinically.