Comparison of Korean vs. American Thyroid Imaging Reporting and Data System in Malignancy Risk Assessment of Indeterminate Thyroid Nodules
Article information
Abstract
Background
The management of cytologically indeterminate thyroid nodules is challenging for clinicians. This study aimed to compare the diagnostic performance of the Korean Thyroid Imaging Reporting and Data Systems (K-TIRADS) with that of the American College of Radiology (ACR)-TIRADS for predicting the malignancy risk of indeterminate thyroid nodules.
Methods
Thyroid nodules diagnosed by fine-needle aspiration (FNA) followed by surgery or core needle biopsy at a single referral hospital were enrolled.
Results
Among 200 thyroid nodules, 78 (39.0%) nodules were classified as indeterminate by FNA (Bethesda category III, IV, and V), and 114 (57.0%) nodules were finally diagnosed as malignancy by surgery or core needle biopsy. The area under the curve (AUC) was higher for FNA than for either TIRADS system in all nodules, while all three methods showed similar AUCs for indeterminate nodules. However, for Bethesda category III nodules, applying K-TIRADS 5 significantly increased the risk of malignancy compared to a cytological examination alone (50.0% vs. 26.5%, P=0.028), whereas applying ACR-TIRADS did not lead to a change.
Conclusion
K-TIRADS and ACR-TIRADS showed similar diagnostic performance in assessing indeterminate thyroid nodules, and K-TIRADS had beneficial effects for malignancy prediction in Bethesda category III nodules.
INTRODUCTION
Ultrasonography (US) is the first-line imaging modality for the diagnosis of thyroid nodules. Currently, major guidelines recommend fine-needle aspiration (FNA), which is an essential diagnostic tool for assessing the malignancy risk of thyroid nodules, based on US imaging characteristics [1]. After FNA, it is recommended that cytologic results should be reported according to the Bethesda System for Reporting Thyroid Cytopathology, and further management is chosen based on the results of cytology [2]. However, the reported risks of malignancy in each cytology category vary across studies, especially for indeterminate nodules [2]. Thus, adding US findings has been recommended as an additional diagnostic tool for assessing the malignancy risk of indeterminate thyroid nodules [1–6].
US risk classification systems have been established by several societies: the American Thyroid Association (ATA) [1], the American College of Radiology (ACR) [7], the European Thyroid Association [8], and the Korean Society of Thyroid Radiology [9]. Several recent studies have compared the diagnostic performance of different US risk stratification systems, including a comparison of the diagnostic value of three different Thyroid Imaging Reporting and Data Systems (TIRADS): the Korean, European, and ACR-TIRADS [10]. The Korean-TIRADS (K-TIRADS) showed the best specificity, while the ACR-TIRADS presented the best sensitivity. Meanwhile, another study showed that the ACR-TIRADS outperformed the K-TIRADS in terms of a higher area under the curve (AUC) and lower false-negative rate [11].
In recent years, several changes have been made in the cytopathologic diagnosis of thyroid nodules. First, the entity of noninvasive follicular thyroid neoplasm with papillary-like nuclear features (NIFTP), which shows an excellent prognosis and an extremely low risk of adverse outcomes, has been introduced [12]. It is mainly placed in Bethesda cytology categories III, IV, and V [13], and a meta-analysis showed that adjusting for NIFTP as a benign disease significantly reduced the risk of malignancy in categories II to V [14]. Second, a subclassification of atypia (cytologic versus architectural) has been suggested to refine the risk of malignancy in category III nodules by the 2017 Bethesda system [2]. Studies have shown that atypia of undetermined significance or follicular lesion of undetermined significance (AUS/FLUS) nodules with cytologic atypia have a significantly higher risk of malignancy than those with architectural atypia [15,16].
Therefore, it is worthwhile to re-evaluate the diagnostic performance of US system based on the recently changed cytopathologic diagnostic system. This study aimed to compare the diagnostic performance of the K-TIRADS and ACR-TIRADS in terms of predicting the risk of malignancy in each cytology category of the Bethesda system, especially focusing on indeterminate nodules (Bethesda cytology categories III, IV, and V).
METHODS
Study population
A total of 1,153 consecutive thyroid nodules from 854 patients (female, 79.9%; mean age, 53.6±13.1 years) who underwent FNA at the Department of Endocrinology of Seoul National University Hospital from January to December 2017 were retrospectively screened. Thyroid nodules from patients aged 20 to 85 years that were pathologically confirmed by core needle biopsy (CNB) or surgery were included. Finally, 200 thyroid nodules from 160 patients were enrolled. This study was approved by the Institutional Review Board of Seoul National University Hospital (IRB No. 1911-039-1076). Informed consent was waived because of the retrospective nature of the study and the analysis used anonymous clinical data.
Ultrasonography risk stratification
A high-resolution US scan using a 5 to 12 MHz linear-array transducer (Philips Affiniti 50 G, Philips Ultrasound Inc., Bothell, WA, USA) was used. After enrollment, US images of all thyroid nodules were retrospectively reviewed and scored using two different systems—the K-TIRADS and ACR-TIRADS—by two experienced endocrinologists, who were blinded to patients’ clinical information and the pathology results. If there was any discrepancy in scoring, two endocrinologists discussed the issue and adjusted the score.
ACR-TIRADS is composed of five categories as previously described [7], and the TIRADS levels are determined according to the total score from all five categories: highly suspicious (TR5), ≥7 points; moderately suspicious (TR4), 4–6 points; mildly suspicious (TR3), 3 points; not suspicious (TR2), 2 points; benign (TR1), 0 points. The K-TIRADS is presented in the form of a tree diagram and classified nodules into five categories, as previously described [9]: high suspicion, intermediate suspicion, low suspicion, and benign, which are designated as K-TR5, K-TR4, K-TR3, and K-TR2, respectively. To compare the diagnostic performance between K-TIRADS and ACR-TIRADS in each cytology category, malignancy risk was assessed using the combined results of cytology (the six categories of the Bethesda system [2]) and US pattern.
US-guided FNA or CNB procedures and management of cytologically indeterminate nodules
According to the standard practice protocol of Seoul National University Hospital, FNA was performed according to the indications recommended in the K-TIRADS scoring system [9]. Repeated FNA or CNB was considered in nodules with non-diagnostic or indeterminate cytology (Bethesda cytology category III or IV) with highly suspicious US features. Of 1,153 total thyroid nodules, 101 (8.8%) and seven (0.6%) nodules were identified as Bethesda category III and IV, respectively. Supplemental Fig. S1 showed the clinical course of these patients. Among the 101 nodules in Bethesda category III, 3 and 55 nodules underwent repeated FNA and CNB, respectively. Among the 55 CNB performed thyroid nodules, 16 nodules underwent thyroid surgery, and their pathologic results were as follows: seven papillary thyroid carcinoma (PTC); one follicular variant of papillary thyroid carcinoma (FVPTC); three NIFTP; and five follicular adenoma. Nine nodules underwent surgery without repeated FNA or CNB for the following reasons: large nodule size (n=6); the simultaneous presence of other nodules confirmed as malignancy (n=2), and the presence of compressive symptoms (n=1). The cytologic examination of repeated FNA in three patients showed Bethesda cytology category I (n=1) or II (n=2). Other 34 patients performed regular US follow-up without repeated FNA or CNB or direct surgery. Additionally, seven (0.6%) nodules were identified as Bethesda category IV, and six of them underwent surgery. Finally, among the 108 nodules diagnosed as Bethesda category III or IV by the first FNA, 56 patients were enrolled in this study.
Statistical analysis
The clinical characteristics of thyroid nodules were compared using the Student’s t test. The chi-square test was used to estimate the malignancy risk in each cytology category and subgroups categorized by combined results of cytology and US patterns (K-TIRADS or ACR-TIRADS). The binominal test was used to evaluate whether the malignancy risk in each cytology category was significantly changed by adding the US pattern (K-TIRADS or ACR-TIRADS). To compare the diagnostic performance between K-TIRADS and ACR-TIRADS, a receiver operating characteristic (ROC) curve was used. Statistical analysis was performed using STATA version 13.1 (StataCorp, College Station, TX, USA) and P<0.05 was considered to indicate statistical significance for all tests.
RESULTS
Clinical and cytopathologic characteristics of thyroid nodules
The baseline clinical characteristics of the thyroid nodules are presented in Table 1. Of 200 finally diagnosed thyroid nodules, 114 nodules (57.0%) were confirmed as malignancies. The age at diagnosis of patients with benign thyroid nodules was younger than that of patients with malignant nodules (48.3±13.8 years vs. 54.9±14.6 years, P<0.001). The benign thyroid nodules were, on average, larger than the malignant nodules (2.1±1.3 cm vs. 1.4±0.9 cm, P<0.001). One hundred and ten (96.5%) malignant nodules were diagnosed by surgery, while 37 (43.0%) benign nodules were diagnosed by CNB (P<0.001). Among the malignant nodules, PTC, FVPTC, and follicular thyroid carcinoma (FTC) were diagnosed in 93.9%, 3.5%, and 2.65% of cases respectively. Of the benign nodules, 11.6% and 10.5% were NIFTP and follicular adenomas, respectively.
Malignancy risk of thyroid nodules assessed by the bethesda classification
First, the malignancy risk of all nodules was assessed by a cytologic examination using the Bethesda categories. Among the 200 nodules, eight (4.0%), 34 (17.0%), 49 (24.5%), seven (3.5%), 22 (11.0%), and 80 (40.0%) were classified as Bethesda cytology category I, II, III, IV, V, and VI, respectively (Table 2). Furthermore, AUS/FLUS nodules were subclassified according to their atypic features. Among 49 nodules, 32 (65.3%), 10 (20.4%), and seven (14.3%) were subclassified as cytologic, architectural, and the Hurthle cell atypia, respectively (Table 2).
Based on the final pathologic diagnosis by surgery or biopsy, the diagnostic performance for predicting malignancy was 26.5%, 28.6%, 81.8%, and 97.5% for Bethesda cytology category III, IV, V, and VI, respectively. The risk of malignancy in category III and V nodules was somewhat higher than those previously reported for the 2017 Bethesda system [2]. Within 49 AUS/FLUS nodules, the risk of malignancy was higher in nodules with cytologic atypia than in those with architectural atypia, but the difference was not statistically significant (37.5% vs. 10.0%, P=0.101). Seven nodules of Hurthle cell atypia showed no risk of malignancy at all.
Malignancy risk of thyroid nodules assessed by the two US scoring systems
Next, the malignancy risk was evaluated by two US scoring systems, K-TIRADS and ACR-TIRADS (Table 3, Supplemental Tables S1, S2). Based on the final pathologic diagnosis, the diagnostic performance of K-TIRADS was 8.7%, 36.4%, and 86.2% for K-TR3, K-TR4, and K-TR5, respectively, and that of ACR-TIRADS was 14.3%, 33.3%, and 85.3% for TR3, TR4, and TR5, respectively. Among all 200 nodules, 37 (18.5%) nodules showed a discordance in the estimated malignancy risk between K-TIRADS and ACR-TIRADS. The ROC analysis showed that AUC of FNA was higher than that of both US scoring systems (0.921 vs. 0.855 for K-TIRADS, P=0.029; vs. 0.842 for ACR-TIRADS, P=0.014) (Fig. 1A). The AUC of K-TIRADS and ACR-TIRADS showed no significant difference overall (0.855 vs. 0.842, P=0.332) or in each cytologic category. In cytologically indeterminate nodules (Bethesda categories III, IV, and V), the diagnostic performance of FNA, K-TIRADS, and ACR-TIRADS showed no significant differences, with AUCs of 0.731, 0.754, and 0.745, respectively (Fig. 1B). All enrolled nodules were divided into two groups according to the final diagnostic methods, CNB (n=41) or surgery (n=159), and the diagnostic performance of the two US scoring system was compared in each group. The diagnostic performance was similar between the two TIRADS systems in both the CNB and the surgery groups. The AUC of K-TIRADS versus ACR-TIRADS were 0.865 versus 0.838 in the CNB group (P=0.151) and 0.900 versus 0.903 in the surgery group (P=0.819).
Effects of US scoring systems on risk assessment of indeterminate thyroid nodules based on FNA
The diagnostic performance of K-TIRADS and ACR-TIRADS was compared for indeterminate thyroid nodules based on FNA. The sensitivity, specificity, positive predictive value, and negative predictive value were similar between the two systems in each cytology category (Table 4). Next, the predictive values for malignancy in each cytology category were calculated for the combination of FNA results with each US scoring system (Table 5). Interestingly, in nodules of Bethesda category III, adding the US finding of K-TR5 significantly increased the risk of malignancy compared to cytologic findings alone (50.0% vs. 26.5%, P=0.028), while adding the US findings of ACR-TIRADS did not (Table 5). Within the nodules of Bethesda category, the presence of spiculated margin (46.2% vs. 5.6%, P=0.001), microcalcification (38.5% vs. 5.7%, P=0.004), and non-parallel orientation (30.8% vs. 8.3%, P=0.048) was significantly higher in malignancy than benign, while the presence of hypoechogenity, macro- or rim calcification was not. In nodules of Bethesda category V, adding the US finding of K-TR3 (81.8% vs. 0%, P=0.033) and ACR-TR4 (81.8% vs. 40.0%, P=0.045) significantly decreased the risk of malignancy, whereas adding ACR-TR5 significantly increased the risk of malignancy (81.8% vs. 100.0%, P=0.04). There was no additional effect of applying K-TIRADS scores for nodules of Bethesda category IV.
DISCUSSION
In this study, we demonstrated that two US scoring systems—the ACR-TIRADS and K-TIRADS—had similar diagnostic performance for indeterminate thyroid nodules diagnosed by FNA. The overall sensitivity and specificity were similar between K-TIRADS and ACR-TIRADS in nodules belonging to Bethesda categories III, IV, and V. However, in nodules belonging to Bethesda category III, adding K-TIRADS 5 significantly increased the risk of malignancy, while adding ACR-TIRADS did not. The presences of spiculated margin, micro-calcification, and non-parallel orientation, were significantly increased in malignant than benign nodules in Bethesda category III. Therefore, the K-TIRADS system may have further beneficial effects in predicting malignancy risk for nodules classified as Bethesda category III.
The management of indeterminate thyroid nodules based on FNA is one of the most challenging topics in the field. Several guidelines recommend repeated FNA and/or lobectomy or molecular diagnostics for indeterminate nodules, but it remains difficult to make an optimal decision because of the wide range of malignancy risk [2]. Thus, several studies have tried to re-assess the malignancy risks of thyroid nodules with indeterminate cytology using various US scoring systems. Recent studies demonstrated that several US scoring systems, including ACR-TIRADS, K-TIRADS, and the ATA guidelines, were useful for refining the malignancy risk of indeterminate thyroid nodules [17–23].
Furthermore, studies have compared the diagnostic performances of different US scoring systems in predicting the malignancy risk of indeterminate thyroid nodules. The present study showed the beneficial effects of applying K-TIRADS on nodules of Bethesda category III in predicting malignancy risk. This result is concordant with that of a recently published study comparing the usefulness of thyroid sonographic risk-stratification systems in the diagnosis of indeterminate or suspicious or unequivocal cytology. In that study, K-TIRADS showed a higher AUC than ACR-TIRADS among AUS/FLUS nodules (0.692 vs. 0.655, P<0.05) [24]. The proportion of PTC was high (79.2%) in that study, and was even higher in our study (93.8%). Since US scoring systems are highly sensitive to PTC, rather than FTC or FVPTC [25,26], K-TIRADS might be more beneficial for diagnosing thyroid cancer in PTC-dominant areas.
Both ACR-TIRADS and K-TIRADS have unique strengths for evaluating the risk of malignancy for thyroid nodules. ACR-TIRADS is composed of five categories, including composition, echogenicity, shape, margin, and echogenic foci, and all of those parameters must be evaluated for a final decision [7]. Meanwhile, K-TIRADS is presented in a tree diagram form using two decision steps, wherein malignancy risk is first categorized by echogenicity and solidity, and then by the presence of suspicious US features [9]. Therefore, K-TIRADS is more intuitive and easily applicable in daily clinical settings, while ACR-TIRADS focuses more on accuracy than on ease of application. Since the current evidence, including the present study, shows similar diagnostic performance between these two US scoring systems, an optimal selection needs to be made considering aspects of the real-world environment, including the prevalence of PTC.
The present study was performed in an extremely PTC-dominant area, and 94% of the histologically confirmed malignant nodules were PTC. Because of its typical cellular morphology [27], we expect that the cytologic diagnosis of PTC is easier than that of FTC or other follicular neoplasms. However, the overall prevalence of AUS/FLUS nodules in this study was 8.8%, similar to that of other countries [28], but not lower than expected. A recent meta-analysis showed an interesting finding comparing Asian and non-Asian populations in AUS/FLUS nodules [28]. In that study, the frequency of AUS/FLUS diagnoses was not significantly different between the Asian and non-Asian cohorts (8.8% vs. 9.1%, P=0.69), while the malignancy risk (43.2% vs. 26.8%), the prevalence of cytologic atypia (70.3% vs. 33.5%), and the proportion of PTC in surgically resected tumors (46.3% vs. 29.1%) of AUS nodules were significantly higher in the Asian studies than the non-Asian studies. The present study showed a higher prevalence of cytologic atypia in AUS/FLUS nodules in association with a high prevalence of PTC. The risk of malignancy was higher in nodules with cytologic atypia than in those with architectural atypia, and adding the US finding of both K-TR5 and ACR-TR5 increased the risk of malignancy compared to cytologic findings alone. However, these findings did not obtain statistical significance because of the limited numbers.
The overall malignancy rate of AUS/FLUS nodules in our study was 26.5%, similar to that reported for non-Asian cohorts. This relatively low malignancy risk of AUS/FLUS nodules, compared to the other Asian cohorts, may be explained by the effects of 10 cases of NIFTP, which were included as benign disease, in the present study. Collectively, the present study and a recent meta-analysis [28] showed that the prevalence of AUS/FLUS nodules were not lower in PTC-dominant Asian cohorts. A further study is needed to explain this intriguing finding.
The present study showed four cases of misdiagnosis. Two of them were false positives and two of them were false negatives. Two nodules initially diagnosed as malignant nodules categorized as Bethesda category VI based on FNA were finally diagnosed as benign hyalinizing trabecular tumor (HTT) after surgery. One comprised hypoechoic and parallel-shaped solid nodules, and was categorized as 4 in both K-TIRADS and ACR-TIRADS. The other one showed the same features, but also had an irregular margin, and was categorized as 5 in K-TIRADS and 4 in ACR-TIRADS. HTT is known to mimic medullary thyroid carcinoma or PTC on cytology, which can lead to the misdiagnosis of these benign tumors as malignancies [29–31]. Jang et al. [32] reported that the most common US features of 12 cases of HTT were hypo- or marked hypo-echogenicity (83.4%), absence of calcification (91.7%) and parallel shape (100.0%), which are consistent with our cases. Further study is needed to identify ways to prevent false-positive diagnoses in cases of HTT.
On the contrary, two cases were categorized as non-diagnostic at initial FNA, but finally diagnosed as malignancies after CNB. Both nodules showed suspicious US features on initial FNA (hypo-echogenicity and a solid component). One case had rim calcification and the other one had both micro- and macro-calcifications. Although the risk of malignancy of initially reported non-diagnostic thyroid nodules is not high (5% to 10%) [1], nodules with macro-calcification have a high likelihood of being categorized as non-diagnostic on FNA [33]. In these cases, repeated CNB can be helpful to make an accurate diagnosis.
There are several limitations in our study. First, we included patients who ultimately underwent CNB or surgery for pathologic confirmation, which means the nodules posed relatively high concerns of malignancy risks. Therefore, our data may overestimate the risk of malignancy. Second, the small sample size may explain the non-significant changes in malignancy risk shown in Table 4. Further research with a larger sample size will be needed to compare different US scoring systems.
In conclusion, both ACR-TIRADS and K-TIRADS had similar diagnostic performance for assessing the malignancy risk of indeterminate thyroid nodules, and K-TIRADS showed beneficial effects on malignancy prediction for nodules belonging to Bethesda cytology category III. Therefore, K-TIRADS may be useful in assessing the malignancy risk of cytologically indeterminate nodules in PTC-prevalent areas.
SUPPLEMENTARY INFORMATION
Malignancy Risk by Bethesda Cytology Category and the Korean Thyroid Imaging Reporting and Data System (K-TIRADS)
Malignancy Risk by Bethesda Cytology Category and American College of Radiology Thyroid Imaging Reporting and Data System (ACR-TIRADS)
Flow diagram of thyroid nodules of Bethesda cytology category III and IV. FNA, fine-needle aspiration; AUS/FLUS, atypia of undetermined significance or follicular lesion of undetermined significance; FN/SFN, follicular neoplasm or suspicious for a follicular neoplasm; US, ultrasonography; CNB, core needle biopsy; FA, follicular adenoma; NIFTP, noninvasive follicular thyroid neoplasm with papillary-like nuclear features.
Notes
CONFLICTS OF INTEREST
No potential conflict of interest relevant to this article was reported.
AUTHOR CONTRIBUTIONS
Conception or design: Y.J.P., S.W.C. Acquisition, analysis, or interpretation of data: S.K., S.K.K., H.S.C., M.J.K. Drafting the work or revising: S.K. Final approval of the manuscript: S.K., Y.J.P., D.J.P., S.W.C.