The Validity of Ultrasonography-Guided Fine Needle Aspiration Biopsy in Thyroid Nodules 4 cm or Larger Depends on Ultrasonography Characteristics
Article information
Abstract
Background
The objective of this study was to evaluate the validity of fine needle aspiration biopsy (FNAB) according to ultrasonography (US) characteristics in thyroid nodules 4 cm and larger.
Methods
We retrospectively reviewed the cases of 263 patients who underwent thyroid surgery for thyroid nodules larger than 4 cm between January 2001 and December 2010.
Results
The sensitivity of US-FNAB was significantly higher in nodules with calcifications (micro- or macro-) than those without (97.9% vs. 87.% P<0.05). The accuracy of US-FNAB was higher in large thyroid nodules with US features suspicious of malignancy, such as a solid component, ill-defined margin, hypoechogenicity or marked hypoechogenicity, or any calcifications (micro- or macro-) compared to thyroid nodules with none of these features. Furthermore, the accuracy improved as the number of these features increased. The overall false negative rate (FNR) was 11.9%. The FNR of thyroid nodules that appeared benign on US, such as mixed nodules (7.7%) or nodules without calcification (9.8%), trended toward being lower than that of solid nodules (17.9%) or nodules with any microcalcification or macrocalcification (33.3%). In nodules without suspicious features of malignancy, the FNR of US-FNAB was 0% (0/15).
Conclusion
We suggest individualized strategies for large thyroid nodules according to US features. Patients with benign FNAB can be followed in the absence of any malignant features in US. However, if patients exhibit any suspicious features, potential false negative results of FNAB should be kept in mind and surgery may be considered.
INTRODUCTION
As the prevalence of thyroid nodules increases, there is an increasing need to identify the nature of these nodules more precisely and efficiently [1,2]. Ultrasonography (US)-guided fine needle aspiration biopsy (FNAB) is the standard diagnostic modality for differential diagnosis of thyroid nodules. It is safe, simple, cost effective, and reliable, with high sensitivity, specificity, and accuracy for malignancies, allowing patients to avoid unnecessary surgery [3,4,5]. However, there are limitations of FNAB for large thyroid nodules, particularly those 4 cm and larger, and the diagnostic validity of FNAB is controversial in thyroid nodules of this size [6,7,8,9,10,11,12,13]. Some authors [6,7,8,9] recommend surgery for nodules ≥4 cm irrespective of FNAB results, while others advocate the reliability of FNAB even in nodules larger than 4 cm [10,11,12,13]. Thus, identifying benign nodules and accurately diagnosing and managing malignant nodules are challenges in large thyroid nodules, especially those 4 cm and larger.
US is also a useful, noninvasive tool for detecting nodules at risk of malignancy. US features suggestive of malignancy include the presence of a solid component, ill-defined margin, hypoechogenicity or marked hypoechogenicity, and microcalcification or macrocalcification [14,15,16,17,18,19], although considerable overlap between benign and malignant characteristics has been found in some studies [20,21]. In particular, a combination of these features provides better prediction for malignancy than only a single feature [14].
The objective of this study is to evaluate the validity of FNAB in large thyroid nodules (4 cm and larger) according to US characteristics and propose a management strategy for these patients.
METHODS
Study populations
We retrospectively reviewed 263 consecutive patients who underwent thyroid surgery for thyroid nodules ≥4 cm (either single nodules or the largest of multiple nodules) with preoperative US-FNAB at Samsung Medical Center between January 2001 and December 2010. Thyroid nodule size was determined based on final surgical histopathology. Data were collected from a review of medical records, preoperative US imaging, preoperative US-FNAB reports, and final histopathology.
Review of US images
All US images of thyroid nodules (solitary nodules ≥4 cm or dominant nodules ≥4 cm on histopathology) were retrospectively reviewed by one endocrinologist with 7 years of experience in thyroid US who was blinded to FNAB and histopathology results. For each nodule, US features were categorized according to internal component (mixed, solid), margin (well-defined, ill-defined), echogenicity (isoechogenic, hypoechogenic, or markedly hypoechogenic), and calcification (no, microcalcification, or macrocalcification).
Classification of FNAB results and histopathology
FNAB results were reclassified by a cytopathologist with over 15 years of experience in thyroid FNAB who was blinded to US features. Samples categorized as inadequate, benign, atypia of undetermined significance/follicular lesion of undetermined significance (AUS/FLUS), follicular neoplasm/Hürthle cell neoplasm (FN/HN), suspicious for malignancy (SM), and malignancy according to the Bethesda System for Reporting Thyroid Cytopathology [22].
Statistical analysis
For statistical analysis, we reclassified the FNAB report according to the Bethesda criteria into benign, indeterminate, and SM/malignancy. Benign included samples determined to be benign by the Bethesda criteria, indeterminate included those deemed AUS/FLUS or FN/HN by the Bethesda criteria, and SM/malignancy included those designated SM or malignancy by the Bethesda criteria. A true positive was defined as a thyroid nodule that was indeterminate or SM/malignant by FNAB and was confirmed to be malignant on final histopathology. A true negative was defined as a thyroid nodule that was benign on FNAB and was diagnosed as benign on final histopathology. A false positive was a thyroid nodule that was indeterminate or SM/malignant on FNAB, but was confirmed benign on final histopathology. A false negative was defined as a thyroid nodule that was benign on FNAB, but was diagnosed as malignant on final histopathology. Sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV), and accuracy were calculated, with the exception of thyroid nodules with inadequate FNAB reports. The chi-square test or Fisher exact test was used to compare the categorical data. One-way analysis of variance was used for continuous variables. Statistical analyses were performed using SPSS version 18.0 (IBM Co., Armonk, NY, USA), and a P<0.05 was considered statistically significant.
RESULTS
Study population demographics
The age of the patients was 45.6±15.5 years (range, 19 to 77). There were 183 women (69.6%) and 80 men (30.4%). The mean size of the nodules was 5.0±1.2 cm (range, 4.0 to 11.0). There were 154 malignant and 109 benign nodules on final histopathology. Neither age (45.5±16.9 years vs. 45.7±13.4 years; P=0.91), sex (68.8% female vs. 70.6% female; P=0.78), nor size (5.1±1.4 cm vs. 4.8±0.9 cm; P=0.13) were significantly different between malignant and benign nodules larger than 4 cm, respectively.
FNAB results and final histopathology
On US-FNAB of thyroid nodules ≥4 cm, 26 samples (9.9%) were inadequate, 67 nodules (25.5%) were benign, 55 nodules (20.9%) were AUS/FLUS, four nodules (1.5%) were FN/HN, 19 nodules (7.2%) were SM, and 92 nodules (35.0%) were malignancies, as defined by the Bethesda criteria (Table 1). Among 263 nodules in the study, 154 (58.6%) were confirmed malignant and 109 (41.4%) were benign on final histopathology. Of 154 malignant nodules, there were 98 classical papillary thyroid carcinomas (PTCs), 19 variants of PTC, 25 follicular thyroid carcinomas (FTCs), six medullary thyroid carcinomas, five anaplastic carcinomas, three Hürthle cell carcinomas, and three poorly differentiated carcinomas. Among 109 benign nodules, there were 80 nodular hyperplasias, 25 follicular adenomas, and four Hürthle cell adenomas. Eight of 67 nodules with benign FNAB reports were malignant on final histology and thus the false negative rate (FNR) was 11.9% (8/67). Those nodules diagnosed as benign in preoperative FNAB included seven FTCs and one follicular variant of PTC. Furthermore, 40% (22/55) of ACUS/FLUS and 100% (4/4) of FN/HN turned out to be malignant on final histopathology.
Validity of FNAB according to US characteristics
FNAB alone had a sensitivity, specificity, PPV, NPV, and accuracy of 94.4% (135/143), 72.8% (59/94), 79.4% (135/170), 88.1% (59/67), and 81.9% (194/237), respectively (Table 2). After excluding nodules with inadequate FNAB results, the overall malignancy rate was 60.3% (143/237).
In our analysis of the validity of FNAB according to US characteristics, FNAB sensitivity in nodules with either microcalcification or macrocalcification was significantly higher than that of nodules without calcification (97.9% vs. 87.8%; P=0.02). The PPV of FNAB was significantly increased in nodules with a solid component, ill-defined margin, hypoechogenicity or marked hypoechogenicity, and either microcalcification macrocalcification. The overall accuracy of FNAB was also improved in combination with any of US features of malignancy mentioned above. The FNR varied according to US features. The FNR of mixed nodules was 7.7% (3/39), while that of solid nodules was 17.9% (5/28), but this difference did not reach statistical significance. Similarly, nodules without calcification demonstrated a lower FNR of 9.8% (6/61) compared to nodules with calcifications at 33.3% (2/6), but this was not statistically significant.
Validity of FNAB according to the number of suspicious malignant features
The diagnostic accuracy of FNAB improved as the number of suspicious US features increased (Table 3). FNAB had 100% (24/24) PPV and 100% (24/24) accuracy in nodules with all four suspicious features. In addition, the malignancy rate increased significantly as the number of these US features increased. In nodules with none of these features, the FNR of FNA was 0% (0/15).
To evaluate the malignancy rate according to FNAB results, we reclassified all nodules into three conventional FNAB categories (benign, indeterminate, and SM/malignancy) and classified the US features according to internal component (mixed, solid), margin (well-defined, ill-defined), echogenicity (iso-, hypo-, or markedly hypoechogenic), and calcification (no, microcalcification, or macrocalcification). Nodules with benign results on FNAB and a solid component on US were more likely to be malignant (17.9%) than nodules that were benign on FNAB and exhibited mixed component US features (7.7%). Nodules with benign FNAB results and calcifications on US had a higher malignancy rate (33.3%) than nodules with benign FNA results without calcification on US (9.8%), but this did not reach statistical significance. Nodules with indeterminate FNAB and solid component, ill-defined margin, hypoechogenicity or marked hypoechogenicity, and microcalcification or macrocalcification on US had a higher malignancy rate than nodules without these features (Fig. 1). The malignancy rate varied according to the number of these US features exhibited. As the number of these features increased, the malignancy rate also tended to increase in nodules with benign and indeterminate FNA results (Fig. 2).
DISCUSSION
The main finding of this study is that the validity of FNAB differs according to US characteristics in large thyroid nodules (4 cm and larger). The accuracy of FNAB was higher in thyroid nodules with certain US features suggestive of malignancy: a solid component, ill-defined margin, hypoechogenicity or marked hypoechogenicity, and either microcalcification or macrocalcification. Furthermore, the accuracy improved as the number of these suspicious features increased.
US-FNAB is the most efficient and reliable diagnostic procedure for determining the nature of thyroid nodules [3,4,5]. It plays a crucial role in guiding appropriate management in patients with thyroid nodules and avoiding unnecessary surgery. However, the application of FNAB results for determining appropriate follow-up management of patients with large thyroid nodules remains controversial.
A few previous studies, mostly focusing on FNRs, have reported conflicting results in thyroid nodules ≥4 cm. Carrillo et al. [6] reported that the FNR was 20% in 35 thyroid nodules ≥4 cm. McCoy et al. [7] also found an unacceptably high FNR (13%) in 149 patients with thyroid nodules ≥4 cm; when multifocal micropapillary carcinomas were included, the FNR increased to 16%. Pinchot et al. [8] reported a FNR of 7.7% in 97 thyroid nodules ≥4 cm and a missed follicular lesion rate of 42% in patients with reportedly benign FNAB. Wharry et al. [9] recently showed a FNR of 10.4% in 382 thyroid nodules ≥4 cm. However, Kuru et al. [10] reported that FNRs were similar between nodules <4 and ≥4 cm, at 1.3% and 4.3%, respectively. Rosario et al. [11] also suggested that the FNR (3.6%) was not high enough to justify routine surgery in 151 thyroid nodules ≥4 cm. Raj et al. [12] reported a FNR of 0.84% in 223 thyroid nodules ≥4 cm. Recently, Shrestha et al. [13] found that the FNR was 15.8% in nodules 0.5 to 0.9 cm, 6.3% in nodules 1.0 to 3.9 cm, and 7.1% in nodules ≥4 cm. They confirmed that thyroid nodule size ≥4 cm did not diminish the accuracy of FNAB.
Unlike previous studies, we compared the reliability of FNAB according to the US characteristics of thyroid nodules 4 cm and larger. We observed a relatively high overall FNR of 11.9%. However, this result varied according to US features. The FNR of benign-appearing nodules, such as mixed-component or noncalcified nodules, was lower rather than that in nodules with suspicious US features, though this did not reach statistical significance. In nodules without suspicious malignant features; that is, nodules with a mixed component, well-defined margin, isoechogenicity, and no calcifications, the FNR of FNAB was 0%.
A solid component, ill-defined margin, hypoechogenicity or marked hypoechogenicity, and microcalcification or macrocalcification on US are features suggestive of malignancy [14,15,16,17,18,19]. In this study, the malignancy rate was higher in nodules with these features, and this rate increased significantly as the number of suspicious features increased in thyroid nodules classified as benign or indeterminate on FNAB. In particular, the presence or absence of calcifications strongly affected malignancy risk, leading to a higher FNR in nodules with calcifications. Similar to our results, Yoon et al. [23] showed a greater prevalence of suspicious US features in malignant nodules ≥3 cm compared with those that were benign. In contrast, Wharry et al. [9] reported that the presence of suspicious US features did not distinguish malignant nodules from benign lesions ≥4 cm.
FNAB previously demonstrated reliable accuracy of 95% with sensitivity of 83%, specificity of 92%, PPV of 75%, and a FNR of 5% [24]. In our study, the diagnostic validity of FNAB consisted of accuracy of 81.9%, sensitivity of 94.4%, specificity of 72.8%, and PPV of 79.4%, and we found that these results varied with the presentation of different US features. FNAB sensitivity in nodules with microcalcification or macrocalcification was significantly higher (97.9%) than in nodules without calcification (87.8%). The PPV and accuracy of FNAB were also increased significantly when at least one suspicious US feature was present. Furthermore, as the number of those US features increased, the PPV and accuracy of FNAB also increased significantly, reaching 100% in nodules with all four suspicious US features. Consequently, we suggest that the US characteristics of thyroid nodules are useful in predicting malignancy, helping to overcome the limitations of FNAB, and selecting appropriate management strategies for thyroid nodules larger than 4 cm.
Of note, 88.1% of patients who were classified as benign on FNAB and underwent surgery were also benign on histopathology. Moreover, 74.6% of this subgroup of patients exhibited nodule hyperplasia on histopathology. It is clinically important to identify the most appropriate candidates for surgery to avoid unnecessary procedures. Based on this study, it appears reasonable for patients with thyroid nodules 4 cm or larger that are benign on FNAB to be considered candidates for conservative follow-up or other less invasive methods, such as radiofrequency ablation, if US does not reveal any suspicious malignant features. Conversely, if patients with thyroid nodules ≥4 cm have any suspicious malignant US features, surgery should be considered even when FNAB results are benign.
This study was limited by its retrospective nature. All FNABs were not performed by the same doctor and all US was not performed by the same radiologist. However, an experienced pathologist reviewed the results using the Bethesda criteria and one endocrinologist who had 7 years of experience in thyroid US reviewed all US images to ensure reliable and homogeneous classifications in a blind approach. The lack of consideration of the reasons leading to surgery, including symptoms, nodule growth, and changes on US is also another limitation. Finally, our results may not be generalizable because of patient selection bias. Our institute is a tertiary university hospital. Thus, it is possible that more patients with aggressive thyroid nodules seek care at our hospital. Despite these limitations, this study is meaningful because it provides the most current data available on the differential reliability of FNAB according to US features in thyroid nodules ≥4 cm.
In conclusion, the validity of FNAB differs according to US characteristics in thyroid nodules 4 cm and larger. These results indicate the need for an individualized approach in interpreting FNAB results and managing large thyroid nodules according to US characteristics.
ACKNOWLEDGMENTS
This work was supported by a grant from Samsung Biomedical Research Institute (SBRI CR 0113031).
Notes
No potential conflict of interest relevant to this article was reported.