SCAR-Net: Spine Segmentation in MRI based on Cross Attention and Recognition-assisted Label Fusion
DOI:
https://doi.org/10.53469/wjimt.2025.08(02).05Keywords:
NAAbstract
The segmentation of multiple vertebrae and intervertebral discs in magnetic resonance images (MRI) plays a crucial role in diagnosing and treating spinal disorders. However, the inherent complexity of the spine, coupled with the challenges of balancing inter-class similarity and intra-class variety, complicates the task. Additionally, improving the generalization ability, learning rate, and accuracy of spine segmentation remains difficult. To address these challenges, this paper proposes a spine segmentation method based on cross attention and recognition-assisted label fusion (SCAR-Net). The approach introduces a multi-channel cross attention (MCCA) mechanism to generate a comprehensive spine description by fusing inter-class and intra-class features. Furthermore, a key-points recognition-assisted learner (KRAL) is designed, incorporating mixed-supervision recognition-assisted label fusion (RALF) to reduce reliance on a single dataset and enhance network generalization. Experimental results on T2-weighted volumetric MRI datasets demonstrate that SCAR-Net achieves outstanding performance, with a mean Dice similarity coefficient (DSC) of 96.12% for 5 vertebral bodies and 95.07% for 5 intervertebral discs. The proposed method proves to be highly effective for both the localization and segmentation of intervertebral discs in MRI spine images.
References
Z. Wu, G. Xia, X. Zhang, F. Zhou, J. Ling, X. Ni, and Y. Li, A novel 3D lumbar vertebrae location and segmentation method based on the fusion envelope of 2D hybrid visual projection images, Computers in Biology and Medicine 151, 106190 (2022).
A. Sekuboyina, J. Kukačka, J. S. Kirschke, B. H. Menze, and A. Valentinitsch, Attention- Driven Deep Learning for Pathological Spine Segmentation, in Computational Meth- ods and Clinical Applications in Musculoskeletal Imaging - 5th International Work- shop, MSKI 2017, Held in Conjunction with MICCAI 2017, Quebec City, QC, Canada, September 10, 2017, Revised Selected Papers, edited by B. Glocker, J. Yao, T. Vrtovec, A.F. Frangi, and G. Zheng, volume 10734 of Lecture Notes in Computer Science, pages 108–119, Springer, 2017.
H. Chen, C. Shen, J. Qin, D. Ni, L. Shi, J. C. Cheng, and P.-A. Heng, Automatic localization and identification of vertebrae in spine CT via a joint learning model with deep neural networks, in Medical Image Computing and Computer-Assisted Intervention– MICCAI 2015: 18th International Conference, Munich, Germany, October 5-9, 2015, Proceedings, Part I 18, pages 515–522, Springer, 2015.
J. Schlemper, O. Oktay, M. Schaap, M. Heinrich, B. Kainz, B. Glocker, and D. Rueck- ert, Attention gated networks: Learning to leverage salient regions in medical images, Medical image analysis 53, 197–207 (2019).
Z. Han, B. Wei, A. Mercado, S. Leung, and S. Li, Spine-GAN: Semantic segmentation of multiple spinal structures, Medical image analysis 50, 23–35 (2018).
H. Chang, S. Zhao, H. Zheng, Y. Chen, and S. Li, Multi-vertebrae segmentation from arbitrary spine MR images under global view, in Medical Image Computing and Com- puter Assisted Intervention–MICCAI 2020: 23rd International Conference, Lima, Peru, October 4–8, 2020, Proceedings, Part VI 23, pages 702–711, Springer, 2020.
S. Pang, C. Pang, L. Zhao, Y. Chen, Z. Su, Y. Zhou, M. Huang, W. Yang, H. Lu, and Q. Feng, SpineParseNet: spine parsing for volumetric MR image by a two-stage segmentation framework with semantic image representation, IEEE Transactions on Medical Imaging 40, 262–273 (2020).
A. Rasoulian, R. Rohling, and P. Abolmaesumi, Lumbar spine segmentation using a statistical multi-vertebrae anatomical shape+ pose model, IEEE transactions on medical imaging 32, 1890–1900 (2013).
Y. Chen, Y. Gao, K. Li, L. Zhao, and J. Zhao, Vertebrae identification and localization utilizing fully convolutional networks and a hidden Markov model, IEEE Transactions on Medical Imaging 39, 387–399 (2019).
M. Kolařík, R. Burget, V. Uher, K. Říha, and M. K. Dutta, Optimized high resolution 3d dense-u-net network for brain and spine segmentation, Applied Sciences 9, 404 (2019).
X. Wang, S. Wang, Z. Zhang, X. Yin, T. Wang, and N. Li, CPAD-Net: Contextual parallel attention and dilated network for liver tumor segmentation, Biomedical Signal Processing and Control 79, 104258 (2023).
Y. Zhang, L. Yuan, Y. Wang, and J. Zhang, SAU-Net: efficient 3D spine MRI segmenta- tion using inter-slice attention, in Medical Imaging with Deep Learning, pages 903–913, PMLR, 2020.
S. Pang, C. Pang, Z. Su, L. Lin, L. Zhao, Y. Chen, Y. Zhou, H. Lu, and Q. Feng, DGMSNet: Spine segmentation for MR image by a detection-guided mixed-supervised segmentation network, Medical Image Analysis 75, 102261 (2022).
X. Ji, G. Zheng, D. Belavy, and D. Ni, DSMS-FCN: A Deeply Supervised Multi-Scale Fully Convolutional Network for Automatic Segmentation of Intervertebral Disc in 3D MR Images, in Computational Methods and Clinical Applications for Spine Imaging, edited by J. Yao, T. Vrtovec, G. Zheng, A. F. Frangi, B. Glocker, and S. Li, volume 10182 of Lecture Notes in Computer Science, pages 38–48, Springer, 2016.
G. Zeng and G. Zheng, DSMS-FCN: a deeply supervised multi-scale fully convolu- tional network for automatic segmentation of intervertebral disc in 3D MR images, in Computational Methods and Clinical Applications in Musculoskeletal Imaging: 5th Inter- national Workshop, MSKI 2017, Held in Conjunction with MICCAI 2017, Quebec City, QC, Canada, September 10, 2017, Revised Selected Papers 5, pages 148–159, Springer,2018.
M. Aygün, Y. H. Şahin, and G. Ünal, Multi modal convolutional neural networks for brain tumor segmentation, arXiv preprint arXiv:1809.06191 (2018).
J. Dolz, C. Desrosiers, and I. Ben Ayed, IVD-Net: Intervertebral disc localization and segmentation in MRI with a multi-modal UNet, in International workshop and challenge on computational methods and clinical applications for spine imaging, pages 130–143, Springer, 2018.
B. De Leener, J. Cohen-Adad, and S. Kadoury, Automatic segmentation of the spinal cord and spinal canal coupled with vertebral labeling, IEEE transactions on medical imaging 34, 1705–1718 (2015).
S. Hong, H. Noh, and B. Han, Decoupled deep neural network for semi-supervised semantic segmentation, Advances in neural information processing systems 28 (2015).
W. Luo and M. Yang, Semi-supervised semantic segmentation via strong-weak dual- branch network, in Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part V 16, pages 784–800, Springer, 2020.
C. Gros et al., Automatic segmentation of the spinal cord and intramedullary multiple sclerosis lesions with convolutional neural networks, Neuroimage 184, 901–915 (2019).
A. Dosovitskiy et al., An image is worth 16x16 words: Transformers for image recognition at scale, arXiv preprint arXiv:2010.11929 (2020).
R. Tao and G. Zheng, Spine-transformers: Vertebra detection and localization in arbi- trary field-of-view spine ct with transformers, in Medical Image Computing and Com- puter Assisted Intervention–MICCAI 2021: 24th International Conference, Strasbourg, France, September 27–October 1, 2021, Proceedings, Part III 24, pages 93–103, Springer,2021.
Z. Chen, V. Badrinarayanan, C.-Y. Lee, and A. Rabinovich, Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks, in International conference on machine learning, pages 794–803, PMLR, 2018.
P. A. Yushkevich, J. Piven, H. C. Hazlett, R. G. Smith, S. Ho, J. C. Gee, and G. Gerig, User-guided 3D active contour segmentation of anatomical structures: significantly im- proved efficiency and reliability, Neuroimage 31, 1116–1128 (2006).
A. Paszke et al., Pytorch: An imperative style, high-performance deep learning library, Advances in neural information processing systems 32 (2019).
R. Zhang, X. Xiao, Z. Liu, Y. Li, and S. Li, MRLN: Multi-task relational learning network for mri vertebral localization, identification, and segmentation, IEEE journal of biomedical and health informatics 24, 2902–2911 (2020).
M. Huang, S. Zhou, X. Chen, H. Lai, and Q. Feng, Semi-supervised hybrid spine network for segmentation of spine MR images, Computerized Medical Imaging and Graphics 107, 102245 (2023).
I. Castro-Mateos, J. M. Pozo, M. Pereanez, K. Lekadir, A. Lazary, and A. F. Frangi, Statistical interspace models (SIMs): application to robust 3D spine segmentation, IEEE transactions on medical imaging 34, 1663–1675 (2015).
M. A. Mazurowski, H. Dong, H. Gu, J. Yang, N. Konz, and Y. Zhang, Segment anything model for medical image analysis: an experimental study, Medical Image Analysis 89, 102918 (2023).
Z. Li, J. Fang, R. Qiu, H. Gong, W. Zhang, L. Li, and J. Jiang, CDA-Net: A contrastive deep adversarial model for prostate cancer segmentation in MRI images, Biomedical Signal Processing and Control 83, 104622 (2023).