Publication
Article
Digital Edition
Researchers introduce a multistage dual-branch network to improve accuracy and efficiency
A research team from Wuhan University has developed a novel multistage dual-branch image projection network for OCT-based segmentation that enhances the accuracy of detecting and mapping retinal geographic atrophy and foveal avascular zone. (Image credit: AdobeStock/suphamit)
A novel optical coherence tomography (OCT) retinal geographic atrophy (GA) segmentation method has been proposed by a research team from China that uses a multistage 2-branch network structure. According to coauthors Xiaoming Liu, PhD, and Jieyang Li, PhD, from the Wuhan University of Science and Technology, Wuhan, China, this method was shown to be more effective than other methods used for the GA segmentation task and for foveal avascular zone (FAZ) segmentation.
Their so-called multistage dual-branch image projection network (DIPN) can learn feature information in B-scan images to assist with GA segmentation via the introduction of additional components, that is, Convolutional Long-Short-Term Memory Networks (ConvLSTM), a projection attention module, an adaptive pooling module, and a contrastive learning enhancement (CLE) module, according to the researchers, who reported their work in Science Reports.1
The challenge with other existing GA segmentation tasks is that they can use only 3-dimensional (3D) data, which ignores the fact that a large number of B-scan images contain lesion information, the researchers explained.
There is an increasing prevalence of GA worldwide with the aging of the population,2 and lesion development and enlargement can result in irreversible loss of visual function, thus underscoring the importance of accurate segmentation of the lesion area for preventing progression and guiding subsequent treatment.3
OCT is a valuable, noninvasive technology that provides high-resolution, 3D, cross-sectional images and rapid biomedical imaging technology.4 It can image biologic tissues at the micron level and generate high-resolution, 3D, cross-sectional images, which are widely used in clinical ophthalmology and are vital to diagnosing and monitoring retinal diseases.5-9
Previous efforts into GA segmentation have attempted to use traditional methods. The authors cited a study10 that proposed a method to create OCT projection images by applying constrained subvolume projection to 3D OCT data and another11 that used U-Net, a deep-learning network, to automatically segment GA lesions. A third study12 used U-Net and Y-Net to automatically segment GA lesions on fundus autofluorescence images. However, the authors pointed out that those studies used only the features of the en face images and did not use the spatial information in the volumetric data.
In response to this, the new method being described used a 2D network framework while incorporating ConvLSTM to capture adjacent information between slices of volumetric data. However, that approach may cause mis-segmentation when segmenting GA edges (low contrast of edge pixels), and it is difficult for the network to classify such samples. The researchers then introduced a projection attention module that was proposed to focus the network attention on the projection direction to capture the contextual relationships. However, because the current projection network uses a unidirectional pooling operation to achieve feature projection, multiscale features and channel information are ignored, and an adaptive pooling module was used to reduce feature dimensions when grasping multiscale features and channel information. Finally, they introduced a CLE module to mitigate the effect of image contrast on the network segmentation performance.
They summarized their efforts as follows: “Specifically, we proposed a multi-stage DIPN that can obtain pretraining weights using many B-scan images during the pretraining stage. In addition, inspired by Liu et al13 we proposed using a projection attention module to integrate long-range dependencies by calculating the affinity between 2 different pixels on each projection column in the B-scan. An adaptive pooling module focused on the channels while extracting and fusing multi-scale features, thus effectively improving the feature utilization. Finally, to ensure that the spatial information in the volumetric data is fully utilized during the segmentation process, we incorporated ConvLSTM to capture the neighborhood information between images in the fine-tuning stage. Utilizing a contrastive learning module enhanced the network’s ability to distinguish boundary features.”
To validate the effectiveness of their proposed method, the researchers conducted experiments on 2 data sets. They explained, “The first was a retinal geographic atrophy data set containing 44 OCT volumes and 2823 GA B-scan images that were used in the pretraining phase. To explore the cross-domain generalizability of our method,14 the second data set is the public data set OCTA50015,16 that included 3D FAZ segmentation labels and retinal vessel segmentation labels.”
They reported the success of their method as follows: “Our network effectively combines the incorporated components, using a large number of individual B-scans images to pretrain the network, and experimental validation on 2 data sets demonstrates the soundness and effectiveness of our approach. The segmentation results show that our method is more effective than other methods in the GA segmentation task and the FAZ segmentation task.”
They will continue to fine-tune the method to reduce the labeling time while ensuring the quality of segmentation. At the same time, they will continue to collect larger retinal OCT data sets.
Don’t miss out—get Ophthalmology Times updates on the latest clinical advancements and expert interviews, straight to your inbox.