Memory-efficient 2.5D convolutional transformer networks for multi-modal deformable registration with weak label supervision applied to whole-heart CT and MRI scans
Purpose Despite its potential for improvements through supervision, deep learning-based registration approaches are difficult to train for large deformations in 3D scans due to excessive memory requirements. Methods We propose a new 2.5D convolutional transformer architecture that enables us to learn a memory-efficient weakly supervised deep learning model for multi-modal image registration. Furthermore, we firstly integrate a volume change control term into the loss function of a deep learning-based registration method to penalize occurring foldings inside the deformation field. Results Our approach succeeds at learning large deformations across multi-modal images. We evaluate our approach on 100 pair-wise registrations of CT and MRI whole-heart scans and demonstrate considerably higher Dice Scores (of 0.74) compared to a state-of-the-art unsupervised discrete registration framework (deeds with Dice of 0.71). Conclusion Our proposed memory-efficient registration method performs better than state-of-the-art conventional registration methods. By using a volume change control term in the loss function, the number of occurring foldings can be considerably reduced on new registration cases.