Document Type

Article

Publication Date

1-14-2026

Abstract

The rapid deterioration of global infrastructure necessitates precise and automated crack detection technologies for proactive maintenance. However, deep learning-based segmentation models often suffer from a scarcity of diverse, high-quality labeled datasets. This study proposes StyleSPADE, a novel conditional image generation model that integrates semantic masks and style images to synthesize realistic crack data with diverse background textures while preserving precise geometric morphology. To validate the effectiveness of the generated data, we conducted extensive semantic segmentation tasks using Transformer-based (Mask2Former, Swin-UPerNet) and CNN-based (K-Net) models. Experimental results demonstrate that StyleSPADE-based augmentation significantly outperforms baseline models, achieving a Crack IoU of 0.6376 and an F1-score of 0.7586. Furthermore, we implemented a Stacking Ensemble strategy combining high-recall and high-precision models, which further improved performance to a Crack IoU of 0.6452. Our findings confirm that StyleSPADE effectively mitigates the data scarcity problem and enhances the robustness of crack detection in complex environmental conditions. This framework contributes to improving the efficiency and safety of infrastructure management by enabling reliable damage assessment in data-limited environments.

Comments

This article was originally published in Applied Sciences, volume 16, issue 2, in 2026. https://doi.org/10.3390/app16020837

Copyright

The authors

Creative Commons License

Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 License.

Share

COinS