Цель исследования

izvestswsu

Известия Юго-Западного государственного университета

Proceedings of the Southwest State University

2223-15602686-6757

ЮЗГУ

10.21869/2223-1560-2025-29-4-53-69

izvestswsu-1516

Research Article

ИНФОРМАТИКА, ВЫЧИСЛИТЕЛЬНАЯ ТЕХНИКА И УПРАВЛЕНИЕ

COMPUTER SCIENCE, COMPUTER ENGINEERING AND CONTROL

Гибридный двухуровневый метод автоматического выявления подмены лица оператора на изображении

Hybrid two-level method for automatic detection of face substitution in an image

Халеев

М. Д.

Haleev

M. D.

Халеев Михаил Дмитриевич, кандидат технических наук, младший научный сотрудник

ул. Корпусная, д. 18, г. Санкт-Петербург 199178

Mikhail D. Haleev, Cand. of Sci. (Engineering), Junior Research Fellow

18, Korpusnaya str., St. Petersburg 199178

Haleev.M@iias.spb.su

Санкт-Петербургский Федеральный исследовательский центр Российской академии наукSt. Petersburg Federal Research Center of the Russian Academy of Sciences

2025

08012026

2945369

2026

Халеев М.Д.

Haleev M.D.

Данная работа распространяется под лицензией Creative Commons Attribution 4.0.

This work is licensed under a Creative Commons Attribution 4.0 License.

https://izvestswsu.elpub.ru/jour/article/view/1516

Цель исследования

Цель исследования: Разработка гибридного двухуровневого метода для повышения как точности, так и устойчивости выявления подмены лица оператора на изображениях, что является актуальной задачей в условиях постоянного роста и усложнения угроз со стороны дипфейк-технологий.

Методы

Методы. Предложена архитектура, объединяющая сверточную нейронную сеть EfficientNet для извлечения глубоких паттернов и ансамбль из четырех классификаторов. Эти классификаторы целенаправленно анализируют специфические группы признаков: экспертные, текстурные, статистические и основанные на координатах лицевых ориентиров, что позволяет выявлять конкретные артефакты синтеза. Для обучения и тестирования был сформирован обширный и репрезентативный комплексный набор данных объемом 34 000 изображений, включающий как сгенерированные дипфейки, так и публичные датасеты.

Результаты

Результаты. Экспериментально подтверждена высокая эффективность предложенного метода: точность составила 0,921, а F1-мера – 0,914. Эти показатели значительно превосходят результаты любой из моделей, использованных по отдельности, что доказывает ярко выраженный и практически значимый синергетический эффект от их объединения.

Заключение

Заключение. Работа демонстрирует, что синергия глубокого обучения и классических признаковых моделей позволяет создать действительно более надежный и точный детектор. Предложенный метод повышает общую точность и увеличивает надежность системы, эффективно компенсируя индивидуальные слабости отдельных классификаторов. Это подтверждает гипотезу о том, что сочетание способности нейросети извлекать сложные, неявные паттерны и способности признаковых моделей анализировать конкретные, заранее известные специфические артефакты (например, геометрические искажения) ведет к созданию более мощного и устойчивого детектора.

Purpose of research

Purpose of research. The development of a hybrid, two-level method to enhance both the accuracy and robustness of detecting operator face spoofing in images, which is a pressing issue given the constant growth and sophistication of threats from deepfake technologies.

Methods

Methods. A novel architecture is proposed, combining the EfficientNet convolutional neural network for deep pattern extraction with an ensemble of four classifiers. These classifiers specifically analyze distinct feature groups: expertbased, textural, statistical, and those based on facial landmark coordinates, enabling the detection of specific synthesis artifacts. For training and testing, an extensive and representative dataset of 34,000 images was compiled, including deepfakes generated by several modern tools as well as public datasets.

Results

Results. The high efficacy of the proposed method was experimentally confirmed: accuracy reached 0.921 and the F1-score was 0.914. These metrics significantly surpass the performance of any of the individual models used separately, demonstrating a pronounced and practically significant synergistic effect from their combination.

Conclusion

Conclusion. This work demonstrates that the synergy between deep learning and classical feature-based models allows for the creation of a genuinely more reliable and precise detector. The proposed method improves overall accuracy and enhances system robustness by effectively compensating for the individual weaknesses of separate classifiers. This validates the hypothesis that combining a neural network's ability to extract complex, implicit patterns with feature-based models' capacity to analyze specific, predefined artifacts (such as geometric distortions) leads to a more powerful and resilient detector.

подмена лицкомпьютерное зрениеискусственный интеллектглубокое обучениедипфейкинформационная безопасность

face swappingcomputer visionartificial intelligencedeep learningdeepfakeinformation security

Исследования выполнены в рамках бюджетной темы FFZF-2025-0003.

Research was supported by Russian State Research FFZF-2025-0003.

References1

Халеев М.Д. Интеллектуальный метод автоматического выявления подмены лица на изображении // Системы анализа и обработки данных. 2025. Т. 97, №1. С. 105-120.

Haleev M. D. Intelligent method for automatic detection of face substitution in an image. Sistemy analiza i obrabotki dannykh = Analysis and Data Processing Systems. 2025; 97(1): 105-120. (In Russ.).

Tolosana R., Vera-Rodriguez R., Fierrez J., Morales A., Ortega-Garcia J. Deepfakes and beyond: A Survey of face manipulation and fake detection // Information Fusion. 2020. Vol. 64. P. 131–148. https://doi:10.1016/J.INFFUS.2020.06.014.

Tolosana R., Vera-Rodriguez R., Fierrez J., Morales A., Ortega-Garcia J. Deepfakes and beyond: A Survey of face manipulation and fake detection. Information Fusion. 2020; 64: 131–148. https://doi:10.1016/J.INFFUS.2020.06.014.

Nawaz M., Javed A., Irtaza A. A deep learning model for FaceSwap and facereenactment deepfakes detection // Applied Soft Computing. 2024. Vol. 162. P. 111854. https://doi:10.1016/J.ASOC.2024.111854.

Nawaz M., Javed A., A. Irtaza A deep learning model for FaceSwap and facereenactment deepfakes detection. Appl Soft Comput. 2024; 162: 111854. https://doi:10.1016/J.ASOC.2024.111854.

Ding X. и др. Swapped face detection using deep learning and subjective assessment // EURASIP Journal on Information Security. 2020. Vol. 2020, № 1. https://doi:10.1186/S13635-020-00109-8.

Ding X., Raziei Z., Larson E. C., E Olinick. V., Krueger P., Hahsler M. Swapped face detection using deep learning and subjective assessment. EURASIP J Inf Secur. 2020; 2020(1). https://doi:10.1186/S13635-020-00109-8.

Essa E. Feature fusion Vision Transformers using MLP-Mixer for enhanced deepfake detection // Neurocomputing. 2024. Vol. 598. P. 128128. https://doi:10.1016/J.NEUCOM.2024.128128

Essa E. Feature fusion Vision Transformers using MLP-Mixer for enhanced deepfake detection. Neurocomputing. 2024; 598: 128128. https://doi:10.1016/J.NEUCOM.2024.128128.

Salman M., et al. AWARE-NET: Adaptive Weighted Averaging for Robust Ensemble Network in Deepfake Detection // Computer Vision and Pattern Recognition. 2025.

Muhammad Salman, Iqra Tariq, Mishal Zulfiqar, Muqadas Jalal, Sami Aujla, Sumbal Fatima, AWARE-NET: Adaptive Weighted Averaging for Robust Ensemble Network in Deepfake Detection. Computer Vision and Pattern Recognition. 2025.

Kingra S., Aggarwal N., Kaur N. SFormer: An end-to-end spatio-temporal transformer architecture for deepfake detection // Forensic Science International: Digital Investigation. 2024. Vol. 51. P. 301817. https://doi:10.1016/J.FSIDI.2024.301817

Kingra S., N Aggarwal., Kaur N. SFormer: An end-to-end spatio-temporal transformer architecture for deepfake detection. Forensic Science International: Digital Investigation. 2024; 51: 301817. https://doi:10.1016/J.FSIDI.2024.301817.

Khalid F., Javed A., ul ain Q., Ilyas H., Irtaza A. DFGNN: An interpretable and generalized graph neural network for deepfakes detection // Expert Systems with Applications. 2023. Vol. 222. P. 119843. https://doi:10.1016/J.ESWA.2023.119843

Khalid F., Javed A., ul ain Q., Ilyas H., Irtaza A. DFGNN: An interpretable and generalized graph neural network for deepfakes detection. Expert Syst Appl. 2023; 222: 119843. https://doi:10.1016/J.ESWA.2023.119843.

Sun K., et al. DiffusionFake: Enhancing Generalization in Deepfake Detection via Guided Stable Diffusion // 38th Conference on Neural Information Processing Systems (NeurIPS 2024). 2024.

Ke Sun, Shen Chen, Taiping Yao, Hong Liu, Xiaoshuai Sun, Shouhong Ding, Rongrong Ji, DiffusionFake: Enhancing Generalization in Deepfake Detection via Guided Stable Diffusion. 38th Conference on Neural Information Processing Systems (NeurIPS 2024). 2024.

Smeu S., Oneata E., Oneata D. DeCLIP: Decoding CLIP Representations for Deepfake Localization // Proceedings of the Winter Conference on Applications of Computer Vision (WACV). 2025. С. 149-159.

Stefan Smeu, Elisabeta Oneata, Dan Oneata, DeCLIP: Decoding CLIP Representations for Deepfake Localization. Proceedings of the Winter Conference on Applications of Computer Vision (WACV). 2025. P. 149-159.

Tian J., et al. Real Appearance Modeling for More General Deepfake Detection // ECCV. 2025.

Jiahe Tian, Cai Yu, Xi Wang, Peng Chen, Zihao Xiao, Jiao Dai, Jizhong Han, and Yesheng Chai, Real Appearance Modeling for More General Deepfake Detection. ECCV, 2025.

Yan B., Li C. T., Lu X. JRC: Deepfake detection via joint reconstruction and classification // Neurocomputing. 2024. Vol. 598. P. 127862. https://doi:10.1016/J.NEUCOM.2024.127862

Yan B., Li C. T., Lu X. JRC: Deepfake detection via joint reconstruction and classification, Neurocomputing. 2024; 598: 127862. https://doi:10.1016/J.NEUCOM.2024.127862.

Li H., et al. FreqBlender: Enhancing DeepFake Detection by Blending Frequency Knowledge // 38th Conference on Neural Information Processing Systems (NeurIPS 2024). 2024.

Hanzhe Li, Jiaran Zhou, Yuezun Li, Baoyuan Wu, Bin Li, Junyu Dong, FreqBlender: Enhancing DeepFake Detection by Blending Frequency Knowledge. 38th Conference on Neural Information Processing Systems (NeurIPS 2024). 2024.

Kashiani H., Talemi N. A., Afghah F. FreqDebias: Towards Generalizable Deepfake Detection via Consistency-Driven Frequency Debiasing // CVPR. 2025.

Hossein Kashiani, Niloufar Alipour Talemi, Fatemeh Afghah, FreqDebias: Towards Generalizable Deepfake Detection via Consistency-Driven Frequency Debiasing. CVPR, 2025.

Zou M., et al. Semantic Contextualization of Face Forgery: A New Definition, Dataset, and Detection Method // Computer Vision and Pattern Recognition. 2025.

Mian Zou, Baosheng Yu, Yibing Zhan, Siwei Lyu, and Kede Ma, Semantic Contextualization of Face Forgery: A New Definition, Dataset, and Detection Method. Computer Vision and Pattern Recognition, 2025.

Chew C. J., et al. Preserving manipulated and synthetic Deepfake detection through face texture naturalness // Journal of Information Security and Applications. 2024. Vol. 83. P. 103798. https://doi:10.1016/J.JISA.2024.103798.

Chew C. J., Lin Y. C., Chen Y. C., Fan Y. Y., Lee J. S. Preserving manipulated and synthetic Deepfake detection through face texture naturalness. Journal of Information Security and Applications. 2024; 83: 103798. https://doi:10.1016/J.JISA.2024.103798.

Gao J., at al. Texture and artifact decomposition for improving generalization in deep-learning-based deepfake detection // Engineering Applications of Artificial Intelligence. 2024. Vol. 133. P. 108450. https://doi:10.1016/J.ENGAPPAI.2024.108450

Gao J., et al. Texture and artifact decomposition for improving generalization in deep-learning-based deepfake detection. Eng Appl Artif Intell. 2024; 133: 108450. https://doi:10.1016/J.ENGAPPAI.2024.108450.

He Q., Peng C., Liu D., Wang N., Gao X. GazeForensics: DeepFake detection via gaze-guided spatial inconsistency learning // Neural Networks. 2024. Vol. 180. P. 106636. https://doi:10.1016/J.NEUNET.2024.106636

He Q., Peng C., Liu D., Wang N., Gao X. GazeForensics: DeepFake detection via gaze-guided spatial inconsistency learning. Neural Networks. 2024; 180: 106636. https://doi:10.1016/J.NEUNET.2024.106636.

Li Y., at al. Texture Shape and Order Matter: A New Transformer Design for Sequential DeepFake Detection // Proceedings of the Winter Conference on Applications of Computer Vision (WACV). 2025. P. 202-211.

Yunfei Li, Yuezun Li, Xin Wang, Baoyuan Wu, Jiaran Zhou, Junyu Dong, Texture Shape and Order Matter: A New Transformer Design for Sequential DeepFake Detection. Proceedings of the Winter Conference on Applications of Computer Vision (WACV), 2025. P. 202-211.

Wang Y., Huang H. Audio–visual deepfake detection using articulatory representation learning // Computer Vision and Image Understanding. 2024. Vol. 248. P. 104133. https://doi:10.1016/J.CVIU.2024.104133

Wang Y., H. Huang Audio–visual deepfake detection using articulatory representation learning. Computer Vision and Image Understanding. 2024; 248: 104133. https://doi:10.1016/J.CVIU.2024.104133.

Wang T., Cheng H., Zhang X., Wang Y. NullSwap: Proactive Identity Cloaking Against Deepfake Face Swapping // CVPR. 2025.

Tianyi Wang, Harry Cheng, Xiao Zhang, Yinglong Wang, NullSwap: Proactive Identity Cloaking Against Deepfake Face Swapping, CVPR, 2025.

Yan Z., et al. DF40: Toward Next-Generation Deepfake Detection // arXiv preprint arXiv:2406.13495. 2024.

Yan Z., et al. DF40: Toward Next-Generation Deepfake Detection. arXiv preprint arXiv:2406.13495, 2024.

The authors declare that there are no conflicts of interest present.