ST-VFD: Spatio-Temporal Video Forgery Detection Using Multi-Scale Convolutional Neural Networks
Main Article Content
Abstract
Video forgery detection is the significant process in digital forensics, especially now forgeries are becoming more advanced with sophisticated video formats. Detection of global and local forgeries can be performed with the help of innovative deep-learning architecture, which utilizes spatial-temporal inconsistencies on low resolution or highly compressed video inputs. To effectively detect the global and local forgeries a new framework with Multi-Scale CNN (MS-CNN), Motion Aware Temporal Modeling (MAT), and Spatio-Temporal Attention (SAT) mechanism is proposed. This has the ability to handle different video qualities. With this framework, the system is prepared with rich spatial details and irregular motion detection between frames by combining optical flow analysis with deep multi-scale spatial features, the system achieves higher accuracy in detecting tampered content without requiring region-level annotations. In order to show the evaluation of the proposed framework, experiments were carried out on FaceForensics++ and a customized Kaggle dataset. The accuracy of proposed work attained 97.9% training accuracy and 94.5% validation accuracy at each frame. The system results demonstrated and showed effectiveness in terms of average processing time, which took only 0.06 seconds per frame. Binary forgery detection in video can be easily accomplished with this work claiming to take research further by providing a generalizable, real-world-ready detection pipeline that is also groundwork for future endeavors in forgery localization and type classification.