Publications-Theses

Article View/Open

Publication Export

Google ScholarTM

NCCU Library

Citation Infomation

Related Publications in TAIR

題名 強化深偽偵測:以統計方法辨識影像的圖像特徵
Enhancing Deepfake Detection: Statistical Analysis of Frame Features with Extension to Video
作者 高崇哲
GAO, CHONG-ZHE
貢獻者 余清祥
YU, QING-XIANG
高崇哲
GAO, CHONG-ZHE
關鍵詞 深偽影像
維度縮減
資料洩漏
資料結構化
幀間同質性
Deepfake videos
Dimensionality reduction
Data leakage
Data structuring
Inter-frame homogeneity
日期 2025
上傳時間 4-Aug-2025 15:11:34 (UTC+8)
摘要 人工智慧與深度學習的快速發展帶來諸多便利與創新,然而這些技術可能遭不法分子濫用成為新型犯罪工具,其中深偽影像(Deepfake)的出現顛覆了眼見為憑的傳統認知,對視覺資訊的真實性構成嚴重威脅。目前,多數深偽影像偵測方法依賴深度學習技術,雖然偵測效果不錯,卻因龐大的參數量與複雜的計算過程,使決策過程難以解釋。本研究從統計角度切入,提出一種透過特徵維度縮減,具高度可解釋性的輕量化偵測方法。 除了可解釋性與計算量較少的優勢外,本研究另有兩項貢獻:一、有效避免資料洩漏問題;二、將偵測單位從圖像層級拓展至影像層級,以符合實務需求。先前方法多以圖像為單位進行分析,這可能導致同一部影像同時出現在訓練集與測試集中,產生資料洩漏(Data Leakage),使測試結果與實際應用存在落差。為改善此問題,本研究改以影像為單位切割資料,避免資料洩漏,提升模型的泛化能力與結果可信度,使偵測效果更穩定且符合實務需求。我們參考先前研究方法,將原本以單一尺度區塊切割計算的梯度強度,調整為使用大尺度區塊與一階差分計算的低離群值,以賦予特徵同時具備全局與局部紋理的描述能力。同時,針對HSV(Hue、Saturation、Value)色彩空間中 H 通道的角度特性,採用 sin H 與 cos H 分解方式進行轉換,以提升偵測表現與解釋性。除了紋理變化,本研究亦發現紋理種類分布對深偽影像具有辨識力,因而進一步納入兩類紋理統計特徵:其一為以共生矩陣(Co-occurrence Matrix)計算的角二階矩(Angular Second Moment,ASM),其二為從梯度方向直方圖(Histogram of Oriented Gradient,HOG)中提取的統計量。本研究以 Celeb-DF-v2 深偽影像資料集為實驗對象,並採用 500 次重複模擬的交叉驗證進行評估。結果顯示,所提方法在僅使用 31 個特徵的情況下,仍可達到 69.55% 的偵測準確率,較原方法提升 4.91%,展現本方法兼具良好效能與可解釋性的潛力。
The rapid advancement of artificial intelligence and deep learning has brought significant benefits and innovations. However, these technologies are also increasingly misused, particularly in the creation of deepfake media, which severely undermines the credibility of visual information. While most existing detection methods rely on deep learning models that achieve high accuracy, they often suffer from limited interpretability and substantial computational complexity. This study presents a lightweight and interpretable statistical approach for deepfake detection, achieving competitive performance with fewer than 1% of the features typically used in deep learning models. Building upon the work of Chen (2023), we enhance both global and local texture representation by applying large-scale block-based gradient extraction in combination with first-order differencing to suppress outliers. To further address angular discontinuities in the HSV color space, hue components are transformed using sine and cosine decomposition (sin H and cos H). In addition to capturing texture variations, we investigate the distribution of texture types by incorporating two types of statistical features: (1) Angular Second Moment (ASM) from gray-level co-occurrence matrices, and (2) summary statistics extracted from Histograms of Oriented Gradients (HOG). These features are then used as inputs for statistical and machine learning classifiers. Experiments conducted on the Celeb-DF-v2 dataset, using 500 iterations of cross-validation, demonstrate that our method achieves a detection accuracy of 69.55% with only 31 features—a 4.91% improvement over the baseline. Furthermore, by sampling and aggregating predictions at the video level rather than the frame level, we mitigate data leakage risks and enhance real-world applicability. Final decisions are made using majority voting and median aggregation strategies to better reflect practical deployment scenarios.
參考文獻 [1] 陳慧霜(2023)。「影像分析與深偽影片的偵測」。國立政治大學統計學系學位論文。 [2] Ahmed, N., Natarajan, T., & Rao, K. R. (2006). “Discrete Cosine Transform”, IEEE Transactions on Computers, 100(1), 90–93. [3] Akiba, T., Sano, S., Yanase, T., Ohta, T., & Koyama, M. (2019). “Optuna: A Next-Generation Hyperparameter Optimization Framework”, Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, 2623–2631. [4] Amari, S. (2006). “A Theory of Adaptive Pattern Classifiers”, IEEE Transactions on Electronic Computers, 3, 299–307. [5] Bertasius, G., Wang, H., & Torresani, L. (2021). “Is Space-Time Attention All You Need for Video Understanding?”, Proceedings of the 38th International Conference on Machine Learning (ICML), Vol. 2, No. 3, p. 4. [6] Blanz, V., & Vetter, T. (2023). “A Morphable Model for the Synthesis of 3D Faces”, Seminal Graphics Papers: Pushing the Boundaries, Volume 2, 157–164. [7] Breiman, L. (2001). “Random Forests”, Machine Learning, 45(1), 5–32. [8] Chen, T., & Guestrin, C. (2016). “XGBoost: A Scalable Tree Boosting System”, Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 785–794. [9] Cortes, C., & Vapnik, V. (1995). “Support-Vector Networks”, Machine Learning, 20, 273–297. [10] Cover, T., & Hart, P. (1967). “Nearest Neighbor Pattern Classification”, IEEE Transactions on Information Theory, 13(1), 21–27. [11] Cox, D. R. (1958). “The Regression Analysis of Binary Sequences”, Journal of the Royal Statistical Society Series B: Statistical Methodology, 20(2), 215–232. [12] Dalal, N., & Triggs, B. (2005). “Histograms of Oriented Gradients for Human Detection”, Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR), 1, 886–893. [13] Dolhansky, B., Bitton, J., Pflaum, B., Lu, J., Howes, R., Wang, M., & Ferrer, C. C. (2020). “The Deepfake Detection Challenge (DFDC) Dataset”, arXiv preprint arXiv:2006.07397. [14] Gabor, D. (1946). “Theory of Communication. Part 1: The Analysis of Information”, Journal of the Institution of Electrical Engineers – Part III: Radio and Communication Engineering, 93(26), 429–441. [15] Haralick, R. M., Shanmugam, K., & Dinstein, I. H. (1973). “Textural Features for Image Classification”, IEEE Transactions on Systems, Man, and Cybernetics, 6, 610–621. [16] Horn, B. K. P., & Schunck, B. G. (1981). “Determining Optical Flow”, Artificial Intelligence, 17(1–3), 185–203. [17] Kim, H., Garrido, P., Tewari, A., Xu, W., Thies, J., Niessner, M., ... & Theobalt, C. (2018). “Deep Video Portraits”, ACM Transactions on Graphics (TOG), 37(4), 1–14. [18] LeCun, Y., Bottou, L., Bengio, Y., & Haffner, P. (1998). “Gradient-Based Learning Applied to Document Recognition”, Proceedings of the IEEE, 86(11), 2278–2324. [19] Li, L., Bao, J., Zhang, T., Yang, H., Chen, D., Wen, F., & Guo, B. (2020). “Face X-Ray for More General Face Forgery Detection”, Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 5001–5010. [20] Li, Y., Yang, X., Sun, P., Qi, H., & Lyu, S. (2020). “Celeb-DF: A Large-Scale Challenging Dataset for Deepfake Forensics”, Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) [21] Liu, Y., Zhang, K., Li, Y., Yan, Z., Gao, C., Chen, R., ... & Sun, L. (2024). “Sora: A Review on Background, Technology, Limitations, and Opportunities of Large Vision Models”, arXiv preprint arXiv:2402.17177. [22] Matern, F., Riess, C., & Stamminger, M. (2019). “Exploiting Visual Artifacts to Expose Deepfakes and Face Manipulations”, 2019 IEEE Winter Applications of Computer Vision Workshops (WACVW), 83–92. [23] Nirkin, Y., Keller, Y., & Hassner, T. (2019). “FSGAN: Subject Agnostic Face Swapping and Reenactment”, Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 7184–7193. [24] Pérez, P., Gangnet, M., & Blake, A. (2023). “Poisson Image Editing”, Seminal Graphics Papers: Pushing the Boundaries, Volume 2, 577–582. [25] Polyak, A., Zohar, A., Brown, A., Tjandra, A., Sinha, A., Lee, A., ... & Du, Y. (2024). “Movie gen: A Cast of Media Foundation Models, 2025”, arXiv preprint arXiv:2410.13720. [26] Rossler, A., Cozzolino, D., Verdoliva, L., Riess, C., Thies, J., & Nießner, M. (2019). “FaceForensics++: Learning to Detect Manipulated Facial Images”, Proceedings of the IEEE/CVF International Conference on Computer Vision, 1–11. [27] Siarohin, A., Lathuilière, S., Tulyakov, S., Ricci, E., & Sebe, N. (2019). “First Order Motion Model for Image Animation”, Advances in Neural Information Processing Systems, 32. [28] Tolosana, R., Vera-Rodriguez, R., Fierrez, J., Morales, A., & Ortega-Garcia, J. (2020). “Deepfakes and Beyond: A Survey of Face Manipulation and Fake Detection”, Information Fusion, 64, 131–148. [29] Wiles, O., Koepke, A., & Zisserman, A. (2018). “X2Face: A Network for Controlling Face Generation Using Images, Audio, and Pose Codes”, Proceedings of the European Conference on Computer Vision (ECCV), 670–686. [30] Yang, X., Li, Y., & Lyu, S. (2019). “Exposing Deep Fakes Using Inconsistent Head Poses”, ICASSP 2019 – IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 8261–8265. [31] Zhao, H., Zhou, W., Chen, D., Wei, T., Zhang, W., & Yu, N. (2021). “Multi-Attentional Deepfake Detection”, Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2185–2194. [32] Zhang, K., Zhang, Z., Li, Z., & Qiao, Y. (2016). “Joint Face Detection and Alignment Using Multitask Cascaded Convolutional Networks”, IEEE Signal Processing Letters, 23(10), 1499–1503. [33] Zhao, T., Xu, X., Xu, M., Ding, H., Xiong, Y., & Xia, W. (2021). “Learning Self-Consistency for Deepfake Detection”, Proceedings of the IEEE/CVF International Conference on Computer Vision, 15023–15033.
描述 碩士
國立政治大學
統計學系
112354020
資料來源 http://thesis.lib.nccu.edu.tw/record/#G0112354020
資料類型 thesis
dc.contributor.advisor 余清祥zh_TW
dc.contributor.advisor YU, QING-XIANGen_US
dc.contributor.author (Authors) 高崇哲zh_TW
dc.contributor.author (Authors) GAO, CHONG-ZHEen_US
dc.creator (作者) 高崇哲zh_TW
dc.creator (作者) GAO, CHONG-ZHEen_US
dc.date (日期) 2025en_US
dc.date.accessioned 4-Aug-2025 15:11:34 (UTC+8)-
dc.date.available 4-Aug-2025 15:11:34 (UTC+8)-
dc.date.issued (上傳時間) 4-Aug-2025 15:11:34 (UTC+8)-
dc.identifier (Other Identifiers) G0112354020en_US
dc.identifier.uri (URI) https://nccur.lib.nccu.edu.tw/handle/140.119/158714-
dc.description (描述) 碩士zh_TW
dc.description (描述) 國立政治大學zh_TW
dc.description (描述) 統計學系zh_TW
dc.description (描述) 112354020zh_TW
dc.description.abstract (摘要) 人工智慧與深度學習的快速發展帶來諸多便利與創新,然而這些技術可能遭不法分子濫用成為新型犯罪工具,其中深偽影像(Deepfake)的出現顛覆了眼見為憑的傳統認知,對視覺資訊的真實性構成嚴重威脅。目前,多數深偽影像偵測方法依賴深度學習技術,雖然偵測效果不錯,卻因龐大的參數量與複雜的計算過程,使決策過程難以解釋。本研究從統計角度切入,提出一種透過特徵維度縮減,具高度可解釋性的輕量化偵測方法。 除了可解釋性與計算量較少的優勢外,本研究另有兩項貢獻:一、有效避免資料洩漏問題;二、將偵測單位從圖像層級拓展至影像層級,以符合實務需求。先前方法多以圖像為單位進行分析,這可能導致同一部影像同時出現在訓練集與測試集中,產生資料洩漏(Data Leakage),使測試結果與實際應用存在落差。為改善此問題,本研究改以影像為單位切割資料,避免資料洩漏,提升模型的泛化能力與結果可信度,使偵測效果更穩定且符合實務需求。我們參考先前研究方法,將原本以單一尺度區塊切割計算的梯度強度,調整為使用大尺度區塊與一階差分計算的低離群值,以賦予特徵同時具備全局與局部紋理的描述能力。同時,針對HSV(Hue、Saturation、Value)色彩空間中 H 通道的角度特性,採用 sin H 與 cos H 分解方式進行轉換,以提升偵測表現與解釋性。除了紋理變化,本研究亦發現紋理種類分布對深偽影像具有辨識力,因而進一步納入兩類紋理統計特徵:其一為以共生矩陣(Co-occurrence Matrix)計算的角二階矩(Angular Second Moment,ASM),其二為從梯度方向直方圖(Histogram of Oriented Gradient,HOG)中提取的統計量。本研究以 Celeb-DF-v2 深偽影像資料集為實驗對象,並採用 500 次重複模擬的交叉驗證進行評估。結果顯示,所提方法在僅使用 31 個特徵的情況下,仍可達到 69.55% 的偵測準確率,較原方法提升 4.91%,展現本方法兼具良好效能與可解釋性的潛力。zh_TW
dc.description.abstract (摘要) The rapid advancement of artificial intelligence and deep learning has brought significant benefits and innovations. However, these technologies are also increasingly misused, particularly in the creation of deepfake media, which severely undermines the credibility of visual information. While most existing detection methods rely on deep learning models that achieve high accuracy, they often suffer from limited interpretability and substantial computational complexity. This study presents a lightweight and interpretable statistical approach for deepfake detection, achieving competitive performance with fewer than 1% of the features typically used in deep learning models. Building upon the work of Chen (2023), we enhance both global and local texture representation by applying large-scale block-based gradient extraction in combination with first-order differencing to suppress outliers. To further address angular discontinuities in the HSV color space, hue components are transformed using sine and cosine decomposition (sin H and cos H). In addition to capturing texture variations, we investigate the distribution of texture types by incorporating two types of statistical features: (1) Angular Second Moment (ASM) from gray-level co-occurrence matrices, and (2) summary statistics extracted from Histograms of Oriented Gradients (HOG). These features are then used as inputs for statistical and machine learning classifiers. Experiments conducted on the Celeb-DF-v2 dataset, using 500 iterations of cross-validation, demonstrate that our method achieves a detection accuracy of 69.55% with only 31 features—a 4.91% improvement over the baseline. Furthermore, by sampling and aggregating predictions at the video level rather than the frame level, we mitigate data leakage risks and enhance real-world applicability. Final decisions are made using majority voting and median aggregation strategies to better reflect practical deployment scenarios.en_US
dc.description.tableofcontents 第一章 緒論 1 第一節 研究動機 1 第二節 研究目的 3 第二章 文獻探討與資料介紹 5 第一節 文獻回顧 5 第二節 資料介紹 9 第三章 研究方法 12 第一節 圖像至影像的偵測拓展 12 第二節 影像資料洩漏挑戰 14 第三節 結構化方法 16 第四節 色彩空間 24 第五節 分類模型 26 第四章 探索性資料分析 33 第五章 驗證性資料分析 44 第一節 變數選取 44 第二節 圖像結果拓展至影像 48 第三節 方法比較 51 第四節 資料依賴性 56 第六章 結論與建議 58 第一節 結論 58 第二節 討論與建議 60 參考文獻 62zh_TW
dc.format.extent 3995317 bytes-
dc.format.mimetype application/pdf-
dc.source.uri (資料來源) http://thesis.lib.nccu.edu.tw/record/#G0112354020en_US
dc.subject (關鍵詞) 深偽影像zh_TW
dc.subject (關鍵詞) 維度縮減zh_TW
dc.subject (關鍵詞) 資料洩漏zh_TW
dc.subject (關鍵詞) 資料結構化zh_TW
dc.subject (關鍵詞) 幀間同質性zh_TW
dc.subject (關鍵詞) Deepfake videosen_US
dc.subject (關鍵詞) Dimensionality reductionen_US
dc.subject (關鍵詞) Data leakageen_US
dc.subject (關鍵詞) Data structuringen_US
dc.subject (關鍵詞) Inter-frame homogeneityen_US
dc.title (題名) 強化深偽偵測:以統計方法辨識影像的圖像特徵zh_TW
dc.title (題名) Enhancing Deepfake Detection: Statistical Analysis of Frame Features with Extension to Videoen_US
dc.type (資料類型) thesisen_US
dc.relation.reference (參考文獻) [1] 陳慧霜(2023)。「影像分析與深偽影片的偵測」。國立政治大學統計學系學位論文。 [2] Ahmed, N., Natarajan, T., & Rao, K. R. (2006). “Discrete Cosine Transform”, IEEE Transactions on Computers, 100(1), 90–93. [3] Akiba, T., Sano, S., Yanase, T., Ohta, T., & Koyama, M. (2019). “Optuna: A Next-Generation Hyperparameter Optimization Framework”, Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, 2623–2631. [4] Amari, S. (2006). “A Theory of Adaptive Pattern Classifiers”, IEEE Transactions on Electronic Computers, 3, 299–307. [5] Bertasius, G., Wang, H., & Torresani, L. (2021). “Is Space-Time Attention All You Need for Video Understanding?”, Proceedings of the 38th International Conference on Machine Learning (ICML), Vol. 2, No. 3, p. 4. [6] Blanz, V., & Vetter, T. (2023). “A Morphable Model for the Synthesis of 3D Faces”, Seminal Graphics Papers: Pushing the Boundaries, Volume 2, 157–164. [7] Breiman, L. (2001). “Random Forests”, Machine Learning, 45(1), 5–32. [8] Chen, T., & Guestrin, C. (2016). “XGBoost: A Scalable Tree Boosting System”, Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 785–794. [9] Cortes, C., & Vapnik, V. (1995). “Support-Vector Networks”, Machine Learning, 20, 273–297. [10] Cover, T., & Hart, P. (1967). “Nearest Neighbor Pattern Classification”, IEEE Transactions on Information Theory, 13(1), 21–27. [11] Cox, D. R. (1958). “The Regression Analysis of Binary Sequences”, Journal of the Royal Statistical Society Series B: Statistical Methodology, 20(2), 215–232. [12] Dalal, N., & Triggs, B. (2005). “Histograms of Oriented Gradients for Human Detection”, Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR), 1, 886–893. [13] Dolhansky, B., Bitton, J., Pflaum, B., Lu, J., Howes, R., Wang, M., & Ferrer, C. C. (2020). “The Deepfake Detection Challenge (DFDC) Dataset”, arXiv preprint arXiv:2006.07397. [14] Gabor, D. (1946). “Theory of Communication. Part 1: The Analysis of Information”, Journal of the Institution of Electrical Engineers – Part III: Radio and Communication Engineering, 93(26), 429–441. [15] Haralick, R. M., Shanmugam, K., & Dinstein, I. H. (1973). “Textural Features for Image Classification”, IEEE Transactions on Systems, Man, and Cybernetics, 6, 610–621. [16] Horn, B. K. P., & Schunck, B. G. (1981). “Determining Optical Flow”, Artificial Intelligence, 17(1–3), 185–203. [17] Kim, H., Garrido, P., Tewari, A., Xu, W., Thies, J., Niessner, M., ... & Theobalt, C. (2018). “Deep Video Portraits”, ACM Transactions on Graphics (TOG), 37(4), 1–14. [18] LeCun, Y., Bottou, L., Bengio, Y., & Haffner, P. (1998). “Gradient-Based Learning Applied to Document Recognition”, Proceedings of the IEEE, 86(11), 2278–2324. [19] Li, L., Bao, J., Zhang, T., Yang, H., Chen, D., Wen, F., & Guo, B. (2020). “Face X-Ray for More General Face Forgery Detection”, Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 5001–5010. [20] Li, Y., Yang, X., Sun, P., Qi, H., & Lyu, S. (2020). “Celeb-DF: A Large-Scale Challenging Dataset for Deepfake Forensics”, Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) [21] Liu, Y., Zhang, K., Li, Y., Yan, Z., Gao, C., Chen, R., ... & Sun, L. (2024). “Sora: A Review on Background, Technology, Limitations, and Opportunities of Large Vision Models”, arXiv preprint arXiv:2402.17177. [22] Matern, F., Riess, C., & Stamminger, M. (2019). “Exploiting Visual Artifacts to Expose Deepfakes and Face Manipulations”, 2019 IEEE Winter Applications of Computer Vision Workshops (WACVW), 83–92. [23] Nirkin, Y., Keller, Y., & Hassner, T. (2019). “FSGAN: Subject Agnostic Face Swapping and Reenactment”, Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 7184–7193. [24] Pérez, P., Gangnet, M., & Blake, A. (2023). “Poisson Image Editing”, Seminal Graphics Papers: Pushing the Boundaries, Volume 2, 577–582. [25] Polyak, A., Zohar, A., Brown, A., Tjandra, A., Sinha, A., Lee, A., ... & Du, Y. (2024). “Movie gen: A Cast of Media Foundation Models, 2025”, arXiv preprint arXiv:2410.13720. [26] Rossler, A., Cozzolino, D., Verdoliva, L., Riess, C., Thies, J., & Nießner, M. (2019). “FaceForensics++: Learning to Detect Manipulated Facial Images”, Proceedings of the IEEE/CVF International Conference on Computer Vision, 1–11. [27] Siarohin, A., Lathuilière, S., Tulyakov, S., Ricci, E., & Sebe, N. (2019). “First Order Motion Model for Image Animation”, Advances in Neural Information Processing Systems, 32. [28] Tolosana, R., Vera-Rodriguez, R., Fierrez, J., Morales, A., & Ortega-Garcia, J. (2020). “Deepfakes and Beyond: A Survey of Face Manipulation and Fake Detection”, Information Fusion, 64, 131–148. [29] Wiles, O., Koepke, A., & Zisserman, A. (2018). “X2Face: A Network for Controlling Face Generation Using Images, Audio, and Pose Codes”, Proceedings of the European Conference on Computer Vision (ECCV), 670–686. [30] Yang, X., Li, Y., & Lyu, S. (2019). “Exposing Deep Fakes Using Inconsistent Head Poses”, ICASSP 2019 – IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 8261–8265. [31] Zhao, H., Zhou, W., Chen, D., Wei, T., Zhang, W., & Yu, N. (2021). “Multi-Attentional Deepfake Detection”, Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2185–2194. [32] Zhang, K., Zhang, Z., Li, Z., & Qiao, Y. (2016). “Joint Face Detection and Alignment Using Multitask Cascaded Convolutional Networks”, IEEE Signal Processing Letters, 23(10), 1499–1503. [33] Zhao, T., Xu, X., Xu, M., Ding, H., Xiong, Y., & Xia, W. (2021). “Learning Self-Consistency for Deepfake Detection”, Proceedings of the IEEE/CVF International Conference on Computer Vision, 15023–15033.zh_TW