Publications-Theses
Article View/Open
Publication Export
-
題名 使用圖像和深度學習了解社交互動
Understanding Social Interaction Using Images and Deep Learning作者 艾費瑪
Abdeo, Fatma Said Abousaleh貢獻者 曹昱<br>余能豪
Tsao, Yu<br>Yu, Neng-Hao
艾費瑪
Fatma Said Abousaleh Abdeo關鍵詞 社交
深度學習
Social Network
Deep Learning日期 2021 上傳時間 2-Mar-2021 15:02:02 (UTC+8) 摘要 人們通常能自然無礙地和他人互動,而社群訊號(social signal)是有效溝通的自然產物。然而如何讓電腦能分析、了解社交互動,並正確展現人類社群訊號的過程,仍舊是社群訊號處理(social signal processing, SSP)領域最大的挑戰之一。社交互動可以透過面對面或網路兩種不同的渠道進行。在面對面的互動中,人們常透過可觀察的非語言行為線索(例如:手勢、臉部表情、聲音表達、肢體動作和人際距離等)來了解社群訊號和行為並與他人互動。基於臉部圖像辨識的社交互動研究近來受到學術界極大重視,這是因為臉部圖像蘊含多樣化的臉部特徵,可以用來傳達關於年齡、性別、情緒和健康狀況的資訊。這些訊息在描述個人特質和社交溝通中扮演了重要的角色,其中,年齡尤其是影響我們日常社交互動最基本的因素之一。因此,根據臉部影像自動估計年齡的研究成為人工智慧領域的一項重要目標。雖然近幾年有巨大進展,但由於臉部樣貌的多變性取決於基因特徵、生活型態、臉部表情以及年齡等因素,這個研究課題仍屬於未解的難題。另一方面,網路互動包含了用戶如何透過社交平台如Facebook、Twitter、Instagram或Flickr等與他人互動。大部分的社交網路允許用戶創造並分享內容,也可以藉由不同的形式(例如:觀看、按讚或留言)與其他用戶創造的內容互動,從而產生大量含有用戶興趣、觀點、日常生活和互動資訊的社交內容。爆炸性成長的社群媒體內容和線上互動的行為,造成少數社交內容得到大量關注、受歡迎,但絕大多數則受到忽視。在社群媒體上不同種類的內容中,圖像已經成為用戶溝通的重要媒介,也導致用戶獲得的觀看次數或社交知名度產生變動。上述現象吸引了電腦視覺和多媒體領域的研究人員的興趣,並探究特定圖像受歡迎的原因,以及如何自動預測其受歡迎程度。然而,因為用戶獨特的偏好及其在社群媒體上互動歷程等其他因素,社群媒體上圖像受歡迎的程度仍然難以衡量、預測和定義。為此,本論文提出了一個架構,用以理解現實和線上世界的社交互動,來解決這些挑戰。首先,本論文探討根據臉部圖像自動估計年齡的問題。傳統估計臉部年齡的方法,透過直接分析臉部資訊(例如:鼻子、嘴巴、眼睛等)來從一個人的照片決定其年紀。然而即使對人類來說,一眼看出某人的年紀本質上仍是一項艱鉅的任務。為了處理這個問題,本論文由人類認知過程發想,提出了一個比較深度學習(comparative deep learning)的架構。藉由比較輸入圖像與選定的參考圖像(基準組),決定那組比較年輕或年長,從而以臉部圖像估算年齡。我們用區域卷積神經網路(region-convolutional neural network, R-CNN)從輸入圖像與參考樣本中擷取臉部特徵。然後,為了估計年齡差距,我們用能量函數(energy function)從全連接層(fully connected layer)獲取資訊,產生了一組代表比較關係(年輕或年長)的建議。最後,在模型的預測階段收集所有建議並依多數決來判斷人的年紀。我們在FG-NET、MORPH和IoG資料集上的實驗結果顯示,我們提出的架構超越目前最頂尖的方法,且進步的幅度分別是在FG-NET的13.24%(平均絕對誤差)、MORPH的23.20%(平均絕對誤差)以及IoG的4.74%(年齡分組分類精準度)。其次,本論文研究社群媒體上圖片受歡迎度預測的問題。隨著社群網路如Flickr、Facebook的興起,用戶常藉由分享他們的生活照片來互動。雖然每分鐘上傳了數十億張圖像到網路,但只有少部分能有超過百萬次的觀看量,其他則完全被忽略。即使是相同用戶上傳的不同照片也不會有相同的觀看數。所以如何預測圖像受歡迎度是一個值得研究的主題,同時也是社群媒體分析的關鍵挑戰。因為這可提供一個瞭解個人喜好以及公眾目光的管道。然而,圖像受歡迎度的關鍵因素,和建立一個能預測社群媒體上圖像歡迎度的模型,依然是未解的難題。為此,本論文提出了一個多模式深度學習模型(multimodal deep learning),該模型藉由與圖像受歡迎度有關的多種視覺和社會特徵,來預測社群媒體上圖像的受歡迎度。本模型使用了兩種CNN,分別學習輸入圖像的高階特徵,並將他們融入一個統一的網路來預測受歡迎度。我們透過一系列對Flickr真實資料集的實驗來評估本模型的效能。實驗結果顯示,本預測模型勝過四個傳統的機器學習演算法、兩個CNN模型和其他最新的方法,效能至少提昇了2.33%(斯皮爾曼等級相關係數)、7.59%(平均絕對誤差)以及14.16%(均方誤差)以上。
Human beings generally have the capability to interact easily with each other without any obvious effort, and social signals are the natural result of this effective communication. The process of providing computers with an equivalent capability that enables them to analyze and understand social interactions, and then properly represent human social signals, remains one of the greatest scientific challenges in the field of social signal processing (SSP). Social interactions can take place in two different ways: face-to-face or cyber. In face-to-face interactions, people commonly use observable nonverbal behavioral cues (e.g., gestures, facial expressions, vocalizations, postures, interpersonal distance, etc.) to understand and interact with the social signals and behavior of others. The problem of recognizing social interactions from face images has recently received significant attention from the research community. This is because facial images have a variety of facial traits that can convey information about an individual’s age, gender, emotions, and physical health. These types of information are known to play a key role both in the description of individuals and social communication. In particular, age is one of the most fundamental attributes that affect our daily social interactions. Automatic age estimation from face images has therefore become a significant task in numerous applications of artificial intelligence. Despite the huge advances in the automatic age estimation from face images in recent years, it remains a challenging problem. This is because of the large variations in facial appearance that result from a number of different factors, including genetic traits, lifestyle, facial expressions, and aging. On the other hand, cyber interactions are related to how users interact with each other through social media websites such as Facebook, Twitter, Instagram, and Flickr. Most social networks allow users to create and share content and interact with other user-generated content in different forms (e.g., by viewing, liking, or commenting). This results in massive amounts of social content that provide information about users’ interests, opinions, daily activities, and interactions. The explosive growth of social media content and the interactive online behaviors between users make only a limited number of social media content attracts a great deal of user attention and become popular, while the vast majority of content is completely ignored. Among the different types of content generated by users on social media, images have become important media for communication between users, resulting in variations in the number of views they receive or their social popularity. This phenomenon has attracted researchers from computer vision and multimedia domains to explore the reasons why certain photos are considered popular and how to predict their popularity automatically. However, it is still difficult to measure, predict, or even define image popularity on social media because it is based on a user’s preferences and many other factors that could affect user’s social interactions on social media websites and lead to the popularity of content. To this end, this dissertation proposes a framework for understanding social interaction in the real and online world to address these challenges.First, this dissertation addresses the problem of automatic age estimation from facial images. The conventional methods for facial age estimation normally determine the age of a person directly from his/her facial image by analyzing some facial information (e.g., nose, mouth, eyes, etc.). This means only the input image is utilized to estimate the person’s age. However, telling someone’s precise age at a glance without any reference information is essentially a challenging task even for humans. To address this problem and inspired by human cognitive processes, this dissertation proposes a comparative deep learning framework that estimates the age from the facial image by comparing the input image with a set of selected reference images (labeled baseline samples) to determine whether the input face is younger or older than each of the baseline samples. A specific deep learning architecture, namely a region-convolutional neural network (R-CNN), is used to extract facial information from both the input image and the baseline samples. Then, an energy function is exploited to aggregate the extracted information from the fully connected layer in order to estimate age comparisons. This results in a set of hints where each hint represents a comparative relationship (younger or older). Finally, the estimation stage aggregates all the set of hints and then votes on the number of hints for each label in order to estimate the person’s age. Therefore, the age of the input person could be estimated by taking the label that received the most votes. The experimental results on the FG-NET, MORPH, and IoG databases demonstrate that the proposed model outperforms compared to the state-of-the-art methods, with a relative improvement of 13.24% (on FG-NET), 23.20% (on MORPH) in terms of mean absolute error, and 4.74% (on IoG) in terms of age group classification accuracy.Second, this dissertation addresses the problem of image popularity prediction on social media websites. With an increasing number of social networks such as Flickr and Facebook, users often interact with each other by sharing photos of their daily lives. Although billions of images are uploaded to the internet every minute, only a few of these images receive millions of views and become popular, while others are completely ignored. Even the different images posted by the same user receive a different number of views. This raises the problem of image popularity prediction, which has become a key challenge in social media analytics, as it offers opportunities to reveal individual preferences and public attention. However, the challenge remains to investigate crucial factors that influence image popularity, as well as modeling and predicting the evolution of image popularity on social media. To this end, this dissertation proposes a multimodal deep learning model that predicts the popularity of images on social media by using various types of visual and social features that are associated with image popularity. The proposed model uses two dedicated CNNs to learn high-level representations separately from the input features and then merges them into a unified network for popularity prediction. The performance of the model was evaluated by performing a series of experiments on a real-world dataset from Flickr. The evaluation results reveal that the proposed prediction model outperforms four traditional machine learning schemes, two CNN-based models, and other state-of-the-art methods, with a relative performance improvement of more than 2.33%, 7.59%, and 14.16% in terms of the Spearman rank correlation coefficient, mean absolute error, and mean squared error, respectively.參考文獻 [1] Alessandro Vinciarelli, Maja Pantic, and Hervé Bourlard. Social signal processing: Survey of an emerging domain. Image and vision computing, 27(12):1743–1759, 2009.[2] Maja Pantic, Anton Nijholt, Alex Pentland, and Thomas S Huanag. Human-centred intelligent human? computer interaction (hci2): how far are we from attaining it? International Journal of Autonomous and Adaptive Communications Systems, 1(2):168– 187, 2008.[3] Ahmed Elgammal. Human-centered multimedia: representations and challenges. In Proceedings of the 1st ACM international workshop on Human-centered multimedia, pages 11–18. ACM, 2006.[4] Jim Gemmell, Kentaro Toyama, C Lawrence Zitnick, Thomas Kang, and Steven Seitz. Gaze awareness for video-conferencing: A software approach. IEEE MultiMedia, 7(4):26–35, 2000.[5] Michael Hecht, Joseph De Vito, and Laura Guerrero. Perspectives on nonverbal communication: codes, functions, and contexts. The Nonverbal Communication Reader, pages 3–18, 1999.[6] Dacher Keltner, Paul Ekman, Gian C Gonzaga, and Jennifer Beer. Facial expression of emotion. 2003.[7] Nalini Ambady and Robert Rosenthal. Thin slices of expressive behavior as predictors of interpersonal consequences: A meta-analysis. Psychological bulletin, 111(2):256, 1992.[8] Albert Mehrabian and Susan R Ferris. Inference of attitudes from nonverbal communication in two channels. Journal of consulting psychology, 31(3):248, 1967.[9] Jon E Grahe and Frank J Bernieri. The importance of nonverbal cues in judging rapport. Journal of Nonverbal behavior, 23(4):253–269, 1999.[10] Sen Pei, Lev Muchnik, José S Andrade Jr, Zhiming Zheng, and Hernán A Makse. Searching for superspreaders of information in real-world social media. Scientific reports, 4:5547, 2014.[11] Elizabeth Dubois and Devin Gaffney. The multiple facets of influence: Identifying political influentials and opinion leaders on twitter. American behavioral scientist, 58(10):1260–1277, 2014.[12] Ioannis Arapakis, Mounia Lalmas, B Barla Cambazoglu, Mari-Carmen Marcos, and Joemon M Jose. User engagement in online news: Under the scope of sentiment, interest, affect, and gaze. Journal of the Association for Information Science and Technology, 65(10):1988–2005, 2014.[13] Anjana Susarla, Jeong-Ha Oh, and Yong Tan. Social networks and the diffusion of user-generated content: Evidence from youtube. Information Systems Research, 23(1):23–41, 2012.[14] Saeideh Bakhshi, David A Shamma, and Eric Gilbert. Faces engage us: Photos with faces attract more likes and comments on instagram. In Proceedings of the SIGCHI conference on human factors in computing systems, pages 965–974, 2014.[15] Malcolm Gladwell. Blink: The power of thinking without thinking. 2006.[16] Mary Lee Hummert, Jaye L Shaner, Teri A Garstka, and Clark Henry. Communication with older adults: The influence of age stereotypes, context, and communicator age. Human Communication Research, 25(1):124–151, 1998.[17] Matthew G Rhodes. Age estimation of faces: A review. Applied Cognitive Psychology: The Official Journal of the Society for Applied Research in Memory and Cognition, 23(1):1–12, 2009.[18] Guodong Guo, Yun Fu, Charles R Dyer, and Thomas S Huang. Image-based human age estimation by manifold learning and locally adjusted robust regression. IEEE Transactions on Image Processing, 17(7):1178–1188, 2008.[19] Guodong Guo, Yun Fu, Thomas S Huang, and Charles R Dyer. Locally adjusted robust regression for human age estimation. In 2008 IEEE Workshop on Applications of Computer Vision, pages 1–6. IEEE, 2008.[20] DS Berry, L Zebrowitz-MeArthur, and TR Alley. Social and applied aspects of perceiving faces. 1988.[21] Kestutis Sveikata, Irena Balciuniene, Janina Tutkuviene, et al. Factors influencing face aging. literature review. Stomatologija, 13(4):113–116, 2011.[22] Harold Smulyan, Roland G Asmar, Annie Rudnicki, Gerard M London, and Michel E Safar. Comparative effects of aging in men and women on the properties of the arterial tree. Journal of the American College of Cardiology, 37(5):1374–1380, 2001.[23] Emma C Paes, Hans JLJM Teepen, Willemijn A Koop, and Moshe Kon. Perioral wrinkles: histologic differences between men and women. Aesthetic Surgery Journal, 29(6):467–472, 2009.[24] Yun Fu, Ye Xu, and Thomas S Huang. Estimating human age by manifold analysis of face pictures and regression on aging features. In 2007 IEEE International Conference on Multimedia and Expo, pages 1383–1386. IEEE, 2007.[25] David A Gunn, Helle Rexbye, Christopher EM Griffiths, Peter G Murray, Amelia Fereday, Sharon D Catt, Cyrena C Tomlin, Barbara H Strongitharm, Dave I Perrett, Michael Catt, et al. Why some women look young for their age. PloS one, 4(12):e8021, 2009.[26] Kai Li, Junliang Xing, Weiming Hu, and Stephen J Maybank. D2c: Deep cumulatively and comparatively learning for human age estimation. Pattern Recognition, 66:95–105, 2017.[27] Bingbing Ni, Zheng Song, and Shuicheng Yan. Web image mining towards universal age estimator. In Proceedings of the 17th ACM international conference on Multimedia, pages 85–94, 2009.[28] Young H Kwon and Niels da Vitoria Lobo. Age classification from facial images.Computer vision and image understanding, 74(1):1–21, 1999.[29] Andreas Lanitis, Christopher J. Taylor, and Timothy F Cootes. Toward automatic simulation of aging effects on face images. IEEE Transactions on pattern Analysis and machine Intelligence, 24(4):442–455, 2002.[30] Ranjan Jana, Debaleena Datta, and Rituparna Saha. Age estimation from face image using wrinkle features. Procedia Computer Science, 46:1754–1761, 2015.[31] Dong Yi, Zhen Lei, and Stan Z Li. Age estimation by multi-scale convolutional network. In Asian conference on computer vision, pages 144–158. Springer, 2014.[32] Xiaolong Wang, Rui Guo, and Chandra Kambhamettu. Deeply-learned feature for age estimation. In 2015 IEEE Winter Conference on Applications of Computer Vision, pages 534–541. IEEE, 2015.[33] Zhenxing Niu, Mo Zhou, Le Wang, Xinbo Gao, and Gang Hua. Ordinal regression with multiple output cnn for age estimation. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 4920–4928, 2016.[34] Shixing Chen, Caojin Zhang, and Ming Dong. Deep age estimation: From classification to ranking. IEEE Transactions on Multimedia, 20(8):2209–2222, 2017.[35] Bin-Bin Gao, Chao Xing, Chen-Wei Xie, Jianxin Wu, and Xin Geng. Deep label distribution learning with label ambiguity. IEEE Transactions on Image Processing, 26(6):2825–2838, 2017.[36] Yoshua Bengio. Deep learning of representations: Looking forward. Dediu, A.-H., Martın-Vide, C., Mitkov, R., Truthe, B. (eds.) SLSP. LNCS, 7978:1–37, 2013.[37] John B. Carroll. Human cognitive abilities: A survey of factor-analytic studies. New York: Cambridge University Press, 1993.[38] Aditya Khosla, Atish Das Sarma, and Raffay Hamid. What makes an image popular? In Proceedings of the 23rd international conference on World wide web, pages 867– 876. ACM, 2014.[39] Francesco Gelli, Tiberio Uricchio, Marco Bertini, Alberto Del Bimbo, and Shih-Fu Chang. Image popularity prediction in social media using sentiment and context features. In Proceedings of the 23rd ACM international conference on Multimedia, pages 907–910. ACM, 2015.[40] Spencer Cappallo, Thomas Mensink, and Cees GM Snoek. Latent factors of visual popularity prediction. In Proceedings of the 5th ACM on International Conference on Multimedia Retrieval, pages 195–202. ACM, 2015.[41] Philip J McParlane, Yashar Moshfeghi, and Joemon M Jose. Nobody comes here anymore, it’s too crowded; predicting image popularity on flickr. In Proceedings of International Conference on Multimedia Retrieval, page 385. ACM, 2014.[42] Luam Catao Totti, Felipe Almeida Costa, Sandra Avila, Eduardo Valle, Wagner Meira Jr, and Virgilio Almeida. The impact of visual attributes on online image diffusion. In Proceedings of the 2014 ACM conference on Web science, pages 42–51. ACM, 2014.[43] Wen Wang and Wei Zhang. Combining multiple features for image popularity pre- diction in social media. In Proceedings of the 25th ACM international conference on Multimedia, pages 1901–1905. ACM, 2017.[44] Zheng Song, Bingbing Ni, Dong Guo, Terence Sim, and Shuicheng Yan. Learning universal multi-view age estimator using video context. In 2011 International Conference on Computer Vision, pages 241–248. IEEE, 2011.[45] A. Lanitis, C. Draganova, and C. Christodoulou. Comparing different classifiers for automatic age estimation. IEEE Trans. System, Man, and Cybernetics, 34(1):621–628, 2004.[46] Yun Fu, Guodong Guo, and Thomas S Huang. Age synthesis and estimation via faces: A survey. IEEE transactions on pattern analysis and machine intelligence, 32(11):1955–1976, 2010.[47] Xin Geng, Zhi-Hua Zhou, Yu Zhang, Gang Li, and Honghua Dai. Learning from facial aging patterns for automatic age estimation. In Proceedings of the 14th ACM international conference on Multimedia, pages 307–316, 2006.[48] Feng Gao and Haizhou Ai. Face age classification on consumer images with gabor feature and fuzzy lda method. In International Conference on Biometrics, pages 132–141. Springer, 2009.[49] ElectronicCustomerRelationshipManagement(ECRM),https://en.wikipedia.org/ wiki/ECRM, 2020.[50] Anil K Jain, Sarat C Dass, and Karthik Nandakumar. Soft biometric traits for personal recognition systems. In International conference on biometric authentication, pages 731–738. Springer, 2004.[51] Eric Patterson, Amrutha Sethuram, Midori Albert, Karl Ricanek, and Michael King. Aspects of age variation in facial morphology affecting biometrics. In 2007 First IEEE International Conference on Biometrics: Theory, Applications, and Systems, pages 1–6. IEEE, 2007.[52] K Ricanek Jr, E Boone, and E Patterson. Craniofacial aging impacts on the eigenface face biometric. Comput. Sci, 1(3), 2006.[53] Karl Ricanek and Edward Boone. The effect of normal adult aging on standard pca face recognition accuracy rates. In Proceedings. 2005 IEEE International Joint Conference on Neural Networks, 2005., volume 4, pages 2018–2023. IEEE, 2005.[54] Narayanan Ramanathan and Rama Chellappa. Face verification across age progression.IEEE transactions on image processing, 15(11):3349–3361, 2006.[55] Junyan Wang, Yan Shang, Guangda Su, and Xinggang Lin. Age simulation for face recognition. In 18th International Conference on Pattern Recognition (ICPR’06), volume 3, pages 913–916. IEEE, 2006.[56] Alexandru Tatar, Marcelo Dias De Amorim, Serge Fdida, and Panayotis Antoniadis. A survey on predicting the popularity of web content. Journal of Internet Services and Applications, 5(1):8, 2014.[57] Nancy J Gnana Amala and K Kumar. Content popularity prediction methods-a survey. In 2018 3rd International Conference on Communication and Electronics Systems (ICCES), pages 749–753. IEEE, 2018.[58] Cheng Li, Yue Lu, Qiaozhu Mei, Dong Wang, and Sandeep Pandey. Click-through prediction for advertising in Twitter timeline. In Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 1959– 1968. ACM, 2015.[59] Flavio Figueiredo, Jussara M Almeida, Marcos André Gonçalves, and Fabrício Ben- evenuto. On the dynamics of social media popularity: A youtube case study. ACM Transactions on Internet Technology (TOIT), 14(4):24, 2014.[60] Chun-Che Wu, Tao Mei, Winston H Hsu, and Yong Rui. Learning to personalize trending image search suggestion. In Proceedings of the 37th international ACM SIGIR conference on Research & development in information retrieval, pages 727–736. ACM, 2014.[61] Meeyoung Cha, Alan Mislove, Ben Adams, and Krishna P Gummadi. Characterizing social cascades in flickr. In Proceedings of the first workshop on Online social networks, pages 13–18, 2008.[62] Eldar Sadikov, Montserrat Medina, Jure Leskovec, and Hector Garcia-Molina. Correcting for missing data in information cascades. In Proceedings of the fourth ACM international conference on Web search and data mining, pages 55–64, 2011.[63] Justin Cheng, Lada Adamic, P Alex Dow, Jon Michael Kleinberg, and Jure Leskovec. Can cascades be predicted? In Proceedings of the 23rd international conference on World wide web, pages 925–936, 2014.[64] Sung Eun Choi, Youn Joo Lee, Sung Joo Lee, Kang Ryoung Park, and Jaihie Kim. A comparative study of local feature extraction for age estimation. In 2010 11th International Conference on Control Automation Robotics & Vision, pages 1280–1284. IEEE, 2010.[65] Jhony K Pontes, Alceu S Britto, Clinton Fookes, and Alessandro L Koerich. A flexible hierarchical approach for facial age estimation based on multiple features. Pattern Recognition, 54:34–51, 2016.[66] Sung Eun Choi, Youn Joo Lee, Sung Joo Lee, Kang Ryoung Park, and Jaihie Kim. Age estimation using a hierarchical classifier based on global and local facial features. Pattern recognition, 44(6):1262–1281, 2011.[67] Ivan Huerta, Carles Fernández, and Andrea Prati. Facial age estimation through the fusion of texture and local appearance descriptors. In European conference on computer vision, pages 667–681. Springer, 2014.[68] Guodong Guo and Guowang Mu. A framework for joint estimation of age, gender and ethnicity on a large database. Image and Vision Computing, 2014.[69] Guodong Guo, Guowang Mu, Yun Fu, and Thomas S Huang. Human age estimation using bio-inspired features. In 2009 IEEE conference on computer vision and pattern recognition, pages 112–119. IEEE, 2009.[70] X. Geng, W.-H. Zhou, and K. Smith-Miles. Automatic age estimation based on facial aging patterns. IEEE Trans. on Pattern Analysis and Machine Intelligence, 29(12):2234–2240, 2007.[71] Timothy F Cootes, Gareth J Edwards, and Christopher J Taylor. Active appearance models. In European conference on computer vision, pages 484–498. Springer, 1998.[72] Timothy F. Cootes, Gareth J. Edwards, and Christopher J. Taylor. Active appearance models. IEEE Transactions on pattern analysis and machine intelligence, 23(6):681– 685, 2001.[73] A Gunay and Vasif V Nabiyev. Facial age estimation based on decision level fusion of amm, lbp and gabor features. Int. J. Adv. Comput. Sci. Appl, 6:19–26, 2015.[74] Jinli Suo, Tianfu Wu, Songchun Zhu, Shiguang Shan, Xilin Chen, and Wen Gao. Design sparse features for age estimation using hierarchical face model. In 2008 8th IEEE International Conference on Automatic Face & Gesture Recognition, pages 1–6. IEEE, 2008.[75] Jinli Suo, Feng Min, Songchun Zhu, Shiguang Shan, and Xilin Chen. A multi-resolution dynamic model for face aging simulation. In 2007 IEEE Conference on Computer Vision and Pattern Recognition, pages 1–8. IEEE, 2007.[76] Renliang Weng, Jiwen Lu, Gao Yang, and Yap-Peng Tan. Multi-feature ordinal ranking for facial age estimation. In 2013 10th IEEE International Conference and Workshops on Automatic Face and Gesture Recognition (FG), pages 1–6. IEEE, 2013.[77] Carles Fernández, Ivan Huerta, and Andrea Prati. A comparative evaluation of regression learning algorithms for facial age estimation. In International Workshop on Face and Facial Expression Recognition from Real World Videos, pages 133–144. Springer, 2014.[78] Navneet Dalal and Bill Triggs. Histograms of oriented gradients for human detection. In 2005 IEEE computer society conference on computer vision and pattern recognition (CVPR’05), volume 1, pages 886–893. IEEE, 2005.[79] Rajeev Ranjan, Sabrina Zhou, Jun Cheng Chen, Amit Kumar, Azadeh Alavi, Vishal M Patel, and Rama Chellappa. Unconstrained age estimation with deep convolutional neural networks. In proceedings of the ieee international conference on computer vision workshops, pages 109–117, 2015.[80] Yann LeCun, Yoshua Bengio, and Geoffrey Hinton. Deep learning. nature, 521(7553):436–444, 2015.[81] Kazuya Ueki, Teruhide Hayashida, and Tetsunori Kobayashi. Subspace-based age- group classification using facial images under various lighting conditions. In 7th International Conference on Automatic Face and Gesture Recognition (FGR06), pages 6–pp. IEEE, 2006.[82] Ying Zheng, Hongxun Yao, Yanhao Zhang, and Pengfei Xu. Age classification based on back-propagation network. In Proceedings of the Fifth International Conference on Internet Multimedia Computing and Service, pages 319–322, 2013.[83] Yun Fu and Thomas S Huang. Human age estimation with regression on discriminative aging manifold. IEEE Transactions on Multimedia, 10(4):578–584, 2008.[84] Khoa Luu, Karl Ricanek, Tien D Bui, and Ching Y Suen. Age estimation using active appearance models and support vector machine regression. In 2009 IEEE 3rd International Conference on Biometrics: Theory, Applications, and Systems pages 1–5. IEEE, 2009.[85] Peter N Belhumeur, Joao P Hespanha, and David J Kriegman. Eigenfaces vs. fisher- faces Recognition using class specific linear projection. In European conference on computer vision, pages 43–58. Springer, 1996.[86] Hu Han and Anil K Jain. Age, gender and race estimation from unconstrained face images. Dept. Comput. Sci. Eng., Michigan State Univ., East Lansing, MI, USA, MSU Tech. Rep.(MSU-CSE-14-5), 87:27, 2014.[87] Sanford Weisberg. Applied linear regression, volume 528. John Wiley & Sons, 2005.[88] Shuicheng Yan, Huan Wang, Xiaoou Tang, and Thomas S Huang. Learning auto- structured regressor from uncertain nonnegative labels. In 2007 IEEE 11th international conference on computer vision, pages 1–8. IEEE, 2007.[89] Hironori Takimoto, Yasue Mitsukura, Minoru Fukumi, and Norio Akamatsu. Robust gender and age estimation under varying facial pose. Electronics and Communications in Japan, 91(7), 2008.[90] Xin Geng, Chao Yin, and Zhi-Hua Zhou. Facial age estimation by learning from label distributions. IEEE Trans. Pattern Analysis and Machine Intelligence, 35(10), 2013.[91] Chenjing Yan, Congyan Lang, Tao Wang, Xuetao Du, and Chen Zhang. Age estimation based on convolutional neural network. Advances in Multimedia Information Processing, 8879:211–220, 2014.[92] Furkan Gurpinar, Heysem Kaya, Hamdi Dibeklioglu, and Ali Salah. Kernel elm and cnn based facial age estimation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pages 80–86, 2016.[93] Zengwei Huo, Xu Yang, Chao Xing, Ying Zhou, Peng Hou, Jiaqi Lv, and Xin Geng. Deep age distribution learning for apparent age estimation. In Proceedings of the IEEE conference on computer vision and pattern recognition workshops, pages 17–24, 2016.[94] Bartłomiej Hebda and Tomasz Kryjak. A compact deep convolutional neural network architecture for video-based age and gender estimation. In 2016 Federated Conference on Computer Science and Information Systems (FedCSIS), pages 787–790. IEEE, 2016.[95] Jun-Cheng Chen, Amit Kumar, Rajeev Ranjan, Vishal M Patel, Azadeh Alavi, and Rama Chellappa. A cascaded convolutional neural network for age estimation of unconstrained faces. In 2016 IEEE 8th International Conference on Biometrics Theory, Applications and Systems (BTAS), pages 1–8. IEEE, 2016.[96] T. Joachims. Optimizing search engines using clickthrough data. International Conference on Knowledge Discovery and Data Mining, pages 133–142, 2002.[97] Yoav Freund, Raj Iyer, Robert E. Schapire, and Yoram Singer. An efficient boosting algorithm for combining preferences. Journal of Machine Learning Research, pages 933–969, 2003.[98] Chris Burges, Tal Shaked, Erin Renshaw, Ari Lazier, Matt Deeds, Nicole Hamilton, and Greg Hullender. Learning to rank using gradient descent. International Conference on Machine Learning, pages 89–96, 2005.[99] Samah Aloufi, Shiai Zhu, and Abdulmotaleb El Saddik. On the prediction of flickr image popularity by analyzing heterogeneous social sensory data. Sensors, 17(3):631, 2017.[100] Liangjie Hong, Ovidiu Dan, and Brian D Davison. Predicting popular messages in twitter. In Proceedings of the 20th international conference companion on World wide web, pages 57–58. ACM, 2011.[101] Ethem F Can, Hüseyin Oktay, and R Manmatha. Predicting retweet count using visual cues. In Proceedings of the 22nd ACM international conference on Conference on information & knowledge management, pages 1481–1484. ACM, 2013.[102] Roja Bandari, Sitaram Asur, and Bernardo A Huberman. The pulse of news in social media: Forecasting popularity. ICWSM, 12:26–33, 2012.[103] Bo Wu, Tao Mei, Wen-Huang Cheng, Yongdong Zhang, et al. Unfolding temporal dy- namics: Predicting social media popularity using multi-scale temporal decomposition. In AAAI, pages 272–278, 2016.[104] Khaled Almgren, Jeongkyu Lee, et al. Predicting the future popularity of images on social networks. In Proceedings of the The 3rd Multidisciplinary International Social Networks Conference on SocialInformatics 2016, Data Science 2016, page 15. ACM, 2016.[105] Harris Drucker, Christopher JC Burges, Linda Kaufman, Alex J Smola, and Vladimir Vapnik. Support vector regression machines. In Advances in neural information processing systems, pages 155–161, 1997.[106] Andy Liaw, Matthew Wiener, et al. Classification and regression by randomforest. R news, 2(3):18–22, 2002.[107] Jerome H Friedman. Greedy function approximation: a gradient boosting machine. Annals of statistics, pages 1189–1232, 2001.[108] Karen Simonyan and Andrew Zisserman. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556, 2014.[109] Christian Szegedy, Wei Liu, Yangqing Jia, Pierre Sermanet, Scott Reed, Dragomir Anguelov, Dumitru Erhan, Vincent Vanhoucke, and Andrew Rabinovich. Going deeper with convolutions. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 1–9, 2015.[110] Fatma S Abousaleh, Tekoing Lim, Wen-Huang Cheng, Neng-Hao Yu, M Anwar Hos- sain, and Mohammed F Alhamid. A novel comparative deep learning framework forfacial age estimation. EURASIP Journal on Image and Video Processing, 2016(1):47, 2016.[111] A. Krizhevsky, I. Sutskever, and G. Hinton. Imagenet classification with deep convolutional neural networks. Conference on Neural Information Processing Systems, 2012.[112] Bo Wu, Wen-Huang Cheng, Yongdong Zhang, Qiushi Huang, Jintao Li, and Tao Mei. Sequential prediction of social media popularity with deep temporal context networks. arXiv preprint arXiv:1712.04443, 2017.[113] Mayank Meghawat, Satyendra Yadav, Debanjan Mahata, Yifang Yin, Rajiv Ratn Shah, and Roger Zimmermann. A multimodal approach to predict social media popularity. In 2018 IEEE Conference on Multimedia Information Processing and Retrieval (MIPR), pages 190–195. IEEE, 2018.[114] Jen-Cheng Hou, Syu-Siang Wang, Ying-Hui Lai, Yu Tsao, Hsiu-Wen Chang, and Hsin-Min Wang. Audio-visual speech enhancement using multimodal deep convolutional neural networks. IEEE Transactions on Emerging Topics in Computational Intelligence, 2(2):117–128, 2018.[115] Microsoft Corp. How-Old.net, 2015.[116] Tsung-Hung Tsai, Wei-Cih Jhou, Wen-Huang Cheng, Min-Chun Hu, I-Chao Shen, Tekoing Lim, Kai-Lung Hua, Ahmed Ghoneim, M. Anwar Hossain, and Shintami C. Hidayati. Photo sundial: Estimating the time of capture in consumer photos. Neuro- computing, 177:529–542, 2016.[117] Chuang-Wen You, Yi-Ling Chen, and Wen-Huang Cheng. Socialcrc: Enabling socially-consensual rendezvous coordination by mobile phones. Pervasive and Mobile Computing, 25:67–87, 2016.[118] Wen-Huang Cheng, Chia-Wei Wang, and Ja-Ling Wu. Video adaptation for small display based on content recomposition. IEEE Transactions on Circuits and Systems for Video Technology, 17(1):43–58, 2007.[119] Bo Wu, Wen-Huang Cheng, Yongdong Zhang, and Tao Mei. Time matters: Multi-scale temporalization of social media popularity. In Proceedings of the ACM International Conference on Multimedia, 2016.[120] Bo Wu, Tao Mei, Wen-Huang Cheng, and Yongdong Zhang. Unfolding temporal dy- namics: Predicting social media popularity using multi-scale temporal decomposition. In Proceedings of the AAAI Conference on Artificial Intelligence, 2016.[121] Tekoing Lim, Kai-Lung Hua, Hong-Cyuan Wang, Kai-Wen Zhao, Min-Chun Hu, and Wen-Huang Cheng. Vrank: Voting system on ranking model for human age estimation. In Proceedings of the IEEE International Workshop on Multimedia Signal Processing, 2015.[122] R. Girshick, J. Donahue, T. Darrell, and J. Malik. Rich feature hierarchies for accurate object detection and semantic segmentation. IEEE Conference on Computer Vision and Pattern Recognition, 2014.[123] T. Serre, L. Wolf, S. Bileschi, M. Riesenhuber, and T. Poggio. Robust object recognition with cortex-like mechanisms. IEEE Trans. on Pattern Analysis and Machine Intelligence, 29(3):411–426, 2007.[124] Miguel A. Carreira-Perpinan and Weiran Wang. Distributed optimization of deeply nested systems. International Conference on Artificial Intelligence and Statistics, 33:10–19, 2014.[125] Yangqing Jia, Evan Shelhamer, Jeff Donahue, Sergey Karayev, Jonathan Long, Ross Girshick, Sergio Guadarrama, and Trevor Darrell. Caffe: Convolutional architecture for fast feature embedding. arXiv preprint, arXiv:1408.5093, 2014.[126] S. Chopra, R. Hadsell, and Y. LeCun. Learning a similarity metric discriminatively, with application to face verification. IEEE Conference on Computer Vision and Pattern Recognition, 1:539–546, 2005.[127] J. Nocedal and S. J. Wright. Numerical optimization. Springer Series in Operations Research and Financial Engineering, 2006.[128] Juha Ylioinas, Abdenour Hadid, Xiaopeng Hong, and Matti Pietikäinen. Age estima- tion using local binary pattern kernel density estimate. International Conference on Image Analysis and Processing, 8156:141–150, 2013.[129] K.-Y. Chang, C.-S. Chen, and Y.-P. Hung. Ordinal hyperplanes ranker with cost sensitivities for age estimation. 2011.[130] K. Ricanek and T. Tesafaye. Morph: a longitudinal image database of normal adult age- progression. International Conference on Automatic Face and Gesture Recognition, pages 341–345, 2006.[131] A. C. Gallagher and T. Chen. Using group prior to identify people in consumer images.IEEE Conference on Computer Vision and Pattern Recognition, pages 1–8, 2007.[132] Jordi Sanchez-Riera, Kai-Lung Hua, Yuan-Sheng Hsiao, Tekoing Lim, Shintami C. Hidayati, and Wen-Huang Cheng. A comparative study of data fusion for rgb-d based visual recognition. Pattern Recognition Letters, 73:1–6, 2016.[133] Mu Li, Tong Zhang, Yuqiang Chen, and Alex Smola. Efficient mini-batch training for stochastic optimization. International Conference on Knowledge Discovery and Data Mining, 2014.[134] E. A. Patrick and F. P. Fischer. A generalized k-nearest neighbor rule. Information and Control, 16(2):128–152, 1970.[135] D. E. Rumelhart, G. E. Hinton, and R. J. Williams. Learning representations by backpropagating errors. Nature, 323(9):318–362, 1986.[136] J. R. Quinlan. C4.5: Programs for machine learning. Morgan Kaufmann, 1993.[137] V. Vapnik. Statistical learning theory. John Wiley and Sons, 1998.[138] R. Jang. Anfis: Adaptive network based fuzzy inference system. IEEE Transaction on System, Man and Cybernetics, 23(3):665–684, 1993.[139] Fatma S Abousaleh, Wen-Huang Cheng, Neng-Hao Yu, and Yu Tsao. Multimodal deep learning framework for image popularity prediction on social media. IEEE Transactions on Cognitive and Developmental Systems, 2020.[140] Gabor Szabo and Bernardo A Huberman. Predicting the popularity of online content.Communications of the ACM, 53(8):80–88, 2010.[141] Xiang Niu, Lusong Li, Tao Mei, Jialie Shen, and Ke Xu. Predicting image popularity in an incomplete social media community by a weighted bi-partite graph. In 2012 IEEE International Conference on Multimedia and Expo, pages 735–740. IEEE, 2012.[142] Sasa Petrovic, Miles Osborne, and Victor Lavrenko. Rt to win! predicting message propagation in twitter. ICWSM, 11:586–589, 2011.[143] Amandianeze O Nwana, Salman Avestimehr, and Tsuhan Chen. A latent social approach to youtube popularity prediction. In Global Communications Conference (GLOBECOM), 2013 IEEE, pages 3138–3144. IEEE, 2013.[144] Henrique Pinto, Jussara M Almeida, and Marcos A Gonçalves. Using early view patterns to predict the popularity of youtube videos. In Proceedings of the sixth ACM international conference on Web search and data mining, pages 365–374. ACM, 2013.[145] David A Shamma, Jude Yew, Lyndon Kennedy, and Elizabeth F Churchill. Viral actions: Predicting video view counts using synchronous sharing behaviors. In ICWSM, 2011.[146] Niyati Aggrawal, Archit Ahluwalia, Prashi Khurana, and Anuja Arora. Brand analysis framework for online marketing: ranking web pages and analyzing popularity of brands on social media. Social Network Analysis and Mining, 7(1):21, 2017.[147] Marcos André Gonçalves, Jussara M Almeida, Luiz GP dos Santos, Alberto HF Laender, and Virgílio Almeida. On popularity in the blogosphere. IEEE Internet Computing, 14(3):42–49, 2010.[148] Aboul-Ella Hassanien and Ajith Abraham. Computational Intelligence in Multimedia Processing: Recent Advances, volume 96. Springer, 2008.[149] Marko Heikkila and Matti Pietikainen. A texture-based method for modeling the background and detecting moving objects. IEEE transactions on pattern analysis and machine intelligence, 28(4):657–662, 2006.[150] Timo Ojala, Matti Pietikainen, and Topi Maenpaa. Multiresolution gray-scale and rotation invariant texture classification with local binary patterns. IEEE Transactions on pattern analysis and machine intelligence, 24(7):971–987, 2002.[151] Di Huang, Caifeng Shan, Mohsen Ardabilian, Yunhong Wang, and Liming Chen. Local binary patterns and its application to facial image analysis: a survey. IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews), 41(6):765–781, 2011.[152] Aude Oliva and Antonio Torralba. Modeling the shape of the scene: A holistic representation of the spatial envelope. International journal of computer vision, 42(3):145–175, 2001.[153] Yiwen Luo and Xiaoou Tang. Photo and video quality evaluation: Focusing on the subject. In European Conference on Computer Vision, pages 386–399. Springer, 2008.[154] Yan Ke, Xiaoou Tang, and Feng Jing. The design of high-level features for photo quality assessment. In 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’06), volume 1, pages 419–426. IEEE, 2006.[155] Xiaoou Tang, Wei Luo, and Xiaogang Wang. Content-based photo quality assessment.IEEE Transactions on Multimedia, 15(8):1930–1943, 2013.[156] Xiaoqiao Chen, Qingyi Zhang, Manhui Lin, Guangyi Yang, and Chu He. No-reference color image quality assessment: from entropy to perceptual quality. EURASIP Journal on Image and Video Processing, 2019(1):77, 2019.[157] Congcong Li, Alexander C Loui, and Tsuhan Chen. Towards aesthetics: A photo quality assessment and photo selection system. In Proceedings of the 18th ACM international conference on Multimedia, pages 827–830. ACM, 2010.[158] https://keras.io/.[159] Charles Spearman. The proof and measurement of association between two things.The American journal of psychology, 15(1):72–101, 1904.[160] Ian T Jolliffe and Jorge Cadima. Principal component analysis: a review and recent developments. Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences, 374(2065):20150202, 2016.[161] JVN Lakshmi. Stochastic gradient descent using linear regression with python. Inter- national Journal of Advanced Engineering Research and Applications, 2(8):519–525, 2016.[162] Nello Cristianini, John Shawe-Taylor, et al. An introduction to support vector machines and other kernel-based learning methods. Cambridge university press, 2000.[163] Martin Hofmann. Support vector machines-kernels and the kernel trick. Notes, 26(3):1–16, 2006.[164] Leo Breiman. Classification and regression trees. Routledge, 2017.[165] Souhaib Ben Taieb and Rob J Hyndman. A gradient boosting approach to the kaggle load forecasting competition. International journal of forecasting, 30(2):382–394, 2014.[166] Tianqi Chen and Carlos Guestrin. Xgboost: A scalable tree boosting system. In Proceedings of the 22nd acm sigkdd international conference on knowledge discovery and data mining, pages 785–794. ACM, 2016.[167] http://scikit-learn.org/stable/.[168] Fabian Pedregosa, Gaël Varoquaux, Alexandre Gramfort, Vincent Michel, Bertrand Thirion, Olivier Grisel, Mathieu Blondel, Peter Prettenhofer, Ron Weiss, Vincent Dubourg, et al. Scikit-learn: Machine learning in python. Journal of machine learning research, 12(Oct):2825–2830, 2011.[169] https://social-media-prediction.github.io/MM17PredictionChallenge/index.html.[170] Diederik P Kingma and Jimmy Ba. Adam: A method for stochastic optimization.arXiv preprint arXiv:1412.6980, 2014.[171] Meeyoung Cha, Haewoon Kwak, Pablo Rodriguez, Yong-Yeol Ahn, and Sue Moon. I tube, you tube, everybody tubes: analyzing the world’s largest user generated content video system. In Proceedings of the 7th ACM SIGCOMM conference on Internet measurement, pages 1–14, 2007.[172] Jinna Lv, Wu Liu, Meng Zhang, He Gong, Bin Wu, and Huadong Ma. Multi-feature fusion for predicting social media popularity. In Proceedings of the 25th ACM international conference on Multimedia, pages 1883–1888, 2017.[173] Xiaowen Huang, Yuqi Gao, Quan Fang, Jitao Sang, and Changsheng Xu. Towards SMP challenge: stacking of diverse models for social image popularity prediction. In Proceedings of the 25th ACM international conference on Multimedia, pages 1895–1900, 2017.[174] Omkar M Parkhi, Andrea Vedaldi, and Andrew Zisserman. Deep face recognition. 2015.[175] Oriol Vinyals, Alexander Toshev, Samy Bengio, and Dumitru Erhan. Show and tell: Lessons learned from the 2015 MSCOCO image captioning challenge. IEEE transactions on pattern analysis and machine intelligence, 39(4):652–663, 2016.[176] Baoyuan Wu, Weidong Chen, Peng Sun, Wei Liu, Bernard Ghanem, and Siwei Lyu. Tagging like humans: Diverse and distinct image annotation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 7967–7975, 2018. 描述 博士
國立政治大學
社群網路與人智計算國際研究生博士學位學程(TIGP)
103761506資料來源 http://thesis.lib.nccu.edu.tw/record/#G0103761506 資料類型 thesis dc.contributor.advisor 曹昱<br>余能豪 zh_TW dc.contributor.advisor Tsao, Yu<br>Yu, Neng-Hao en_US dc.contributor.author (Authors) 艾費瑪 zh_TW dc.contributor.author (Authors) Fatma Said Abousaleh Abdeo en_US dc.creator (作者) 艾費瑪 zh_TW dc.creator (作者) Abdeo, Fatma Said Abousaleh en_US dc.date (日期) 2021 en_US dc.date.accessioned 2-Mar-2021 15:02:02 (UTC+8) - dc.date.available 2-Mar-2021 15:02:02 (UTC+8) - dc.date.issued (上傳時間) 2-Mar-2021 15:02:02 (UTC+8) - dc.identifier (Other Identifiers) G0103761506 en_US dc.identifier.uri (URI) http://nccur.lib.nccu.edu.tw/handle/140.119/134227 - dc.description (描述) 博士 zh_TW dc.description (描述) 國立政治大學 zh_TW dc.description (描述) 社群網路與人智計算國際研究生博士學位學程(TIGP) zh_TW dc.description (描述) 103761506 zh_TW dc.description.abstract (摘要) 人們通常能自然無礙地和他人互動,而社群訊號(social signal)是有效溝通的自然產物。然而如何讓電腦能分析、了解社交互動,並正確展現人類社群訊號的過程,仍舊是社群訊號處理(social signal processing, SSP)領域最大的挑戰之一。社交互動可以透過面對面或網路兩種不同的渠道進行。在面對面的互動中,人們常透過可觀察的非語言行為線索(例如:手勢、臉部表情、聲音表達、肢體動作和人際距離等)來了解社群訊號和行為並與他人互動。基於臉部圖像辨識的社交互動研究近來受到學術界極大重視,這是因為臉部圖像蘊含多樣化的臉部特徵,可以用來傳達關於年齡、性別、情緒和健康狀況的資訊。這些訊息在描述個人特質和社交溝通中扮演了重要的角色,其中,年齡尤其是影響我們日常社交互動最基本的因素之一。因此,根據臉部影像自動估計年齡的研究成為人工智慧領域的一項重要目標。雖然近幾年有巨大進展,但由於臉部樣貌的多變性取決於基因特徵、生活型態、臉部表情以及年齡等因素,這個研究課題仍屬於未解的難題。另一方面,網路互動包含了用戶如何透過社交平台如Facebook、Twitter、Instagram或Flickr等與他人互動。大部分的社交網路允許用戶創造並分享內容,也可以藉由不同的形式(例如:觀看、按讚或留言)與其他用戶創造的內容互動,從而產生大量含有用戶興趣、觀點、日常生活和互動資訊的社交內容。爆炸性成長的社群媒體內容和線上互動的行為,造成少數社交內容得到大量關注、受歡迎,但絕大多數則受到忽視。在社群媒體上不同種類的內容中,圖像已經成為用戶溝通的重要媒介,也導致用戶獲得的觀看次數或社交知名度產生變動。上述現象吸引了電腦視覺和多媒體領域的研究人員的興趣,並探究特定圖像受歡迎的原因,以及如何自動預測其受歡迎程度。然而,因為用戶獨特的偏好及其在社群媒體上互動歷程等其他因素,社群媒體上圖像受歡迎的程度仍然難以衡量、預測和定義。為此,本論文提出了一個架構,用以理解現實和線上世界的社交互動,來解決這些挑戰。首先,本論文探討根據臉部圖像自動估計年齡的問題。傳統估計臉部年齡的方法,透過直接分析臉部資訊(例如:鼻子、嘴巴、眼睛等)來從一個人的照片決定其年紀。然而即使對人類來說,一眼看出某人的年紀本質上仍是一項艱鉅的任務。為了處理這個問題,本論文由人類認知過程發想,提出了一個比較深度學習(comparative deep learning)的架構。藉由比較輸入圖像與選定的參考圖像(基準組),決定那組比較年輕或年長,從而以臉部圖像估算年齡。我們用區域卷積神經網路(region-convolutional neural network, R-CNN)從輸入圖像與參考樣本中擷取臉部特徵。然後,為了估計年齡差距,我們用能量函數(energy function)從全連接層(fully connected layer)獲取資訊,產生了一組代表比較關係(年輕或年長)的建議。最後,在模型的預測階段收集所有建議並依多數決來判斷人的年紀。我們在FG-NET、MORPH和IoG資料集上的實驗結果顯示,我們提出的架構超越目前最頂尖的方法,且進步的幅度分別是在FG-NET的13.24%(平均絕對誤差)、MORPH的23.20%(平均絕對誤差)以及IoG的4.74%(年齡分組分類精準度)。其次,本論文研究社群媒體上圖片受歡迎度預測的問題。隨著社群網路如Flickr、Facebook的興起,用戶常藉由分享他們的生活照片來互動。雖然每分鐘上傳了數十億張圖像到網路,但只有少部分能有超過百萬次的觀看量,其他則完全被忽略。即使是相同用戶上傳的不同照片也不會有相同的觀看數。所以如何預測圖像受歡迎度是一個值得研究的主題,同時也是社群媒體分析的關鍵挑戰。因為這可提供一個瞭解個人喜好以及公眾目光的管道。然而,圖像受歡迎度的關鍵因素,和建立一個能預測社群媒體上圖像歡迎度的模型,依然是未解的難題。為此,本論文提出了一個多模式深度學習模型(multimodal deep learning),該模型藉由與圖像受歡迎度有關的多種視覺和社會特徵,來預測社群媒體上圖像的受歡迎度。本模型使用了兩種CNN,分別學習輸入圖像的高階特徵,並將他們融入一個統一的網路來預測受歡迎度。我們透過一系列對Flickr真實資料集的實驗來評估本模型的效能。實驗結果顯示,本預測模型勝過四個傳統的機器學習演算法、兩個CNN模型和其他最新的方法,效能至少提昇了2.33%(斯皮爾曼等級相關係數)、7.59%(平均絕對誤差)以及14.16%(均方誤差)以上。 zh_TW dc.description.abstract (摘要) Human beings generally have the capability to interact easily with each other without any obvious effort, and social signals are the natural result of this effective communication. The process of providing computers with an equivalent capability that enables them to analyze and understand social interactions, and then properly represent human social signals, remains one of the greatest scientific challenges in the field of social signal processing (SSP). Social interactions can take place in two different ways: face-to-face or cyber. In face-to-face interactions, people commonly use observable nonverbal behavioral cues (e.g., gestures, facial expressions, vocalizations, postures, interpersonal distance, etc.) to understand and interact with the social signals and behavior of others. The problem of recognizing social interactions from face images has recently received significant attention from the research community. This is because facial images have a variety of facial traits that can convey information about an individual’s age, gender, emotions, and physical health. These types of information are known to play a key role both in the description of individuals and social communication. In particular, age is one of the most fundamental attributes that affect our daily social interactions. Automatic age estimation from face images has therefore become a significant task in numerous applications of artificial intelligence. Despite the huge advances in the automatic age estimation from face images in recent years, it remains a challenging problem. This is because of the large variations in facial appearance that result from a number of different factors, including genetic traits, lifestyle, facial expressions, and aging. On the other hand, cyber interactions are related to how users interact with each other through social media websites such as Facebook, Twitter, Instagram, and Flickr. Most social networks allow users to create and share content and interact with other user-generated content in different forms (e.g., by viewing, liking, or commenting). This results in massive amounts of social content that provide information about users’ interests, opinions, daily activities, and interactions. The explosive growth of social media content and the interactive online behaviors between users make only a limited number of social media content attracts a great deal of user attention and become popular, while the vast majority of content is completely ignored. Among the different types of content generated by users on social media, images have become important media for communication between users, resulting in variations in the number of views they receive or their social popularity. This phenomenon has attracted researchers from computer vision and multimedia domains to explore the reasons why certain photos are considered popular and how to predict their popularity automatically. However, it is still difficult to measure, predict, or even define image popularity on social media because it is based on a user’s preferences and many other factors that could affect user’s social interactions on social media websites and lead to the popularity of content. To this end, this dissertation proposes a framework for understanding social interaction in the real and online world to address these challenges.First, this dissertation addresses the problem of automatic age estimation from facial images. The conventional methods for facial age estimation normally determine the age of a person directly from his/her facial image by analyzing some facial information (e.g., nose, mouth, eyes, etc.). This means only the input image is utilized to estimate the person’s age. However, telling someone’s precise age at a glance without any reference information is essentially a challenging task even for humans. To address this problem and inspired by human cognitive processes, this dissertation proposes a comparative deep learning framework that estimates the age from the facial image by comparing the input image with a set of selected reference images (labeled baseline samples) to determine whether the input face is younger or older than each of the baseline samples. A specific deep learning architecture, namely a region-convolutional neural network (R-CNN), is used to extract facial information from both the input image and the baseline samples. Then, an energy function is exploited to aggregate the extracted information from the fully connected layer in order to estimate age comparisons. This results in a set of hints where each hint represents a comparative relationship (younger or older). Finally, the estimation stage aggregates all the set of hints and then votes on the number of hints for each label in order to estimate the person’s age. Therefore, the age of the input person could be estimated by taking the label that received the most votes. The experimental results on the FG-NET, MORPH, and IoG databases demonstrate that the proposed model outperforms compared to the state-of-the-art methods, with a relative improvement of 13.24% (on FG-NET), 23.20% (on MORPH) in terms of mean absolute error, and 4.74% (on IoG) in terms of age group classification accuracy.Second, this dissertation addresses the problem of image popularity prediction on social media websites. With an increasing number of social networks such as Flickr and Facebook, users often interact with each other by sharing photos of their daily lives. Although billions of images are uploaded to the internet every minute, only a few of these images receive millions of views and become popular, while others are completely ignored. Even the different images posted by the same user receive a different number of views. This raises the problem of image popularity prediction, which has become a key challenge in social media analytics, as it offers opportunities to reveal individual preferences and public attention. However, the challenge remains to investigate crucial factors that influence image popularity, as well as modeling and predicting the evolution of image popularity on social media. To this end, this dissertation proposes a multimodal deep learning model that predicts the popularity of images on social media by using various types of visual and social features that are associated with image popularity. The proposed model uses two dedicated CNNs to learn high-level representations separately from the input features and then merges them into a unified network for popularity prediction. The performance of the model was evaluated by performing a series of experiments on a real-world dataset from Flickr. The evaluation results reveal that the proposed prediction model outperforms four traditional machine learning schemes, two CNN-based models, and other state-of-the-art methods, with a relative performance improvement of more than 2.33%, 7.59%, and 14.16% in terms of the Spearman rank correlation coefficient, mean absolute error, and mean squared error, respectively. en_US dc.description.tableofcontents Chapter one - Introduction 11.1 Background of the Study 11.1.1 Social Interaction and Facial Age Estimation 51.1.2 Social Interaction across Social Media and Popularity Prediction 91.2 Motivation 111.3 Contribution 151.4 Dissertation Organization 18Chapter two - Literature Review 192.1 Introduction 192.2 Human Age Estimation from Face Images 202.2.1 Aging-Related Facial Feature Extraction 202.2.2 Age Estimation Techniques 222.3 Image Popularity Prediction on Social Media 252.3.1 Features Influencing the Image Popularity 262.3.2 Prediction Models 27Chapter three - Comparative Deep Learning Framework for Facial Age Estimation 303.1 Overview 303.2 Introduction 313.3 Proposed Method: CRCNN Framework 353.3.1 Preliminary Definitions 363.3.2 Overview of Our CRCNN Framework 383.3.3 CRCNN Formulations 393.3.4 Learning Method for the Comparative Stage 433.4 Experimental Results and Discussions 453.4.1 Experimental Setup 463.4.2 Optimization of Our CRCNN Framework 473.4.3 Discussions and Comparisons with State-of-the-art Methods 553.5 Summary 59Chapter four - Multimodal Deep Learning Framework for Image Popularity Prediction on Social Media 604.1 Overview 604.2 Introduction 614.3 Features 644.3.1 Visual Content Features 654.3.2 Social Context Features 724.4 Methodology 764.4.1 Overview of Proposed Framework 764.4.2 Training the VSCNN Model 784.4.3 Baseline Models 804.5 Experiments and Results 834.5.1 Experimental Setup 834.5.2 Results 884.6 Summary 101Chapter five - Conclusions and Future Work 1035.1 Conclusions 1035.2 Future Work 106References 108Appendix A Publications 126 zh_TW dc.format.extent 4828069 bytes - dc.format.mimetype application/pdf - dc.source.uri (資料來源) http://thesis.lib.nccu.edu.tw/record/#G0103761506 en_US dc.subject (關鍵詞) 社交 zh_TW dc.subject (關鍵詞) 深度學習 zh_TW dc.subject (關鍵詞) Social Network en_US dc.subject (關鍵詞) Deep Learning en_US dc.title (題名) 使用圖像和深度學習了解社交互動 zh_TW dc.title (題名) Understanding Social Interaction Using Images and Deep Learning en_US dc.type (資料類型) thesis en_US dc.relation.reference (參考文獻) [1] Alessandro Vinciarelli, Maja Pantic, and Hervé Bourlard. Social signal processing: Survey of an emerging domain. Image and vision computing, 27(12):1743–1759, 2009.[2] Maja Pantic, Anton Nijholt, Alex Pentland, and Thomas S Huanag. Human-centred intelligent human? computer interaction (hci2): how far are we from attaining it? International Journal of Autonomous and Adaptive Communications Systems, 1(2):168– 187, 2008.[3] Ahmed Elgammal. Human-centered multimedia: representations and challenges. In Proceedings of the 1st ACM international workshop on Human-centered multimedia, pages 11–18. ACM, 2006.[4] Jim Gemmell, Kentaro Toyama, C Lawrence Zitnick, Thomas Kang, and Steven Seitz. Gaze awareness for video-conferencing: A software approach. IEEE MultiMedia, 7(4):26–35, 2000.[5] Michael Hecht, Joseph De Vito, and Laura Guerrero. Perspectives on nonverbal communication: codes, functions, and contexts. The Nonverbal Communication Reader, pages 3–18, 1999.[6] Dacher Keltner, Paul Ekman, Gian C Gonzaga, and Jennifer Beer. Facial expression of emotion. 2003.[7] Nalini Ambady and Robert Rosenthal. Thin slices of expressive behavior as predictors of interpersonal consequences: A meta-analysis. Psychological bulletin, 111(2):256, 1992.[8] Albert Mehrabian and Susan R Ferris. Inference of attitudes from nonverbal communication in two channels. Journal of consulting psychology, 31(3):248, 1967.[9] Jon E Grahe and Frank J Bernieri. The importance of nonverbal cues in judging rapport. Journal of Nonverbal behavior, 23(4):253–269, 1999.[10] Sen Pei, Lev Muchnik, José S Andrade Jr, Zhiming Zheng, and Hernán A Makse. Searching for superspreaders of information in real-world social media. Scientific reports, 4:5547, 2014.[11] Elizabeth Dubois and Devin Gaffney. The multiple facets of influence: Identifying political influentials and opinion leaders on twitter. American behavioral scientist, 58(10):1260–1277, 2014.[12] Ioannis Arapakis, Mounia Lalmas, B Barla Cambazoglu, Mari-Carmen Marcos, and Joemon M Jose. User engagement in online news: Under the scope of sentiment, interest, affect, and gaze. Journal of the Association for Information Science and Technology, 65(10):1988–2005, 2014.[13] Anjana Susarla, Jeong-Ha Oh, and Yong Tan. Social networks and the diffusion of user-generated content: Evidence from youtube. Information Systems Research, 23(1):23–41, 2012.[14] Saeideh Bakhshi, David A Shamma, and Eric Gilbert. Faces engage us: Photos with faces attract more likes and comments on instagram. In Proceedings of the SIGCHI conference on human factors in computing systems, pages 965–974, 2014.[15] Malcolm Gladwell. Blink: The power of thinking without thinking. 2006.[16] Mary Lee Hummert, Jaye L Shaner, Teri A Garstka, and Clark Henry. Communication with older adults: The influence of age stereotypes, context, and communicator age. Human Communication Research, 25(1):124–151, 1998.[17] Matthew G Rhodes. Age estimation of faces: A review. Applied Cognitive Psychology: The Official Journal of the Society for Applied Research in Memory and Cognition, 23(1):1–12, 2009.[18] Guodong Guo, Yun Fu, Charles R Dyer, and Thomas S Huang. Image-based human age estimation by manifold learning and locally adjusted robust regression. IEEE Transactions on Image Processing, 17(7):1178–1188, 2008.[19] Guodong Guo, Yun Fu, Thomas S Huang, and Charles R Dyer. Locally adjusted robust regression for human age estimation. In 2008 IEEE Workshop on Applications of Computer Vision, pages 1–6. IEEE, 2008.[20] DS Berry, L Zebrowitz-MeArthur, and TR Alley. Social and applied aspects of perceiving faces. 1988.[21] Kestutis Sveikata, Irena Balciuniene, Janina Tutkuviene, et al. Factors influencing face aging. literature review. Stomatologija, 13(4):113–116, 2011.[22] Harold Smulyan, Roland G Asmar, Annie Rudnicki, Gerard M London, and Michel E Safar. Comparative effects of aging in men and women on the properties of the arterial tree. Journal of the American College of Cardiology, 37(5):1374–1380, 2001.[23] Emma C Paes, Hans JLJM Teepen, Willemijn A Koop, and Moshe Kon. Perioral wrinkles: histologic differences between men and women. Aesthetic Surgery Journal, 29(6):467–472, 2009.[24] Yun Fu, Ye Xu, and Thomas S Huang. Estimating human age by manifold analysis of face pictures and regression on aging features. In 2007 IEEE International Conference on Multimedia and Expo, pages 1383–1386. IEEE, 2007.[25] David A Gunn, Helle Rexbye, Christopher EM Griffiths, Peter G Murray, Amelia Fereday, Sharon D Catt, Cyrena C Tomlin, Barbara H Strongitharm, Dave I Perrett, Michael Catt, et al. Why some women look young for their age. PloS one, 4(12):e8021, 2009.[26] Kai Li, Junliang Xing, Weiming Hu, and Stephen J Maybank. D2c: Deep cumulatively and comparatively learning for human age estimation. Pattern Recognition, 66:95–105, 2017.[27] Bingbing Ni, Zheng Song, and Shuicheng Yan. Web image mining towards universal age estimator. In Proceedings of the 17th ACM international conference on Multimedia, pages 85–94, 2009.[28] Young H Kwon and Niels da Vitoria Lobo. Age classification from facial images.Computer vision and image understanding, 74(1):1–21, 1999.[29] Andreas Lanitis, Christopher J. Taylor, and Timothy F Cootes. Toward automatic simulation of aging effects on face images. IEEE Transactions on pattern Analysis and machine Intelligence, 24(4):442–455, 2002.[30] Ranjan Jana, Debaleena Datta, and Rituparna Saha. Age estimation from face image using wrinkle features. Procedia Computer Science, 46:1754–1761, 2015.[31] Dong Yi, Zhen Lei, and Stan Z Li. Age estimation by multi-scale convolutional network. In Asian conference on computer vision, pages 144–158. Springer, 2014.[32] Xiaolong Wang, Rui Guo, and Chandra Kambhamettu. Deeply-learned feature for age estimation. In 2015 IEEE Winter Conference on Applications of Computer Vision, pages 534–541. IEEE, 2015.[33] Zhenxing Niu, Mo Zhou, Le Wang, Xinbo Gao, and Gang Hua. Ordinal regression with multiple output cnn for age estimation. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 4920–4928, 2016.[34] Shixing Chen, Caojin Zhang, and Ming Dong. Deep age estimation: From classification to ranking. IEEE Transactions on Multimedia, 20(8):2209–2222, 2017.[35] Bin-Bin Gao, Chao Xing, Chen-Wei Xie, Jianxin Wu, and Xin Geng. Deep label distribution learning with label ambiguity. IEEE Transactions on Image Processing, 26(6):2825–2838, 2017.[36] Yoshua Bengio. Deep learning of representations: Looking forward. Dediu, A.-H., Martın-Vide, C., Mitkov, R., Truthe, B. (eds.) SLSP. LNCS, 7978:1–37, 2013.[37] John B. Carroll. Human cognitive abilities: A survey of factor-analytic studies. New York: Cambridge University Press, 1993.[38] Aditya Khosla, Atish Das Sarma, and Raffay Hamid. What makes an image popular? In Proceedings of the 23rd international conference on World wide web, pages 867– 876. ACM, 2014.[39] Francesco Gelli, Tiberio Uricchio, Marco Bertini, Alberto Del Bimbo, and Shih-Fu Chang. Image popularity prediction in social media using sentiment and context features. In Proceedings of the 23rd ACM international conference on Multimedia, pages 907–910. ACM, 2015.[40] Spencer Cappallo, Thomas Mensink, and Cees GM Snoek. Latent factors of visual popularity prediction. In Proceedings of the 5th ACM on International Conference on Multimedia Retrieval, pages 195–202. ACM, 2015.[41] Philip J McParlane, Yashar Moshfeghi, and Joemon M Jose. Nobody comes here anymore, it’s too crowded; predicting image popularity on flickr. In Proceedings of International Conference on Multimedia Retrieval, page 385. ACM, 2014.[42] Luam Catao Totti, Felipe Almeida Costa, Sandra Avila, Eduardo Valle, Wagner Meira Jr, and Virgilio Almeida. The impact of visual attributes on online image diffusion. In Proceedings of the 2014 ACM conference on Web science, pages 42–51. ACM, 2014.[43] Wen Wang and Wei Zhang. Combining multiple features for image popularity pre- diction in social media. In Proceedings of the 25th ACM international conference on Multimedia, pages 1901–1905. ACM, 2017.[44] Zheng Song, Bingbing Ni, Dong Guo, Terence Sim, and Shuicheng Yan. Learning universal multi-view age estimator using video context. In 2011 International Conference on Computer Vision, pages 241–248. IEEE, 2011.[45] A. Lanitis, C. Draganova, and C. Christodoulou. Comparing different classifiers for automatic age estimation. IEEE Trans. System, Man, and Cybernetics, 34(1):621–628, 2004.[46] Yun Fu, Guodong Guo, and Thomas S Huang. Age synthesis and estimation via faces: A survey. IEEE transactions on pattern analysis and machine intelligence, 32(11):1955–1976, 2010.[47] Xin Geng, Zhi-Hua Zhou, Yu Zhang, Gang Li, and Honghua Dai. Learning from facial aging patterns for automatic age estimation. In Proceedings of the 14th ACM international conference on Multimedia, pages 307–316, 2006.[48] Feng Gao and Haizhou Ai. Face age classification on consumer images with gabor feature and fuzzy lda method. In International Conference on Biometrics, pages 132–141. Springer, 2009.[49] ElectronicCustomerRelationshipManagement(ECRM),https://en.wikipedia.org/ wiki/ECRM, 2020.[50] Anil K Jain, Sarat C Dass, and Karthik Nandakumar. Soft biometric traits for personal recognition systems. In International conference on biometric authentication, pages 731–738. Springer, 2004.[51] Eric Patterson, Amrutha Sethuram, Midori Albert, Karl Ricanek, and Michael King. Aspects of age variation in facial morphology affecting biometrics. In 2007 First IEEE International Conference on Biometrics: Theory, Applications, and Systems, pages 1–6. IEEE, 2007.[52] K Ricanek Jr, E Boone, and E Patterson. Craniofacial aging impacts on the eigenface face biometric. Comput. Sci, 1(3), 2006.[53] Karl Ricanek and Edward Boone. The effect of normal adult aging on standard pca face recognition accuracy rates. In Proceedings. 2005 IEEE International Joint Conference on Neural Networks, 2005., volume 4, pages 2018–2023. IEEE, 2005.[54] Narayanan Ramanathan and Rama Chellappa. Face verification across age progression.IEEE transactions on image processing, 15(11):3349–3361, 2006.[55] Junyan Wang, Yan Shang, Guangda Su, and Xinggang Lin. Age simulation for face recognition. In 18th International Conference on Pattern Recognition (ICPR’06), volume 3, pages 913–916. IEEE, 2006.[56] Alexandru Tatar, Marcelo Dias De Amorim, Serge Fdida, and Panayotis Antoniadis. A survey on predicting the popularity of web content. Journal of Internet Services and Applications, 5(1):8, 2014.[57] Nancy J Gnana Amala and K Kumar. Content popularity prediction methods-a survey. In 2018 3rd International Conference on Communication and Electronics Systems (ICCES), pages 749–753. IEEE, 2018.[58] Cheng Li, Yue Lu, Qiaozhu Mei, Dong Wang, and Sandeep Pandey. Click-through prediction for advertising in Twitter timeline. In Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 1959– 1968. ACM, 2015.[59] Flavio Figueiredo, Jussara M Almeida, Marcos André Gonçalves, and Fabrício Ben- evenuto. On the dynamics of social media popularity: A youtube case study. ACM Transactions on Internet Technology (TOIT), 14(4):24, 2014.[60] Chun-Che Wu, Tao Mei, Winston H Hsu, and Yong Rui. Learning to personalize trending image search suggestion. In Proceedings of the 37th international ACM SIGIR conference on Research & development in information retrieval, pages 727–736. ACM, 2014.[61] Meeyoung Cha, Alan Mislove, Ben Adams, and Krishna P Gummadi. Characterizing social cascades in flickr. In Proceedings of the first workshop on Online social networks, pages 13–18, 2008.[62] Eldar Sadikov, Montserrat Medina, Jure Leskovec, and Hector Garcia-Molina. Correcting for missing data in information cascades. In Proceedings of the fourth ACM international conference on Web search and data mining, pages 55–64, 2011.[63] Justin Cheng, Lada Adamic, P Alex Dow, Jon Michael Kleinberg, and Jure Leskovec. Can cascades be predicted? In Proceedings of the 23rd international conference on World wide web, pages 925–936, 2014.[64] Sung Eun Choi, Youn Joo Lee, Sung Joo Lee, Kang Ryoung Park, and Jaihie Kim. A comparative study of local feature extraction for age estimation. In 2010 11th International Conference on Control Automation Robotics & Vision, pages 1280–1284. IEEE, 2010.[65] Jhony K Pontes, Alceu S Britto, Clinton Fookes, and Alessandro L Koerich. A flexible hierarchical approach for facial age estimation based on multiple features. Pattern Recognition, 54:34–51, 2016.[66] Sung Eun Choi, Youn Joo Lee, Sung Joo Lee, Kang Ryoung Park, and Jaihie Kim. Age estimation using a hierarchical classifier based on global and local facial features. Pattern recognition, 44(6):1262–1281, 2011.[67] Ivan Huerta, Carles Fernández, and Andrea Prati. Facial age estimation through the fusion of texture and local appearance descriptors. In European conference on computer vision, pages 667–681. Springer, 2014.[68] Guodong Guo and Guowang Mu. A framework for joint estimation of age, gender and ethnicity on a large database. Image and Vision Computing, 2014.[69] Guodong Guo, Guowang Mu, Yun Fu, and Thomas S Huang. Human age estimation using bio-inspired features. In 2009 IEEE conference on computer vision and pattern recognition, pages 112–119. IEEE, 2009.[70] X. Geng, W.-H. Zhou, and K. Smith-Miles. Automatic age estimation based on facial aging patterns. IEEE Trans. on Pattern Analysis and Machine Intelligence, 29(12):2234–2240, 2007.[71] Timothy F Cootes, Gareth J Edwards, and Christopher J Taylor. Active appearance models. In European conference on computer vision, pages 484–498. Springer, 1998.[72] Timothy F. Cootes, Gareth J. Edwards, and Christopher J. Taylor. Active appearance models. IEEE Transactions on pattern analysis and machine intelligence, 23(6):681– 685, 2001.[73] A Gunay and Vasif V Nabiyev. Facial age estimation based on decision level fusion of amm, lbp and gabor features. Int. J. Adv. Comput. Sci. Appl, 6:19–26, 2015.[74] Jinli Suo, Tianfu Wu, Songchun Zhu, Shiguang Shan, Xilin Chen, and Wen Gao. Design sparse features for age estimation using hierarchical face model. In 2008 8th IEEE International Conference on Automatic Face & Gesture Recognition, pages 1–6. IEEE, 2008.[75] Jinli Suo, Feng Min, Songchun Zhu, Shiguang Shan, and Xilin Chen. A multi-resolution dynamic model for face aging simulation. In 2007 IEEE Conference on Computer Vision and Pattern Recognition, pages 1–8. IEEE, 2007.[76] Renliang Weng, Jiwen Lu, Gao Yang, and Yap-Peng Tan. Multi-feature ordinal ranking for facial age estimation. In 2013 10th IEEE International Conference and Workshops on Automatic Face and Gesture Recognition (FG), pages 1–6. IEEE, 2013.[77] Carles Fernández, Ivan Huerta, and Andrea Prati. A comparative evaluation of regression learning algorithms for facial age estimation. In International Workshop on Face and Facial Expression Recognition from Real World Videos, pages 133–144. Springer, 2014.[78] Navneet Dalal and Bill Triggs. Histograms of oriented gradients for human detection. In 2005 IEEE computer society conference on computer vision and pattern recognition (CVPR’05), volume 1, pages 886–893. IEEE, 2005.[79] Rajeev Ranjan, Sabrina Zhou, Jun Cheng Chen, Amit Kumar, Azadeh Alavi, Vishal M Patel, and Rama Chellappa. Unconstrained age estimation with deep convolutional neural networks. In proceedings of the ieee international conference on computer vision workshops, pages 109–117, 2015.[80] Yann LeCun, Yoshua Bengio, and Geoffrey Hinton. Deep learning. nature, 521(7553):436–444, 2015.[81] Kazuya Ueki, Teruhide Hayashida, and Tetsunori Kobayashi. Subspace-based age- group classification using facial images under various lighting conditions. In 7th International Conference on Automatic Face and Gesture Recognition (FGR06), pages 6–pp. IEEE, 2006.[82] Ying Zheng, Hongxun Yao, Yanhao Zhang, and Pengfei Xu. Age classification based on back-propagation network. In Proceedings of the Fifth International Conference on Internet Multimedia Computing and Service, pages 319–322, 2013.[83] Yun Fu and Thomas S Huang. Human age estimation with regression on discriminative aging manifold. IEEE Transactions on Multimedia, 10(4):578–584, 2008.[84] Khoa Luu, Karl Ricanek, Tien D Bui, and Ching Y Suen. Age estimation using active appearance models and support vector machine regression. In 2009 IEEE 3rd International Conference on Biometrics: Theory, Applications, and Systems pages 1–5. IEEE, 2009.[85] Peter N Belhumeur, Joao P Hespanha, and David J Kriegman. Eigenfaces vs. fisher- faces Recognition using class specific linear projection. In European conference on computer vision, pages 43–58. Springer, 1996.[86] Hu Han and Anil K Jain. Age, gender and race estimation from unconstrained face images. Dept. Comput. Sci. Eng., Michigan State Univ., East Lansing, MI, USA, MSU Tech. Rep.(MSU-CSE-14-5), 87:27, 2014.[87] Sanford Weisberg. Applied linear regression, volume 528. John Wiley & Sons, 2005.[88] Shuicheng Yan, Huan Wang, Xiaoou Tang, and Thomas S Huang. Learning auto- structured regressor from uncertain nonnegative labels. In 2007 IEEE 11th international conference on computer vision, pages 1–8. IEEE, 2007.[89] Hironori Takimoto, Yasue Mitsukura, Minoru Fukumi, and Norio Akamatsu. Robust gender and age estimation under varying facial pose. Electronics and Communications in Japan, 91(7), 2008.[90] Xin Geng, Chao Yin, and Zhi-Hua Zhou. Facial age estimation by learning from label distributions. IEEE Trans. Pattern Analysis and Machine Intelligence, 35(10), 2013.[91] Chenjing Yan, Congyan Lang, Tao Wang, Xuetao Du, and Chen Zhang. Age estimation based on convolutional neural network. Advances in Multimedia Information Processing, 8879:211–220, 2014.[92] Furkan Gurpinar, Heysem Kaya, Hamdi Dibeklioglu, and Ali Salah. Kernel elm and cnn based facial age estimation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pages 80–86, 2016.[93] Zengwei Huo, Xu Yang, Chao Xing, Ying Zhou, Peng Hou, Jiaqi Lv, and Xin Geng. Deep age distribution learning for apparent age estimation. In Proceedings of the IEEE conference on computer vision and pattern recognition workshops, pages 17–24, 2016.[94] Bartłomiej Hebda and Tomasz Kryjak. A compact deep convolutional neural network architecture for video-based age and gender estimation. In 2016 Federated Conference on Computer Science and Information Systems (FedCSIS), pages 787–790. IEEE, 2016.[95] Jun-Cheng Chen, Amit Kumar, Rajeev Ranjan, Vishal M Patel, Azadeh Alavi, and Rama Chellappa. A cascaded convolutional neural network for age estimation of unconstrained faces. In 2016 IEEE 8th International Conference on Biometrics Theory, Applications and Systems (BTAS), pages 1–8. IEEE, 2016.[96] T. Joachims. Optimizing search engines using clickthrough data. International Conference on Knowledge Discovery and Data Mining, pages 133–142, 2002.[97] Yoav Freund, Raj Iyer, Robert E. Schapire, and Yoram Singer. An efficient boosting algorithm for combining preferences. Journal of Machine Learning Research, pages 933–969, 2003.[98] Chris Burges, Tal Shaked, Erin Renshaw, Ari Lazier, Matt Deeds, Nicole Hamilton, and Greg Hullender. Learning to rank using gradient descent. International Conference on Machine Learning, pages 89–96, 2005.[99] Samah Aloufi, Shiai Zhu, and Abdulmotaleb El Saddik. On the prediction of flickr image popularity by analyzing heterogeneous social sensory data. Sensors, 17(3):631, 2017.[100] Liangjie Hong, Ovidiu Dan, and Brian D Davison. Predicting popular messages in twitter. In Proceedings of the 20th international conference companion on World wide web, pages 57–58. ACM, 2011.[101] Ethem F Can, Hüseyin Oktay, and R Manmatha. Predicting retweet count using visual cues. In Proceedings of the 22nd ACM international conference on Conference on information & knowledge management, pages 1481–1484. ACM, 2013.[102] Roja Bandari, Sitaram Asur, and Bernardo A Huberman. The pulse of news in social media: Forecasting popularity. ICWSM, 12:26–33, 2012.[103] Bo Wu, Tao Mei, Wen-Huang Cheng, Yongdong Zhang, et al. Unfolding temporal dy- namics: Predicting social media popularity using multi-scale temporal decomposition. In AAAI, pages 272–278, 2016.[104] Khaled Almgren, Jeongkyu Lee, et al. Predicting the future popularity of images on social networks. In Proceedings of the The 3rd Multidisciplinary International Social Networks Conference on SocialInformatics 2016, Data Science 2016, page 15. ACM, 2016.[105] Harris Drucker, Christopher JC Burges, Linda Kaufman, Alex J Smola, and Vladimir Vapnik. Support vector regression machines. In Advances in neural information processing systems, pages 155–161, 1997.[106] Andy Liaw, Matthew Wiener, et al. Classification and regression by randomforest. R news, 2(3):18–22, 2002.[107] Jerome H Friedman. Greedy function approximation: a gradient boosting machine. Annals of statistics, pages 1189–1232, 2001.[108] Karen Simonyan and Andrew Zisserman. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556, 2014.[109] Christian Szegedy, Wei Liu, Yangqing Jia, Pierre Sermanet, Scott Reed, Dragomir Anguelov, Dumitru Erhan, Vincent Vanhoucke, and Andrew Rabinovich. Going deeper with convolutions. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 1–9, 2015.[110] Fatma S Abousaleh, Tekoing Lim, Wen-Huang Cheng, Neng-Hao Yu, M Anwar Hos- sain, and Mohammed F Alhamid. A novel comparative deep learning framework forfacial age estimation. EURASIP Journal on Image and Video Processing, 2016(1):47, 2016.[111] A. Krizhevsky, I. Sutskever, and G. Hinton. Imagenet classification with deep convolutional neural networks. Conference on Neural Information Processing Systems, 2012.[112] Bo Wu, Wen-Huang Cheng, Yongdong Zhang, Qiushi Huang, Jintao Li, and Tao Mei. Sequential prediction of social media popularity with deep temporal context networks. arXiv preprint arXiv:1712.04443, 2017.[113] Mayank Meghawat, Satyendra Yadav, Debanjan Mahata, Yifang Yin, Rajiv Ratn Shah, and Roger Zimmermann. A multimodal approach to predict social media popularity. In 2018 IEEE Conference on Multimedia Information Processing and Retrieval (MIPR), pages 190–195. IEEE, 2018.[114] Jen-Cheng Hou, Syu-Siang Wang, Ying-Hui Lai, Yu Tsao, Hsiu-Wen Chang, and Hsin-Min Wang. Audio-visual speech enhancement using multimodal deep convolutional neural networks. IEEE Transactions on Emerging Topics in Computational Intelligence, 2(2):117–128, 2018.[115] Microsoft Corp. How-Old.net, 2015.[116] Tsung-Hung Tsai, Wei-Cih Jhou, Wen-Huang Cheng, Min-Chun Hu, I-Chao Shen, Tekoing Lim, Kai-Lung Hua, Ahmed Ghoneim, M. Anwar Hossain, and Shintami C. Hidayati. Photo sundial: Estimating the time of capture in consumer photos. Neuro- computing, 177:529–542, 2016.[117] Chuang-Wen You, Yi-Ling Chen, and Wen-Huang Cheng. Socialcrc: Enabling socially-consensual rendezvous coordination by mobile phones. Pervasive and Mobile Computing, 25:67–87, 2016.[118] Wen-Huang Cheng, Chia-Wei Wang, and Ja-Ling Wu. Video adaptation for small display based on content recomposition. IEEE Transactions on Circuits and Systems for Video Technology, 17(1):43–58, 2007.[119] Bo Wu, Wen-Huang Cheng, Yongdong Zhang, and Tao Mei. Time matters: Multi-scale temporalization of social media popularity. In Proceedings of the ACM International Conference on Multimedia, 2016.[120] Bo Wu, Tao Mei, Wen-Huang Cheng, and Yongdong Zhang. Unfolding temporal dy- namics: Predicting social media popularity using multi-scale temporal decomposition. In Proceedings of the AAAI Conference on Artificial Intelligence, 2016.[121] Tekoing Lim, Kai-Lung Hua, Hong-Cyuan Wang, Kai-Wen Zhao, Min-Chun Hu, and Wen-Huang Cheng. Vrank: Voting system on ranking model for human age estimation. In Proceedings of the IEEE International Workshop on Multimedia Signal Processing, 2015.[122] R. Girshick, J. Donahue, T. Darrell, and J. Malik. Rich feature hierarchies for accurate object detection and semantic segmentation. IEEE Conference on Computer Vision and Pattern Recognition, 2014.[123] T. Serre, L. Wolf, S. Bileschi, M. Riesenhuber, and T. Poggio. Robust object recognition with cortex-like mechanisms. IEEE Trans. on Pattern Analysis and Machine Intelligence, 29(3):411–426, 2007.[124] Miguel A. Carreira-Perpinan and Weiran Wang. Distributed optimization of deeply nested systems. International Conference on Artificial Intelligence and Statistics, 33:10–19, 2014.[125] Yangqing Jia, Evan Shelhamer, Jeff Donahue, Sergey Karayev, Jonathan Long, Ross Girshick, Sergio Guadarrama, and Trevor Darrell. Caffe: Convolutional architecture for fast feature embedding. arXiv preprint, arXiv:1408.5093, 2014.[126] S. Chopra, R. Hadsell, and Y. LeCun. Learning a similarity metric discriminatively, with application to face verification. IEEE Conference on Computer Vision and Pattern Recognition, 1:539–546, 2005.[127] J. Nocedal and S. J. Wright. Numerical optimization. Springer Series in Operations Research and Financial Engineering, 2006.[128] Juha Ylioinas, Abdenour Hadid, Xiaopeng Hong, and Matti Pietikäinen. Age estima- tion using local binary pattern kernel density estimate. International Conference on Image Analysis and Processing, 8156:141–150, 2013.[129] K.-Y. Chang, C.-S. Chen, and Y.-P. Hung. Ordinal hyperplanes ranker with cost sensitivities for age estimation. 2011.[130] K. Ricanek and T. Tesafaye. Morph: a longitudinal image database of normal adult age- progression. International Conference on Automatic Face and Gesture Recognition, pages 341–345, 2006.[131] A. C. Gallagher and T. Chen. Using group prior to identify people in consumer images.IEEE Conference on Computer Vision and Pattern Recognition, pages 1–8, 2007.[132] Jordi Sanchez-Riera, Kai-Lung Hua, Yuan-Sheng Hsiao, Tekoing Lim, Shintami C. Hidayati, and Wen-Huang Cheng. A comparative study of data fusion for rgb-d based visual recognition. Pattern Recognition Letters, 73:1–6, 2016.[133] Mu Li, Tong Zhang, Yuqiang Chen, and Alex Smola. Efficient mini-batch training for stochastic optimization. International Conference on Knowledge Discovery and Data Mining, 2014.[134] E. A. Patrick and F. P. Fischer. A generalized k-nearest neighbor rule. Information and Control, 16(2):128–152, 1970.[135] D. E. Rumelhart, G. E. Hinton, and R. J. Williams. Learning representations by backpropagating errors. Nature, 323(9):318–362, 1986.[136] J. R. Quinlan. C4.5: Programs for machine learning. Morgan Kaufmann, 1993.[137] V. Vapnik. Statistical learning theory. John Wiley and Sons, 1998.[138] R. Jang. Anfis: Adaptive network based fuzzy inference system. IEEE Transaction on System, Man and Cybernetics, 23(3):665–684, 1993.[139] Fatma S Abousaleh, Wen-Huang Cheng, Neng-Hao Yu, and Yu Tsao. Multimodal deep learning framework for image popularity prediction on social media. IEEE Transactions on Cognitive and Developmental Systems, 2020.[140] Gabor Szabo and Bernardo A Huberman. Predicting the popularity of online content.Communications of the ACM, 53(8):80–88, 2010.[141] Xiang Niu, Lusong Li, Tao Mei, Jialie Shen, and Ke Xu. Predicting image popularity in an incomplete social media community by a weighted bi-partite graph. In 2012 IEEE International Conference on Multimedia and Expo, pages 735–740. IEEE, 2012.[142] Sasa Petrovic, Miles Osborne, and Victor Lavrenko. Rt to win! predicting message propagation in twitter. ICWSM, 11:586–589, 2011.[143] Amandianeze O Nwana, Salman Avestimehr, and Tsuhan Chen. A latent social approach to youtube popularity prediction. In Global Communications Conference (GLOBECOM), 2013 IEEE, pages 3138–3144. IEEE, 2013.[144] Henrique Pinto, Jussara M Almeida, and Marcos A Gonçalves. Using early view patterns to predict the popularity of youtube videos. In Proceedings of the sixth ACM international conference on Web search and data mining, pages 365–374. ACM, 2013.[145] David A Shamma, Jude Yew, Lyndon Kennedy, and Elizabeth F Churchill. Viral actions: Predicting video view counts using synchronous sharing behaviors. In ICWSM, 2011.[146] Niyati Aggrawal, Archit Ahluwalia, Prashi Khurana, and Anuja Arora. Brand analysis framework for online marketing: ranking web pages and analyzing popularity of brands on social media. Social Network Analysis and Mining, 7(1):21, 2017.[147] Marcos André Gonçalves, Jussara M Almeida, Luiz GP dos Santos, Alberto HF Laender, and Virgílio Almeida. On popularity in the blogosphere. IEEE Internet Computing, 14(3):42–49, 2010.[148] Aboul-Ella Hassanien and Ajith Abraham. Computational Intelligence in Multimedia Processing: Recent Advances, volume 96. Springer, 2008.[149] Marko Heikkila and Matti Pietikainen. A texture-based method for modeling the background and detecting moving objects. IEEE transactions on pattern analysis and machine intelligence, 28(4):657–662, 2006.[150] Timo Ojala, Matti Pietikainen, and Topi Maenpaa. Multiresolution gray-scale and rotation invariant texture classification with local binary patterns. IEEE Transactions on pattern analysis and machine intelligence, 24(7):971–987, 2002.[151] Di Huang, Caifeng Shan, Mohsen Ardabilian, Yunhong Wang, and Liming Chen. Local binary patterns and its application to facial image analysis: a survey. IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews), 41(6):765–781, 2011.[152] Aude Oliva and Antonio Torralba. Modeling the shape of the scene: A holistic representation of the spatial envelope. International journal of computer vision, 42(3):145–175, 2001.[153] Yiwen Luo and Xiaoou Tang. Photo and video quality evaluation: Focusing on the subject. In European Conference on Computer Vision, pages 386–399. Springer, 2008.[154] Yan Ke, Xiaoou Tang, and Feng Jing. The design of high-level features for photo quality assessment. In 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’06), volume 1, pages 419–426. IEEE, 2006.[155] Xiaoou Tang, Wei Luo, and Xiaogang Wang. Content-based photo quality assessment.IEEE Transactions on Multimedia, 15(8):1930–1943, 2013.[156] Xiaoqiao Chen, Qingyi Zhang, Manhui Lin, Guangyi Yang, and Chu He. No-reference color image quality assessment: from entropy to perceptual quality. EURASIP Journal on Image and Video Processing, 2019(1):77, 2019.[157] Congcong Li, Alexander C Loui, and Tsuhan Chen. Towards aesthetics: A photo quality assessment and photo selection system. In Proceedings of the 18th ACM international conference on Multimedia, pages 827–830. ACM, 2010.[158] https://keras.io/.[159] Charles Spearman. The proof and measurement of association between two things.The American journal of psychology, 15(1):72–101, 1904.[160] Ian T Jolliffe and Jorge Cadima. Principal component analysis: a review and recent developments. Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences, 374(2065):20150202, 2016.[161] JVN Lakshmi. Stochastic gradient descent using linear regression with python. Inter- national Journal of Advanced Engineering Research and Applications, 2(8):519–525, 2016.[162] Nello Cristianini, John Shawe-Taylor, et al. An introduction to support vector machines and other kernel-based learning methods. Cambridge university press, 2000.[163] Martin Hofmann. Support vector machines-kernels and the kernel trick. Notes, 26(3):1–16, 2006.[164] Leo Breiman. Classification and regression trees. Routledge, 2017.[165] Souhaib Ben Taieb and Rob J Hyndman. A gradient boosting approach to the kaggle load forecasting competition. International journal of forecasting, 30(2):382–394, 2014.[166] Tianqi Chen and Carlos Guestrin. Xgboost: A scalable tree boosting system. In Proceedings of the 22nd acm sigkdd international conference on knowledge discovery and data mining, pages 785–794. ACM, 2016.[167] http://scikit-learn.org/stable/.[168] Fabian Pedregosa, Gaël Varoquaux, Alexandre Gramfort, Vincent Michel, Bertrand Thirion, Olivier Grisel, Mathieu Blondel, Peter Prettenhofer, Ron Weiss, Vincent Dubourg, et al. Scikit-learn: Machine learning in python. Journal of machine learning research, 12(Oct):2825–2830, 2011.[169] https://social-media-prediction.github.io/MM17PredictionChallenge/index.html.[170] Diederik P Kingma and Jimmy Ba. Adam: A method for stochastic optimization.arXiv preprint arXiv:1412.6980, 2014.[171] Meeyoung Cha, Haewoon Kwak, Pablo Rodriguez, Yong-Yeol Ahn, and Sue Moon. I tube, you tube, everybody tubes: analyzing the world’s largest user generated content video system. In Proceedings of the 7th ACM SIGCOMM conference on Internet measurement, pages 1–14, 2007.[172] Jinna Lv, Wu Liu, Meng Zhang, He Gong, Bin Wu, and Huadong Ma. Multi-feature fusion for predicting social media popularity. In Proceedings of the 25th ACM international conference on Multimedia, pages 1883–1888, 2017.[173] Xiaowen Huang, Yuqi Gao, Quan Fang, Jitao Sang, and Changsheng Xu. Towards SMP challenge: stacking of diverse models for social image popularity prediction. In Proceedings of the 25th ACM international conference on Multimedia, pages 1895–1900, 2017.[174] Omkar M Parkhi, Andrea Vedaldi, and Andrew Zisserman. Deep face recognition. 2015.[175] Oriol Vinyals, Alexander Toshev, Samy Bengio, and Dumitru Erhan. Show and tell: Lessons learned from the 2015 MSCOCO image captioning challenge. IEEE transactions on pattern analysis and machine intelligence, 39(4):652–663, 2016.[176] Baoyuan Wu, Weidong Chen, Peng Sun, Wei Liu, Bernard Ghanem, and Siwei Lyu. Tagging like humans: Diverse and distinct image annotation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 7967–7975, 2018. zh_TW dc.identifier.doi (DOI) 10.6814/NCCU202100261 en_US