三焦張量在多視角幾何中的計算與應用

Publications-Theses

Article View/Open

pdf(759)

Publication Export

Google Scholar^TM

題名	三焦張量在多視角幾何中的計算與應用 Computation and Applications of Trifocal Tensor in Multiple View Geometry
作者	李紹暐 Li, Shau Wei
貢獻者	何瑁鎧 Hor, Maw Kae 李紹暐 Li, Shau Wei
關鍵詞	三焦張量多視角影像極線轉換最小中值平方法投影矩陣 trifocal tensor multiple view images epipolar transfer LMedS projection matrix
日期	2010
上傳時間	4-Sep-2013 17:09:18 (UTC+8)
摘要	電腦視覺三維建模的精確度，仰賴影像中對應點的準確性。以前的研究大多採取兩張影像，透過極線轉換(epipolar transfer)取得影像間基礎矩陣(fundamental matrix)的關係，然後進行比對或過濾不良的對應點以求取精確的對應點。然極線轉換存在退化的問題，如何避免此退化問題以及降低兩張影像之間轉換錯誤的累積，成為求取精確三維建模中極待解決的課題。本論文中，我們提出一套機制，透過三焦張量(trifocal tensor)的觀念來過濾影像間不良的對應點，提高整體對應點的準確度，從而能計算較精確的投影矩陣進行三維建模。我們由多視角影像出發，先透過Bundler求取對應點，然後採用三焦張量過濾Bundler產生的對應點，並輔以最小中值平方法(LMedS)提升選點之準確率，再透過權重以及重複過濾等機制來調節並過濾對應點，從而取得精確度較高的對應點組合，最後求取投影矩陣進行電腦視覺中的各項應用。實作中，我們測詴了三組資料，包含一組以3ds Max自行建置的資料與兩組網路中取得的資料。我們先從三張影像驗證三焦張量的幾何特性與其過濾對應點的可行性，再將此方法延伸至多張影像，同樣也能證實透過三焦張量確實能提升對應點的準確度，甚至可以過濾出輸入資料中較不符合彼此間幾何性的影像。 The accuracy of 3D model constructions in computer vision depends on the accuracy of the corresponding points extracted from the images. Previous studies in this area mostly use two images and compute the fundamental matrix through the use of the epipolar geometry and then proceed for corresponding point matching and filtering out the outliers in order to get accurate corresponding points. However, the epipoler transform suffers from the degenerate problems and, also, the accumulated conversion errors during the corresponding matches both will degrade the model accuracy. Solving these problems become crucial in reconstructing accurate 3D models from multiple images. In this thesis, we proposed a mechanism to obtain accurate corresponding points for 3D model reconstruction from multiple images. The concept of trifocal tensor is used to remove the outliers in order to improve the overall accuracy of the corresponding points. We first use Bundler to search the corresponding points in the feature points extracted from multiple view images. Then we use trifocal tensor to determine and remove the outliers in the corresponding points generated by Bundler. LMedS is used in these processes to improve the accuracy of the selected points. One can also improve the accuracy of the corresponding points through the use of weighting function as well as repeated filtering mechanism. With these high precision corresponding points, we can compute more accurate fundamental matrix in order to reconstruct the 3D models and other applications in computer vision. We have tested three sets of data, one of that is self-constructed data using the 3ds Max and the other two are downloaded from the internet. We started by demonstrating the geometric properties of trifocal tensor associated with three images and showed that it can be used to filter out the bad corresponding points. Then, we successfully extended this mechanism to more images and successfully improved the accuracy of the corresponding points among these images.
參考文獻	[1] 丁彥弘，"從未校正影像序列做三維建築物重建"，華梵大學資訊管理學系碩士論文，台北，民國94年5月。 [2] 吳坤信，"從多視角已校正影像改善三維粗略模型"，政治大學資訊科學系碩士論文，台北，民國97年7月。 [3] 林志偉，"以影像為基礎之新景合成"，清華大學電機工程學系碩士論文，新竹，民國95年7月。 [4] 張庭慈，"谷歌街景圖之長場景全景視覺化"，台灣大學資訊工程系碩士論文，台北，民國99年7月。 [5] 彭騏，"用於路線規劃的整合式平滑谷歌街景影片與地圖"，台灣大學資訊網路與多媒體所碩士論文，台北，民國99年7月。 [6] 楊子頤，"一個利用影像組的共同主平面之強健 Fundamental Matrix 計算法"，清華大學電機工程學系碩士論文，新竹，民國94年10月。 [7] 廖怡儂，"應用在電腦視覺強健式估測之研究"，華梵大學資訊管理學系碩士論文，台北，民國94年5月。 [8] 蔡瑞陽，"從多視角影像萃取密集影像對應"，政治大學資訊科學系碩士論文，台北，民國98年7月。 [9] 蔡政君，"使用光束調整法與多張影像做相機效正與三維模型重建"，政治大學資訊科學系碩士論文，台北，民國97年7月。 [10] A. Alzati and A. Tortora, "A geometric approach to the trifocal tensor," Journal of Mathematical Imaging and Vision, vol. 38, pp. 159-170, 2010. [11] X. Armangue and J. Salvi, "Overall view regarding fundamental matrix estimation," Image and vision computing, vol. 21, pp. 205-220, 2003. [12] Y. Furukawa and J. Ponce, "Accurate, dense, and robust multi-view stereopsis," IEEE Conference on Computer Vision and Pattern Recognition, pp. 1-8, 2007. [13] Y. Furukawa and J. Ponce, "Accurate camera calibration from multi-view stereo and bundle adjustment," International Journal of Computer Vision, vol. 84, pp. 257-268, 2009. [14] C. Harris and M. Stephens, "A combined corner and edge detector," Proceedings of The Fourth Alvey Vision Conference Manchester, pp. 147-151, 1988. [15] R. Hartley and A. Zisserman, Multiple view geometry in computer vision, Cambridge University Press, 2003. [16] R. I. Hartley, "Lines and points in three views and the trifocal tensor," International Journal of Computer Vision, vol. 22, pp. 125-140, 1997. [17] D. Lowe, "Distinctive image features from scale-invariant keypoints," International Journal of Computer Vision, vol. 60, pp. 91-110, 2004. [18] M. I. A. Lourakis and A. A. Argyros, "SBA: A software package for generic sparse bundle adjustment," ACM Transactions on Mathematical Software, vol. 36, pp. 1-30, 2009. [19] E. Mouragnon, et al., "Generic and real-time structure from motion using local bundle adjustment," Image and vision computing, vol. 27, pp. 1178-1193, 2009. [20] D. Nister, "Reconstruction from uncalibrated sequences with a hierarchy of trifocal tensors," Proceedings of European Conference on Computer Vision, pp. 649-663, 2000. [21] D. Nister, "Preemptive RANSAC for live structure and motion estimation," Proceedings Ninth IEEE International Conference on Computer Vision, vol. 16, pp. 199-206, 2003. [22] P. J. Rousseeuw, "Least median of squares regression," Journal of the American Statistical Association, vol. 79, pp. 871-880, 1984. [23] N. Snavely, et al., "Photo tourism: exploring photo collections in 3D," ACM Transactions on Graphics, vol. 25, pp. 835-846, 2006. [24] N. Snavely, S. M. Seitz, and R. Szeliski, "Modeling the World from Internet Photo Collections," International Journal of Computer Vision, vol. 80, no. 2, pp. 189-210, 2007. [25] L. Taehee, "Multithreaded hybrid feature tracking for markerless augmented reality," IEEE Transactions on Visualization and Computer Graphics, vol. 15, pp. 355-368, 2009. [26] P. H. S. Torr and A. Zisserman, "Robust parameterization and computation of the trifocal tensor," Image and Vision Computing, vol. 15, pp. 591-605, 1997. [27] Z. Zhang, "Parameter estimation techniques: A tutorial with application to conic fitting," Image and Vision Computing, vol. 15, pp. 59-76, 1997. [28] http://vision.middlebury.edu/mview/ [29] http://en.wikipedia.org/wiki/Epipolar_geometry [30] http://phototour.cs.washington.edu/ [31] http://phototour.cs.washington.edu/bundler/ [32] http://cvlab.epfl.ch/~strecha/multiview/denseMVS.html
描述	碩士國立政治大學資訊科學學系 98753035 99
資料來源	http://thesis.lib.nccu.edu.tw/record/#G0098753035
資料類型	thesis

dc.contributor.advisor	何瑁鎧	zh_TW
dc.contributor.advisor	Hor, Maw Kae	en_US
dc.contributor.author (Authors)	李紹暐	zh_TW
dc.contributor.author (Authors)	Li, Shau Wei	en_US
dc.creator (作者)	李紹暐	zh_TW
dc.creator (作者)	Li, Shau Wei	en_US
dc.date (日期)	2010	en_US
dc.date.accessioned	4-Sep-2013 17:09:18 (UTC+8)	-
dc.date.available	4-Sep-2013 17:09:18 (UTC+8)	-
dc.date.issued (上傳時間)	4-Sep-2013 17:09:18 (UTC+8)	-
dc.identifier (Other Identifiers)	G0098753035	en_US
dc.identifier.uri (URI)	http://nccur.lib.nccu.edu.tw/handle/140.119/60256	-
dc.description (描述)	碩士	zh_TW
dc.description (描述)	國立政治大學	zh_TW
dc.description (描述)	資訊科學學系	zh_TW
dc.description (描述)	98753035	zh_TW
dc.description (描述)	99	zh_TW
dc.description.abstract (摘要)	電腦視覺三維建模的精確度，仰賴影像中對應點的準確性。以前的研究大多採取兩張影像，透過極線轉換(epipolar transfer)取得影像間基礎矩陣(fundamental matrix)的關係，然後進行比對或過濾不良的對應點以求取精確的對應點。然極線轉換存在退化的問題，如何避免此退化問題以及降低兩張影像之間轉換錯誤的累積，成為求取精確三維建模中極待解決的課題。本論文中，我們提出一套機制，透過三焦張量(trifocal tensor)的觀念來過濾影像間不良的對應點，提高整體對應點的準確度，從而能計算較精確的投影矩陣進行三維建模。我們由多視角影像出發，先透過Bundler求取對應點，然後採用三焦張量過濾Bundler產生的對應點，並輔以最小中值平方法(LMedS)提升選點之準確率，再透過權重以及重複過濾等機制來調節並過濾對應點，從而取得精確度較高的對應點組合，最後求取投影矩陣進行電腦視覺中的各項應用。實作中，我們測詴了三組資料，包含一組以3ds Max自行建置的資料與兩組網路中取得的資料。我們先從三張影像驗證三焦張量的幾何特性與其過濾對應點的可行性，再將此方法延伸至多張影像，同樣也能證實透過三焦張量確實能提升對應點的準確度，甚至可以過濾出輸入資料中較不符合彼此間幾何性的影像。	zh_TW
dc.description.abstract (摘要)	The accuracy of 3D model constructions in computer vision depends on the accuracy of the corresponding points extracted from the images. Previous studies in this area mostly use two images and compute the fundamental matrix through the use of the epipolar geometry and then proceed for corresponding point matching and filtering out the outliers in order to get accurate corresponding points. However, the epipoler transform suffers from the degenerate problems and, also, the accumulated conversion errors during the corresponding matches both will degrade the model accuracy. Solving these problems become crucial in reconstructing accurate 3D models from multiple images. In this thesis, we proposed a mechanism to obtain accurate corresponding points for 3D model reconstruction from multiple images. The concept of trifocal tensor is used to remove the outliers in order to improve the overall accuracy of the corresponding points. We first use Bundler to search the corresponding points in the feature points extracted from multiple view images. Then we use trifocal tensor to determine and remove the outliers in the corresponding points generated by Bundler. LMedS is used in these processes to improve the accuracy of the selected points. One can also improve the accuracy of the corresponding points through the use of weighting function as well as repeated filtering mechanism. With these high precision corresponding points, we can compute more accurate fundamental matrix in order to reconstruct the 3D models and other applications in computer vision. We have tested three sets of data, one of that is self-constructed data using the 3ds Max and the other two are downloaded from the internet. We started by demonstrating the geometric properties of trifocal tensor associated with three images and showed that it can be used to filter out the bad corresponding points. Then, we successfully extended this mechanism to more images and successfully improved the accuracy of the corresponding points among these images.	en_US
dc.description.tableofcontents	第一章緒論.................................................................................................................. 1 1.1 研究動機與目的 ................................................................................................. 1 1.2 問題描述 ............................................................................................................. 4 第二章相關研究.......................................................................................................... 5 2.1 相關論文 ............................................................................................................. 5 2.1.1 兩張影像 ...................................................................................................... 5 2.1.2多視角影像 ................................................................................................... 6 2.1.3 三焦張量 ...................................................................................................... 9 2.1.4 三維建模 .................................................................................................... 14 2.2 背景知識 ........................................................................................................... 15 2.2.1 極線轉換 .................................................................................................... 15 2.2.2 Bundler：從運動恢復結構 ........................................................................ 17 2.2.3 特徵不變量轉換 ........................................................................................ 17 2.2.4 最小中值平方法 ........................................................................................ 18 2.2.5 投影幾何 .................................................................................................... 20 第三章三焦張量特性與流程.................................................................................... 21 3.1 三焦張量特性探討 ........................................................................................... 21 3.1.1 三焦張量幾何特性 .................................................................................... 22 3.1.2 最小中值平方法計算三焦張量 ................................................................ 27 3.1.3 二維影像投影的校正 ................................................................................ 30 3.1.4 分析三焦張量的選點 ................................................................................ 32 3.2 三焦張量流程 ................................................................................................... 33 3.2.1 驗證Bundler可用性 ................................................................................. 35 3.2.2 引入Bundler執行三焦張量實驗 ............................................................. 38 3.2.3 多張影像執行三焦張量 ............................................................................ 43 第四章實驗結果........................................................................................................ 46 4.1 三焦張量過濾前後的評估 ............................................................................... 46 4.2 三焦張量過濾終止條件 ................................................................................... 51 4.3 過濾影像機制 ................................................................................................... 53 第五章結論與討論.................................................................................................... 55 5.1 結論 ................................................................................................................... 55 5.2未來方向 ............................................................................................................ 57 參考文獻...................................................................................................................... 58 實驗附錄...................................................................................................................... 61	zh_TW
dc.format.extent	3531831 bytes	-
dc.format.mimetype	application/pdf	-
dc.language.iso	en_US	-
dc.source.uri (資料來源)	http://thesis.lib.nccu.edu.tw/record/#G0098753035	en_US
dc.subject (關鍵詞)	三焦張量	zh_TW
dc.subject (關鍵詞)	多視角影像	zh_TW
dc.subject (關鍵詞)	極線轉換	zh_TW
dc.subject (關鍵詞)	最小中值平方法	zh_TW
dc.subject (關鍵詞)	投影矩陣	zh_TW
dc.subject (關鍵詞)	trifocal tensor	en_US
dc.subject (關鍵詞)	multiple view images	en_US
dc.subject (關鍵詞)	epipolar transfer	en_US
dc.subject (關鍵詞)	LMedS	en_US
dc.subject (關鍵詞)	projection matrix	en_US
dc.title (題名)	三焦張量在多視角幾何中的計算與應用	zh_TW
dc.title (題名)	Computation and Applications of Trifocal Tensor in Multiple View Geometry	en_US
dc.type (資料類型)	thesis	en
dc.relation.reference (參考文獻)	[1] 丁彥弘，"從未校正影像序列做三維建築物重建"，華梵大學資訊管理學系碩士論文，台北，民國94年5月。 [2] 吳坤信，"從多視角已校正影像改善三維粗略模型"，政治大學資訊科學系碩士論文，台北，民國97年7月。 [3] 林志偉，"以影像為基礎之新景合成"，清華大學電機工程學系碩士論文，新竹，民國95年7月。 [4] 張庭慈，"谷歌街景圖之長場景全景視覺化"，台灣大學資訊工程系碩士論文，台北，民國99年7月。 [5] 彭騏，"用於路線規劃的整合式平滑谷歌街景影片與地圖"，台灣大學資訊網路與多媒體所碩士論文，台北，民國99年7月。 [6] 楊子頤，"一個利用影像組的共同主平面之強健 Fundamental Matrix 計算法"，清華大學電機工程學系碩士論文，新竹，民國94年10月。 [7] 廖怡儂，"應用在電腦視覺強健式估測之研究"，華梵大學資訊管理學系碩士論文，台北，民國94年5月。 [8] 蔡瑞陽，"從多視角影像萃取密集影像對應"，政治大學資訊科學系碩士論文，台北，民國98年7月。 [9] 蔡政君，"使用光束調整法與多張影像做相機效正與三維模型重建"，政治大學資訊科學系碩士論文，台北，民國97年7月。 [10] A. Alzati and A. Tortora, "A geometric approach to the trifocal tensor," Journal of Mathematical Imaging and Vision, vol. 38, pp. 159-170, 2010. [11] X. Armangue and J. Salvi, "Overall view regarding fundamental matrix estimation," Image and vision computing, vol. 21, pp. 205-220, 2003. [12] Y. Furukawa and J. Ponce, "Accurate, dense, and robust multi-view stereopsis," IEEE Conference on Computer Vision and Pattern Recognition, pp. 1-8, 2007. [13] Y. Furukawa and J. Ponce, "Accurate camera calibration from multi-view stereo and bundle adjustment," International Journal of Computer Vision, vol. 84, pp. 257-268, 2009. [14] C. Harris and M. Stephens, "A combined corner and edge detector," Proceedings of The Fourth Alvey Vision Conference Manchester, pp. 147-151, 1988. [15] R. Hartley and A. Zisserman, Multiple view geometry in computer vision, Cambridge University Press, 2003. [16] R. I. Hartley, "Lines and points in three views and the trifocal tensor," International Journal of Computer Vision, vol. 22, pp. 125-140, 1997. [17] D. Lowe, "Distinctive image features from scale-invariant keypoints," International Journal of Computer Vision, vol. 60, pp. 91-110, 2004. [18] M. I. A. Lourakis and A. A. Argyros, "SBA: A software package for generic sparse bundle adjustment," ACM Transactions on Mathematical Software, vol. 36, pp. 1-30, 2009. [19] E. Mouragnon, et al., "Generic and real-time structure from motion using local bundle adjustment," Image and vision computing, vol. 27, pp. 1178-1193, 2009. [20] D. Nister, "Reconstruction from uncalibrated sequences with a hierarchy of trifocal tensors," Proceedings of European Conference on Computer Vision, pp. 649-663, 2000. [21] D. Nister, "Preemptive RANSAC for live structure and motion estimation," Proceedings Ninth IEEE International Conference on Computer Vision, vol. 16, pp. 199-206, 2003. [22] P. J. Rousseeuw, "Least median of squares regression," Journal of the American Statistical Association, vol. 79, pp. 871-880, 1984. [23] N. Snavely, et al., "Photo tourism: exploring photo collections in 3D," ACM Transactions on Graphics, vol. 25, pp. 835-846, 2006. [24] N. Snavely, S. M. Seitz, and R. Szeliski, "Modeling the World from Internet Photo Collections," International Journal of Computer Vision, vol. 80, no. 2, pp. 189-210, 2007. [25] L. Taehee, "Multithreaded hybrid feature tracking for markerless augmented reality," IEEE Transactions on Visualization and Computer Graphics, vol. 15, pp. 355-368, 2009. [26] P. H. S. Torr and A. Zisserman, "Robust parameterization and computation of the trifocal tensor," Image and Vision Computing, vol. 15, pp. 591-605, 1997. [27] Z. Zhang, "Parameter estimation techniques: A tutorial with application to conic fitting," Image and Vision Computing, vol. 15, pp. 59-76, 1997. [28] http://vision.middlebury.edu/mview/ [29] http://en.wikipedia.org/wiki/Epipolar_geometry [30] http://phototour.cs.washington.edu/ [31] http://phototour.cs.washington.edu/bundler/ [32] http://cvlab.epfl.ch/~strecha/multiview/denseMVS.html	zh_TW

Publications-Theses

Article View/Open

Publication Export

Google ScholarTM

NCCU Library

Citation Infomation

Related Publications in TAIR

Google Scholar^TM