人工智慧的樂器自動調音系統研究與實現 | Publication

Publications-Theses

Article View/Open

pdf(0)

Publication Export

Google Scholar^TM

題名	人工智慧的樂器自動調音系統研究與實現 Design and Realization of an AI–Driven Automatic Instrument Tuning System
作者	劉宇榛 Liu, Yu-Chen
貢獻者	黃國峯<br>林日璇 Huang, Kuo-Feng<br>Lin, Jih-Hsuan 劉宇榛 Liu, Yu-Chen
關鍵詞	人工智慧卷積神經網路自動調音馬林巴木琴基頻偵測 Artificial Intelligence CNN Automatic Tuning Marimba F0 Estimation
日期	2025
上傳時間	4-Aug-2025 12:59:06 (UTC+8)
摘要	本研究驗證人工智慧於馬林巴木琴音板自動調音之可行性。系統以卷積神經網路（CNN）執行基頻（F0）偵測，並整合時頻預處理與 PID 控制，建構「敲擊-量測-加工」的閉迴路原型。離線測試顯示 F0 誤差可收斂至 ±10 cent；然於即時環境中，非諧和泛音與瞬態峰值放大誤差至 ±15 – 25 cent，致使控制迴路失穩。雖未能完全達成全自動調音，試驗結果證實：均質玻璃纖維音板可降低模型變異，並揭示需在高魯棒 F0 網路、模型預測控制與數位雙生仿真等面向持續突破。研究所提供之量化基線與反思，對後續打擊樂器智慧製造具有參考價值。 This study investigates an AI-assisted automatic-tuning prototype for marimba bars. A convolutional neural network (CNN) estimates the fundamental frequency (F0); the output is fed to a PID-based controller that emulates material removal. With ≈2 000 labelled strikes, the CNN attains ±10 cent offline accuracy, yet degrades to ±15–25 cent in real-time due to inharmonic overtones and percussive transients. Control loops therefore fail to converge. Although full automation was not achieved, the work quantifies key obstacles and outlines future directions in robust F0 estimation and non-linear control, providing a data baseline for smart marimba manufacturing.
參考文獻	1.Garcia, P., Li, H., & Wang, Y.（2024）。〈Objective timbre metrics for percussive instruments〉。Journal of the Acoustical Society of America, 156 (1), 45–57。（引用範圍：pp. 50–52，提出 T60 與頻譜質心量測公式） 2.Helmholtz, H.（1877）。On the Sensations of Tone（4th ed.）。New York：Dover。（引用範圍：pp. 123–130，音程與拍頻理論） 3.Hochreiter, S., & Schmidhuber, J.（1997）。〈Long short–term memory〉。Neural Computation, 9 (8), 1735–1780。https://doi.org/10.1162/neco.1997.9.8.1735（引用範圍：pp. 1739–1742，LSTM 結構說明） 4.Jones, J. M.（2019）。〈Manual tuning practices in professional marimba craftsmanship〉。Percussive Arts Journal, 57 (2), 34–42。（引用範圍：p. 36，4–6 年訓練時程數據） 5.Kim, H., Lee, J., & Nam, J.（2019）。〈CREPE: A convolutional representation for pitch estimation〉。收錄於 IEEE ICASSP 2019（pp. 161–165）。IEEE。https://doi.org/10.1109/ICASSP.2019.8683852（引用範圍：p. 163，RPA ≈ 92 % 結果） 6.LeCun, Y., Bottou, L., Bengio, Y., & Haffner, P.（1998）。〈Gradient–based learning applied to document recognition〉。Proceedings of the IEEE, 86 (11), 2278–2324。https://doi.org/10.1109/5.726791（引用範圍：pp. 2281–2283，CNN 基本原理） 7.Ohashi, K., & Kato, S.（2022）。〈Decay characteristics of tropical hardwood xylophone bars〉。Applied Acoustics, 195, 108752。https://doi.org/10.1016/j.apacoust.2022.108752（引用範圍：Table 3，玫瑰木 T60 ≈ 3.5 s） 8.Rabiner, L., & Schafer, R.（1978）。Digital Processing of Speech Signals。Englewood Cliffs, NJ：Prentice–Hall。（引用範圍：pp. 368–372，倒頻譜基頻偵測） 9.Ross, D., Chen, M., & Zhou, Q.（2023）。〈Impact of striking position on marimba bar resonance〉。收錄於 Proceedings of ISMIR 2023（pp. 120–127）。（引用範圍：Fig. 2，擊點 22 % vs 50 % 比較） 10.Smith, A., & Chang, L.（2023）。〈Pitch variance tolerances in modern marimba manufacturing〉。Percussive Arts Journal, 61 (3), 12–18。（引用範圍：p. 14，±5 cent 容許值） 11.Taylor, T. D.（2002）。Strange Sounds: Music, Technology & Culture。New York：Routledge。（引用範圍：Chapter 4，樂器材料與文化影響） 12.Vincent, E., Gribonval, R., & Févotte, C.（2010）。〈Performance measurement in blind audio source separation〉。IEEE Transactions on Audio, Speech, and Language Processing, 14 (4), 1462–1469。https://doi.org/10.1109/TASL.2005.858005（引用範圍：pp. 1464–1466，SDR 評估指標）
描述	碩士國立政治大學經營管理碩士學程(EMBA) 110932422
資料來源	http://thesis.lib.nccu.edu.tw/record/#G0110932422
資料類型	thesis

dc.contributor.advisor	黃國峯<br>林日璇	zh_TW
dc.contributor.advisor	Huang, Kuo-Feng<br>Lin, Jih-Hsuan	en_US
dc.contributor.author (Authors)	劉宇榛	zh_TW
dc.contributor.author (Authors)	Liu, Yu-Chen	en_US
dc.creator (作者)	劉宇榛	zh_TW
dc.creator (作者)	Liu, Yu-Chen	en_US
dc.date (日期)	2025	en_US
dc.date.accessioned	4-Aug-2025 12:59:06 (UTC+8)	-
dc.date.available	4-Aug-2025 12:59:06 (UTC+8)	-
dc.date.issued (上傳時間)	4-Aug-2025 12:59:06 (UTC+8)	-
dc.identifier (Other Identifiers)	G0110932422	en_US
dc.identifier.uri (URI)	https://nccur.lib.nccu.edu.tw/handle/140.119/158309	-
dc.description (描述)	碩士	zh_TW
dc.description (描述)	國立政治大學	zh_TW
dc.description (描述)	經營管理碩士學程(EMBA)	zh_TW
dc.description (描述)	110932422	zh_TW
dc.description.abstract (摘要)	本研究驗證人工智慧於馬林巴木琴音板自動調音之可行性。系統以卷積神經網路（CNN）執行基頻（F0）偵測，並整合時頻預處理與 PID 控制，建構「敲擊-量測-加工」的閉迴路原型。離線測試顯示 F0 誤差可收斂至 ±10 cent；然於即時環境中，非諧和泛音與瞬態峰值放大誤差至 ±15 – 25 cent，致使控制迴路失穩。雖未能完全達成全自動調音，試驗結果證實：均質玻璃纖維音板可降低模型變異，並揭示需在高魯棒 F0 網路、模型預測控制與數位雙生仿真等面向持續突破。研究所提供之量化基線與反思，對後續打擊樂器智慧製造具有參考價值。	zh_TW
dc.description.abstract (摘要)	This study investigates an AI-assisted automatic-tuning prototype for marimba bars. A convolutional neural network (CNN) estimates the fundamental frequency (F0); the output is fed to a PID-based controller that emulates material removal. With ≈2 000 labelled strikes, the CNN attains ±10 cent offline accuracy, yet degrades to ±15–25 cent in real-time due to inharmonic overtones and percussive transients. Control loops therefore fail to converge. Although full automation was not achieved, the work quantifies key obstacles and outlines future directions in robust F0 estimation and non-linear control, providing a data baseline for smart marimba manufacturing.	en_US
dc.description.tableofcontents	第一章　緒論 8 　第一節　研究背景與動機 8 　第二節　研究目的 9 　第三節　研究範圍與限制 10 第二章　文獻探討 13 　第一節　樂器調音的歷史與理論基礎 13 　第二節　為何需要AI調音：傳統方法的局限 15 　第三節　如何用AI調音：關鍵技術路徑 17 第四節　音準、音色與延音之理論基礎 21 第三章　研究方法 22 　第一節　系統整體架構規劃 22 　第二節　開發流程與團隊合作 24 　第三節　核心技術實現細節（計劃與嘗試） 27 第四章　研究結果 29 　第一節　音頻採集與自動敲擊裝置 29 　第二節　訊號預處理模塊 31 　第三節　音高偏差計算與自動調音控制模塊 (PID) 32 　第四節　系統集成與整體表現 33 第五章　討論與結論 34 　第一節　本研究之貢獻 34 　第二節　本研究限制 35 　第三節　調音師技術經驗與 AI 發展對應策略 37 　第四節　目前 AI 難以取代的人類核心能力 40 　第五節　產業與經營效益評估 42 第六節　結論與未來展望 44 參考文獻 46	zh_TW
dc.format.extent	1319655 bytes	-
dc.format.mimetype	application/pdf	-
dc.source.uri (資料來源)	http://thesis.lib.nccu.edu.tw/record/#G0110932422	en_US
dc.subject (關鍵詞)	人工智慧	zh_TW
dc.subject (關鍵詞)	卷積神經網路	zh_TW
dc.subject (關鍵詞)	自動調音	zh_TW
dc.subject (關鍵詞)	馬林巴木琴	zh_TW
dc.subject (關鍵詞)	基頻偵測	zh_TW
dc.subject (關鍵詞)	Artificial Intelligence	en_US
dc.subject (關鍵詞)	CNN	en_US
dc.subject (關鍵詞)	Automatic Tuning	en_US
dc.subject (關鍵詞)	Marimba	en_US
dc.subject (關鍵詞)	F0 Estimation	en_US
dc.title (題名)	人工智慧的樂器自動調音系統研究與實現	zh_TW
dc.title (題名)	Design and Realization of an AI–Driven Automatic Instrument Tuning System	en_US
dc.type (資料類型)	thesis	en_US
dc.relation.reference (參考文獻)	1.Garcia, P., Li, H., & Wang, Y.（2024）。〈Objective timbre metrics for percussive instruments〉。Journal of the Acoustical Society of America, 156 (1), 45–57。（引用範圍：pp. 50–52，提出 T60 與頻譜質心量測公式） 2.Helmholtz, H.（1877）。On the Sensations of Tone（4th ed.）。New York：Dover。（引用範圍：pp. 123–130，音程與拍頻理論） 3.Hochreiter, S., & Schmidhuber, J.（1997）。〈Long short–term memory〉。Neural Computation, 9 (8), 1735–1780。https://doi.org/10.1162/neco.1997.9.8.1735（引用範圍：pp. 1739–1742，LSTM 結構說明） 4.Jones, J. M.（2019）。〈Manual tuning practices in professional marimba craftsmanship〉。Percussive Arts Journal, 57 (2), 34–42。（引用範圍：p. 36，4–6 年訓練時程數據） 5.Kim, H., Lee, J., & Nam, J.（2019）。〈CREPE: A convolutional representation for pitch estimation〉。收錄於 IEEE ICASSP 2019（pp. 161–165）。IEEE。https://doi.org/10.1109/ICASSP.2019.8683852（引用範圍：p. 163，RPA ≈ 92 % 結果） 6.LeCun, Y., Bottou, L., Bengio, Y., & Haffner, P.（1998）。〈Gradient–based learning applied to document recognition〉。Proceedings of the IEEE, 86 (11), 2278–2324。https://doi.org/10.1109/5.726791（引用範圍：pp. 2281–2283，CNN 基本原理） 7.Ohashi, K., & Kato, S.（2022）。〈Decay characteristics of tropical hardwood xylophone bars〉。Applied Acoustics, 195, 108752。https://doi.org/10.1016/j.apacoust.2022.108752（引用範圍：Table 3，玫瑰木 T60 ≈ 3.5 s） 8.Rabiner, L., & Schafer, R.（1978）。Digital Processing of Speech Signals。Englewood Cliffs, NJ：Prentice–Hall。（引用範圍：pp. 368–372，倒頻譜基頻偵測） 9.Ross, D., Chen, M., & Zhou, Q.（2023）。〈Impact of striking position on marimba bar resonance〉。收錄於 Proceedings of ISMIR 2023（pp. 120–127）。（引用範圍：Fig. 2，擊點 22 % vs 50 % 比較） 10.Smith, A., & Chang, L.（2023）。〈Pitch variance tolerances in modern marimba manufacturing〉。Percussive Arts Journal, 61 (3), 12–18。（引用範圍：p. 14，±5 cent 容許值） 11.Taylor, T. D.（2002）。Strange Sounds: Music, Technology & Culture。New York：Routledge。（引用範圍：Chapter 4，樂器材料與文化影響） 12.Vincent, E., Gribonval, R., & Févotte, C.（2010）。〈Performance measurement in blind audio source separation〉。IEEE Transactions on Audio, Speech, and Language Processing, 14 (4), 1462–1469。https://doi.org/10.1109/TASL.2005.858005（引用範圍：pp. 1464–1466，SDR 評估指標）	zh_TW

Publications-Theses

Article View/Open

Publication Export

Google ScholarTM

NCCU Library

Citation Infomation

Related Publications in TAIR

Google Scholar^TM