dc.contributor.advisor | 張家銘 | zh_TW |
dc.contributor.advisor | Chang, Jia-Ming | en_US |
dc.contributor.author (Authors) | 王崇飛 | zh_TW |
dc.contributor.author (Authors) | Wang, Chung-Fei | en_US |
dc.creator (作者) | 王崇飛 | zh_TW |
dc.creator (作者) | Wang, Chung-Fei | en_US |
dc.date (日期) | 2022 | en_US |
dc.date.accessioned | 5-Oct-2022 09:09:06 (UTC+8) | - |
dc.date.available | 5-Oct-2022 09:09:06 (UTC+8) | - |
dc.date.issued (上傳時間) | 5-Oct-2022 09:09:06 (UTC+8) | - |
dc.identifier (Other Identifiers) | G0106971017 | en_US |
dc.identifier.uri (URI) | http://nccur.lib.nccu.edu.tw/handle/140.119/142100 | - |
dc.description (描述) | 碩士 | zh_TW |
dc.description (描述) | 國立政治大學 | zh_TW |
dc.description (描述) | 資訊科學系碩士在職專班 | zh_TW |
dc.description (描述) | 106971017 | zh_TW |
dc.description.abstract (摘要) | 臺灣地理位置特殊,每年皆會面對颱風、洪水等天然災害的威脅,雖然無法避免災害的發生,卻能運用各類科技技術,來減少災害來臨時造成的威脅與損傷。近幾年因科技運算能力的強化,讓大數據、人工智慧與機器學習成為近幾年的熱門關鍵詞,而在災害防救上鮮少有運用數據分析技術針對災情與氣象進行研究,故本文將氣象資料與災情資訊應用羅吉斯迴歸及決策樹建立模型。本研究透過蒐集歷史氣象資料、災情資訊及氣象測站資料,將前述資料內容進行格式欄位統一、去除無關資料等資料清洗,再將其依據彼此關聯性進行測站內資料整合,以作為後續分析統計及建模之基準資料。模型則以測站內的氣象資料作為自變數,災情資料作為依變數,透過不同採樣方式進行訓練及測試資料集拆分,建構該氣象測站的模型,並將測試資料集透過模型產出預測的數值,使用混淆矩陣來比較不同條件下的精準度、精準率、召回率及F1-Score。分析結果得出平均準確率最高為99.7%,平均精準率最高為67.9%,平均召回率最高為81.9%,平均F1-Score最高為48.6%;若單獨以測站來看F1-Score最高為C0M730(嘉義市東區)測站的96.6%,且除C0M730(嘉義市東區)測站外,有60個測站在本文所建立的224個模型的表現(F1-Score>80%),達到預期的結果,其餘未達標的部分將於未來透過其他模型演算法或採樣方式進一步的精進。科技雖然無法改變氣候,卻可以改變面對氣候時的準備與應變,用最好的準備,來面對最壞的打算。 | zh_TW |
dc.description.abstract (摘要) | Due to its special geographical location, Taiwan faces the threat of natural disasters such as typhoons and floods every year. Although the occurrence of disasters cannot be avoided, various type of technology can be used to reduce the threats and damages caused by disasters.In recent years, due to the strengthening of scientific and technological computing capabilities, big data, artificial intelligence, and machine learning have become popular keywords. However, data analysis technology is rarely used in disaster prevention. Therefore, this paper uses the meteorological data and disaster information by Logistic Regression and Decision Tree to build models.This research will first collect meteorological data, disaster information and observation station data, and clean those data by unifying the format and deleting irrelevant data. Then integrate those data based on their correlation in each meteorological observation station to serve as the benchmark data for subsequent analysis, statistics, and modeling.The model using the meteorological data as the independent variable and the disaster data as the dependent variable, and then splits the training and testing data sets through different sampling methods. Build the model of the meteorological station and use the test data set to output the predicted value through the model, use the confusion matrix to compare the accuracy, precision, recall rate and F1-Score under different conditions.The analysis results show that the highest average accuracy rate is 99.7%, the highest average precision rate is 67.9%, the highest average recall rate is 81.9%, and the highest average F1-Score is 48.6%. If look at the observation station alone, the highest F1-Score is C0M730 (East District of Chiayi City) 96.6%. In addition to the C0M730 station, there are 60 stations of the 224 models building in this paper, in the performance reaching the expected(F1-Score> 80%). As a result, the remaining parts that do not reach the standard will be further refined through other model algorithm or sampling methods in the future.Although technology cannot change the climate, it can change the preparation and response to the climate. Use the best preparation to face the worst situation. | en_US |
dc.description.tableofcontents | 致謝 I摘要 IIAbstract III目錄 IV表目錄 VI圖目錄 IX第一章 緒論 1第一節 研究動機 1第二節 論文架構 1第二章 資料前處理 3第一節 資料蒐集 3一、 災情資料 5二、 氣象資料 7三、 測站資料 13第二節 資料清洗與整理 14一、 資料清洗 14二、 資料處理 18第三章 資料彙整與統計 20第一節 合併氣象資料與災情 20第二節 建立整體氣象與災情詳細資料 22第三節 建立單一測站單位內詳細資料(氣象與災情) 24第四節 建立以災情類別為單位之測站單位內詳細資料(氣象與災情) 24第五節 展開災情發生次數 25第六節 小結 26第四章 分析研究 27第一節 分析內容說明 27一、 分析流程 28二、 演算法說明 29三、 採樣方式說明 30四、 混淆矩陣說明 31第二節 以測站分析 32一、 資料筆數 32二、 羅吉斯迴歸Logistic Regression 32三、 決策樹Decision Tree 38第三節 以災情類別分析 45一、 資料筆數 45二、 羅吉斯迴歸Logistic Regression 48三、 決策樹 Decision Tree 72第四節 整體分析結果比較 89一、 準確率 89二、 精準率 89三、 召回率比較 90四、 F1-Score 91五、 以測站為單位比較 92第五節 總結 100第五章 結論與建議 101參考文獻 102 | zh_TW |
dc.format.extent | 5823465 bytes | - |
dc.format.mimetype | application/pdf | - |
dc.source.uri (資料來源) | http://thesis.lib.nccu.edu.tw/record/#G0106971017 | en_US |
dc.subject (關鍵詞) | 災害 | zh_TW |
dc.subject (關鍵詞) | 氣象 | zh_TW |
dc.subject (關鍵詞) | 羅吉斯迴歸 | zh_TW |
dc.subject (關鍵詞) | 決策樹 | zh_TW |
dc.subject (關鍵詞) | 風速 | zh_TW |
dc.subject (關鍵詞) | 雨量 | zh_TW |
dc.subject (關鍵詞) | 颱風 | zh_TW |
dc.subject (關鍵詞) | 水災 | zh_TW |
dc.subject (關鍵詞) | Disaster | en_US |
dc.subject (關鍵詞) | Weather | en_US |
dc.subject (關鍵詞) | Logistic Regression | en_US |
dc.subject (關鍵詞) | Decision Tree | en_US |
dc.subject (關鍵詞) | Rain | en_US |
dc.subject (關鍵詞) | Windspeed | en_US |
dc.subject (關鍵詞) | Typhoon | en_US |
dc.subject (關鍵詞) | Flood | en_US |
dc.title (題名) | 藉由氣象資料應用羅吉斯迴歸及決策樹模型來預測颱風及水災期間成災與否 | zh_TW |
dc.title (題名) | Predict Disaster during Typhoons and Floods with Meteorological data by using Logistic Regression and Decision Tree models | en_US |
dc.type (資料類型) | thesis | en_US |
dc.relation.reference (參考文獻) | [1] 內政部消防署-全民防災E點通-歷年災害專區,取自:https://bear.emic.gov.tw/MY/#/home/disasterInfo/history[2] 國家災害防救科技中心-全球災害事件簿-颱風事件,取自:https://den.ncdr.nat.gov.tw/1132/1188/[3] 民生公共物聯網-資料服務平台,取自https://ci.taiwan.gov.tw/dsp/index.aspx[4] 中央氣象局-測站代號及站況資料查詢,取自:https://e-service.cwb.gov.tw/wdps/obs/state.htm[5] 內政部TGOS全國門牌地址定位服務,取自:https://www.tgos.tw/tgos/Web/Address/TGOS_Address.aspx[6] Python-pandas, From: https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.html[7] scikit-learn- LogisticRegression, From:https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html[8] scikit-learn-Decision Tree, From:https://scikit-learn.org/stable/modules/Tree.html[9] imbalanced-learn- SMOT, From:https://imbalanced-learn.org/stable/references/generated/imblearn.over_sampling.SMOTE.html[10] imbalanced-learn-TomekLink, From:https://imbalanced-learn.org/stable/references/generated/imblearn.under_sampling.TomekLinks.html[11] imbalanced-learn- Combination of over- and under-sampling methods, From:https://imbalanced-learn.org/stable/references/combine.html[12] matplotlib- 3D scatterplot, From:https://matplotlib.org/stable/gallery/mplot3d/scatter3d.html | zh_TW |
dc.identifier.doi (DOI) | 10.6814/NCCU202201508 | en_US |