MNIST數據庫

MNIST數據庫（源自「National Institute of Standards and Technology database」^[1] ）是一個通常用於訓練各種數位影像處理系統的大型數據庫^[2]^[3]。該數據庫通過對來自NIST原始數據庫的樣本進行修改創建，涵蓋手寫數字的圖像，共包含60,000張訓練圖像和10,000張測試圖像，尺寸為28×28像素。該數據庫廣泛運用於機器學習領域的訓練與測試當中^[4]^[5]。MNIST在其發布時使用支持向量機的錯誤率為0.8%，但一些研究後來通過使用深度學習技術顯著改進了這一成績。

歷史

MNIST數據庫通過「重混」（re-mixing）的來自NIST原始數據庫的樣本創建^[6]。創建者認為，由於NIST的訓練數據來自美國人口普查局的員工，而測試數據取自美國高中學生，這樣的數據集不適合用來進行研究^[7]。此外，NIST的黑白圖像被歸一化（英語：Normalization (image processing)）處理，以適應28×28像素的邊界框，並進行了抗鋸齒（英語：Spatial anti-aliasing）處理，從而引入了灰度級別^[7]。

MNIST數據庫包含有60,000張訓練圖像與10,000張測試圖像^[8]。訓練集的一半和測試集的一半來自NIST的訓練數據集，而訓練集的另一半和測試集的另一半則來自NIST的測試數據集^[9]。數據庫的原始創建者保留了一些在其上測試的算法方法的列表^[7]。在他們的原始論文中，他們使用支持向量機獲得了0.8%的錯誤率^[10]。然而，原始的MNIST數據庫含有至少4個錯誤標籤^[11]。

擴展MNIST（EMNIST）是由NIST開發和發布的一個更新的數據集，作為MNIST的（最終）繼任者^[12]^[13]。MNIST僅包含手寫數字的圖像，而EMNIST包括NIST特別數據庫19中的所有圖像，該數據庫包含大量的手寫大寫和小寫字母以及數字的圖像^[14]^[15]。

表現

一些研究通過使用人工神經網絡在MNIST數據庫中取得了「接近人類的表現」^[16]。原始數據庫官方網站上列出的最高錯誤率為12%，這是使用簡單線性分類器且沒有預處理時的成績^[10]^[7]。

在2004年，研究人員使用一種名為「LIRA」的基於羅森布拉特感知器原理的三層神經分類器，在數據庫上實現了0.42%的最佳錯誤率^[17]。

一些研究者使用隨機失真的MNIST數據庫對人工智能系統進行測試。這些系統通常是人工神經網絡系統，所使用的失真方式可能是仿射失真或彈性失真（英語：Elastic deformation）^[7]。在某些情況下，這些系統可以非常成功；其中一個系統在數據庫上實現了0.39%的錯誤率^[18]。

2011年，研究人員報告使用類似的神經網絡系統，實現了0.27%的錯誤率，提升了之前的最佳成績^[19]。2013年，一種基於DropConnect正則化神經網絡的方法聲稱實現了0.21%的錯誤率^[20]。2016年，單個卷積神經網絡在MNIST上的最佳性能為0.25%的錯誤率^[21]。截至2018年8月，使用MNIST訓練數據、沒有數據增強的單個卷積神經網絡的最佳性能為0.25%的錯誤率^[21]^[22]。此外，烏克蘭赫梅爾尼茨基的並行計算中心（Parallel Computing Center）使用了僅5個卷積神經網絡的集成，在MNIST數據庫上表現為0.21%的錯誤率^[23]^[24]。

參見

機器學習研究數據集列表（英語：List of datasets for machine learning research）
Caltech 101（英語：Caltech 101）
LabelMe（英語：LabelMe）
光學字符識別

參考來源

^ THE MNIST DATABASE of handwritten digits. Yann LeCun, Courant Institute, NYU Corinna Cortes, Google Labs, New York Christopher J.C. Burges, Microsoft Research, Redmond.
^ Support vector machines speed pattern recognition - Vision Systems Design. Vision Systems Design. [2013-08-17].
^ Gangaputra, Sachin. Handwritten digit database. [2013-08-17].
^ Qiao, Yu. THE MNIST DATABASE of handwritten digits. 2007 [2013-08-18]. （原始內容存檔於2018年2月11號）.
^ Platt, John C. Using analytic QP and sparseness to speed training of support vector machines (PDF). Advances in Neural Information Processing Systems. 1999: 557–563 [2013-08-18]. （原始內容 (PDF)存檔於2016-03-04）.
^ Grother, Patrick J. NIST Special Database 19 - Handprinted Forms and Characters Database (PDF). National Institute of Standards and Technology.
^ ^7.0 ^7.1 ^7.2 ^7.3 ^7.4 LeCun, Yann; Cortez, Corinna; Burges, Christopher C.J. The MNIST Handwritten Digit Database. Yann LeCun's Website yann.lecun.com. [2020-04-30].
^ Kussul, Ernst; Baidyk, Tatiana. Improved method of handwritten digit recognition tested on MNIST database. Image and Vision Computing. 2004, 22 (12): 971–981. doi:10.1016/j.imavis.2004.03.008.
^ Zhang, Bin; Srihari, Sargur N. Fast k-Nearest Neighbor Classification Using Cluster-Based Trees (PDF). IEEE Transactions on Pattern Analysis and Machine Intelligence. 2004, 26 (4): 525–528 [2020-04-20]. PMID 15382657. doi:10.1109/TPAMI.2004.1265868. （原始內容 (PDF)存檔於2021年7月25號）.
^ ^10.0 ^10.1 LeCun, Yann; Léon Bottou; Yoshua Bengio; Patrick Haffner. Gradient-Based Learning Applied to Document Recognition (PDF). Proceedings of the IEEE. 1998, 86 (11): 2278–2324 [2013-08-18]. doi:10.1109/5.726791.
^ Muller, Nicolas M.; Markert, Karla. Identifying Mislabeled Instances in Classification Datasets. 2019 International Joint Conference on Neural Networks (IJCNN). IEEE: 1–8. July 2019. ISBN 978-1-7281-1985-4. arXiv:1912.05283 . doi:10.1109/IJCNN.2019.8851920.
^ NIST. The EMNIST Dataset. NIST. 2017-04-04 [2022-04-11].
^ NIST. NIST Special Database 19. NIST. 2010-08-27 [2022-04-11].
^ Cohen, G.; Afshar, S.; Tapson, J.; van Schaik, A. EMNIST: an extension of MNIST to handwritten letters.. 2017. arXiv:1702.05373  [cs.CV].
^ Cohen, G.; Afshar, S.; Tapson, J.; van Schaik, A. EMNIST: an extension of MNIST to handwritten letters.. 2017. arXiv:1702.05373v1  [cs.CV].
^ Cires¸an, Dan; Ueli Meier; Jürgen Schmidhuber. Multi-column deep neural networks for image classification (PDF). 2012 IEEE Conference on Computer Vision and Pattern Recognition. 2012: 3642–3649. CiteSeerX 10.1.1.300.3283 . ISBN 978-1-4673-1228-8. S2CID 2161592. arXiv:1202.2745 . doi:10.1109/CVPR.2012.6248110.
^ Kussul, Ernst; Tatiana Baidyk. Improved method of handwritten digit recognition tested on MNIST database (PDF). Image and Vision Computing. 2004, 22 (12): 971–981 [2013-09-20]. doi:10.1016/j.imavis.2004.03.008. （原始內容 (PDF)存檔於2013-09-21）.
^ Ranzato, Marc'Aurelio; Christopher Poultney; Sumit Chopra; Yann LeCun. Efficient Learning of Sparse Representations with an Energy-Based Model (PDF). Advances in Neural Information Processing Systems. 2006, 19: 1137–1144 [2013-09-20].
^ Ciresan, Dan Claudiu; Ueli Meier; Luca Maria Gambardella; Jürgen Schmidhuber. Convolutional neural network committees for handwritten character classification (PDF). 2011 International Conference on Document Analysis and Recognition (ICDAR). 2011: 1135–1139 [2013-09-20]. CiteSeerX 10.1.1.465.2138 . ISBN 978-1-4577-1350-7. S2CID 10122297. doi:10.1109/ICDAR.2011.229. （原始內容 (PDF)存檔於2016-02-22）.
^ Wan, Li; Matthew Zeiler; Sixin Zhang; Yann LeCun; Rob Fergus. Regularization of Neural Network using DropConnect. International Conference on Machine Learning(ICML). 2013.
^ ^21.0 ^21.1 SimpleNet. Lets Keep it simple, Using simple architectures to outperform deeper and more complex architectures. 2016 [2020-12-03]. arXiv:1608.06037 .
^ SimpNet. Towards Principled Design of Deep Convolutional Networks: Introducing SimpNet. Github. 2018 [2020-12-03]. arXiv:1802.06205 .
^ Romanuke, Vadim. Parallel Computing Center (Khmelnytskyi, Ukraine) represents an ensemble of 5 convolutional neural networks which performs on MNIST at 0.21 percent error rate.. [2016-11-24].
^ Romanuke, Vadim. Training data expansion and boosting of convolutional neural networks for reducing the MNIST dataset error rate. Research Bulletin of NTUU "Kyiv Polytechnic Institute". 2016, 6 (6): 29–34. doi:10.20535/1810-0546.2016.6.84115 .

延伸閱讀

Ciresan, Dan; Meier, Ueli; Schmidhuber, Jürgen. Multi-column deep neural networks for image classification (PDF). 2012 IEEE Conference on Computer Vision and Pattern Recognition. New York, NY: Institute of Electrical and Electronics Engineers. June 2012: 3642–3649 [2013-12-09]. CiteSeerX 10.1.1.300.3283 . ISBN 9781467312264. OCLC 812295155. S2CID 2161592. arXiv:1202.2745 . doi:10.1109/CVPR.2012.6248110.

外部連結

[1] THE MNIST DATABASE of handwritten digits. Yann LeCun, Courant Institute, NYU Corinna Cortes, Google Labs, New York Christopher J.C. Burges, Microsoft Research, Redmond.

[2] Support vector machines speed pattern recognition - Vision Systems Design. Vision Systems Design. [2013-08-17].

[3] Gangaputra, Sachin. Handwritten digit database. [2013-08-17].

[4] Qiao, Yu. THE MNIST DATABASE of handwritten digits. 2007 [2013-08-18]. （原始內容存檔於2018年2月11號）.

[5] Platt, John C. Using analytic QP and sparseness to speed training of support vector machines (PDF). Advances in Neural Information Processing Systems. 1999: 557–563 [2013-08-18]. （原始內容 (PDF)存檔於2016-03-04）.

[6] Grother, Patrick J. NIST Special Database 19 - Handprinted Forms and Characters Database (PDF). National Institute of Standards and Technology.

[LeCun-7] 7.0 ^7.1 ^7.2 ^7.3 ^7.4 LeCun, Yann; Cortez, Corinna; Burges, Christopher C.J. The MNIST Handwritten Digit Database. Yann LeCun's Website yann.lecun.com. [2020-04-30].

[8] Kussul, Ernst; Baidyk, Tatiana. Improved method of handwritten digit recognition tested on MNIST database. Image and Vision Computing. 2004, 22 (12): 971–981. doi:10.1016/j.imavis.2004.03.008.

[9] Zhang, Bin; Srihari, Sargur N. Fast k-Nearest Neighbor Classification Using Cluster-Based Trees (PDF). IEEE Transactions on Pattern Analysis and Machine Intelligence. 2004, 26 (4): 525–528 [2020-04-20]. PMID 15382657. doi:10.1109/TPAMI.2004.1265868. （原始內容 (PDF)存檔於2021年7月25號）.

[Gradient-10] 10.0 ^10.1 LeCun, Yann; Léon Bottou; Yoshua Bengio; Patrick Haffner. Gradient-Based Learning Applied to Document Recognition (PDF). Proceedings of the IEEE. 1998, 86 (11): 2278–2324 [2013-08-18]. doi:10.1109/5.726791.

[11] Muller, Nicolas M.; Markert, Karla. Identifying Mislabeled Instances in Classification Datasets. 2019 International Joint Conference on Neural Networks (IJCNN). IEEE: 1–8. July 2019. ISBN 978-1-7281-1985-4. arXiv:1912.05283 . doi:10.1109/IJCNN.2019.8851920.

[12] NIST. The EMNIST Dataset. NIST. 2017-04-04 [2022-04-11].

[13] NIST. NIST Special Database 19. NIST. 2010-08-27 [2022-04-11].

[14] Cohen, G.; Afshar, S.; Tapson, J.; van Schaik, A. EMNIST: an extension of MNIST to handwritten letters.. 2017. arXiv:1702.05373  [cs.CV].

[15] Cohen, G.; Afshar, S.; Tapson, J.; van Schaik, A. EMNIST: an extension of MNIST to handwritten letters.. 2017. arXiv:1702.05373v1  [cs.CV].

[Multideep-16] Cires¸an, Dan; Ueli Meier; Jürgen Schmidhuber. Multi-column deep neural networks for image classification (PDF). 2012 IEEE Conference on Computer Vision and Pattern Recognition. 2012: 3642–3649. CiteSeerX 10.1.1.300.3283 . ISBN 978-1-4673-1228-8. S2CID 2161592. arXiv:1202.2745 . doi:10.1109/CVPR.2012.6248110.

[17] Kussul, Ernst; Tatiana Baidyk. Improved method of handwritten digit recognition tested on MNIST database (PDF). Image and Vision Computing. 2004, 22 (12): 971–981 [2013-09-20]. doi:10.1016/j.imavis.2004.03.008. （原始內容 (PDF)存檔於2013-09-21）.

[18] Ranzato, Marc'Aurelio; Christopher Poultney; Sumit Chopra; Yann LeCun. Efficient Learning of Sparse Representations with an Energy-Based Model (PDF). Advances in Neural Information Processing Systems. 2006, 19: 1137–1144 [2013-09-20].

[19] Ciresan, Dan Claudiu; Ueli Meier; Luca Maria Gambardella; Jürgen Schmidhuber. Convolutional neural network committees for handwritten character classification (PDF). 2011 International Conference on Document Analysis and Recognition (ICDAR). 2011: 1135–1139 [2013-09-20]. CiteSeerX 10.1.1.465.2138 . ISBN 978-1-4577-1350-7. S2CID 10122297. doi:10.1109/ICDAR.2011.229. （原始內容 (PDF)存檔於2016-02-22）.

[20] Wan, Li; Matthew Zeiler; Sixin Zhang; Yann LeCun; Rob Fergus. Regularization of Neural Network using DropConnect. International Conference on Machine Learning(ICML). 2013.

[:0-21] 21.0 ^21.1 SimpleNet. Lets Keep it simple, Using simple architectures to outperform deeper and more complex architectures. 2016 [2020-12-03]. arXiv:1608.06037 .

[22] SimpNet. Towards Principled Design of Deep Convolutional Networks: Introducing SimpNet. Github. 2018 [2020-12-03]. arXiv:1802.06205 .

[Romanuke3-23] Romanuke, Vadim. Parallel Computing Center (Khmelnytskyi, Ukraine) represents an ensemble of 5 convolutional neural networks which performs on MNIST at 0.21 percent error rate.. [2016-11-24].

[Romanuke4-24] Romanuke, Vadim. Training data expansion and boosting of convolutional neural networks for reducing the MNIST dataset error rate. Research Bulletin of NTUU "Kyiv Polytechnic Institute". 2016, 6 (6): 29–34. doi:10.20535/1810-0546.2016.6.84115 .

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]

[11]

[12]

[13]

[14]

[15]

[16]

[17]

[18]

[19]

[20]

[21]

[22]

[23]

[24]

閱論編標準測試項目
全字母句參考實現健全性測試標準測試圖像
人工智能	中文房間圖靈測試
電視（檢驗圖）	彩條信號印第安人頭檢驗圖測試卡F（英語：Test Card F）飛利浦PM5544
計算機語言	「你好，世界」程序自產生程式特拉百·帕爾多-克努斯算法（英語：Trabb Pardo–Knuth algorithm）編譯器遞歸測試 JAPH
數據壓縮	卡爾加里語料庫（英語：Calgary corpus）坎特伯雷語料庫（英語：Canterbury corpus）
三維計算機圖形	康奈爾盒子（英語：Cornell box）斯坦福兔子斯坦福龍（英語：Stanford dragon）猶他茶壺
機器學習	ImageNet MNIST數據庫列表（英語：List of datasets for machine learning research）
字體排印學	Hamburgevons（英語：Hamburgevons） Lorem ipsum The quick brown fox jumps over the lazy dog 我能吞下玻璃而不傷身體
其他	EICAR測試文件 GTUBE 哈佛語句（英語：Harvard sentences）萊娜圖〈Tom's Diner〉 SMPTE通用片頭（英語：film leader）圓圈星座防偽技術振動試驗（英語：Shakedown (testing)） Bad_Apple!!