政大機構典藏-National Chengchi University Institutional Repository(NCCUR):Item 140.119/134085

English | 正體中文 | 简体中文 | Post-Print筆數 : 27 | Items with full text/Total items : 118780/149845 (79%)
Visitors : 81286101 Online Users : 7153

RC Version 6.0 © Powered By DSPACE, MIT. Enhanced by NTU Library IR team.

Scope

please add "double quotation mark" for query phrases to get precise results

please goto advance search for comprehansive author search

Adv. Search

Home ‧ Login ‧ Upload ‧ Help ‧ About ‧ Administer

Goto mobile version

政大機構典藏 > 資訊學院 > 資訊科學系 > 學位論文 > Item 140.119/134085

Please use this identifier to cite or link to this item: https://nccur.lib.nccu.edu.tw/handle/140.119/134085

Title:	應用 Auto-encoder 技術於無監督漢字圖像轉譯 Unsupervised Chinese character image translation based on Auto-encoder
Authors:	邱柏森 Chiu, Po-Sen
Contributors:	劉昭麟 Liu, Chao-Lin 邱柏森 Chiu, Po-Sen
Keywords:	影像處理圖像轉譯
Date:	2021
Issue Date:	2021-03-02 14:32:19 (UTC+8)
Abstract:	光學字元辨識（Optical Character Recognition）為對漢字圖像檔案進行分析辨識處理，目前已成為一項重要且廣泛使用的技術。然而待辨識的原始資料裡的漢字不一定能被其光學字元辨識模型所辨識，主要原因有以下幾種，一為原始資料裡所使用的漢字字型是未知的，導致每個漢字筆畫上的粗細、長短、形狀特徵等等皆不同，假如剛好此字型不在光學字元辨識模型的辨識範圍內，極有可能會出現辨識困難，二為可能因為種種原因使得原始資料上會有污損或模糊等等，導致得到的掃描的圖像的品質不好，因而無法辨識。綜合以上問題，除非能找到另外能辨識此特徵的辨識模型以外，就只能花費大量時間另外標記類別進行訓練，難以快速解決光學字元辨識問題。因此本研究實驗應用Auto-encoder技術於建構漢字圖像轉譯模型，能以無監督方式進行訓練來對資料集的掃描圖像做預處理來獲得預處理後的漢字圖像結果，並會使用未經過預處理的漢字圖像在固定的光學字元模型中來做比較，藉以評估預處理後光學字元辨識的辨識結果。 Optical character recognition is an important and used technology for analyzing and identifying Chinese character image files. However, the original data may not be recognized by its optical character recognition model. The main reasons are as follows. One is that the Chinese characters used in the original data are unknown, which leads to the strokes of each Chinese character thickness, length, shape feature is different. If the font is not within the recognition range of the optical character recognition model, it is likely to be difficult to recognize. On the other hand, the original data may be defaced due to various reasons or blurring resulting in poor quality of the scanned image, which cannot be recognized. Based on the above problems, unless another recognition model can be found that can recognize this feature, lots of time will be spent on training for additional marking categories, it is difficult to quickly solve the problem of optical character recognition. Therefore, this research experiment uses Auto-encoder technology to construct a Chinese character image translation model, which can be trained in an unsupervised manner to preprocess the scanned images of the data set to obtain the preprocessed Chinese character image results, and will use unsupervised preprocessed Chinese character images are compared in a fixed optical character model to evaluate the recognition results of the optical character recognition after preprocessing.
Reference:	[1] 政府資料開放平台CNS11643中文標準交換碼全字庫字型下載https://data.gov.tw/dataset/5961. [2] 中日韓統一表意文字http://jicheng.tw/hanzi/unicode?s=4E00&e=9FFF. [3] A. B. L. Larsen, S. K. Sønderby, H. Larochelle, O. Winther. Autoencoding beyond pixels using a learned similarity metric. In ICML, 1558-1566, 2016. [4] A. Makhzani, J. Shlens, N. Jaitly, I. Goodfellow, B Frey. Adversarial Autoencoders. In NIPS, 2016. [5] D. P. Kingma, M. Welling. Auto-Encoding Variational Bayes. In ICLR, 2014. [6] D. Pathak, P. Krähenbühl, J. Donahue, T.Darrell, A. A. Efros. Context Encoders: Feature Learning by Inpainting. In CVPR, 2536-2544, 2016. [7] H. Cho, J. Wang, S. Lee. Text Image Deblurring Using Text-Specific Properties. In ECCV, 524-537, 2012. [8] I. Gulrajani, F. Ahmed, M. Arjovsky, V. Dumoulin, A. C. Courville. Improved Training of Wasserstein GANs. In NIPS, 5769-5779, 2017. [9] J. Pan, Z. Hu, Z. Su, M.-H. Yang. Deblurring Text Images via L0-Regularized Intensity and Gradient Prior. In CVPR, 2901-2908, 2014. [10] J.-Y. Zhu, T. Park, P. Isola, A. A. Efros. Unpaired Image-to-Image Translation Using Cycle-Consistent Adversarial Networks. In ICCV, 2242-2251, 2017. [11] K. Nazeri, E. Ng, T. Joseph, F. Z. Qureshi, M. Ebrahimi. EdgeConnect: Generative Image Inpainting with Adversarial Edge Learning. In arXiv:1901.00212, 2019. [12] M. Arjovsky, S. Chintala, L. Bottou. Wasserstein Generative Adversarial Networks. In ICML, 214-223, 2017. [13] M. Tan, Q. V. Le. EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks. In ICML, 6105-6114, 2019. [14] M.-Y. Liu, T. Breuel, J. Kautz. Unsupervised Image-to-Image Translation Networks. In NIPS, 700-708, 2017. [15] O. Elharrouss, N. Almaadeed, S. Al-Maadeed, Y. Akbari. Image inpainting: A review. In Neural Process Letters, 2019. [16] O. Ronneberger, P. Fischer, T. Brox. U-Net: Convolutional Networks for Biomedical Image Segmentation. In MICCAI, 234-241, 2015. [17] P. Isola, J.-Y. Zhu, T. Zhou, A. A. Efros. Image-to-Image Translation with Conditional Adversarial Networks. In CVPR, 5967-5976, 2017. [18] R. Smith. An overview of the Tesseract OCR engine. In ICDAR, 629-633, 2007. [19] S. Ren, K. He, R. B. Girshick, J. Sun. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks. In NIPS, 91-99, 2015. [20] X. Mao, Q. Li, H. Xie, R. Y. K. Lau, Z. Wang, S. P. Smolley. Least Squares Generative Adversarial Networks. In ICCV, 2813-2821, 2017.
Description:	碩士國立政治大學資訊科學系 107753029
Source URI:	http://thesis.lib.nccu.edu.tw/record/#G0107753029
Data Type:	thesis
DOI:	10.6814/NCCU202100219
Appears in Collections:	[資訊科學系] 學位論文

Files in This Item:

File	Description	Size	Format
302901.pdf		10267Kb	Adobe PDF2	0	View/Open

All items in 政大典藏 are protected by copyright, with all rights reserved.

社群 sharing

著作權政策宣告 Copyright Announcement

1.本網站之數位內容為國立政治大學所收錄之機構典藏，無償提供學術研究與公眾教育等公益性使用，惟仍請適度，合理使用本網站之內容，以尊重著作權人之權益。商業上之利用，則請先取得著作權人之授權。
The digital content of this website is part of National Chengchi University Institutional Repository. It provides free access to academic research and public education for non-commercial use. Please utilize it in a proper and reasonable manner and respect the rights of copyright owners. For commercial use, please obtain authorization from the copyright owner in advance.

2.本網站之製作，已盡力防止侵害著作權人之權益，如仍發現本網站之數位內容有侵害著作權人權益情事者，請權利人通知本網站維護人員(nccur@nccu.edu.tw)，維護人員將立即採取移除該數位著作等補救措施。
NCCU Institutional Repository is made to protect the interests of copyright owners. If you believe that any material on the website infringes copyright, please contact our staff(nccur@nccu.edu.tw). We will remove the work from the repository and investigate your claim.

DSpace Software Copyright © 2002-2004 MIT & Hewlett-Packard / Enhanced by NTU Library IR team Copyright © - Feedback