Handwriting Recognition

Handwriting recognition is the problem of recognizing a handwritten word; the input may be obtained via scanning (offline recognition) or from a digitizing tablet (online recognition). Whereas recognizing machine print (OCR) is largely solved for clean documents, the problem still remains for noisy, small font documents, and handwritten documents.

handwriting recognition

 

 

Mathlet: Online Handwritten Mathematical Expression Recognition

formula recognition  

Project Description

We have developed a system for online handwritten mathematical expression recognition together with a user-interface for writing scientific articles. A neural network is trained for recognizing each stroke and a recursive algorithm parses the expression by combining neural network output and structure of the expression. The interface associated with the proposed system integrates the built-in recognition capabilities of the Microsoft’s Tablet PC-API for recognizing textual input and also supports conversion of hand-drawn figures into PNG format, which enable the user to enter text, mathematics and draw figures in a single interface. After the recognition, all output is combined into one LATEX code and compiled into a PDF file.



Related papers:

"Türkçe İçin Tablet PC Ortamında Çevrimiçi Yazı Tanıma Sistemi", Esra Vural, Hakan Erdoğan, Kemal Oflazer, Berrin Yanikoglu, IEEE SIU Proceedings, Apr 2004.

"Turkce icin Genis Dagarcikli Dokuman Tanima Sistemi", Proceedings of SIU '2003.

"Turkish handwritten text recognition: A Case of Agglutinative Languages". Proceedings of SPIE, Jan 2003.

"Text Detection and Extraction in Outdoor Scenes", Alisher Kholmatov, Aytül Erçil, Berrin Yanikoglu, IEEE SIU Conference, June 12-14, 2002 Pamukkale, Turkey.

"Pitch Estimation and Pitch-Based Segmentation for Dot-Matrix Text Recognition", Berrin Yanikoglu, International Journal on Document Analysis and Recognition, 3, 1, 2000.

"Segmentation of Off-line Cursive Handwriting Using Linear Programming", Berrin Yanikoglu and Peter Sandon, "Patter Recognition" 31, 12 1998.

Projects:

      TÜBİTAK Project (No: 101E012 Duration: 9/2001-9/2003):  "Document segmentation and Recognition for Turkish".

In this project, we developed image processing and document recognition algorithms suitable for Turkish, since the agglutinative nature of Turkish morphology prevents the use of previously developed
OCR systems.