MICR E-13B source and language files for Tesseract 3.01 / TesseractDotNet
This archive contains compiled language file (in the "tessdata" folder) and source files (in the "Source" folder) for the MICR E-13B font found on the bottom of cheques. You only need to copy the mcr.traineddata file to your Tesseract "tessdata" folder to use this. The source files are only if you wish to improve the training somehow.
Usage: tesseract image.tif outfile -l mcr
Included in this archive is also the source and compiled files for Tesseract 2 / TessNet 2. Read the readme.htm file in that zip for more information.
Special characters to note: (also in the OCR output, strip the spaces prior to
parsing)
Character | Symbol looks a bit like this | Standard Output char |
Transit / Route | |: | [ |
On-US / Account | ||# | @ |
Dash | lll | - |
Cheque Amount | .l' | $ |
If have any changes or improvements, let me know and i shall included them here for everyone.
Hunter
Beanland
hunter
@ beanland.net.au
http://www.beanland.net.au/programming/dotnet/