Today I had a technical talk at OpenFest 2009 along with my colleague Vesko Kolev. We presented the OCR technologies and the open source OCR engine Tesseract.
I demonstrated hot to download and compile tesseract, how it works, when it recognizes text correctly and when it fails and how to train it in a new language. I trained Tesseract for Bulgarian and English. The full presentation is available for download along with the demonstration scripts:
- Tesseract-OCR-Engine-v1.01.ppt - the presentation
- Tesseract-demonstrations-8-Nov-2009.zip - the training materials (scripts and executables are Windows based but can be adopted for other platforms)
Enjoy.

