Forums

Using EasyOCR on PythonAnywhere

EasyOCR requires pytorch in order to function. Looking at this page: https://www.pythonanywhere.com/batteries_included/ I can see that pytorch comes included with pythonanywhere systems. However, the pre-configured version is too low for it to function with newer libraries. Can I do anything about it? upgrading my torch version causes a disk quota exceeded error.

If the required version is not pre-installed, the only way is to install it, so I'm afraid the only way is to upgrade your account to support more disk quota. We are currently working on a new system image which will support newer versions of pre-installed packages -- could you tell us which versions did you try install / did you require?

I was trying to install torch version 2.5.1 (latest) in order to use EasyOCR. However, that exceeded disk quota. Considering how better EasyOCR is than tesseract, you should consider adding library that as a pre-configured library too.

Thanks! I'm not sure if we'll be able to add EasyOCR to this system image, but I put a note into our backlog on your behalf.

Is there any way to obtain good quality OCR Results from images using pythonanywhere? Tesseract doesnt work well on images which are not black text with a white background

It's rather a general question -- but you could possibly try some image preprocessing or/and fine tuning tesseract parameters...

I tried those, it somehow gives me different results on my main machine and on pythonanywhere even when I am using the same versions

What is the difference?

the difference is very large. On my machine, it produces coherent text whereas it produces gibberish when used on pythonanywhere

Something in your software stack or configuration must be different.

Sorry, as it turns out. Pythonanywhere only provides tesseract v4. I am running tesseract v5 on my machine. any ideas how to upgrade/recieve good quality ocr results?

I just checked our upcoming system image, but it looks like we'll still be on Tesseract v4.1.1 for that :-(

Perhaps there are APIs out there that you could use? I was thinking that OpenAI might provide one (as they seem to be able to do text recognition in their ChatGPT interface) but I can't find any good way to do that via their API.