Forums

Can I install pdf2image?

Hi everyone,

I'm using this awesome service and have a question. Is it possible to install pdf2image module? I'm asking this because this module has some dependencies (like poppler) and I'm not sure if they are installed or if I can install them as well. What I need basically is to be able to split pdf files into pages and save each separate page as jpg (or png) for further ocr. I do it on my home machine via pdf2image module. However, if you know other way that will work on pythonanywhere - I'll be more than glad to hear it.

Thank you in advance!

If you are familiar with compiling/building from source, then you could go to poppler, download their latest stable release, and compile it into your homedir. Then you could get pdf2image to run off of that compiled binary.

Hi conrad,

Thank you for your reply. Unfortunately I'm not familiar with compiling/building, but we have google and stackoverflow. So I'll read something and try to figure out. Probably I'll have to do something along the lines of this: https://askubuntu.com/questions/25961/how-do-i-install-a-tar-gz-or-tar-bz2-file And while speaking of this, do I understand correctly that I have to do steps described via above link from pythonanywhere console?

That link is pretty good, yes. On PythonAnywhere, you'd use a bash console, with the wget command to download the file -- you give it a URL and it downloads the file into the current directory -- and then use tar xf filename.tar to extract the archive file. Because you can't use the sudo command on our systems, you'd need to specify the --prefix option with the ./configure command to tell the build system that when you install it, you want it to go into /home/xelandar/.local -- something like this:

./configure --prefix=/home/xelandar/.local

Hopefully with those hints and a bit of Google-fu, you'll get it working :-)

Run this to see the options you can use:

./configure --help

This command might be useful if you get errors when building

Just got the same problem and - surprisingly - you don't even need to install Poppler!

From pdf2image docs:

Linux Most distros ship with pdftoppm and pdftocairo. If they are not installed, refer to your package manager to install poppler-utils

Indeed, pythonanywhere already has these packages, so just install pdf2image - it works out of the box!

Interesting -- what command did you use to install it? We were suggesting using Poppler because building it provides you with a copy of pdf2image, but if there's an easier way, we'd love to know!