Install and config for exiftool, antiword, and pdftotext

I would really appreciate some help. I’ve installed the Bitnami ResourceSpace stack on an Amazon EC2 server. In doing some testing and preparing it for use, I’ve noticed that it is not extracting any metadata (text) from PDF or Word .doc files. This is a standard feature that is available on the ResourceSpace trial sites (hosted by them, of course).

In researching this issue, I’ve discovered that exiftool is not a standard component of the ResourceSpace stack. The config.default.php file and the Ubuntu installation instructions in the ResourceSpace documentation also describe installing antiword and xpdf (of which pdftotext is used).

Here’s what I’ve done:
First: installed libimage-exiftool-perl, antiword, and xpdf using terminal commands
Second: edited the config.php file to include the following:

$exiftool_path='/usr/share/libimage-exiftool-perl';
$antiword_path='/usr/share/antiword';
$pdftotext_path='/usr/share/xpdf';

Note: these are the paths where the applications installed when I used "sudo apt-get install … "

Installation Check in ResourceSpace now shows a Fail message next to ExifTool: ‘/usr/share/libimage-exiftool-perl/exiftool’ not found

I tried changing the directory name to what I thought ResourceSpace would be looking for (/usr/share/exiftool) and updated config php to point to that new directory, but then the Fail message was: Unexpected output when executing ‘/usr/share/exiftool’ -ver command. Output was ‘’.

Since making these changes, tests to upload both PDF and DOC files still do not result in anything populating the extracted text field. It isn’t a problem with the files because when I upload them to the ResourceSpace trial site the text is successfully extracted. Antiword and xpdf/pdftotext aren’t included in the installation check, so I’m not sure where the issue could be with those two external apps. My organization is planning to use ResourceSpace primarily for documents and being able to extract and index text is pretty important. Any insights into linking these three external apps are most appreciated!

Hello @mkstephens

I agree with you, those components should be included in the Bitnami ResourceSpace Stack. I opened a task to investigate this issue deeply and include the missing components in future releases. Thanks for reporting it.

You do not need to do that. The binaries are installed at /usr/bin if you installed them using:

sudo apt-get update && sudo apt-get install -y libimage-exiftool-perl antiword xpdf

You can check it running:

which antiword
which xpdf
which exiftool

Therefore, once they’re installed. You can simply restart Apache (sudo /opt/bitnami/ctlscritp.sh restart apache) and you should be able to handle the metadata of PDFs.

Best Regards,

Juan Ariza

Thank you so much for your response, @jariza!

I checked the installation using the which command as you suggested, and then set about making the config.php file to make it work correctly. I discovered that config.php does require the following to allow ResourceSpace to find and use the external applications:

$ffmpeg_path='/usr/bin';
$exiftool_path='/usr/bin';
$antiword_path='/usr/bin';
$pdftotext_path='/usr/bin';

This is how it reads in config.default.php for all except exiftool. Anyone else who runs into this problem will be able to copy and paste from config.default.php into config.php, but will also need to change the exiftool_path to point to usr/bin, not usr/local/bin. I tried it with these lines commented out to see if ResourceSpace would pick up antiword and pdftotext without a path specified. It did not. ResourceSpace only successfully extracted the text after these paths were set in config.php.

Thanks again for pointing me in the right direction!
Myka Kennedy Stephens

Hello @mkstephens

Thanks for sharing it, it would be very useful for the rest of users.

As I mentioned before, we should be packaging those components in the stack so you don’t have to do that since they would be included under the installation directory and the path would be properly preconfigured by us during installation. In the meantime, installing the components using apt-get and editing config.php as you suggested seems the best solution.

We will let you know when we can work on include them for the next release. Thanks again.

Best Regards,

Juan Ariza

Hello @mkstephens,

We have included the missing packages for the latest version of ResourceSpace 9.6. We also provide guides on how to install those missing packages in your instances for older versions in our documentation:

https://docs.bitnami.com/aws/apps/resourcespace/configuration/

I will proceed to close this thread as it is now been solved.

Regards,
Francisco de Paz