I would really appreciate some help. I've installed the Bitnami ResourceSpace stack on an Amazon EC2 server. In doing some testing and preparing it for use, I've noticed that it is not extracting any metadata (text) from PDF or Word .doc files. This is a standard feature that is available on the ResourceSpace trial sites (hosted by them, of course).
In researching this issue, I've discovered that exiftool is not a standard component of the ResourceSpace stack. The config.default.php file and the Ubuntu installation instructions in the ResourceSpace documentation also describe installing antiword and xpdf (of which pdftotext is used).
Here's what I've done:
First: installed libimage-exiftool-perl, antiword, and xpdf using terminal commands
Second: edited the config.php file to include the following:
Note: these are the paths where the applications installed when I used "sudo apt-get install ... "
Installation Check in ResourceSpace now shows a Fail message next to ExifTool: '/usr/share/libimage-exiftool-perl/exiftool' not found
I tried changing the directory name to what I thought ResourceSpace would be looking for (/usr/share/exiftool) and updated config php to point to that new directory, but then the Fail message was: Unexpected output when executing '/usr/share/exiftool' -ver command. Output was ''.
Since making these changes, tests to upload both PDF and DOC files still do not result in anything populating the extracted text field. It isn't a problem with the files because when I upload them to the ResourceSpace trial site the text is successfully extracted. Antiword and xpdf/pdftotext aren't included in the installation check, so I'm not sure where the issue could be with those two external apps. My organization is planning to use ResourceSpace primarily for documents and being able to extract and index text is pretty important. Any insights into linking these three external apps are most appreciated!