I am working on a project to allow full-text indexing of common file types such as word, powerpoint, excel and pdf. After searching for a few days, I have a hard time finding a perfect solution.
Word
http://ftp.wagner.pp.ru/~vitus/software/catdoc/
Excel
http://ftp.wagner.pp.ru/~vitus/software/catdoc/
PowerPoint