Skip to content

Uses Apache Tika parser libraries to extract text out of a variety of file formats (pdf, excel, word, mhtml, images, txt, csv, etc.) and then uses java-diff-utils to generate unified diff between two versions of the files. This unified diff can be fed to diff2html library to show side by side diff on browser

Notifications You must be signed in to change notification settings

manon2333/apache-tika-file-diff

 
 

Repository files navigation

"# Apache tika + java-diff-utils + diff2html" Examples to follow

About

Uses Apache Tika parser libraries to extract text out of a variety of file formats (pdf, excel, word, mhtml, images, txt, csv, etc.) and then uses java-diff-utils to generate unified diff between two versions of the files. This unified diff can be fed to diff2html library to show side by side diff on browser

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Java 100.0%