Search |
||
Read office files with Java APIPosted by claudio on February 20, 2008 at 4:23 PM PST
Last year when working in a project, there were a lot of documents (requirements, user guides, architecture, etc.), from different sources (email attachments, file shares, backups, old version control). The same document name but different date and size. So, how to know which one is the latest and delete the others ? There were two ways to achieve that:
Then, in a short time I developed a small program to read a set of files and print its name, date of last modification and complete path. OpenOffice SDK libraries is used, so you need to have a OpenOffice 2.x installed somewhere. Actually it reads a small set of office files like sxw, doc, xls, odt, ods, pps, odt, ppt and odp. Feel free to modify it to suit your needs. Or even extend its functionality. The java source code can be downloaded at DocViewer.java
As I am a linux user, this works with linux in mind. Windows users have reported it works with small modifications to its runtime settings, but I don't know which modifications need to be done. RequirementsCompile timeThe following libraries are needed: $OO_HOME/program/classes/juh.jar $OO_HOME/program/classes/jurt.jar $OO_HOME/program/classes/jut.jar $OO_HOME/program/classes/ridl.jar $OO_HOME/program/classes/unoil.jar $OO_HOME points to the OpenOffice installation. For me it is installed at /opt/broffice.org2.3 * BrOffice is the official Brazilian version of OpenOffice At runtime
CompileVery easy to compile javac -classpath /opt/broffice.org2.3/program/classes/\* src/claudius/DocViewer.java You see I have used classpath wildcards. Modify this as needed to compile it with JDK 5.
RunTo run it, openoffice standalone program need to be running, but to avoid a graphical program popping out in a window hundreds of times, it can run in a non graphical way. To achieve that I used X Virtual Frame Buffer (xvfb), its a kind of X window manager in memory, this is useful to run graphical libraries at server machines. The OpenOffice SDK will connect to OpenOffice program through sockets, as the standalone program will do the real job of read the office file.
ResultThe output will look similar to this
dir = /home/claudio/resources/palestras/2007/10_justjava file = diagnostico2.odp Modified by: Claudio Miranda 5/10/2007 17:46:8 If this piece of code is useful or if you made any modification, please share it and write a comment. »
Related Topics >>
Programming Comments
Comments are listed in date ascending order (oldest first)
Submitted by nbw on Wed, 2008-02-20 19:12.
Do these libraries allow you write M$ Office compatible documents (.doc/.ppt etc.)? If not can you recommend anything?
Submitted by gadominas on Thu, 2008-02-21 02:08.
There's another API for particular task: Apache POI (http://poi.apache.org/index.html). Check is out.
Submitted by claudio on Thu, 2008-02-21 05:18.
With OpenOffice SDK its possible to to create/modify MS Office documents. See its SDK documentation
Submitted by kszkaresz on Wed, 2009-03-04 06:46.
Thanks, it's a very useful info, but how can I iterate over paragraphs and words, and how can I get information about character's property (like: font name, color, size, etc.)?
Submitted by claudio on Thu, 2009-03-05 11:55.
I recommend you to take a look at other information sources, as http://development.openoffice.org/#COMPONENTS
Also, take a look at http://www.jopendocument.org/
|
||
|
|