[Duraspace] Filter media

Louw Venter Louw.Venter at nwu.ac.za
Fri Oct 23 09:37:20 SAST 2009


Morning all,
 
I made a bit of a mess. 
A while back I uploaded some PDF documents to DSpace and ran Filter media to extract the text. Recently the creators of the pdf files sent me a batch with updated volume numbers etc to replace the existing ones already on the server. So I simply removed the items and added new bitstreams. And in the back of my mind I remember something about that not really being the right thing to do.
Now when I run the filter media process again the text doesn't get extracted - could this be because the checksums don't match or because the original was located in one assetstore and the new one in another?
 
Thank you in advance
 
 
ERROR filtering, skipping bitstream:
 
        Item Handle: 10394/1886
        Bundle Name: ORIGINAL
        File Size: 287223
        Checksum: 6de2597a7cabd6ca3a995c355d9301f1 (MD5)
        Asset Store: 1
java.lang.NullPointerException
java.lang.NullPointerException
        at org.pdfbox.pdmodel.PDPageNode.getAllKids(PDPageNode.java:194)
        at org.pdfbox.pdmodel.PDPageNode.getAllKids(PDPageNode.java:182)
        at org.pdfbox.pdmodel.PDDocumentCatalog.getAllPages(PDDocumentCatalog.java:226)
        at org.pdfbox.util.PDFTextStripper.writeText(PDFTextStripper.java:216)
        at org.dspace.app.mediafilter.PDFFilter.getDestinationStream(PDFFilter.java:141)
        at org.dspace.app.mediafilter.MediaFilterManager.processBitstream(MediaFilterManager.java:668)
        at org.dspace.app.mediafilter.MediaFilterManager.filterBitstream(MediaFilterManager.java:570)
        at org.dspace.app.mediafilter.MediaFilterManager.filterItem(MediaFilterManager.java:520)
        at org.dspace.app.mediafilter.MediaFilterManager.applyFiltersItem(MediaFilterManager.java:488)
        at org.dspace.app.mediafilter.MediaFilterManager.applyFiltersAllItems(MediaFilterManager.java:427)
        at org.dspace.app.mediafilter.MediaFilterManager.main(MediaFilterManager.java:359)
 
 
Louw Venter
Louw.Venter at nwu.ac.za 

Ferdinand Postma Biblioteek | Library
Potchefstroomkampus van die Noordwes-Universiteit / 
Potchefstroom Campus of the North-West University
Privaatsak | Private Bag X05
Noordbrug
2522
(018) 299 2812
 
 

Vrywaringsklousule / Disclaimer: http://www.nwu.ac.za/it/gov-man/disclaimer.html 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lib.sun.ac.za/pipermail/duraspace/attachments/20091023/dac71031/attachment.html>


More information about the Duraspace mailing list