SobekCM.Builder_Library.Tools Namespace
Tools used by the builder to perform resource manipulation, such as extraction of text from PDFs, conversion of Office files to PDF, creation of the MARC feed, etc..
Classes
| Class | Description |
---|
| FDA_Report_Processor | Processor class steps through and processes all the Florida Digital Archive reports under a given directory,
reading the FDA reports and saving the pertinent information into the SobekCM database |
| HTML_XML_Text_Extractor | Helper class extracts the full text from a HTML or XML file by removing all the tags
and stores the resulting text (for indexing) into an output file |
| MarcXML_Load_Creator | Class is used to create the MarcXML feed of all digital resources
in this SobekCM library |
| PDF_Tools | Class contains some basic methods for preparing PDF files for loading into a SobekCM library |
| Text_Cleaner | Class is used to clean incoming text files and remove offending additional spaces
or stand-alone punctuation |
| Word_Powerpoint_to_PDF_Converter | Class is used to convert Word and Powerpoint files to PDF |