PDF analysis;
extraction and repurposing of
text, graphics & metadata
in PDF Files
(PDF Data Mining):
Citation Software

PDF data mining software PDF text extraction software
Government buyers
click here
 • Products & Services     • Buy software 
 • Downloads     • Support 
 • Customer testimonials     • FAQ 
 • Free newsletter     • Press 
 • Mailpiece-design site     • Links 
 • News archives     • Contact 
   • About     • Home 



PDF data-mining software lets you
extract information
from PDF files
and repurpose it.

PDF data mining software  
 
The XpdfAnalyze SDK is

PDF data mining API
buy PDF data mining software on line
buy XpdfAnalyze SDK software on line
 
The XpdfInfo SDK is

API to extract metadata from PDF files
buy XpdfInfo SDK software on line
buy XpdfInfo SDK software on line
 
The XpdfText SDK is

API to extract text from PDF files
buy XpdfText SDK software on line
buy XpdfText SDK software on line
 



































































































































































































































































  







Citation Software Inc.
 Specialists in variable-data publishing since 1986
 
www.CitationSoftware.com     info@CitationSoftware.com


                 Follow CitationSW on Twitter                 
              Click to use wizard
$949 gets you started with variable-data printing. Call for details.              Use our Wizard to find the right product for your requirements and budget!


888-260-7316
  
  
  
 
Search  
    Make $50 by sending a customer to us
♦ Sign up for our free newsletter ♦
 

These are the products we offer for PDF analysis and data mining (extraction and repurposing of text, graphics & metadata).

server based PDF data mining software

 XpdfAnalyze SDK
server based PDF data mining software

 XpdfInfo SDK
server based PDF data mining software

 XpdfText SDK

These products are programmer's libraries/toolkits that make it easy to do dynamic PDF text extraction, PDF metadata extraction, and other kinds of PDF analysis and data mining.
 

XpdfAnalyze SDK         
888-260-7316    info@CitationSoftware.com         

PDF analysis API The XpdfAnalyze SDK is a very affordable developer's library/SDK that makes it easy to determine the object types and colors used on one or more pages in a PDF file. Object types are:
  • images
  • text strings
  • strokes (lines)
  • fills (filled polygons)
Object-type information can be used to categorize PDF files as image-only, text-only, or image-and-text.

Color information includes color spaces (DeviceRGB, DeviceCMYK, Separation, etc.), as well as information on which process colors (CMYK) and/or custom colors (spot colors) are used.

The XpdfAnalyze SDK can be used in an automated workflow to determine which pages contain color and which are black & white.

The XpdfAnalyze SDK is available as a COM component or a DLL for Windows platforms and as a shared library for Linux platforms. Portable C++ source code is also available.
The XpdfAnalyze SDK is easy to use!

PDFHandle pdf;
int n;

pdfLoadFile(&pdf, "MyFile.pdf");

// analyze pages 1-4
pdfAnalyzePages(pdf, 1, 4);

// number of images 
// on pages 1-4
n = pdfGetNumImages(pdf);
The XpdfAnalyze SDK is available as a COM component or a DLL for Windows platforms and as a shared library for Linux platforms. Portable C++ source code is also available.
The XpdfAnalyze SDK is available as a COM component or a DLL for Windows platforms and as a shared library for Linux platforms. Portable C++ source code is also available.

Click the button below to download a free trial version of the XpdfAnalyze SDK for Windows. If you're not using Windows, call us at 888-260-7316 to get your free trial version.

The quickest and easiest way to purchase the XpdfAnalyze SDK is to buy it on line at our Web store. You'll be able to download it and start using it right away.
We'll be happy to accept your order by phone or fax if you prefer not to purchase on line. To purchase by phone, call us at 888-260-7316. To purchase by fax, download our order form, fill it out, and fax it to 207-433-1160.
PDF document assembly software PDF split and merge software
We accept credit cards and debit cards (American Express, Discover, MasterCard, Visa, Diners Club, and JCB).

Pricing starts at $235.00 US for a developer's license and $9.00 US per unit for runtime licenses. Minimum purchase is one developer's license and five runtime licenses.

Volume discounts are available. Call us at 888-260-7316 to get a price quote — or click the button above to go to our Web store and see the volume discounts.

Pricing is subject to change without notice.

XpdfInfo SDK         
888-260-7316    info@CitationSoftware.com         

PDF data mining API The XpdfInfo SDK is a very affordable developer's library/SDK that reads a PDF file and provides access to the following information:
  • page count
  • page size (per page)
  • standard metadata fields: title, subject, keywords, author, creator, producer, creation date, modification date
  • custom metadata fields (depending on the software used to create the PDF file)
The XpdfInfo SDK is available as a COM component or a DLL for Windows platforms and as a shared library for Linux platforms. Portable C++ source code is also available.
The XpdfInfo SDK is easy to use!

PDFHandle pdf;
char *title;
int length;

pdfLoadFile(&pdf, "MyFile.pdf");
title = pdfGetTitle(pdf, &length);
printf("%s\n", title);
The XpdfInfo SDK is available as a COM component or a DLL for Windows platforms and as a shared library for Linux platforms. Portable C++ source code is also available.
The XpdfInfo SDK is available as a COM component or a DLL for Windows platforms and as a shared library for Linux platforms. Portable C++ source code is also available.

Click the button below to download a free trial version of the XpdfInfo SDK for Windows. If you're not using Windows, call us at 888-260-7316 to get your free trial version.

The quickest and easiest way to purchase the XpdfInfo SDK is to buy it on line at our Web store. You'll be able to download it and start using it right away.
We'll be happy to accept your order by phone or fax if you prefer not to purchase on line. To purchase by phone, call us at 888-260-7316. To purchase by fax, download our order form, fill it out, and fax it to 207-433-1160.
PDF document assembly software PDF split and merge software
We accept credit cards and debit cards (American Express, Discover, MasterCard, Visa, Diners Club, and JCB).

Pricing starts at $235.00 US for a developer's license and $9.00 US per unit for runtime licenses. Minimum purchase is one developer's license and five runtime licenses.

Volume discounts are available. Call us at 888-260-7316 to get a price quote — or click the button above to go to our Web store and see the volume discounts.

Pricing is subject to change without notice.

XpdfText SDK         
888-260-7316    info@CitationSoftware.com         

PDF data mining API The XpdfText SDK is a very affordable developer's library/SDK that extracts plain text from a PDF file. The PDF file can be on disk or in memory; and likewise, the text can be extracted to memory or directly to disk.

The XpdfText SDK can be used in different ways:

  • Convert entire PDF files or individual pages to plain text, maintaining layout or converting to "reading order."
  • Extract text from a specified rectangle on a page (useful for extracting text from forms).
  • Convert pages into word lists: for each word, you can retrieve font name and font size, text color, word position on the page, character offset (for highlight files).
The extracted text can be converted to a wide choice of standard encodings:

  • UTF-8 Unicode
  • Latin1 (8-bit ISO-8859-1)
  • 7-bit ASCII
  • ISO-2022-CN (simplified Chinese)
  • EUC-CN (simplified Chinese)
  • Big5 (traditional Chinese)
  • KOI8-R (Cyrillic)
  • ISO-8859-7 (Greek)
  • ISO-2022-JP (Japanese)
  • EUC-JP (Japanese)
  • Shift-JIS (Japanese)
  • KSX1001 (Korean)
  • TIS-620 (Thai)
  • ISO-8859-9 (Turkish)
Other encodings can be supported upon request.

In addition to the features described above, the XpdfText SDK includes all the functionality of the XpdfInfo SDK.

The XpdfText SDK is available as a COM component or a DLL for Windows platforms and as a shared library for Linux platforms. Portable C++ source code is also available.
The XpdfText SDK is easy to use!

PDFHandle pdf;
char *buf;
int length;

pdfLoadFile(&pdf, "MyFile.pdf");

// convert to a text file on disk...
pdfConvertToTextFile(pdf, 1, 5,
 "MyFile.txt");

// ... or convert in memory
buf = pdfConvertToTextString(pdf, 
1, 5, &length);
The XpdfText SDK is available as a COM component or a DLL for Windows platforms and as a shared library for Linux platforms. Portable C++ source code is also available.
The XpdfText SDK is available as a COM component or a DLL for Windows platforms and as a shared library for Linux platforms. Portable C++ source code is also available.

Click the button below to download a free trial version of the XpdfText SDK for Windows. If you're not using Windows, call us at 888-260-7316 to get your free trial version.

The quickest and easiest way to purchase the XpdText SDK is to buy it on line at our Web store. You'll be able to download it and start using it right away.
We'll be happy to accept your order by phone or fax if you prefer not to purchase on line. To purchase by phone, call us at 888-260-7316. To purchase by fax, download our order form, fill it out, and fax it to 207-433-1160.
PDF document assembly software PDF split and merge software
We accept credit cards and debit cards (American Express, Discover, MasterCard, Visa, Diners Club, and JCB).

Pricing starts at $475.00 US for a developer's license and $18.00 US per unit for runtime licenses. Minimum purchase is one developer's license and five runtime licenses.

Volume discounts are available. Call us at 888-260-7316 to get a price quote — or click the button above to go to our Web store and see the volume discounts.

Pricing is subject to change without notice.
 
Can't find exactly what you need? Not sure exactly what you need? Contact us by phone at 888-260-7316, or send email to info@CitationSoftware.com. We can help you find appropriate software for your requirements.
 
 




    
Let our Wizard help you find the right product!

• Products & Services   • Buy software   • Downloads   • Support
• Mailpiece-design site   • Our customers    • Company information   • Links
• Free newsletter   • FAQ   • Case studies   • Contact us
• News archives   • Press   • Customer testimonials   • Home

Search   

Copyright © 2001-2017 Citation Software Inc.
info@CitationSoftware.com
888-260-7316
www.CitationSoftware.com
print on demand
PDF data mining PDF text extraction PDF image extraction PDF repurposing PDF data mining PDF text extraction PDF image extraction PDF repurposing PDF data mining PDF text extraction PDF image extraction PDF repurposing PDF data mining PDF text extraction PDF image extraction PDF repurposing PDF data mining PDF text extraction PDF image extraction PDF repurposing