Show / Hide Table of Contents

Class DataExtractionModule

static interface to PDFTron SDKs data extraction functionality

Inheritance
object
DataExtractionModule
Inherited Members
object.ToString()
object.Equals(object)
object.Equals(object, object)
object.ReferenceEquals(object, object)
object.GetHashCode()
object.GetType()
Namespace: pdftron.PDF
Assembly: PDFNet.dll
Syntax
public sealed class DataExtractionModule

Constructors

DataExtractionModule()

Declaration
public DataExtractionModule()

Methods

DetectAndAddFormFieldsToPDF(PDFDoc)

Perform automatic form field detection, then insert the fields into the PDF.

Declaration
public static void DetectAndAddFormFieldsToPDF(PDFDoc doc)
Parameters
Type Name Description
PDFDoc doc

The PDF document where fields are detected from and inserted into

DetectAndAddFormFieldsToPDF(PDFDoc, DataExtractionOptions)

Perform automatic form field detection, then insert the fields into the PDF. Note: The FormKeyValue engine is experimental and subject to change.

Declaration
public static void DetectAndAddFormFieldsToPDF(PDFDoc doc, DataExtractionOptions options)
Parameters
Type Name Description
PDFDoc doc

The PDF document where fields are detected from and inserted into

DataExtractionOptions options

Data extraction options

ExtractData(string, string, DataExtractionEngine)

Perform data extraction on a PDF file using the specified engine. Note: The FormKeyValue engine is experimental and subject to change.

Declaration
public static void ExtractData(string input_pdf_file, string output_json_file, DataExtractionModule.DataExtractionEngine engine)
Parameters
Type Name Description
string input_pdf_file

The source document filename

string output_json_file

The resulting JSON filename

DataExtractionModule.DataExtractionEngine engine

The extraction engine

ExtractData(string, string, DataExtractionEngine, DataExtractionOptions)

Perform data extraction on a PDF file using the specified engine. Note: The FormKeyValue engine is experimental and subject to change.

Declaration
public static void ExtractData(string input_pdf_file, string output_json_file, DataExtractionModule.DataExtractionEngine engine, DataExtractionOptions options)
Parameters
Type Name Description
string input_pdf_file

The source document filename

string output_json_file

The resulting JSON filename

DataExtractionModule.DataExtractionEngine engine

The extraction engine

DataExtractionOptions options

Data extraction options

ExtractData(string, DataExtractionEngine)

Perform data extraction on a PDF file using the specified engine and return the resulting JSON string. Note: The FormKeyValue engine is experimental and subject to change.

Declaration
public static string ExtractData(string input_pdf_file, DataExtractionModule.DataExtractionEngine engine)
Parameters
Type Name Description
string input_pdf_file

The source document filename

DataExtractionModule.DataExtractionEngine engine

The extraction engine

Returns
Type Description
string

JSON string representing the extracted results

ExtractData(string, DataExtractionEngine, DataExtractionOptions)

Perform data extraction on a PDF file using the specified engine and return the resulting JSON string. Note: The FormKeyValue engine is experimental and subject to change.

Declaration
public static string ExtractData(string input_pdf_file, DataExtractionModule.DataExtractionEngine engine, DataExtractionOptions options)
Parameters
Type Name Description
string input_pdf_file

The source document filename

DataExtractionModule.DataExtractionEngine engine

The extraction engine

DataExtractionOptions options

Data extraction options

Returns
Type Description
string

JSON string representing the extracted results

ExtractToXLSX(string, string)

Perform data extraction on a PDF in XLSX output format.

Declaration
public static void ExtractToXLSX(string input_pdf_file, string output_xlsx_file)
Parameters
Type Name Description
string input_pdf_file

The source document filename

string output_xlsx_file

The resulting XLSX filename

ExtractToXLSX(string, string, DataExtractionOptions)

Perform data extraction on a PDF in XLSX output format.

Declaration
public static void ExtractToXLSX(string input_pdf_file, string output_xlsx_file, DataExtractionOptions options)
Parameters
Type Name Description
string input_pdf_file

The source document filename

string output_xlsx_file

The resulting XLSX filename

DataExtractionOptions options

Data extraction options

ExtractToXLSX(string, Filter)

Perform data extraction on a PDF in XLSX output format.

Declaration
public static void ExtractToXLSX(string input_pdf_file, Filter output_xlsx_stream)
Parameters
Type Name Description
string input_pdf_file

The source document filename

Filter output_xlsx_stream

The resulting XLSX filter

ExtractToXLSX(string, Filter, DataExtractionOptions)

Perform data extraction on a PDF in XLSX output format.

Declaration
public static void ExtractToXLSX(string input_pdf_file, Filter output_xlsx_stream, DataExtractionOptions options)
Parameters
Type Name Description
string input_pdf_file

The source document filename

Filter output_xlsx_stream

The resulting XLSX filter

DataExtractionOptions options

Data extraction options

IsModuleAvailable(DataExtractionEngine)

Find out whether the specified data extraction module is available (and licensed).

Declaration
public static bool IsModuleAvailable(DataExtractionModule.DataExtractionEngine engine)
Parameters
Type Name Description
DataExtractionModule.DataExtractionEngine engine

The extraction engine

Returns
Type Description
bool

returns true if data extraction operations can be performed

In This Article
Back to top Generated by DocFX