# AdapterForPdfPid

## Introduction

The purpose of **AdapterForPdfPid** is to extract data from different PDF files into a local `UPVP` database.
**AdapterForPdfPid** scans PDFs and creates a semi-intelligent P&ID database file for later use in Builder and IntelliPID.

The program recognises **text labels** and sorts them into user-defined categories such as equipment, pipe runs, instruments, and off-page connectors.
 
The user-defined categories for label recognition must be stored in an Excel worksheet.

The name of this file must be **config.xlsx**.

An example of `config.xlsx` can be found in:

> `\Program Files\CAXperts\PdfPidAdapter\Templates`

The names of sheet 1 and sheet 2 can be changed. It is important only that both sheets remain in the correct order.

Sheet 1 = `Patterns`

<img src="./media/AdapterForPdfPid_image6.png" width="600" height="300">

Sheet 2 = `Settings`

<img src="./media/AdapterForPdfPid_image7.png" width="600" height="300">

## Graphical User Interface

After starting **AdapterForPdfPid**, the following window appears:

<img src="./media/AdapterForPdfPid_image1.png" width="463" height="226">

After selecting the input directory and defining the output file, the capture process can begin.

The Excel file named `config.xlsx` must be in the same folder as the imported PDFs.

Start the process by clicking the **Capture** button.

### DocType

AdapterForPdfPid can also be used to convert other PDF drawings, such as equipment arrangement plans or isometric drawings, in order to create semi-intelligent drawings and map them to the 3D model. The document type can be defined in `config.xlsx` in sheet 2. In the example shown here, that sheet is called `Settings`.

For `DocType`, any name can be chosen. Please note that `IntelliPID` is a protected term. Therefore, the IntelliPID sketching module is available only for this document type. If no `DocType` is set in the `Settings` sheet, the documents in this folder are treated automatically as IntelliPID. If no `Settings` sheet exists in `config.xlsx`, the documents in this folder are also treated automatically as IntelliPID.

The PDF file containing the specific drawing type should be a **single-page** document. Otherwise, it is treated as `DocType "IntelliPID"`.

### IgnoreZone

By defining an ignore zone, you can exclude text recognition by AdapterForPdfPid. Text labels within the defined zone are ignored. This can be useful, for example, when excluding text recognition in the title block of a drawing.

<img src="./media/AdapterForPdfPid_image5.png" width="600" height="350">

### Example PDF P&ID

The following images show the process that **AdapterForPdfPid** follows to generate semi-intelligent P&IDs.

Example input PDF with tank, pumps, valves, lines, and additional information:

<img src="./media/AdapterForPdfPid_image2.png" width="600" height="450">

Extract from the resulting `config.xlsx` file. This Excel file contains the relevant data to be collected from several input PDFs.
The information is based on regular expressions (`regex`).

<img src="./media/AdapterForPdfPid_image3_new.png" width="700" height="430">

### Builder

When viewed in [**IntelliPID**](https://help.caxperts.com/UDiTH%20App/IntelliPID%20Module), the result shows the full graphical representation and also indicates that the main text labels are recognised as intelligent objects.
In the following example, the semi-intelligent information is marked yellow.

<img src="./media/AdapterForPdfPid_image4_new.png" width="700" height="430">

For detailed information about 3D/P&ID object linking by using mapping files, see the [Builder documentation](https://help.caxperts.com/UDiTH%20Capture%20Station/Builder#pid-3d-mapping). Additional attributes can also be loaded by using the Builder plugin [ExcelImportPlugin](https://help.caxperts.com/UDiTH%20Capture%20Station/ExcelImportPlugin).
