PDF - Extract Text from Regions – Encodian Customer Help

Power Automate Connector: Encodian – PDF

Overview

The 'PDF - Extract Text from Regions' flow action enables text to be extracted from specified regions of a PDF document and returns an array of the extracted text.

Whilst this action is limited to extracting text regions from PDF documents, it will auto-convert 70+ file formats to PDF before executing the text extraction. Please refer to the "Credit Count" section below, as the conversion will consume additional credits.

The action will auto-OCR the PDF document if it is not searchable. Please refer to the "Credit Count" section below, as OCR is a resource-intensive operation and will consume additional credits.

Please take a look at the "Supported Document Types" articles for a complete list of the different file formats and document types supported for PDF conversion.

Example Flow

Please refer to the following articles:

Credit Count

** NOTE - The total extractions factor will change from 10 per credit to 50 per credit from 1st July 2025**

The final credit count is determined using the following calculation:

Extract Regions Operation (1) + Converted to PDF? (1) + OCR'd Pages (n) + (No. of Extractions / 10) = Total Credits used

For example:

9 regions extracted from a PDF document, which did not require OCR:

1 + 0 + 0 + (9/10 = 0) = 1 credit used

11 regions extracted from a PDF document, which did not require OCR:

1 + 0 + 0 + (11/10 = 1) = 2 credits used

11 regions extracted from a PowerPoint presentation:

1 + 1 + 0 + (11/10 = 1) = 3 credits used

11 regions extracted from a 5-page PDF document, which requires OCR

1 + 0 + 5 + (11/10 = 1) = 7 credits used

OCR is a resource-intensive operation; therefore, an Encodian Flowr credit is used for every page OCR'd.

Parameters

The default 'PDF - Extract Text from Regions' flow action parameters are detailed below:

Filename: The filename (including the file extension)
File Content: A Base64 encoded representation of the file to be processed.
Text Regions: An array of Text Regions (See below for further details)

Please refer to the Obtaining the 'File Contents' Parameter article for guidance on how to obtain the 'File Content' parameter ready to provide to an Encodian flow action.

Text Region Generator Tool

Please use the 'Text Region Generator' tool to automatically determine the required coordinates.

Text Region Detail

A text region is specified as a rectangle, comprising 4 coordinates that represent the bottom left and upper right corners of the rectangle on both the X and Y axes.

The origin (0,0) of the coordinate system is in the bottom left-hand corner of the page. Coordinates are specified in points; a typical A4 page is 595 x 842 points.

Text Region - Multiple text regions can be selected in one operation. To create more than one region, click the "Add new item" button:
- Text Regions Name: Provide a name with which to reference the extracted region
- Text Regions Lower Left X Coordinate: Number of points across from the left-hand edge of the page to the lower left corner of the rectangle
- Text Regions Lower Left Y Coordinate: Number of points up from the bottom edge of the page to the lower left corner of the rectangle
- Text Regions Upper Right X Coordinate: Number of points across from the left-hand edge of the page to the upper right corner of the rectangle
- Text Regions Upper Right Y Coordinate: Number of points up from the bottom edge of the page to the upper right corner of the rectangle.

Advanced Parameters

The advanced 'PDF - Extract Text from Regions' flow action parameters are detailed below:

Operation ID: (Advanced) The ID of a parent operation. Please refer to Flow Action Return Options: File Content vs. Operation ID
Return File: (Advanced) Sets whether the action should return a file or alternatively an operation ID: Flow Action Return Options: File Content vs. Operation ID

Return Parameters

The 'PDF - Extract Text from Regions' flow action returns the following data.

Action Specific Values

Text Region Results Simple - An array of results for each text region in simplified format (key / value pair).

A partial example response payload (JSON) is detailed below:

 "TextRegionResultsSimple":
 {
    "Region1": "Region1 Value",
    "Region2": "Region2 Value",
    "Region3": "Region3 Value"
 }

Text Region Results - An array of results for each text region specified

A partial example response payload (JSON) is detailed below:

 "TextRegionResults": [
 {
 "Name": "Extracted Region Name",
 "Text": "This is text extracted from the demo region",
 "PageNumber": 1
 }
 ]

Standard Return Values

Filename - The filename of the document.
FileContent - The processed document content.
OperationId - The unique ID assigned to this operation.
HttpStatusCode - The HTTP Status code for the response.
HttpStatusMessage - The HTTP Status message for the response.
Errors - An array of error messages should an error occur.
Operation Status - Indicates whether the operation has been completed, has been queued or has failed.

A complete example return payload (JSON) is detailed below:

{
 "TextRegionResults": [
 {
 "Name": "Extracted Region Name",
 "Text": "This is text extracted from the demo region",
 "PageNumber": 4
 }
 ],
 "HttpStatusCode": 200,
 "HttpStatusMessage": "",
 "OperationId": "**********-****-****-****-************",
 "Errors": [],
 "Operation Status": "Complete",
 "Filename": "textRegionsDemo.pdf",
 "FileContent": null
}

PDF - Extract Text from Regions

Overview

Example Flow

Credit Count

Parameters

Text Region Generator Tool

Text Region Detail

Advanced Parameters

Return Parameters

Action Specific Values

Standard Return Values

0 Comments

Review Documentation

Create a post

Submit a Ticket

Related articles

Overview

Example Flow

Credit Count

Parameters

Text Region Generator Tool

Text Region Detail

Advanced Parameters

Return Parameters

Action Specific Values

Standard Return Values

0 Comments

Review Documentation

Create a post

Submit a Ticket