Advanced recognition settings

By default, elDoc IDP is configured to handle input files (images) with average or good scanning quality, using default preprocessing settings optimized for such images.

However, elDoc IDP can also process images with poor quality, such as those with noisy backgrounds, color distortions, or twisted and rotated documents. For these scenarios, custom preprocessing settings may be necessary. To accommodate this, elDoc IDP allows you to define custom parameters for each RecoForm. When documents are processed with a given RecoForm, they are first preprocessed using these custom settings before the IDP (Intelligent Document Processing) is applied.

This page provides an overview of the available advanced settings for customizing the preprocessing of images.

Recognition Form Advanced Settings

Advanced settings are set via the Advanced settings configuration area of the RecoForm.

Advanced settings description

Advanced setting param name

Data type

Default value

Description

Triggers custom
pre-processing*
idp_recoform_table_ancundtblusebottomBooleanfalseUse bottom left point of anchor for calculating location of table end
idp_recoform_use_sourceBooleanfalseSwitches RecoForm to use source image instead of the pre-processed image for the RecoForm processing
idp_recoform_custom_idString
Id of the custom RecoForm which is loaded as plugin
idp_recoform_morphologyString

Enables artifacts removal and sets size of the artifact to be removed during image pre-processing

Min value: "0x0" (disabled)

Max value: "99x99"

(warning)

idp_adaptive_thresholdBooleanfalseEnables adaptive threshold(warning)
idp_iterative_thresholdBooleanfalseEnables iterative threshold(warning)
idp_threshold_valueInteger

Defines custom thresholding value. When set to 0 - processing of this property will be ignored

Min value: 0 (disabled)

Max value: 255

(warning)
idp_resolution_thresholdString

If source image resolution is below defined threshold in at least one dimension document will be sent to validation

Min value: "10x10"

Max value: "99999x99999"


idp_ocr_mode_sparse_textBooleanfalseSets OCR recognition mode to Sparse Text (Find as much text as possible in no particular order)(warning)
idp_recoform_strict_layoutBooleanfalseEnforces strict template layout for the RecoForm. When this value set to true anchors' positions are used for calculating page scale
idp_ignore_text_layerBooleanfalseInstructs system to ignore text-layer (if any available, relevant to the PDF files) and enforces document OCR(warning)
idp_recoform_keywords_classificationBooleanfalseEnables document classification based on the RecoForm keywords
idp_recoform_table_disable_row_mergeBooleanfalseDisables table rows merge
idp_recoform_table_enable_columns_detectionBooleanfalseEnables table columns width detection based on column border vertical lines (applicable for scan-images only)


  • (warning) - "Triggers custom pre-processing" explanation:

    The system performs document preprocessing on upload using default parameters suited for most common document types. However, due to the specific nature of some documents, these generic parameters might not be optimal. For such cases, advanced preprocessing settings should be defined within the RecoForms tailored for those documents.

    When a RecoForm with advanced settings is used, and the "triggers custom pre-processing" option is enabled, the system reprocesses the document from scratch. This custom preprocessing may impact the overall time required for processing documents in the recognition queue.

To apply advanced preprocessing settings during document upload, you can use the "Force files preprocessing by elDoc" switch on the AI document processing page. This enables the application of specific advanced settings, such as idp_ocr_mode_sparse_text and idp_ignore_text_layer, for improved document handling.

Last modified: August 14, 2024