Recognition results post-processing

In many scenarios captured recognition data may require further post-processing in order to prepare or clean up data, or for adding additional validation hooks for documents like Invoices, where sum of line items has to be equal to the total invoice amount. For such scenarious elDoc IDP&BPM system provides post-processing functionality.

Post-processing functionality is based on the JavaScript script engine as such standard JavaScript syntax has to be used for writing post-processing code.

Post-processing can be defined for the specific fields and for the whole RecoForm.

Code repository

Post-processing code repository is provided for storing codes (functions) which are to be shared across different fields and RecoForm post-processing code snippets. Code repository is accessible via the Recognition rules managementpage by pressing Code repository button.

Important:

  • on RecoForms export - whole code repository contents is being exported and saved within RSF-file;
  • on RecoForms import - codes are being imported and appended to the existing code repository. Imported codes are appended after the following divider:
    //
    // Imported by RFS-import on: 17.06.2021 08:26
    //

Bitap support in post-processing

Due to the nature of IDP when using OCR for data recognition some words may come with obvious mistakes. In order to automatically handle such scenarios elDoc provides embedded support for the basic Bitap operations in postprocessing for achieving better results.

The following functions are supported in the post-processing:

/**
 * Replaces searchValue in originalValue with searchValue if case it was found with the defined (by mistakesCount) maximum of allowed mistakes
 *
 * originalValue - string value containing one or several words
 * searchValue - string value for search
 * mistakesCount - integer value indicating the maximum number of allowed mistakes
 **/
Bitap.replace(originalValue, searchValue, mistakesCount);

/**
 * Replaces searchValue in originalValue with newValue if case it was found with the defined (by mistakesCount) maximum of allowed mistakes
 *
 * originalValue - string value containing one or several words * searchValue - string value for search
 * mistakesCount - integer value indicating the maximum number of allowed mistakes
 * newValue - string value to be used for replacing found value
 **/
Bitap.replace(originalValue, searchValue, mistakesCount, newValue);

/**
 * Returns index of first occurrence of searchValue in originalValue in case it was found with the defined (by mistakesCount) maximum of allowed mistakes
 *
 * originalValue - string value containing one or several words
 * searchValue - string value for search
 * mistakesCount - integer value indicating the maximum number of allowed mistakes
 **/
Bitap.indexOf(originalValue, searchValue, mistakesCount);

/**
 * Returns boolean value true|false indicating whether originalValue contains searchValue with the defined (by mistakesCount) maximum of allowed mistakes
 *
 * originalValue - string value containing one or several words
 * searchValue - string value for search
 * mistakesCount - integer value indicating the maximum number of allowed mistakes
 **/
Bitap.contains(originalValue, searchValue, mistakesCount);

Notes relevant to all Bitap functions:

  • Minimal length of the original string value should be equal or longer than 5 characters;
  • Minimal number of allowed mistakes is 1. Maximum number is calculated based on the word length and in case higher number is provided it is replaced with the maximum allowed. The following rules apply:
    • Word length >= 5 - max. number of allowed mistakes is 1;
    • Word length <= 7 - max. number of allowed mistakes is 2;
    • Word length <= 10 - max. number of allowed mistakes is 3;
    • Word length <= 15 - max. number of allowed mistakes is 5;
  • All functions process original string value by splitting it by space into words and joining words which do not meet length criteria (maximum length for the single word can not exceed 15 characters). For example: original given string value "Pick Up / Transport Charges" is split into the following set of words:
    • Pick Up
    • Up / Transport
    • / Transport
    • Transport
    • Charges

Useful code samples

/**
 * Returns field index by tag or -1 in case no field found by the given tag
 */
function getFieldIndexByTag(fieldTagName, fieldsArray){
  return fieldsArray.findIndex(field => field.tags.includes(fieldTagName));
}

Last modified: May 15, 2023