Smart Text Engine Documentation

Smart Text Engine is a multi-platform standalone SDK for recognizing unstructured text fragments on documents and arbitrary images.

Workflow

The TextEngine workflow consists of the following stage:

Creating a recognition engine.
Creating a recognition session.
Setting session options.
Creating an image object.
Recognition process.
Processing the recognition result.
Memory deallocation.

Creating a Recognition Engine

From System Description:

Recognition Engine is the object where all recognition tools are stored and initialized. It is created using the appropriate configuration bundle containing all possible settings: a list of documents supported by a specific SDK, their fields, a list of possible authentication checks, and so on.

INFO

In special cases, the bundle is not supplied separately, but as part of (inside) the library.

Initialize the engine in a single instance. One instance allows you to create multiple recognition sessions. However, creating multiple instances of the engine is also possible.

The initialization process is a resource-intensive operation (as is the image analysis itself), so perform it off the main UI thread.

Create a TextEngine instance:

C++Java

C++

    std::unique_ptr<se::text::TextEngine> engine(
        se::text::TextEngine::Create(configuration_bundle_path));

Java

    TextEngine engine = TextEngine.Create(configuration_bundle_path);

Parameters:

configuration_bundle_path — the path to the onfiguration bundle file containing TextEngine settings (a .se file);
boolean, enabling/disabling lazy configuration (true by default). Optional.

Each delivery includes one or more bundles — archives containing the settings required for creating objects and configuring Smart Text Engine.

Attention!

TextEngine::Create() is a factory method and returns an allocated pointer. The caller is responsible for deleting it.

Creating a Recognition Session

From System Description:

Session settings is an object storing:

The list of the supported documents for recognition, grouped by internal engines. It is set in the configuration bundle with which the engine was created (read-only);
Advanced information about documents, including links to PRADO (read-only, used in Smart ID Engine only);
The list of documents submitted for recognition (* by default);
The list of expected fields for recognition (all by default);
The list of document sets (the mode parameter, default by default);
The special session options: the number of recognition threads, the expansion of the field list, the session timeout, and so on. You can find the full list of the options in our documentation.

Create a TextSessionSettings object using the configured TextEngine instance:

C++Java

C++

std::unique_ptr<se::text::TextSessionSettings> settings(engine->CreateSessionSettings());

Java

    TextSessionSettings settings = engine.CreateSessionSettings();

Attention!

TextEngine::CreateSessionSettings() is a factory method and returns an allocated pointer. The caller is responsible for deleting it.

Setting Session Options

Specifying Languages for TextSession

The Russian (rus) and English (eng) languages are supported.

C++Java

C++

    settings->AddEnabledLanguages("rus"); // The Russian language

Java

    // Java
    settings.AddEnabledLanguages("rus"); // The Russian language

Supported languages

A language is simply string encoding real world language you want to recognize (usually using ISO 639-3 codes). We also have two service languages - digits and punct. Languages that Smart Text Engine SDK delivered to you can potentially recognize can be obtained using the following procedure:

C++Java

C++

// Iterating through supported languages
for (auto it = settings->SupportedLanguagesBegin();
     it != settings->SupportedLanguagesEnd();
     ++it) {
  //Getting language code
  std::string lang = it.GetValue();
  //Getting the string of characters
  std::string char_str = settings->GetLanguageAlphabet(lang); 
}

Java

// Iterating through supported languages
for (StringSetIterator it = settings.SupportedLanguagesBegin();
     !it.Equals(settings.SupportedLanguagesEnd());
     it.Advance()) {
  //Getting language code
  String lang = it.GetValue();
  //Getting the string of characters
  String char_str = settings.GetLanguageAlphabet(lang);
}

To enable several languages you should combine their codes using colon, e.g. eng:digits.

-->

Spawning a Session

From System Description:

A personal signature is provided to the customer with the product. It is contained in the README.html file in the /doc directory.

Each time an instance of the recognition session is created, the signature must be passed as one of the arguments. This confirms the caller's right to use the library and unlocks it.

Signature is verified offline. The library does not access any external resources.

Spawn a session (the TextSession object):

C++Java

C++

const char* signature = "... YOUR SIGNATURE HERE ..."; // Your personal signature you use to start Smart Text Engine session
    std::unique_ptr<se::text::TextSession> session(engine->SpawnSession(*settings, signature));

Java

    String signature = "... YOUR SIGNATURE HERE ..."; // Your personal signature you use to start Smart Text Engine session
    TextSession session = engine.SpawnSession(settings, signature);

Creating an Image Object

From System Description:

Pass an image of the special class se.common.image to the system for recognition. You can create it using the following image formats:

jpeg, png;
tiff (✔️TIFF_LZW, ✔️TIFF_PACKBITS,✔️TIFF_CCITT);
base64 (above mentioned formats);
file buffer with a preliminary indication of the color scheme, width\height\number of channels.

The maximum allowed image size by default is 15000x15000px. You can change the maximum image size.

HEIC Support

A HEIC file in the mobile SDK are handled similarly to other image formats. The HEIC is read using system tools. In the server SDK, open the HEIC format using external tools and convert it either to one of the formats we support, or transfer the raw pixels directly as an RGB buffer (this is recommended).

PDF Support

PDF with preliminary conversion to raster formats. Upon customer request, Smart Engines provides such a converter for the selected architecture.

Create an Image object for further processing:

C++Java

C++

std::unique_ptr<se::common::Image> image(
    se::common::Image::FromFile(image_path)); // Loading from file```

Java

Image image = Image.FromFile(image_path); // Loading from file

Information

To pass any images to the engine, an object of the se.common.Image class is required.

Actual methods:

Image.FromFile(imagePath) — takes the local path to the file for input;
Image.FromFileBuffer(data) — takes a file read into the buffer;
Image.FromBase64Buffer(data) — takes a file wrapped in base64 and read into the buffer;
Image.FromBufferExtended(raw_data, width, height, stride, pixel_format, bytes_per_channel) — creating an Image object using raw pixels;
Image.FromYUV(planeY, planeU, planeV, yuvDimensions) — creating an Image object from YUV_420_888.

Supported file formats:

jpeg, png;
tiff (✔️TIFF_LZW, ✔️TIFF_PACKBITS,✔️TIFF_CCITT);
base64 (above mentioned formats);
file buffer with a preliminary indication of the color scheme, width\height\number of channels.

The maximum allowed image size by default is 15000x15000px. You can change the maximum image size.

HEIC Support

PDF Support

PDF with preliminary conversion to raster formats. Upon customer request, Smart Engines provides such a converter for the selected architecture.

Attention!

Image::FromFile() s a factory method and returns an allocated pointer. The caller is responsible for deleting it.

Recognition Process

Call the ProcessImage(...) method to process the image:

C++Java

C++

    session->ProcessImage(*image);

Java

    session.ProcessImage(image);

Processing the Recognition Result

Get the current result from the session:

C++java

C++

    const se::text::TextResult& result = session->GetCurrentResult();

java

    TextResult result = session.GetCurrentResult();

To get the TextScene object information containing the recognition results call the GetCurrentScene() method:

C++Java

C++

    const se::text::TextScene& scene = result.GetCurrentScene();

Java

    TextScene scene = result.GetCurrentScene();

To iterate the recognized text fragments described by a collection of TextChunk instances, get the TextIterator object from the TextScene object:

C++Java

C++

    std::unique_ptr<se::text::TextIterator> chunk_iterator;
    chunk_iterator.reset(scene.CreateIterator("default"));
    for (; !chunk_iterator->Finished(); chunk_iterator->Advance()) {
      //Getting text chunk value (UTF-8 string representation)
      std::string chunk_str = chunk_iterator->GetTextChunk().GetOcrString().GetFirstString().GetCStr();
    }

Java

    TextIterator chunk_iterator = scene.CreateIterator("default");
    for (; !chunk_iterator.Finished(); chunk_iterator.Advance()) {
      //Getting text chunk value (UTF-8 string representation)
      String chunk_str = chunk_iterator.GetTextChunk().GetOcrString().GetFirstString().GetCStr();
    }

Memory Deallocation

Some Smart Document Engine SDK classes have factory methods which return pointers to heap-allocated objects. Caller is responsible for deleting such objects.

TIP

In C++:, for simple memory management and avoiding memory leaks, use smart pointers, such asstd::unique_ptr<T> or std::shared_ptr<T>
In Java API use the .delete() method to remove garbage.

Library Interface

Common Classes

Common classes, such as Point, OcrString, Image etc., are located in the se::common namespace in the secommon directory:
These are the following C++ headers:

Заголовочный файл	Описание
<secommon/se_export_defs.h>	Contains export-related definitions of Smart Engines libraries
<secommon/se_exceptions_defs.h>	Contains the definition of exceptions used in Smart Engines libraries
<secommon/se_geometry.h>	Contains the geometric classes and procedures (Point, Rectangle, etc.)
<secommon/se_image.h>	Contains the definition of the Image class
<secommon/se_string.h>	Contains the string-related classes (MutableString, OcrString, etc.)
<secommon/se_string_iterator.h>	Contains the definition of string-targeted iterators
<secommon/se_serialization.h>	Contains the auxiliary classes related to object serialization
<secommon/se_common.h>	An auxiliary header which simply includes all of the above

The same common classes in Java API are located within com.smartengines.common module:

java

// Java
import com.smartengines.common.*; // Import all se::common classes

Main Classes

The main Smart Text Engine classes are located in the se::text namespace in the textengine directory:

Header	Description
<textengine/text_chunk_info.h>	Contains the `TextChunk` class definition
<textengine/text_engine.h>	Contains the `TextEngine` class definition
<textengine/text_session_settings.h>	Contains the `textSessionSettings` class definition
<textengine/text_ session.h>	Contains the `textSession` class definition
<textengine/text_result.h>	Contains the `textResult` class definition
<textengine/text_feedback.h>	Contains the TextFeedback interface and associated container
<textengine/text_forward_declarations.h>	A service header containing forward declarations
<textengine/text_iterator.h>	Contains the `TextIterator` class definition
<textengine/text_scene.h>	Contains the `text_scene` class definition

The same classes in Java API are contained in com.smartengines.doc:

Java

import com.smartengines.doc.*; // Import se::text

Exceptions

The C++ API may throw se::common::BaseException subclasses exceptions when invalid input data is input, incorrect calls are made or if something else goes wrong.

Exception	Description
FileSystemException	Thrown if attempt to read data from a non-existent file is mane
InternalException	Thrown in case of an unknown error of an internal system component
InvalidArgumentException	Thrown if a method with invalid input parameters is called
InvalidKeyException	Thrown in case of access the container using an invalid or a non-existent key, or in case of access to a list using an invalid or out-of-range index
InvalidStateException	Thrown if a system error occurs due to an invalid state of the system objects
MemoryException	Thrown if an allocation is attempted with insufficient RAM
NotSupportedException	Thrown when trying to access a method which is not supported in the current library version or is not supported at all
Uninitialized Object Exception	Thrown in case of an attempt to access a non-existent or non-initialized object

If exceptions are thrown, user-friendly messages are displayed by the e.what() method.

Attention!

se::common::BaseException is not a subclass of std::exception. The Smart ID Engine interface does not have any dependency on the STL.

The Java API exceptions are wrapped in general class java.lang.Exception. The exception type is included in the corresponding message text.

PDF Recognition

For server recognition, PDF files are supported as an input format. PDF support is implemented via preliminary conversion of PDF documents into raster images (e.g. PNG), which are then passed to the recognition pipeline.

PDF-to-raster conversion can be performed using the open-source PDFium library via the pdfium_cli command-line utility.

The PDF conversion utility is given upon demand in order to reduce the size of the SDK distribution.

Upon customer request, a ready-to-use PDF-to-raster conversion utility can be provided for a specific target architecture.

Option	Description	Default
-i, --input	Path to the input PDF file	—
-o, --output	Output directory	`result`
-p, --prefix	Filename prefix for output files	`page_`
-d, --dpi	Rendering resolution in DPI	`300`
-r, --pages	Page range to render, e.g. "1-3,5,7" or "all"	`all`
-g, --grayscale	Render pages in grayscale (smaller PNG file size)	—
-h, --help	Print this help message and exit	—

Examples:

shell

./pdfium_cli -i file.pdf

shell

./pdfium_cli -i file.pdf -o out -d 150 -r 1-5

Smart Text Engine Documentation ​

Workflow ​

Creating a Recognition Engine ​

Creating a Recognition Session ​

Setting Session Options ​

Specifying Languages for TextSession ​

Supported languages ​

Spawning a Session ​

Creating an Image Object ​

Recognition Process ​

Processing the Recognition Result ​

Memory Deallocation ​

Library Interface ​

Common Classes ​

Main Classes ​

Exceptions ​

PDF Recognition ​

Smart Text Engine Documentation

Workflow

Creating a Recognition Engine

Creating a Recognition Session

Setting Session Options

Specifying Languages for TextSession

Supported languages

Spawning a Session

Creating an Image Object

Recognition Process

Processing the Recognition Result

Memory Deallocation

Library Interface

Common Classes

Main Classes

Exceptions

PDF Recognition