Smart Text Engine Documentation
Smart Text Engine is a multi-platform standalone SDK for recognizing unstructured text fragments on documents and arbitrary images.
Workflow
The TextEngine workflow consists of the following stage:
- Creating a recognition engine.
- Creating a recognition session.
- Setting session options.
- Creating an image object.
- Recognition process.
- Processing the recognition result.
- Memory deallocation.
Creating a Recognition Engine
From System Description:
Recognition Engine is the object where all recognition tools are stored and initialized. It is created using the appropriate configuration bundle containing all possible settings: a list of documents supported by a specific SDK, their fields, a list of possible authentication checks, and so on.
INFO
In special cases, the bundle is not supplied separately, but as part of (inside) the library.
Initialize the engine in a single instance. One instance allows you to create multiple recognition sessions. However, creating multiple instances of the engine is also possible.
The initialization process is a resource-intensive operation (as is the image analysis itself), so perform it off the main UI thread.
Create a TextEngine instance:
std::unique_ptr<se::text::TextEngine> engine(
se::text::TextEngine::Create(configuration_bundle_path)); TextEngine engine = TextEngine.Create(configuration_bundle_path);Parameters:
configuration_bundle_path— the path to the onfiguration bundle file containing TextEngine settings (a .se file);- boolean, enabling/disabling lazy configuration (true by default). Optional.
Each delivery includes one or more bundles — archives containing the settings required for creating objects and configuring Smart Text Engine.
Attention!
TextEngine::Create() is a factory method and returns an allocated pointer. The caller is responsible for deleting it.
Creating a Recognition Session
From System Description:
Session settings is an object storing:
- The list of the supported documents for recognition, grouped by internal engines. It is set in the configuration bundle with which the engine was created (read-only);
- Advanced information about documents, including links to PRADO (read-only, used in Smart ID Engine only);
- The list of documents submitted for recognition (
*by default); - The list of expected fields for recognition (
allby default); - The list of document sets (the
modeparameter,defaultby default); - The special session options: the number of recognition threads, the expansion of the field list, the session timeout, and so on. You can find the full list of the options in our documentation.
Create a TextSessionSettings object using the configured TextEngine instance:
std::unique_ptr<se::text::TextSessionSettings> settings(engine->CreateSessionSettings()); TextSessionSettings settings = engine.CreateSessionSettings();Attention!
TextEngine::CreateSessionSettings() is a factory method and returns an allocated pointer. The caller is responsible for deleting it.
Setting Session Options
Specifying Languages for TextSession
The Russian (rus) and English (eng) languages are supported.
settings->AddEnabledLanguages("rus"); // The Russian language // Java
settings.AddEnabledLanguages("rus"); // The Russian languageSupported languages
A language is simply string encoding real world language you want to recognize (usually using ISO 639-3 codes). We also have two service languages - digits and punct. Languages that Smart Text Engine SDK delivered to you can potentially recognize can be obtained using the following procedure:
// Iterating through supported languages
for (auto it = settings->SupportedLanguagesBegin();
it != settings->SupportedLanguagesEnd();
++it) {
//Getting language code
std::string lang = it.GetValue();
//Getting the string of characters
std::string char_str = settings->GetLanguageAlphabet(lang);
}// Iterating through supported languages
for (StringSetIterator it = settings.SupportedLanguagesBegin();
!it.Equals(settings.SupportedLanguagesEnd());
it.Advance()) {
//Getting language code
String lang = it.GetValue();
//Getting the string of characters
String char_str = settings.GetLanguageAlphabet(lang);
}To enable several languages you should combine their codes using colon, e.g. eng:digits.
Spawning a Session
From System Description:
A personal signature is provided to the customer with the product. It is contained in the README.html file in the /doc directory.
Each time an instance of the recognition session is created, the signature must be passed as one of the arguments. This confirms the caller's right to use the library and unlocks it.
Signature is verified offline. The library does not access any external resources.
Spawn a session (the TextSession object):
const char* signature = "... YOUR SIGNATURE HERE ..."; // Your personal signature you use to start Smart Text Engine session
std::unique_ptr<se::text::TextSession> session(engine->SpawnSession(*settings, signature)); String signature = "... YOUR SIGNATURE HERE ..."; // Your personal signature you use to start Smart Text Engine session
TextSession session = engine.SpawnSession(settings, signature);Creating an Image Object
From System Description:
Pass an image of the special class se.common.image to the system for recognition. You can create it using the following image formats:
- jpeg, png;
- tiff (✔️TIFF_LZW, ✔️TIFF_PACKBITS,✔️TIFF_CCITT);
- base64 (above mentioned formats);
- file buffer with a preliminary indication of the color scheme, width\height\number of channels.
The maximum allowed image size by default is 15000x15000px. You can change the maximum image size.
HEIC Support
A HEIC file in the mobile SDK are handled similarly to other image formats. The HEIC is read using system tools. In the server SDK, open the HEIC format using external tools and convert it either to one of the formats we support, or transfer the raw pixels directly as an RGB buffer (this is recommended).
PDF Support
PDF with preliminary conversion to raster formats. Upon customer request, Smart Engines provides such a converter for the selected architecture.
Create an Image object for further processing:
std::unique_ptr<se::common::Image> image(
se::common::Image::FromFile(image_path)); // Loading from file```Image image = Image.FromFile(image_path); // Loading from fileInformation
To pass any images to the engine, an object of the se.common.Image class is required.
Actual methods:
Image.FromFile(imagePath)— takes the local path to the file for input;Image.FromFileBuffer(data)— takes a file read into the buffer;Image.FromBase64Buffer(data)— takes a file wrapped in base64 and read into the buffer;Image.FromBufferExtended(raw_data, width, height, stride, pixel_format, bytes_per_channel)— creating anImageobject using raw pixels;Image.FromYUV(planeY, planeU, planeV, yuvDimensions)— creating anImageobject from YUV_420_888.
Supported file formats:
- jpeg, png;
- tiff (✔️TIFF_LZW, ✔️TIFF_PACKBITS,✔️TIFF_CCITT);
- base64 (above mentioned formats);
- file buffer with a preliminary indication of the color scheme, width\height\number of channels.
The maximum allowed image size by default is 15000x15000px. You can change the maximum image size.
HEIC Support
A HEIC file in the mobile SDK are handled similarly to other image formats. The HEIC is read using system tools. In the server SDK, open the HEIC format using external tools and convert it either to one of the formats we support, or transfer the raw pixels directly as an RGB buffer (this is recommended).
PDF Support
PDF with preliminary conversion to raster formats. Upon customer request, Smart Engines provides such a converter for the selected architecture.
Attention!
Image::FromFile() s a factory method and returns an allocated pointer. The caller is responsible for deleting it.
Recognition Process
Call the ProcessImage(...) method to process the image:
session->ProcessImage(*image); session.ProcessImage(image);Processing the Recognition Result
Get the current result from the session:
const se::text::TextResult& result = session->GetCurrentResult(); TextResult result = session.GetCurrentResult();To get the TextScene object information containing the recognition results call the GetCurrentScene() method:
const se::text::TextScene& scene = result.GetCurrentScene(); TextScene scene = result.GetCurrentScene();To iterate the recognized text fragments described by a collection of TextChunk instances, get the TextIterator object from the TextScene object:
std::unique_ptr<se::text::TextIterator> chunk_iterator;
chunk_iterator.reset(scene.CreateIterator("default"));
for (; !chunk_iterator->Finished(); chunk_iterator->Advance()) {
//Getting text chunk value (UTF-8 string representation)
std::string chunk_str = chunk_iterator->GetTextChunk().GetOcrString().GetFirstString().GetCStr();
} TextIterator chunk_iterator = scene.CreateIterator("default");
for (; !chunk_iterator.Finished(); chunk_iterator.Advance()) {
//Getting text chunk value (UTF-8 string representation)
String chunk_str = chunk_iterator.GetTextChunk().GetOcrString().GetFirstString().GetCStr();
}Memory Deallocation
Some Smart Document Engine SDK classes have factory methods which return pointers to heap-allocated objects. Caller is responsible for deleting such objects.
TIP
- In C++:, for simple memory management and avoiding memory leaks, use smart pointers, such as
std::unique_ptr<T>orstd::shared_ptr<T> - In Java API use the
.delete()method to remove garbage.
Library Interface
Common Classes
Common classes, such as Point, OcrString, Image etc., are located in the se::common namespace in the secommon directory:
These are the following C++ headers:
| Заголовочный файл | Описание |
|---|---|
| <secommon/se_export_defs.h> | Contains export-related definitions of Smart Engines libraries |
| <secommon/se_exceptions_defs.h> | Contains the definition of exceptions used in Smart Engines libraries |
| <secommon/se_geometry.h> | Contains the geometric classes and procedures (Point, Rectangle, etc.) |
| <secommon/se_image.h> | Contains the definition of the Image class |
| <secommon/se_string.h> | Contains the string-related classes (MutableString, OcrString, etc.) |
| <secommon/se_string_iterator.h> | Contains the definition of string-targeted iterators |
| <secommon/se_serialization.h> | Contains the auxiliary classes related to object serialization |
| <secommon/se_common.h> | An auxiliary header which simply includes all of the above |
The same common classes in Java API are located within com.smartengines.common module:
// Java
import com.smartengines.common.*; // Import all se::common classesMain Classes
The main Smart Text Engine classes are located in the se::text namespace in the textengine directory:
| Header | Description |
|---|---|
| <textengine/text_chunk_info.h> | Contains the TextChunk class definition |
| <textengine/text_engine.h> | Contains the TextEngine class definition |
| <textengine/text_session_settings.h> | Contains the textSessionSettings class definition |
| <textengine/text_ session.h> | Contains the textSession class definition |
| <textengine/text_result.h> | Contains the textResult class definition |
| <textengine/text_feedback.h> | Contains the TextFeedback interface and associated container |
| <textengine/text_forward_declarations.h> | A service header containing forward declarations |
| <textengine/text_iterator.h> | Contains the TextIterator class definition |
| <textengine/text_scene.h> | Contains the text_scene class definition |
The same classes in Java API are contained in com.smartengines.doc:
import com.smartengines.doc.*; // Import se::textExceptions
The C++ API may throw se::common::BaseException subclasses exceptions when invalid input data is input, incorrect calls are made or if something else goes wrong.
| Exception | Description |
|---|---|
| FileSystemException | Thrown if attempt to read data from a non-existent file is mane |
| InternalException | Thrown in case of an unknown error of an internal system component |
| InvalidArgumentException | Thrown if a method with invalid input parameters is called |
| InvalidKeyException | Thrown in case of access the container using an invalid or a non-existent key, or in case of access to a list using an invalid or out-of-range index |
| InvalidStateException | Thrown if a system error occurs due to an invalid state of the system objects |
| MemoryException | Thrown if an allocation is attempted with insufficient RAM |
| NotSupportedException | Thrown when trying to access a method which is not supported in the current library version or is not supported at all |
| Uninitialized Object Exception | Thrown in case of an attempt to access a non-existent or non-initialized object |
If exceptions are thrown, user-friendly messages are displayed by the e.what() method.
Attention!
se::common::BaseException is not a subclass of std::exception. The Smart ID Engine interface does not have any dependency on the STL.
The Java API exceptions are wrapped in general class java.lang.Exception. The exception type is included in the corresponding message text.
PDF Recognition
For server recognition, PDF files are supported as an input format. PDF support is implemented via preliminary conversion of PDF documents into raster images (e.g. PNG), which are then passed to the recognition pipeline.
PDF-to-raster conversion can be performed using the open-source PDFium library via the pdfium_cli command-line utility.
The PDF conversion utility is given upon demand in order to reduce the size of the SDK distribution.
Upon customer request, a ready-to-use PDF-to-raster conversion utility can be provided for a specific target architecture.
| Option | Description | Default |
|---|---|---|
| -i, --input | Path to the input PDF file | — |
| -o, --output | Output directory | result |
| -p, --prefix | Filename prefix for output files | page_ |
| -d, --dpi | Rendering resolution in DPI | 300 |
| -r, --pages | Page range to render, e.g. "1-3,5,7" or "all" | all |
| -g, --grayscale | Render pages in grayscale (smaller PNG file size) | — |
| -h, --help | Print this help message and exit | — |
Examples:
./pdfium_cli -i file.pdf./pdfium_cli -i file.pdf -o out -d 150 -r 1-5