Welcome to NeoML documentation!

neoml module provides a Python interface for the C++ NeoML library.

NeoML is an end-to-end machine learning framework that allows you to build, train, and deploy ML models. This framework is used by ABBYY engineers for computer vision and natural language processing tasks, including image preprocessing, classification, document layout analysis, OCR, and data extraction from structured and unstructured documents.

Basic principles

Platform independence

The user interface is completely separated from the low-level calculations implemented by a math engine. The only thing you have to do is to specify at the start the type of the math engine that will be used for calculations. You can also choose to select the math engine automatically, based on the device configuration detected.

The rest of your machine learning code will be the same regardless of the math engine you choose.

Math engines independence

Each network works with one math engine instance, and all its layers should have been created with the same math engine. If you have chosen a GPU math engine, it will perform all calculations. This means you may not choose to use a CPU for “light” calculations like adding vectors and a GPU for “heavy” calculations like multiplying matrices. We have introduced this restriction to avoid unnecessary synchronizations and data exchange between devices.

Multi-threading support

The math engine interface is thread-safe; the same instance may be used in different networks and different threads. Note that this may entail some synchronization overhead.

However, the neural network implementation is not thread-safe; the network may run only in one thread.

Serialization format

The trained models can be serialized in the internal format compatible with C++ library version, or in the standard pickle format for Python.

GPU support

Processing on GPU often helps significantly improve performance of mathematical operations. The NeoML library uses GPU both for training and running the models. This is an optional setting and depends on the hardware and software capabilities of your system.

To work on GPU, the library requires an NVIDIA GPU card with CUDA 11.2 update 1 support.

Submodules

The Python API contains several submodules:

Tutorials

Here are several guides that walk you through using NeoML for simple practical tasks, illustrating the specifics of building NeoML networks, working with the blob data format, and evaluating the performance.

Installation

Install a stable version of NeoML library from PyPI:

pip3 install neoml

Install the library you downloaded locally from our github repo:

cd <path to neoml>/NeoML/Python
python3 setup.py install

Supported Python versions: 3.8 to 3.11

If you’re going to use a GPU for processing, install also CUDA 11.8.