Skip to main content

Computer Vision

⏳ Performance work in progress

This system model is subject to ongoing development and all functions do not fully utilize the JIT compiler yet. Therefore, the execution of some operations currently take quite some time to complete.

🚀 Interactive documentation

This page uses Studio code blocks so you can run the examples directly in the browser. You only need to sign up for SA Studio (it's free). Once you have done that you can execute the code blocks on this page.

Module specification 
SA Engine version:5.2.0
Supported platforms:Windows, Linux(x86), Raspberry Pi

SA Engine contains the cv system model which is a computer vision library with a wide range of functions and algorithms for image processing and analysis. The data representation is fully compatible with Python NumPy, and rely on PIL/Pillow for image I/O and image annotation. Functions that use python are prefixed with py:.

info

The images downloaded in the "Download example images" section are used by the examples in the other sections. So if you want to try the examples for a specific section, make sure that you have run the code in the "Download example images" section first.

Load module

System models are loaded with the system_models:load() function. For this documentation we need the image_io and cv system models.

system_models:load('image_io');
system_models:load('cv');

Download example images

Let's first download some images that we will use in the examples.

http:download_file("https://assets.streamanalyze.com/public/images/baboon.png",{},
sa_home() + "baboon.png");
http:download_file("https://assets.streamanalyze.com/public/images/cameraman.png",{},
sa_home() + "cameraman.png");
http:download_file("https://assets.streamanalyze.com/public/images/einstein.png",{},
sa_home() + "einstein.png");
http:download_file("https://assets.streamanalyze.com/public/images/meanshift.png",{},
sa_home() + "meanshift.png");

Image I/O

There are currently two ways of loading and saving images in SA Engine. For PNG files it is recommended to use the png:read() and png:write() functions in the image_io system model. For all other formats we have py:imread() and py:imwrite(), which are wrapper functions for reading and writing images with Python using NumPy and PIL/Pillow.

//plot: Bitmap
set :baboon = png:read(sa_home() + "baboon.png");

If we check the data format and the shape of the image array we see that it is a 8-bit unsigned integer image of size [512,512,3], which means it has 512 rows (1st index), 512 columns (2nd index), and 3 color channels.

format(:baboon);
// "U8"
shape(:baboon);
// [512,512,3]

To save images to disk we use the png:write() function.

png:write(sa_home() + "my_image.png", :baboon, {});

We can verify that the image was saved by loading it again.

//plot: Bitmap
png:read(sa_home() + "my_image.png");

Resizing

The cv library supports resizing images with the functions cv:resize() and cv:scale().

Let's first load an image that we can rescale.

//plot: Bitmap
set :im = png:read(sa_home() + "baboon.png");

Now we can use cv:resize() to resize the image by providing a new height and width.

//plot: Bitmap
cv:resize(:im, 200, 300);

Or we can use the cv:scale() that takes a fraction instead of hight and width.

//plot: Bitmap
cv:scale(:im, 0.4);

Conversion

We can convert images between RGB and grayscale with the functions cv:rgb2gray() and cv:gray2rgb().

We start by loading a color image.

//plot: Bitmap
set :im_rgb = png:read(sa_home() + "baboon.png");
set :im_rgb = cv:scale(:im_rgb, 0.4);

We can now convert the image to grayscale with cv:rgb2gray().

//plot: Bitmap
set :im_gray = cv:rgb2gray(:im_rgb);

If we look at the format and shape of the output image we see that it only has one color channel, compared to the color image which has three.

shape(:im_rgb);
// [204,204,3]
shape(:im_gray);
// [204,204]

We can convert the grayscale image back to a color image with cv:gray2rgb().

//plot: Bitmap
set :im_rgb = cv:gray2rgb(:im_gray);

The output image is still gray since we have not added any color information, but it now has three color channels instead of one.

shape(:im_rgb);
// [204,204,3]

We can also use a color map when converting grayscale images to color by providing a color map to cv:gray2rgb().

//plot: Bitmap
set :im_rgb = cv:gray2rgb(:im_gray, cv:const:COLORMAP_HOT());

Cropping and masking

The cv library supports cropping and masking of images through the cv:crop() and cv:mask() functions. They work on both graylevel and color images.

Let's first load an image that we can crop and mask.

//plot: Bitmap
set :im = png:read(sa_home() + "baboon.png");

Now we can crop the image with cv:crop() which takes the top right corner (y,x) and a height and width.

//plot: Bitmap
cv:crop(:im, 50, 100, 200, 200);

We can also mask the image with cv:mask() which takes a mask as input. Only pixels where the mask is >0 will present in the output image. All other pixels will be set to 0.

First we define a function that creates a small circular mask.

create function my_mask() -> Array
as select Array[y..512, x..512] of U8 e
where e = case when sqrt((y-65)^2 + (x-160)^2) > 50 then 0
else 255 end;

Now we can mask the image.

//plot: Bitmap
cv:mask(:im, my_mask());

Filtering

The cv library supports filtering of graylevel images with the cv:filter_2d() function.

We start by loading a graylevel image.

//plot: Bitmap
set :im = png:read(sa_home() + "cameraman.png");
set :im = cv:rgb2gray(:im);
set :im = cv:resize(:im,150,150);

We can get a gaussian kernel with the cv:get_gaussian_kernel() function.

set :kernel = cv:get_gaussian_kernel(5,1.0);

We can now filter the image with the gaussian kernel.

//plot: Bitmap
cv:filter_2d(:im, :kernel);

It is also easy to create new kernels. For example, here we create a kernel that approximates the first order derivative in one direction.

set :kernel = array("F64", [[1, 0, -1],
[2, 0, -2],
[1, 0, -1]]);

Now we can use the kernel to find vertical edges in the image.

//plot: Bitmap
cv:filter_2d(:im, :kernel);

Drawing

The cv library supports a limited set of functions for drawing boxes and putting text in images.

We start by loading an image.

//plot: Bitmap
set :im = png:read(sa_home() + "baboon.png");

Now we can draw a simple rectangle in the image with the cv:draw_rect() function.

//plot: Bitmap
cv:draw_rect(:im, 40, 90, 130, 220, array("U8", [255,0,0]));

Finding extrema

The cv library supports finding maxima in floating point images with the cv:findmax() function.

First let's create a floating point image.

create function my_image() -> Array of F64
as select Array grad[y..256, x..256] of F64 val
from Real d
where d = minkowski([x,y], [180,50]+0.2, 2)
and val = max((1/d)*(128 - d)/128, 0);

If we look at the image we see that it has a maximum at [50,180].

//plot: Bitmap
my_image();

We use cv:findmax() to find the pixel coordinates with the maximum value.

cv:findmax(my_image());
// [50,180]

Measuring image similarity

The cv library supports measuring the similarity between two images with the cv:ssim() function that implements the structural similarity index measure (SSIM).

We load two images that we can use to measure the similarity. First the original image.

//plot: Bitmap
set :im1 = png:read(sa_home() + "einstein.png");

Then we load a copy of the first image that has been altered slightly.

//plot: Bitmap
set :im2 = png:read(sa_home() + "meanshift.png");

Now we can use cv:ssim() to measure the similarity between the images.

cv:ssim(array("F64", :im1), array("F64", :im2), 11, 1.5, 0.01, 0.03);
// 0.99470285061723

Template matching

The cv library supports template matching through the cv:match_template() function.

First we load an image we can use.

//plot: Bitmap
set :cameraman = png:read(sa_home() + "cameraman.png");
set :cameraman = cv:rgb2gray(:cameraman);

The we crop a section that we can use as template from the image.

//plot: Bitmap
set :template = cv:crop(:cameraman, 100, 240, 80, 80);

Now we can use find the template in the image with cv:match_template().

//plot: Bitmap
set :template_map = cv:match_template(array("F32", :cameraman), array("F32", :template));

We can find the position with the best match by using cv:findmax() on the template map.

set (:y0,:x0) = cv:findmax(:template_map);
// [100,240]

Let's draw a box at the position with the best match. We do this by first converting the image to a color image.

//plot: Bitmap
set :im = cv:gray2rgb(:cameraman);

And then drawing the box in our color image.

//plot: Bitmap
cv:draw_rect(:im, :y0, :y0 + 80, :x0, :x0 + 80, array("U8", [255,0,0]));

Thresholding

The cv library supports thresholding of graylevel images through the cv:threshold() function.

First we load an image we can threshold.

//plot: Bitmap
set :cameraman = png:read(sa_home() + "cameraman.png");
set :cameraman = cv:rgb2gray(:cameraman);

Then we apply the thresholding function on the image with a threshold value, max value and the thresholding type. For every pixel, the same threshold value is applied. If the pixel value is smaller than the threshold, it is set to 0, otherwise it is set to the maximum value.

//plot: Bitmap
cv:threshold(:cameraman, 128, 255, cv:const:THRESH_BINARY());

Available thresholding types:

cv:const:THRESH_BINARY()
cv:const:THRESH_BINARY_INV()
cv:const:THRESH_TRUNK()
cv:const:THRESH_TOZERO()
cv:const:THRESH_TOZERO_INV()

We refer to OpenCV's documentation on ThresholdTypes for a description of the different thresholding types.

API

All computer vision functions are listed in the OSQL API.