Computer Vision
Some operations in this system model can be a bit slower than expected. This is because those operations do not yet fully utilize the JIT compiler. We are continuously working to resolve this and you can expect the performance to improve over the upcoming releases.
Module specification | |
---|---|
SA Engine version: | 5.2.0 |
Supported platforms: | Windows, Linux(x86), Raspberry Pi |
SA Engine contains the cv
system model which is a computer vision library with a wide range of functions and algorithms for image processing and analysis. The data representation is fully compatible with Python NumPy.
Download example images
To be able to run the examples in this guide we first need to download the example images to our sa_home()
folder in the browser. The images are downloaded by running the following code.
// Download example images
http:download_file("https://assets.streamanalyze.com/public/images/baboon.png",{},
sa_home() + "baboon.png");
http:download_file("https://assets.streamanalyze.com/public/images/baboon_small.png",{},
sa_home() + "baboon_small.png");
http:download_file("https://assets.streamanalyze.com/public/images/cameraman.png",{},
sa_home() + "cameraman.png");
http:download_file("https://assets.streamanalyze.com/public/images/einstein.png",{},
sa_home() + "einstein.png");
http:download_file("https://assets.streamanalyze.com/public/images/meanshift.png",{},
sa_home() + "meanshift.png");
Load module
System models are loaded with the system_models:load()
function. For this documentation we need the image_io
and cv
system models.
// Load required system models
system_models:load('image_io');
system_models:load('cv');
Examples
Image I/O
There are currently two ways of loading and saving images in SA Engine. For PNG files it is recommended to use the png:read()
and png:write()
functions in the image_io
system model. For all other formats we have py:imread()
and py:imwrite()
, which are wrapper functions for reading and writing images with Python using NumPy and PIL/Pillow.
// Load PNG image
//plot: Bitmap
set :baboon = png:read(sa_home() + "baboon.png");
If we check the data format and the shape of the image array we see that it is a 8-bit unsigned integer image of size [512,512,3]
, which means it has 512 rows (1st index), 512 columns (2nd index), and 3 color channels.
// Check data format and image shape
format(:baboon);
shape(:baboon);
To save images to disk we use the png:write()
function.
// Write PNG to disk
png:write(sa_home() + "my_image.png", :baboon, {});
We can verify that the image was saved by loading it again.
// Verify that PNG image was written
//plot: Bitmap
png:read(sa_home() + "my_image.png");
Resize
The cv
library supports resizing images with the functions cv:resize()
and cv:scale()
.
We can use cv:resize()
to resize the image by providing a new height and width.
// Resize image
//plot: Bitmap
set :im = png:read(sa_home() + "baboon.png");
set :im = cv:resize(:im, 200, 300);
Or we can use the cv:scale()
that takes a fraction instead of height and width.
// Scale image
//plot: Bitmap
set :im = png:read(sa_home() + "baboon.png");
set :im = cv:scale(:im, 0.4);
Convert color format
We can convert images between RGB and grayscale with the functions cv:rgb2gray()
and cv:gray2rgb()
.
We can convert images to grayscale with cv:rgb2gray()
.
// Load image and convert to grayscale
//plot: Bitmap
set :im = png:read(sa_home() + "baboon_small.png");
set :im_gray = cv:rgb2gray(:im);
If we look at the format and shape of the output image we see that it only has one color channel, compared to the color image which has three.
// Check number of color channels
shape(:im_gray);
shape(:im);
We can convert the grayscale image back to a color image with cv:gray2rgb()
.
// Load image and convert to color
//plot: Bitmap
set :im = png:read(sa_home() + "baboon_small.png");
set :im_gray = cv:rgb2gray(:im);
set :im_rgb = cv:gray2rgb(:im_gray);
Now it has three color channels instead of one but the output image is still gray since all channels have the same value for each pixel.
// Check number of color channels
shape(:im_rgb);
We can also use a color map when converting grayscale images to color by specifying a color map when calling cv:gray2rgb()
.
// Load image and convert to color with color map
//plot: Bitmap
set :im = png:read(sa_home() + "baboon_small.png");
set :im_gray = cv:rgb2gray(:im);
set :im_rgb = cv:gray2rgb(:im_gray, cv:const:COLORMAP_JET());
See the computer vision API reference for a list of available color map functions.
Crop and mask
The cv
library supports cropping and masking of images through the cv:crop()
and cv:mask()
functions. They work on both graylevel and color images.
Let's first load an image that we can crop and mask.
// Load image and convert to color with color map
//plot: Bitmap
set :im = png:read(sa_home() + "baboon.png");
//plot: Bitmap
cv:crop(:im, 50, 100, 200, 200);
We can also mask the image with cv:mask()
which takes a mask as input. Only pixels where the mask is >0 will present in the output image. All other pixels will be set to 0.
First we define a function that creates a small circular mask.
create function my_mask() -> Array
as select Array[y..512, x..512] of U8 e
where e = case when sqrt((y-65)^2 + (x-160)^2) > 50 then 0
else 255 end;
Now we can mask the image.
//plot: Bitmap
cv:mask(:im, my_mask());
Filter
The cv
library supports filtering of graylevel images with the cv:filter_2d()
function.
We start by loading a graylevel image.
//plot: Bitmap
set :im = png:read(sa_home() + "cameraman.png");
set :im = cv:rgb2gray(:im);
set :im = cv:resize(:im,150,150);
We can get a gaussian kernel with the cv:get_gaussian_kernel()
function.
set :kernel = cv:get_gaussian_kernel(5,1.0);
We can now filter the image with the gaussian kernel.
//plot: Bitmap
cv:filter_2d(:im, :kernel);
It is also easy to create new kernels. For example, here we create a kernel that approximates the first order derivative in one direction.
set :kernel = array("F64", [[1, 0, -1],
[2, 0, -2],
[1, 0, -1]]);
Now we can use the kernel to find vertical edges in the image.
//plot: Bitmap
cv:filter_2d(:im, :kernel);
Draw
The cv
library supports a limited set of functions for drawing boxes and putting text in images.
We start by loading an image.
// Load image and convert to color with color map
//plot: Bitmap
set :im = png:read(sa_home() + "baboon.png");
Now we can draw a simple rectangle in the image with the cv:draw_rect()
function.
// Load image and convert to color with color map
//plot: Bitmap
set :im = png:read(sa_home() + "baboon.png");
Find extrema
The cv
library supports finding maxima in floating point images with the cv:findmax()
function.
First let's create a floating point image.
// Load image and convert to color with color map
create function my_image() -> Array of F64
as select Array grad[y..256, x..256] of F64 val
from Real d
where d = minkowski([x,y], [180,50]+0.2, 2)
and val = max((1/d)*(128 - d)/128, 0);
If we look at the image we see that it has a maximum at [50,180].
//plot: Bitmap
my_image();
cv:findmax(my_image());
// [50,180]
Measure image similarity
The cv
library supports measuring the similarity between two images with the cv:ssim()
function that implements the structural similarity index measure (SSIM).
We load two images that we can use to measure the similarity. First the original image.
//plot: Bitmap
set :im1 = png:read(sa_home() + "einstein.png");
Then we load a copy of the first image that has been altered slightly.
//plot: Bitmap
set :im2 = png:read(sa_home() + "meanshift.png");
Now we can use cv:ssim()
to measure the similarity between the images.
cv:ssim(array("F64", :im1), array("F64", :im2), 11, 1.5, 0.01, 0.03);
// 0.99470285061723
Template match
The cv
library supports template matching through the cv:match_template()
function.
First we load an image we can use.
//plot: Bitmap
set :cameraman = png:read(sa_home() + "cameraman.png");
set :cameraman = cv:rgb2gray(:cameraman);
The we crop a section that we can use as template from the image.
//plot: Bitmap
set :template = cv:crop(:cameraman, 100, 240, 80, 80);
Now we can use find the template in the image with cv:match_template()
.
//plot: Bitmap
set :template_map = cv:match_template(array("F32", :cameraman), array("F32", :template));
We can find the position with the best match by using cv:findmax()
on the template map.
set (:y0,:x0) = cv:findmax(:template_map);
// [100,240]
Let's draw a box at the position with the best match. We do this by first converting the image to a color image.
//plot: Bitmap
set :im = cv:gray2rgb(:cameraman);
And then drawing the box in our color image.
//plot: Bitmap
cv:draw_rect(:im, :y0, :y0 + 80, :x0, :x0 + 80, array("U8", [255,0,0]));
Threshold
The cv
library supports thresholding of graylevel images through the cv:threshold()
function.
First we load an image we can threshold.
//plot: Bitmap
set :cameraman = png:read(sa_home() + "cameraman.png");
set :cameraman = cv:rgb2gray(:cameraman);
Then we apply the thresholding function on the image with a threshold value, max value and the thresholding type. For every pixel, the same threshold value is applied. If the pixel value is smaller than the threshold, it is set to 0, otherwise it is set to the maximum value.
//plot: Bitmap
cv:threshold(:cameraman, 128, 255, cv:const:THRESH_BINARY());
Available thresholding types:
cv:const:THRESH_BINARY()
cv:const:THRESH_BINARY_INV()
cv:const:THRESH_TRUNK()
cv:const:THRESH_TOZERO()
cv:const:THRESH_TOZERO_INV()
We refer to OpenCV's documentation on ThresholdTypes for a description of the different thresholding types.
API
All computer vision functions are listed in the OSQL API.