Introduction
OSQL extends SQL seamlessly with functions that compute results for given arguments. For example, there is a large library of mathematical functions built into the system and new computational functions can easily be defined. Each SA Engine system has an internal main-memory database where the functions' definitions are stored.
The simplest kind of query is to call some built-in function.
Example:
sin(3.14/2)*cos(3.14)
Run the function by clicking the play button play_circle
Documentation of functions and other SA Engine artifacts are stored in
the database as well. You can query the database about the
documentation of a function fn
by calling doc('fn')
. It returns
the documentation of a function as a string.
Example:
doc('sin')
You can edit the examples in the tutorial. Try changing sin
to tan
.
You can enter several OSQL statements by separating them with a ;
.
Examples:
doc('atan2');
doc('cos')
If you don't know exactly the name of a function you can also pass a
string pattern to doc()
to return all functions matching the
string pattern in their names. In a string pattern a %
matches any
sequence of characters while a _
matches a single character.
Example:
doc('_tan%')
Functions may return sets of values. In the example it returned two different documentation strings.
The function sourcecode('fn')
where fn
can be a function name or
a string pattern returns the definitions of the matching functions.
Example:
sourcecode('tan')
The first line of a source code string contains the signature of
the function. It is a string denoting the function's arguments and
results. You can get just the signature of function or string pattern
fn
by calling the system function signature(fn)
.
Example:
signature('atan%')
Functions may have different definitions depending on the types of their arguments. Such functions are said to be overloaded on their types. Overloaded function have several signatures.
Example:
signature('range')
Some functions have corresponding infix operators. For example,
the function plus
has the infix operator +
so the queries 1+2
and plus(1,2)
both return the value 3.
You can call signature(op)
to get the signature of infix operators too.
Example:
signature('^')
The following function call returns the version of SA Engine you are running:
system_version()
Visit Documentation functions for a list of functions used for documenting the system.
Basic math functions are listed in Math.
Database tables​
Numbers and other objects used in analysis models can be saved in database tables. Database tables are maintained and queried by SQL statements. OSQL is a seamless extension of SQL with functions, mathematical objects, and data streams.
Tables are created by the create table
statement.
Example:
create table Constants(name Charstring,
value Real,
unique (name))
The above statement creates a table called Constants
having two
columns, where values in column name
are strings and column
value
stores floating point numbers. The keyword unique
specifies
that the values of column name
are different in all table rows.
The create
table
syntax of
SQL is more elaborate. For example, strings need not have maximum
lengths in OSQL.
Tables are populated using SQL's insert
into
statement.
Example:
insert into Constants values('pi', 3.14),
('e', 2.71),
('freeze', 0.0),
('boil', 100.0)
A populated table can be queried by SQL's select
statement
.
Examples:
select * from Constants
select value from Constants where name = 'pi'
select sin(value/2)*cos(value) from Constants where name = 'pi'
Tables can be updated using SQL's update
statement
.
Examples:
update Constants set value=asin(1)*2 where name = 'pi'
update Constants set value=exp(1) where name = 'e'
select * from Constants
The SQL Tutorial explains the basic SQL functionality in SA Engine. You can skip that tutorial if you are already familiar with SQL.
Defining functions​
Analysis models are defined in terms of a number of OSQL
functions. You can define your own OSQL functions by the create
function
statement. OSQL functions are stored in the local database
inside SA Engine. There are functions of different kinds, depending on
how they are used.
Example:
The following function converts Celsius degrees to Fahrenheit:
create function ctof(Real c) -> Real
/* Convert a Celsius degree `c` to Fahrenheit */
as c * 9 / 5 + 32
Let's test ctof
:
ctof(10)
You can use your OSQL functions in queries, e.g.
select ctof(value) from Constants where name = 'boil'
You can also create functions defined by queries.
Example:
create function constant(Charstring c) -> Number
as select value from Constants where name = c
Test it:
ctof(constant('boil'))
In the example the variables value
and name
refer to columns in
the table Constants
, while the variable c
is the function's
argument.
Temporal functions​
The built-in function local_time()
returns the current wall time as
an ISO time stamp.
While local_time
returns the current wall time as a string, the
function now()
returns the wall time as an object of type Timeval
,
which is the internal representation of time points.
Examples:
local_time()
now()
The function ts(Object o)->Timeval
returns a time stamped object,
which represents a timestamped value o
.
Examples:
ts(1)
value(ts(1))
local_time(ts(1))
For more about functions over time visit Time in the reference documentation.
Vectors​
So far the function calls were made over single values. The data type
Vector
represents finite sequences of objects with the usual
[...]
notation.
In difference to vectors, a stream is a possibly infinite and growing sequence of objects.
Examples of vectors:
A vector of four integers:
[1,2,3,4]
A vector of two vectors:
[[1.1,2.0],[3,4]]
A vector of a string and a number:
['Tore',55]
Vector operators and functions​
As for numbers there is a built-in library of vector operators and
functions. The mathematical operators +,-,*,/,^
can be used for
whole vectors or mixes of vectors and scalar values. A dot (.
) in
front of an operator indicates element-wise application of it on each
element of the vectors.
Examples:
Vector addition:
[1,4,3,2]+[1,5,6,7]
Change visualization of the vectors to Bar plot
by adding //plot:
Bar plot
before the query.
Element wise vector multiplication:
[1, 2, 3, 4] .* [5, 6, 7, 8]
Combining vectors and scalars.
[1,4,3,2]-[0,2,3,4]+1
The sum of vector elements:
sum([1, 2, 3, 4])
The average of vector elements:
avg([1, 2, 3, 4])
The standard deviation of elements:
stdev([1, 2, 3, 4])
The median of the elements:
median([1, 4, 2, 3])
The largest or smallest elements:
max([1, 4, 2, 3]);
min([1, 4, 2, 3]);
The number of elements in a vector;
dim([1, 3, 2])
The general Minkowski
distance function
minkowski(v,w,r)
is also built-in. The following examples shows
different kinds of distances expressed in terms of minkowski(v,w,r)
.
Examples:
Manhattan distance ():
minkowski([1,2],[3,4], 1)
Euclidean distance ():
minkowski([1,2],[3,4], 2)
Chebyshev distance ():
minkowski([1,2],[3,4], +inf)
Go to Vectors in the reference documentation for documentation of vectors.
User defined vector functions​
Often vectors are used in function definitions.
Example:
The
following function computes the Euclidean distance between vector v
and w
:
create function dist(Vector v, Vector w) -> Real
/* Compute the Euclidean distance between `v` and `w` */
as sqrt(sum((v - w) .^ 2))
Usage:
dist([1,2],[3,4])
The built-in function euclid(v,w)
does the same.
Arrays​
An array is a multi-dimensional cube of numbers. It is
represented by type Array
. Arrays can be created with the function
array(Vector data)->Array
, where the (nested) vector data
contains
its numerical elements.
The number of dimensions of an array is called its rank and
returned by the function rank(Array a)->Integer
. Arrays with rank
one (1D arrays) are called numerical vectors, arrays with rank two
are called matrices, and arrays with higher dimension than two are
called tensors.
Examples:
A numerical vector:
array([1,2,3])
rank(array([1,2,3]))
A matrix:
array([[1,2],[3,4]])
rank(array([[1,2],[3,4]]))
The representation of numerical data using arrays is much more
efficient than using objects of type Vector
.
Array format​
The elements of an array have a format, which specifies their
binary representations in memory. For example, elements of type
Integer
are represented as 64-bits integers having the format I64
,
and type Real
has the format F64
. The first argument of the
function array(Charstring fmt,Vector data)->Array
specifies the
elements' format.
Example:
This call creates a 1-dimensional array of 16-bit integers:
array('I16',[1,2,3,4])
When called with one argument, array(Vector data)->Array
the
element format will be the most general type in data
, where
real numbers are considered more general than integers,
Examples:
array([1,2,3,4])
array([1.0,2,3,4])
Array shape​
An array also has a shape being a vector of the number of elements
in each of its dimensions. The shape is computed by the function
shape(Array a)->Vector of Integer
.
Examples:
shape(array([1,2,3]))
shape(array([[1,2,3],[4,5,6]]))
Array operators and functions​
As for vectors there is a built in library of array functions and operators.
Examples:
Numerical vector addition:
array([1, 2, 3, 4]) + array([5, 6, 7, 8])
The total sum of array elements:
sum(array([1, 2, 3, 4]))
sum(array([[1, 2], [3, 4]]))
Array addition:
array([[1,2],[3,4]]) + array([[2,3],[6,7]])
Element-wise multiplication:
array([[1,2],[3,4]]) .* array([[2,3],[6,7]])
The function dot(Array u, Array v)->Number
computes the scalar
product of the elements in arrays u
and v
of any shape.
Examples:
dot(array([1,2,3]),array([4,5,6]))
dot(array([[1,2],[3,4]]),array([[4,5],[6,7]]))
The function matmul(Array a, Array b)->Array
computes the array
multiplication of matrices a
and b
.
Example:
matmul(array([[1,2,3]]),array([[1,2],
[3,4],
[5,6]]))
Visit Arrays and tensors for a detailed description of arrays.
The Fast Fourier Transform​
The function rfft(a)
computes the real number Fast Fourier
Transform
(FFT) over a one-dimensional array of real numbers.
Example:
rfft(array([1.0, 2, 3, 4]))
You can also apply rfft()
on vectors of numbers, even though this is
much slower as it will convert the vector to an array. For example:
rfft([1.0, 2, 3, 4])
The inverse real FFT is computed with irfft(a)
.
Example:
irfft(rfft(array([1.0, 2, 3, 4])))
The later tutorial section Querying the microphone shows how user-defined functions over vectors and FFT can be used for filtering audio streams in real-time.
Session variables​
During sessions with SA Engine it is often practical to temporarily
save values of expressions in session variables prefixed with :
by using the set
statement, e.g.:
set :pi = 3.14
Usage:
sin(:pi/2)*cos(:pi)
The session variable :pi
is first assigned the value of 3.14 and
then used twice in a computation.
Session variables are not stored in the local database and disappear when you exit SA Engine. Use tables to save data in the local database.
Sets and bags​
Functions can also produce sets of objects where the order is not
guaranteed as with vectors. For example, the function range(n)
returns the set of numbers from 1 up to n
.
Example:
range(10)
Here range(10)
produces the set of all integers from 1 to 10.
Example:
sin(range(10))
The function call range(200)
returns a set of numbers. A function
applied on a set is called for each element of the set as in the
example, where the sine function is applied one each element in the set
of all integers from 1 to 200.
Applying other functions, such as sin(x)
on a set means that the
function is applied on each element of the set.
Try visualizing the result as a line plot by inserting the row
//plot: Line plot
before the query!
In general sets in OSQL may contain duplicates.
Example:
floor(range(200)/2)
The term bag is used rather than set to indicate sets with possible duplicates.
Apply sin(x)
on the result of floor
and visualize by line plot!
Explicit bags can be create as using the notation, for example:
Example:
set :mybag = bag(1,2,2,3)
Use variable:
:mybag
As for vectors and arrays, aggregate functions over bags are applied on entire bags, rather than on each element of the bag.
Example:
Sum the elements of a bag of mumbers:
sum(range(200)/2)
Aggregate functions over bags can be used in SQL queries, e.g.
select sum(value) from Constants
See Aggregate functions in the reference documentation for more on aggregate functions.
Boolean and null values​
The constant null
is used to represent unknown or missing
values. For example, the last element of the vector [1,2,null]
is
unknown.
A Boolean null
value is regarded as false. Thus, the query 2<1
returns null
, while 2>1
return true
. The only object of type
Boolean is thus true
. The constant false
is defined as null
casted to a Boolean.
null
values in bags are regarded as missing values that are ignored
by aggregate functions.
Passing null
as argument in a function call will return null
without calling the function. For example:
sin(sqrt(-1))
An OSQL query variable bound to the value null
is regarded as
unbound and therefore null
cannot be regraded as a regular
value. Instead, to test that an expression x
is null
, use the
notany(x)
function that returns true
if x
is null
.
Often it is practical to use a default value instead of null. For this
the expression can be used. It returns
the first not being null
and dlft
if all are
null
.
For example:
coalesce(sqrt(-1),ln(-1),1)+1
To replace all null
s in a vector use
whennull(Vector v, Object dflt)->Vector
.
For example:
whennull([1,2,null,3],0)+2
The next part of the tutorial will show how to make general queries that select collections of objects from the database.