Skip to main content

Basic concepts

Types

Type names are not case sensitive and the type names are always internally upper-cased. For clarity all type names used in examples in this manual have the first letter capitalized. Type names must be unique in the database.

There are three main kinds of representations of objects: literals, collections, and surrogates.

Literal types

Literals are self-described and system maintained objects that are implicitly created or deleted by the system when needed. Literals types below type Literal in the type hierarchy, such as numbers (type Number), strings (type Charstring), Booleans (type Boolean) and time stamps (Timeval) are . The type Number can represent both integers (type Integer) and floating point numbers (type Real).

Collection types

Collections are types under type Collection in the type hierarchy representing collections of other objects as elements. Like literals, collections are automatically created and deleted by the system when needed. The following kinds of collections are supported:

  • Collections of type Bag contain unordered sets of objects where duplicates are allowed. A set query returns a bag as result.

  • Type Json represent JSON collections, more specifically:

    • Collections of type Vector contain ordered sequences of elements of any type. A select vector query returns an collection of type Vector as result.

    • Collections of type Record represents key/value pairs.

  • Collections of type Array represent tensors and contain numerical values organized as a cube of arbitrary dimensionality. A select array query returns an object of type Array as result.

  • Type Stream represents a possibly infinite and growing sequence of objects, e.g. to represent continuous readings of sensor measurements. A select stream query returns a stream as result.

Surrogate types

Each instance of a surrogate type has an associated object identifier (OID). Such surrogate objects are explicitly created or deleted by the user or the system. Examples of surrogate objects are objects representing real-world entities such as sensors, system objects such as functions, or even objects representing other SA Engine peers.

The built-in types Type and Function are surrogate types.

  • The system type named Type represents the types themselves, such as their names, their super-types, etc.

  • The system type Function represents functions and their properties.

Type signatures

All objects in the system belong to a type where the types themselves are represented as objects. The type signature is a string representing a type. You can inspect the type signature of an object with the function typesig(Object).

Examples:

typesig(123)
typesig(1.23)
typesig([1,2,3])
typesig(array([1,2,3]))

Type extents

All objects are instances of some types. The set of all instances of a type is called the extent of the type. If an object is an instance of a type, then it is also an instance of all the super-types of that type; conversely, the extent of a type is a subset of all extents of the super-types of that type.

User-defined types

It is possible to create new user defined types as surrogate types under type Userobject.

Constants

Constants can be integers, reals, strings, time stamps, booleans, or null. The value of a constant is the constant itself.

Examples: of constant numbers:

123
1.2
2.3E2
-2.4e-21

Example of a constant time stamp, which is an internal representation of the number of micro seconds since new year 1970 :

|2025-11-18T12:27:43.985Z| 

Time stamps are instances of the type Timeval.

Example:

typesig(|2025-11-18T12:27:43.985Z|)

The function real(ts) converts a time stamp ts into the number of seconds since new year 1970.

Example:

real(|2025-11-18T12:27:43.985736Z|) 

Boolean constants represent logical values. The constant false is equivalent to null casted to type Boolean. The only legal boolean value that can be stored in the database is true.

There are two boolean constants:

true
false

A boolean value is regarded as false if it is null or not in the database (so called close world assumption).

null

Identifiers

Identifiers represent names of OSQL variables, functions, arrays, etc. For exampe, MySalary, sin, Integer, x, x1234, or x1234_b. Notice that OSQL identifiers are not case sensitive; i.e. they are always internally capitalized.

Comments

A comment enclosed with /* and */can be placed anywhere in an OSQL statement outside identifiers, constants, strings, or variables. You can also use SQL's single line comments:

-- This is a comment

Variables

Variables are of two kinds: local variables and session variables.

Local variables are identifiers for data values inside queries and functions. Variables are not case sensitive and internally capitalized.

Example: the following set query uses a local variable internally named I:

select i
from Number i
where I = 3

Local variables must be declared in from clauses of set queries or in function signatures.

Session variables are prefixed with :, e.g.:x, :my_stream, etc. They hold only temporary results during interactive sessions. Session variables cannot be referenced in function bodies and they are not stored in the database. Their lifespan is the current session only. Their purpose is to hold temporary values in scripts and database interactions.

Session variables can be assigned by the variable assignment statement set.

Examples: these statements assign values to the session variables :x, :i, and :v:

set :x = sqrt(2)        -- set session variable :x
set :i = round(3.5)     -- set session variable :i
set :v = [1,2,3]        -- set session variavle :v

These expresson gets the values of the assignments:

:x                     -- get value of :x
:i                     -- get value of :i
:v                     -- get value of :v

Inspect the session variables' type signatures:

typesig(:x)
typesig(:i)
typesig(:v)

Expressions

Expressions are formulas expressed with the OSQL syntax that can be evaluated by the REPL to produce a value. Complex expressions can be built up in terms of other expression. Expressions are basic building blocks in all kinds of OSQL statements.

Examples:

1+2+3
asin(1)
[1,2,3]+1
1+array([[1,2],[3,4]])

Entering simple expressions is the simplest form of OSQL queries.

The type signature of the value of an expression e is computed by the function typesig(e).

Examples:

typesig(1+2+3)
typesig([1,2,3]+1)
typesig(1+array([[1,2],[3,4]]))

Strings

The datatype Charstring represents sequences of UTF-8 characters. Strings are enclosed either by the character ' or ".

Examples:

"A string"
'A string'
'A string with "'
"A string with \" and '"

The enclosing string separators (' or ") for a string constant must be the same. If the string separator is " then \ is the escape character inside the string, replacing the succeeding character. For example the string 'ab"\' can also be written as "ab\"\\", and the string a'"b must be written as "a'\"b".

String concatenation

Strings can be concatenated using SQL's infix operator ||.

Examples:

'ab' || 'cd' || 'ef';
'ab' || 12 || 'de';
'ab' || 1 || 2

String matching

A string str is matched against a regular expression string pattern pat using the infix operator str like pat. In a pattern % (or *) matches a sequence of characters, while _ matches a single character. The [chars] pattern matches any character in chars.

Examples:

'abc' like '__c'
'abc' like '%c'
'abc' like 'a[bd][dc]'
'abc' like 'a[bd][de]'

The library of system string functions is documented in String

Tuples

Tuples couple several comma separated values eie_i into a single expression using the notation (e1,e2,...)(e_1, e_2,...).

Examples:

('a',1)
('a',1.1,2)
('a',1,sqrt(2))

A tuple with a single element, a singelton, is the same as the element itself:

('a')

The type of a tuple is a tuple of the types of its elements.

Examples:

typesig(('a',1))
typesig(('a',1.1,2))
typesig(('a',1,sqrt(2)))
typesig(('a'))

Collections

Collections represent sets of objects. OSQL supports five kinds of collections: bags, vectors, arrays, records, and streams:

  • A bag is a set where duplicates are allowed.
  • A table is a named bag of tuples.
  • A vector is an ordered sequence of objects of any kind.
  • A record is an associative array of key-value pairs.
  • An array is a multi-dimensional collection of numbers.
  • A stream is a possibly infinite sequence of objects that is continuously extended with new elements at the end.

Collections are constructed by collection constructor expressions.

Examples:

Construct a bag of numbers:

values 1,2,3

Construct a vector of numbers:

[1,2,3]

Construct a vector of vectors of numbers:

[[1,2],[3,4]]

Construct a bag of vectors:

values ['a',1],['b',2],['c',3]

Construct an array of integers:

array([1,2,3])

Construct a record (JSON object):

{"id":1,"name":"Kalle","age":32}

Construct a finite stream of three numbers:

stream 1,2,3

Representation

Collections are implemented in two different ways:

  • They can be implemented as generators, which are functions dynamically generating their elements one-by-one, without storing the collection in memory. This enables the representation of very large and even infinite collections with a small and limited foot-print.

  • They can be materialized in memory, that is explicitly represented in memory by some data structures. This is suitable for collections of limited size.

The query processor decides how collections are represented for best performance. In general, bags are represented either as generators or materialized, streams are always generated, while the other collections are always materialized.

Bags

A common kind of collection is bags, which are unordered sets of objects with duplicates allowed. They are represented as objects of type Bag. When a query to an SA Engine REPL returns a bag as result the elements of the bag are printed on separate lines.

Bags can be explicitly created using the values v1,v2,.., the bag constructor notation.

Example:

   values 1,2,2,3

Tuples in values expressions construct bags of tuples.

Example:

 values ('a',1),('b',2),('c',3)

Nameless tables are represented as a bag of tuples having the same lengths and types. The bag above represents a nameless table with two columns typed `Charstring and Integer.

A bag of singeltons represents a nameless table with one column.

Examples:

values ('a'),('b'),('c')

You can assign variables to bag-valued expressions.

Example:

set :sb = values 1,2,3

Bags are normally generated to enable representation of very large bags with limited memory.

Example: The bag :lb contains a million numbers.

set :sb = range(1000000)

Bag specified with values, such as :sb, are always materialized. The query optimizer may decide to materialize (parts of) bags when needed.

Example: The aggregate function count(Bag b)->Integer counts the elements in :b by generating its elements.

count(:lb)

The type signature of a bag b can be inspected with typesig(b).

Examples:

typesig(:sb)
typesig(:lb)
typesig(values ('a'),('b'),('c'))
typesig(values ('a',1),('b',2),('c',3))

The (values v1,v2....) notation can be used to represent bags inside other expressions.

Example:

 select i+1 
from Integer i
where i in (values 3,4,2)

Vectors

Objects of type Vector represent ordered sequences of objects of any kind. Contrast this to bags where the order of the elements in bags is not specified.

Example: The vector :v represents a number followed by a string and a vector of numbers.

set :v = [1, 'Hello world', [2, 3]]

The elements of a vector are enumerated from 1 and up. You can access element number i of a vector v (i.e. viv_i) with the notation v[i].

Example:

:v[3]

The type signature of a vector is Vector of t where t is the type covering all elements in the vector.

Examples:

typesig(:v[3])
typesig([1,2.1,3])

If there is no common type among the elements, the type is just Vector.

Example:

typesig(:v)

Vectors are always materialized.

Section Vectors documents vector functions and queries.

Arrays

Multi-dimensional arrays represent collections of numbers. 1D arrays represent numerical vectors, 2D arrays represent numerical matrices, and higher dimensionality arrays represent tensors.

Arrays are represented as objects of type Array. The difference between objects of type Array and type Vector is that vectors represent 1D sequences of objects of any kind, while 1D arrays very efficiently represent numerical vectors.

Arrays create created with the array constructor function array():

Examples:

A numerical vector of integers:

array([1,2,3])

A numerical vector of reals:

array([1.1,2.2,3.3])

A 3x2 matrix of integers with 3 columns and 2 rows:

array([[1,2,3],[3,4,5]])

A 3x2 matrix of reals:

array([[1.1,2.2,3.3],[3,4,5]])

Arrays are always materialized.

Section Arrays and tensors documents array functions and queries.

Records

Records represent dynamic associations between keys and values, i.e. associative arrays. Other commonly used terms for associative arrays are JSON objects, property lists, key-value pairs, dictionaries, and hash links. OSQL uses JSON notation to construct records.

Example: The following expression assigns the session variable :r to a record.

   set :r= {'Greeting':'Hello, I am Tore','Email':'Tore.Andersson@it.uu.se'}

In the record :r the field Greeting has as value the string 'Hello, I am Tore' and the field Email has the string 'Tore.Andersson@it.uu.se' as value.

A field f of a record bound to a variable r can be accessed with the notation r[f].

Example:

:r['Greeting' ]

Records have type Record. The type Json represents both records and vectors, and therefore type Json is above types Record and Vector in the system type hierarchy.

You can inspect the type signature of record r with typesig(r).

Examples:

typesig(:r)
typesig(:r['Greeting'])

Records are always materialized.

Functions creating and accessing records are described in JSON functions.

Streams

Streams represent possibly infinite and growing sequences of elements, such as measurements produced by a sensor. Streams can also be generated (simulated) by functions. Streams are objects of type Stream.

Example: The function heartbeat(p) returns an infinite stream of number elements produced every p seconds:

heartbeat(0.5) 

The syntax stream e1,e2,... constructs a stream having specific elements eie_i.

Example:

stream -1,0,2

Streams are objects too, so they can be assigned to variables.

Examples:

set :s1 = heartbeat(0.5) 
set :s2= (stream -1,0,2)

The type signature of a stream s can be inspected with typesig(s).

Examples:

typesig(:s1) 
typesig(:s2)

The elements of a stream s can be converted into a bag by running the stream with extract(s).

Examples:

extract(:s1) 
extract(:s2)

The REPL automatically calls extract(s) when it receives a stream s.

Examples:

:s1 
:s2

Streams are always generated.

Streams are documented in section Streams.