Basic concepts
Types
Type names are not case sensitive and the type names are always internally upper-cased. For clarity all type names used in examples in this manual have the first letter capitalized. Type names must be unique in the database.
There are three main kinds of representations of objects: literals, collections, and surrogates.
Literal types
Literals are self-described and system maintained objects that are
implicitly created or deleted by the system when needed. Literals
types below type Literal in the type
hierarchy, such as numbers (type
Number), strings (type Charstring), Booleans (type Boolean) and
time stamps (Timeval) are . The type Number can represent both
integers (type Integer) and floating point numbers (type Real).
Collection types
Collections are types under type Collection in the type
hierarchy
representing collections of other objects as elements. Like
literals, collections are automatically created and deleted by the
system when needed. The following kinds of collections are supported:
Collections of type
Bagcontain unordered sets of objects where duplicates are allowed. A set query returns a bag as result.Type
Jsonrepresent JSON collections, more specifically:Collections of type
Vectorcontain ordered sequences of elements of any type. A select vector query returns an collection of typeVectoras result.Collections of type
Recordrepresents key/value pairs.
Collections of type
Arrayrepresent tensors and contain numerical values organized as a cube of arbitrary dimensionality. A select array query returns an object of typeArrayas result.Type
Streamrepresents a possibly infinite and growing sequence of objects, e.g. to represent continuous readings of sensor measurements. A select stream query returns a stream as result.
Surrogate types
Each instance of a surrogate type has an associated object identifier (OID). Such surrogate objects are explicitly created or deleted by the user or the system. Examples of surrogate objects are objects representing real-world entities such as sensors, system objects such as functions, or even objects representing other SA Engine peers.
The built-in types Type and Function are surrogate types.
The system type named
Typerepresents the types themselves, such as their names, their super-types, etc.The system type
Functionrepresents functions and their properties.
Type signatures
All objects in the system belong to a type where the types themselves
are represented as objects. The type signature is a string
representing a type. You can inspect the type signature of an object
with the function typesig(Object).
Examples:
typesig(123)
typesig(1.23)
typesig([1,2,3])
typesig(array([1,2,3]))
Type extents
All objects are instances of some types. The set of all instances of a type is called the extent of the type. If an object is an instance of a type, then it is also an instance of all the super-types of that type; conversely, the extent of a type is a subset of all extents of the super-types of that type.
User-defined types
It is possible to create new user defined types
as surrogate types under type
Userobject.
Constants
Constants can be integers, reals, strings, time stamps, booleans, or null. The value of a constant is the constant itself.
Examples: of constant numbers:
123
1.2
2.3E2
-2.4e-21
Example of a constant time stamp, which is an internal representation of the number of micro seconds since new year 1970 :
|2025-11-18T12:27:43.985Z|
Time stamps are instances of the
type Timeval.
Example:
typesig(|2025-11-18T12:27:43.985Z|)
The function real(ts) converts a time stamp ts into the number of
seconds since new year 1970.
Example:
real(|2025-11-18T12:27:43.985736Z|)
Boolean constants represent logical values. The constant false
is equivalent to null casted to type Boolean. The only legal
boolean value that can be stored in the database is true.
There are two boolean constants:
true
false
A boolean value is regarded as false if it is null or not in the
database (so called close world assumption).
null
Identifiers
Identifiers represent names of OSQL variables, functions, arrays,
etc. For exampe, MySalary, sin, Integer, x, x1234, or
x1234_b. Notice that OSQL identifiers are not case sensitive;
i.e. they are always internally capitalized.
Comments
A comment enclosed with /* and */can be placed anywhere in an
OSQL statement outside identifiers, constants, strings, or
variables. You can also use SQL's single line comments:
-- This is a comment
Variables
Variables are of two kinds: local variables and session variables.
Local variables are identifiers for data values inside queries and functions. Variables are not case sensitive and internally capitalized.
Example: the following set query
uses a local variable internally named I:
select i
from Number i
where I = 3
Local variables must be declared in from clauses of set
queries or in function
signatures.
Session variables are prefixed with :, e.g.:x, :my_stream, etc.
They hold only temporary results during interactive
sessions. Session variables cannot be referenced in function bodies
and they are not stored in the database. Their lifespan is the
current session only. Their purpose is to hold temporary values in
scripts and database interactions.
Session variables can be assigned by the variable assignment statement
set.
Examples: these statements assign values to the session variables
:x, :i, and :v:
set :x = sqrt(2) -- set session variable :x
set :i = round(3.5) -- set session variable :i
set :v = [1,2,3] -- set session variavle :v
These expresson gets the values of the assignments:
:x -- get value of :x
:i -- get value of :i
:v -- get value of :v
Inspect the session variables' type signatures:
typesig(:x)
typesig(:i)
typesig(:v)
Expressions
Expressions are formulas expressed with the OSQL syntax that can be evaluated by the REPL to produce a value. Complex expressions can be built up in terms of other expression. Expressions are basic building blocks in all kinds of OSQL statements.
Examples:
1+2+3
asin(1)
[1,2,3]+1
1+array([[1,2],[3,4]])
Entering simple expressions is the simplest form of OSQL queries.
The type signature of the value of an expression
e is computed by the function typesig(e).
Examples:
typesig(1+2+3)
typesig([1,2,3]+1)
typesig(1+array([[1,2],[3,4]]))
Strings
The datatype Charstring represents sequences of
UTF-8 characters. Strings are
enclosed either by the character ' or ".
Examples:
"A string"
'A string'
'A string with "'
"A string with \" and '"
The enclosing string separators (' or ") for a string constant
must be the same. If the string separator is " then \ is the
escape character inside the string, replacing the succeeding
character. For example the string 'ab"\' can also be written as
"ab\"\\", and the string a'"b must be written as "a'\"b".
String concatenation
Strings can be concatenated using SQL's infix operator ||.
Examples:
'ab' || 'cd' || 'ef';
'ab' || 12 || 'de';
'ab' || 1 || 2
String matching
A string str is matched against a regular expression string
pattern pat using the infix operator str like pat. In a pattern
% (or *) matches a sequence of characters, while _ matches a
single character. The [chars] pattern matches any character in
chars.
Examples:
'abc' like '__c'
'abc' like '%c'
'abc' like 'a[bd][dc]'
'abc' like 'a[bd][de]'
The library of system string functions is documented in String
Tuples
Tuples couple several comma separated values into a single expression using the notation .
Examples:
('a',1)
('a',1.1,2)
('a',1,sqrt(2))
A tuple with a single element, a singelton, is the same as the element itself:
('a')
The type of a tuple is a tuple of the types of its elements.
Examples:
typesig(('a',1))
typesig(('a',1.1,2))
typesig(('a',1,sqrt(2)))
typesig(('a'))
Collections
Collections represent sets of objects. OSQL supports five kinds of collections: bags, vectors, arrays, records, and streams:
- A bag is a set where duplicates are allowed.
- A table is a named bag of tuples.
- A vector is an ordered sequence of objects of any kind.
- A record is an associative array of key-value pairs.
- An array is a multi-dimensional collection of numbers.
- A stream is a possibly infinite sequence of objects that is continuously extended with new elements at the end.
Collections are constructed by collection constructor expressions.
Examples:
Construct a bag of numbers:
values 1,2,3
Construct a vector of numbers:
[1,2,3]
Construct a vector of vectors of numbers:
[[1,2],[3,4]]
Construct a bag of vectors:
values ['a',1],['b',2],['c',3]
Construct an array of integers:
array([1,2,3])
Construct a record (JSON object):
{"id":1,"name":"Kalle","age":32}
Construct a finite stream of three numbers:
stream 1,2,3
Representation
Collections are implemented in two different ways:
They can be implemented as generators, which are functions dynamically generating their elements one-by-one, without storing the collection in memory. This enables the representation of very large and even infinite collections with a small and limited foot-print.
They can be materialized in memory, that is explicitly represented in memory by some data structures. This is suitable for collections of limited size.
The query processor decides how collections are represented for best performance. In general, bags are represented either as generators or materialized, streams are always generated, while the other collections are always materialized.
Bags
A common kind of collection is bags, which are unordered sets of
objects with duplicates allowed. They are represented as objects of
type Bag. When a query to an SA Engine
REPL returns a bag as result
the elements of the bag are printed on separate lines.
Bags can be explicitly created using the values v1,v2,.., the bag
constructor notation.
Example:
values 1,2,2,3
Tuples in values expressions construct bags of tuples.
Example:
values ('a',1),('b',2),('c',3)
Nameless tables are represented as a bag of tuples having
the same lengths and types. The bag above represents a nameless table
with two columns typed `Charstring and Integer.
A bag of singeltons represents a nameless table with one column.
Examples:
values ('a'),('b'),('c')
You can assign variables to bag-valued expressions.
Example:
set :sb = values 1,2,3
Bags are normally generated to enable representation of very large bags with limited memory.
Example: The bag :lb contains a million numbers.
set :sb = range(1000000)
Bag specified with values, such
as :sb, are always materialized. The query optimizer may decide to
materialize (parts of) bags when needed.
Example: The aggregate
function count(Bag
b)->Integer counts the elements in :b by generating its elements.
count(:lb)
The type signature of a bag b can be inspected
with typesig(b).
Examples:
typesig(:sb)
typesig(:lb)
typesig(values ('a'),('b'),('c'))
typesig(values ('a',1),('b',2),('c',3))
The (values v1,v2....) notation can be used to represent bags inside
other expressions.
Example:
select i+1
from Integer i
where i in (values 3,4,2)
Vectors
Objects of type Vector represent
ordered sequences of objects of any kind. Contrast this to
bags where the order of the elements in bags is not
specified.
Example: The vector :v represents a number followed by a string and
a vector of numbers.
set :v = [1, 'Hello world', [2, 3]]
The elements of a vector are enumerated from 1 and up. You can
access element number i of a vector v (i.e. ) with the notation
v[i].
Example:
:v[3]
The type signature of a vector is Vector of t
where t is the type covering all elements in the vector.
Examples:
typesig(:v[3])
typesig([1,2.1,3])
If there is no common type among the elements, the type is just
Vector.
Example:
typesig(:v)
Vectors are always materialized.
Section Vectors documents vector functions and queries.
Arrays
Multi-dimensional arrays represent collections of numbers. 1D arrays represent numerical vectors, 2D arrays represent numerical matrices, and higher dimensionality arrays represent tensors.
Arrays are represented as objects of type
Array. The difference between
objects of type Array and type Vector is that vectors
represent 1D sequences of objects of any kind, while 1D arrays very
efficiently represent numerical vectors.
Arrays create created with the array constructor function
array():
Examples:
A numerical vector of integers:
array([1,2,3])
A numerical vector of reals:
array([1.1,2.2,3.3])
A 3x2 matrix of integers with 3 columns and 2 rows:
array([[1,2,3],[3,4,5]])
A 3x2 matrix of reals:
array([[1.1,2.2,3.3],[3,4,5]])
Arrays are always materialized.
Section Arrays and tensors documents array functions and queries.
Records
Records represent dynamic associations between keys and values, i.e. associative arrays. Other commonly used terms for associative arrays are JSON objects, property lists, key-value pairs, dictionaries, and hash links. OSQL uses JSON notation to construct records.
Example: The following expression assigns the session
variable :r to a record.
set :r= {'Greeting':'Hello, I am Tore','Email':'Tore.Andersson@it.uu.se'}
In the record :r the field Greeting has as value the
string 'Hello, I am Tore' and the field Email has the string
'Tore.Andersson@it.uu.se' as value.
A field f of a record bound to a variable r can be accessed with the
notation r[f].
Example:
:r['Greeting' ]
Records have type Record. The type
Json represents both records and
vectors, and therefore type Json is above types Record and
Vector in the system type hierarchy.
You can inspect the type signature of record r
with typesig(r).
Examples:
typesig(:r)
typesig(:r['Greeting'])
Records are always materialized.
Functions creating and accessing records are described in JSON functions.
Streams
Streams represent possibly infinite and growing sequences of
elements, such as measurements produced by a sensor. Streams can also
be generated (simulated) by functions. Streams are objects of type
Stream.
Example: The function heartbeat(p) returns an infinite stream of
number elements produced every p seconds:
heartbeat(0.5)
The syntax stream e1,e2,... constructs a stream having specific
elements .
Example:
stream -1,0,2
Streams are objects too, so they can be assigned to variables.
Examples:
set :s1 = heartbeat(0.5)
set :s2= (stream -1,0,2)
The type signature of a stream s can be
inspected with typesig(s).
Examples:
typesig(:s1)
typesig(:s2)
The elements of a stream s can be converted into a
bag by running the
stream with extract(s).
Examples:
extract(:s1)
extract(:s2)
The REPL automatically calls
extract(s) when it receives a stream s.
Examples:
:s1
:s2
Streams are always generated.
Streams are documented in section Streams.
