Stateful and procedural functions
Stateful functions
A stateful function is a function whose internal state may change
when it is called. For example, the wall clock function like now()
and random number generators like rand(Integer high)->Integer are
stateful.
Stateful functions can be called in queries. The query optimizer is aware of that they may return different values for given arguments every time they are called. Therefore the optimizer is careful not to change the order in which stateful functions are called in a query.
The stream function extract(s) is stateful since it runs stream
s to extract the elements. The state of stream s may thereby
change. For example, a stream reading sensor values will continuously
change its state when new sensor readings arrive.
The function stateful(Charstring pat)->Bag of Charstring returns the
signatures of the stateful resolvents for functions whose names
match pat.
Example:
stateful('%rand%')
Procedural functions
A procedural function is a stateful function defined as a sequence
of OSQL statements that may have side effects. The typical use of
procedural functions is for consistent updates of the database. The
return statement inside a procedural function returns a value.
Example: Let's create a table to store equipment properties.
drop table Equipment; -- Drop old table
create table Equipment(name Charstring,
weight Integer,
unique(name))
The following procedural Boolean
function
new_equipment(nm,e) inserts a new item into the equipment table and
returns one if the insert succeeded:
create function new_equipment(Charstring n, Integer w) -> Integer
as {
insert into Equipment values (n, w);
return rowcount();
}
You can have several OSQL statements in the body of a procedural
function by enclosing them in curly brackets {...} and separating
them with semicolons. This is called a block.
The curly brackets in a block can be omitted if the block contains only a single statement.
The stateful function rowcount() returns the number of affected rows
in the preceding update of a table.
Test it:
new_equipment('Sensor',10)
select * from Equipment
The function new_equipment is stateful:
stateful('new_equipment')
If the return statement is not called in the procedural function,
the result is null. If a procedural function is used for its side
effects only, not returning any value, the result type Boolean can
be specified.
Iterating over queries
You can make procedures that execute a statement or a block for each result tuple in the result of a query using the for statement:
for <variables> in <query> <statement>
Example: The following procedural function populates the table
Equipment with n items named E1, E2,.. along with random
simulated weights between 1 and 100.
create function populate_equipment1(Integer n) -> Boolean
as for i in range(n)
{
insert into Equipment values ('E' || i, rand(1,100));
commit;
}
populate_equipment1(10)
select * from Equipment
The type of the variable i is implicitly determined by the system
from the type of the expression after in and is not explicitly
declared.
The commit after each insert is very important for saving memory
during the population. If it had not been present all updates would
have been logged and the log would use memory
proportional to n.
Try calling populate_equipment1 again:
populate_equipment1(10)
The call to populate_equipment1 fails when it is called a second
time, since the column name is unique.
When experimenting with different table sizes one would like to remove
all rows in the table Equipment before populating it. This can be
done by executing the SQL statement truncate table Equipment before
inserting the rows;
create function populate_equipment2(Integer n) -> Integer
as {
truncate table Equipment;
for i in range(n)
{
insert into Equipment values ('E' || i, rand(1,100));
commit; -- To clear log
};
return n;
}
Try it:
populate_equipment2(10)
select * from Equipment
The SQL statement truncate table Equipment removes all rows from the
table without logging. If we had excuted
delete from Equipment the row deletions
would have been logged, which would have used a lot of memory.
Local declarations
You can declare local variables Vi with types Ti in a
block with a declare statement placed
first in the block:
declare T1 V1, T2 V2,...
Example: The following procedural function populates Equipment
while also computing the avererage weights.
create function populate_equipment3(Integer n) -> Real
as {
declare Integer sum;
set sum = 0;
truncate table Equipment;
for i in range(n)
{
declare Integer w;
set w = rand(1,100);
insert into Equipment values ('E' || i, w);
commit; -- To clear log
set sum = sum + w;
};
return sum/n;
}
Run it:
populate_equipment3(10)
Check that the average weight was computed correctly:
select avg(weight) from Equipment
Returning bags
For procedural functions returning bags the
statement yield e is used for iteratively adding a new element
to the result.
Example: The following procedural function populates Equipment and
returns a bag of tuples containing the new
item names and their weights.
create function populate_equipment4(Integer n) -> Bag of (Charstring, Integer)
as {
truncate table Equipment;
for i in range(n)
{
declare Charstring nm, Integer w;
set nm = 'E' || i;
set w = rand(1,100);
insert into Equipment values (nm, w);
commit; -- To clear log
yield (nm, w); -- Add result tuple
};
}
Run it:
populate_equipment4(10)
The table was also populated:
select * from Equipment
Stream generators
Procedures that transform other streams can be defined by using for
over incoming streams. This is often a practical method for producing
streams of running statistics over other streams.
Example: The following procedure returns a stream of elements for an incoming stream along with their accumulated sum, min, and max values.
create function elem_sum(Stream of Real s) -> Stream of (Real, Real, Real, Real)
as { declare Real rsum, Real rmin, Real rmax;
set rsum = 0;
set rmin = +inf;
set rmax = -inf;
for e in extract(s)
{
set rsum = rsum + e;
set rmin = min(e,rmin);
set rmax = max(e,rmax);
yield (e, rsum, rmin, rmax);
}
}
elem_sum(timeout(heartbeat(0.25),2))
Analogous to bag results, procedural functions can return steams by
declaring the result Stream of ....
Terminating streams
The return x statement yields stream element e and then the stream
is terminated. It can be used to terminate infinite streams by
immediately returning from a for statement iterating over a stream.
Example: The following function returns the elements of a stream of
numbers s until their accumulated sum exceeds a threshold th.
create function sumstream(Stream of Real s, Real th) -> Stream of Real
as { declare Real sum;
set sum = 0;
for e in extract(s)
{
set sum = sum + e;
case when sum >= th then return null
else yield e
end
}
}
Iterative updates
Use SQL's update statement in procedural
functions for iterative updates of table rows fulfilling some
conditions.
Example:
create function increase_weights(Integer inc,Integer thres) -> Integer
as {
update Equipment
set weight = weight + inc
where weight>thres;
return rowcount();
}
Let's test it:
select sum(weight) from Equipment
increase_weights(10,50)
select sum(weight) from Equipment