Validation
Query based testing
SA Engine provides primitives for testing the expected outcomes of executing OSQL statement. In general, the testing is based on validating that queries and statements return the expected values. The combination of queries and validation primitives in SA Engine enables very powerful testing of correct behavior of functions and queries. Such validation is particularly important when developing complex analytical models where small changes may cause unexpected behavior. It is therefore strongly recommended to always develop validation rules along with the models themselves.
Example: The following statement validates the positive square root
function sqrt.
validate "square root"
check sqrt(4) => 2
check sqrt(0) => 0
check sqrt(-1) => null
The validate statement accepts a header string for a test sequence
(unit test) followed by check clauses, check LHS => RHS where
LHS is an OSQL statement to be tested and RHS is an OSQL query that
should return the expected value of executing LHS. The test passes OK
is all LHS-statements are equivalent to the value of the corresponding
RHS query.
Change the expected value of sqrt(0) to be -1 and see what happens.
Testing bag valued expressions
For bag valued expressions use a values
expression in RHS.
Example:
validate "bagged result"
check range(3)
=> values 1,2,3
When checking for bag equality, two bags are considered equal if they contain equivalent elements. The order of the elements is not considered in the equality test,
Example:
validate "insignificant bag order"
check range(3)
=> values 3,1,2
A bag with a single element is equivalent to the element.
Example:
validate "single element"
check 1+2 => 3
check 1+2 => values 3
Use tuple notation to test tuple valued functions.
Example:
validate "tuple valued result"
check divide(7,5) => (1,2)
Rounding numerical values
Be careful when you check correctness of floating point numbers as flotaing point numbers are often not equal because of rounding errors.
Example: The following test fails.
validate "square root"
check sqrt(2)
=> 1.414213562373095
The test fails even though LHS and RHS seem to be equal. The reason is
that the internal binary 64-bits representation of sqrt(2) is not
exactly the same as the one produced by reading the numeric string
1.414213562373095.
To circumvent the round-off problem you can LHS by calling the system
function roundto(Real r, Integer d)->Real that rounds r into d
decimals.
Example:
validate "square root"
check roundto(sqrt(2),3)
=> 1.414
When the function roundto is applied on any a collection all numeric
elements in the collection are rounded.
Example:
validate "rounding collection elements"
check roundto([sqrt(2),'hello',12],2)
=> [1.41,'hello',12]
check roundto([sqrt(2),12,[sqrt(3),'hello']],2)
=> [1.41,12,[1.73,'hello']]
check roundto((values 12,'hello',sqrt(2)),2)
=> values 12,'hello',1.41
Validating streams
Special care is necessary when testing the correctness of expressions
returning streams. To be testable the streams will have to converted
to either bags or vectors using extract(Stream s)->Bag or
vectorof(Stream s)->Vector.
Example:
validate 'finite streams'
check extract(diota(0.1,1,3))
=> values 3,1,2 -- insignificant order
check vectorof(diota(0.1,1,3))
=> [1,2,3] -- order important
Infinite streams must be made finite before being converted to finite collections.
Example:
validate 'infinite streams'
check extract(first_n(heartbeat(0.1), 4))
=> values 0,0.1,0.2,0.3
Special care has to be taken to handle in-precision in timings.
Example: The following fails sometimes
validate "timing variances"
check extract(timeout(heartbeat(0.1),0.1))
=> values 0,0.1
This succeeds:
validate "timing variances"
check extract(timeout(heartbeat(0.1),0.12))
=> values 0,0.1
Stream elements can be rounded.
Example:
validate "rounding stream elements"
check extract(roundto(first_n(sqrt(heartbeat(0.1)),3),1))
=> values 0,0.3,0.4
Testing known bugs
Known bugs can be identified by validate along with their expected
outcomes by calling the function validate:unresolved(Charstring
issue) immediately before a validate statement. The validate
statement should identify the expected values of expressions
identifying issue.
Example:
validate:unresolved("Issue #4144");
validate "indexing tuple function results"
check divide(23,5)[2]
=> 3
The statements identifies an unresolved bug identified by Issue
#4144 where the expected value of divide(23,5)[2] should be 3. The
negative validation will succeed until the bug is fixed, after which
the validate:unresolved must be removed for a successful validation.
Measuring performance
The system function
time_query(Charstring msg,Charstring q,Integer n)->Real
measures the time of executing a query q. After optimizing
the query, its execution plan is run n times. The average execution
time in seconds is returned. The time to optimize the query is not
included in the measurement.
The performance is logged on standard output identified with the
message msg. The total time to run the optimized query is logged
while the average time for each run is returned.
You may need to run the test using the command line interface to see the log.
Example:
time_query('Compute median of bag with 1000 elements',
'median(range(1000))', 1000)
The result of the query is the time in seconds it takes to compute the median of a bag with 1000 integers. The optimized query is run 1000 times to measure the average execution time. The logging will look like this:
[Compute median of bag with 1000 elements 1000 times .. 0.047 s, base line]
External factors influence the execution time, so it must be
interpreted with care. In particular, the logged total execution time
should not be substantially larger than the clock cycle of your
computer. You may need to adjust n in time_query.
To compare two measurements you can call the function
compare_time_query(Real base,Charstring msg,Charstring q,Integer
n)->Real.
It compares the average execution time of query q with
the base line execution time base returned by time_query. The
output is logged on standard output. The query returns the average
execution time like time_query. The log incudes how many times faster the
execution was compared to the base line.
Example: The following measurements compare the execution of
mean(range(1000)) with the base line execution of median(range(1000)).
The base line:
set :base = time_query('Compute median of bag with 1000 elements',
'median(range(1000))', 1000)
The comparison:
compare_time_query(:base, 'Computing the mean of a bag with 1000 integers',
'avg(range(1000))' , 10000)
The logging will look like this:
[Computing the mean of a bag with 1000 integers 10000 times .. 0.312 s, improvement factor 1.5]
For reliable measurements you should adjust n so that the the logged
total execution time is of the same order of magnitude as the base line
execution time.
Analogous to time_query and compare_time_query, the functions
time_function(Charstring msg,Charstring fn,Vector args,Integer
n)->Real
and
compare_time_function(Real base,Charstring msg,Charstring fn,Vector args,Integer n)
measure and compare, respectively, the performance of different
implementations of an OSQL function fn. They run fn(args) n
times.
