Skip to main content

Validation

Query based testing

SA Engine provides primitives for testing the expected outcomes of executing OSQL statement. In general, the testing is based on validating that queries and statements return the expected values. The combination of queries and validation primitives in SA Engine enables very powerful testing of correct behavior of functions and queries. Such validation is particularly important when developing complex analytical models where small changes may cause unexpected behavior. It is therefore strongly recommended to always develop validation rules along with the models themselves.

Example: The following statement validates the positive square root function sqrt.

validate "square root"
check sqrt(4) => 2
check sqrt(0) => 0
check sqrt(-1) => null

The validate statement accepts a header string for a test sequence (unit test) followed by check clauses, check LHS => RHS where LHS is an OSQL statement to be tested and RHS is an OSQL query that should return the expected value of executing LHS. The test passes OK is all LHS-statements are equivalent to the value of the corresponding RHS query.

Exercise

Change the expected value of sqrt(0) to be -1 and see what happens.

Testing bag valued expressions

For bag valued expressions use a values expression in RHS.

Example:

validate "bagged result"
check range(3)
=> values 1,2,3
note

When checking for bag equality, two bags are considered equal if they contain equivalent elements. The order of the elements is not considered in the equality test,

Example:

validate "insignificant bag order"
check range(3)
=> values 3,1,2

A bag with a single element is equivalent to the element.

Example:

validate "single element"
check 1+2 => 3
check 1+2 => values 3

Use tuple notation to test tuple valued functions.

Example:

validate "tuple valued result"
check divide(7,5) => (1,2)

Rounding numerical values

Be careful when you check correctness of floating point numbers as flotaing point numbers are often not equal because of rounding errors.

Example: The following test fails.

validate "square root"
check sqrt(2)
=> 1.414213562373095

The test fails even though LHS and RHS seem to be equal. The reason is that the internal binary 64-bits representation of sqrt(2) is not exactly the same as the one produced by reading the numeric string 1.414213562373095.

To circumvent the round-off problem you can LHS by calling the system function roundto(Real r, Integer d)->Real that rounds r into d decimals.

Example:

validate "square root"
check roundto(sqrt(2),3)
=> 1.414

When the function roundto is applied on any a collection all numeric elements in the collection are rounded.

Example:

validate "rounding collection elements"
check roundto([sqrt(2),'hello',12],2)
=> [1.41,'hello',12]
check roundto([sqrt(2),12,[sqrt(3),'hello']],2)
=> [1.41,12,[1.73,'hello']]
check roundto((values 12,'hello',sqrt(2)),2)
=> values 12,'hello',1.41

Validating streams

Special care is necessary when testing the correctness of expressions returning streams. To be testable the streams will have to converted to either bags or vectors using extract(Stream s)->Bag or vectorof(Stream s)->Vector.

Example:

validate 'finite streams'
check extract(diota(0.1,1,3))
=> values 3,1,2 -- insignificant order
check vectorof(diota(0.1,1,3))
=> [1,2,3] -- order important

Infinite streams must be made finite before being converted to finite collections.

Example:

validate 'infinite streams'
check extract(first_n(heartbeat(0.1), 4))
=> values 0,0.1,0.2,0.3

Special care has to be taken to handle in-precision in timings.

Example: The following fails sometimes

validate "timing variances"
check extract(timeout(heartbeat(0.1),0.1))
=> values 0,0.1

This succeeds:

validate "timing variances"
check extract(timeout(heartbeat(0.1),0.12))
=> values 0,0.1

Stream elements can be rounded.

Example:

validate "rounding stream elements"
check extract(roundto(first_n(sqrt(heartbeat(0.1)),3),1))
=> values 0,0.3,0.4

Testing known bugs

Known bugs can be identified by validate along with their expected outcomes by calling the function validate:unresolved(Charstring issue) immediately before a validate statement. The validate statement should identify the expected values of expressions identifying issue.

Example:

validate:unresolved("Issue #4144");
validate "indexing tuple function results"
check divide(23,5)[2]
=> 3

The statements identifies an unresolved bug identified by Issue #4144 where the expected value of divide(23,5)[2] should be 3. The negative validation will succeed until the bug is fixed, after which the validate:unresolved must be removed for a successful validation.

Measuring performance

The system function

time_query(Charstring msg,Charstring q,Integer n)->Real

measures the time of executing a query q. After optimizing the query, its execution plan is run n times. The average execution time in seconds is returned. The time to optimize the query is not included in the measurement.

The performance is logged on standard output identified with the message msg. The total time to run the optimized query is logged while the average time for each run is returned.

note

You may need to run the test using the command line interface to see the log.

Example:

time_query('Compute median of bag with 1000 elements',
'median(range(1000))', 1000)

The result of the query is the time in seconds it takes to compute the median of a bag with 1000 integers. The optimized query is run 1000 times to measure the average execution time. The logging will look like this:

[Compute median of bag with 1000 elements 1000 times .. 0.047 s, base line]
note

External factors influence the execution time, so it must be interpreted with care. In particular, the logged total execution time should not be substantially larger than the clock cycle of your computer. You may need to adjust n in time_query.

To compare two measurements you can call the function

compare_time_query(Real base,Charstring msg,Charstring q,Integer n)->Real.

It compares the average execution time of query q with the base line execution time base returned by time_query. The output is logged on standard output. The query returns the average execution time like time_query. The log incudes how many times faster the execution was compared to the base line.

Example: The following measurements compare the execution of mean(range(1000)) with the base line execution of median(range(1000)).

The base line:

set :base = time_query('Compute median of bag with 1000 elements',
'median(range(1000))', 1000)

The comparison:

compare_time_query(:base, 'Computing the mean of a bag with 1000 integers',
'avg(range(1000))' , 10000)

The logging will look like this:

[Computing the mean of a bag with 1000 integers 10000 times .. 0.312 s, improvement factor 1.5]
note

For reliable measurements you should adjust n so that the the logged total execution time is of the same order of magnitude as the base line execution time.

Analogous to time_query and compare_time_query, the functions

time_function(Charstring msg,Charstring fn,Vector args,Integer n)->Real

and

compare_time_function(Real base,Charstring msg,Charstring fn,Vector args,Integer n)

measure and compare, respectively, the performance of different implementations of an OSQL function fn. They run fn(args) n times.