Skip to main content

OSQL exercises

Guide specification
Guide type:Studio code
Requirements:None
Recommended reading:System tutorial, OSQL Testing

Introduction

This is a guide that will help you train your OSQL skills. It contains exercises on various aspects of OSQL and SA Engine. Each exercise contains a problem description and a code block where you can write your solution. There is a short snippet of code in the code block that should help you to get started. At the end of each code block there is a validation function that tests your solution. To check if your solution is correct simply run the code block. Happy solving!

Input files

Some exercises read data from files. The files have to be downloaded before solving those exercises.

http:download_file("https://assets.streamanalyze.com/docs/guides/osql-exercises/input.csv",
{}, temp_folder() + "input.csv");
http:download_file("https://assets.streamanalyze.com/docs/guides/osql-exercises/input2.csv",
{}, temp_folder() + "input2.csv");
http:download_file("https://assets.streamanalyze.com/docs/guides/osql-exercises/input3.csv",
{}, temp_folder() + "input3.csv");
Not connected

To run this code block you must be logged in and your studio instance must be started.

Part 1 - Queries/basic problems

Problem 1-1 - Remove divisible by three

Create a function problem1_1() that takes a vector of integers and returns the vector with all integers divisible by three removed.

Example:

> problem1_1([1,2,3,4,5,6]);
[1,2,4,5]

Solution:

create function problem1_1(vector of integer v) -> vector of integer
as /* Write your solution here */


validate "problem 1-1"
check problem1_1([1,2,3,4,5,6]) => [1,2,4,5]
check problem1_1([10,20,30,40]) => [10,20,40];

Not connected

To run this code block you must be logged in and your studio instance must be started.

Problem 1-2 - Categorize grade

Create a function problem1_2() that takes a grade (integer) between 0 and 10 and returns a string. The string should say "bad" if the grade is in the interval 0-5, "ok" if the grade is 6-7, "good" if the grade is 8-9, and "great" if the grade is 10.

Example:

> problem1_2(9);
"good"

Solution:

create function problem1_2(integer x) -> Charstring
as /* Write your solution here */


validate "problem 1-2"
check problem1_2(4) => "bad"
check problem1_2(5) => "bad"
check problem1_2(6) => "ok"
check problem1_2(7) => "ok"
check problem1_2(8) => "good"
check problem1_2(9) => "good"
check problem1_2(10) => "great";
Not connected

To run this code block you must be logged in and your studio instance must be started.

Problem 1-3 - Check if outside interval

Create a function problem1_3() that takes an integer and checks if it is outside the interval [5,10]. That is, it should return the integer if it is less than 5 or greater than 10, otherwise it should not return anything.

Example:

> problem1_3(4);
4
> problem1_3(7);
null

Solution:

create function problem1_3(integer i) -> integer
as /* Write your solution here */


validate "problem 1-3"
check problem1_3(3) => 3
check problem1_3(4) => 4
check problem1_3(5) => null
check problem1_3(6) => null
check problem1_3(9) => null
check problem1_3(10) => null
check problem1_3(11) => 11
check problem1_3(12) => 12;
Not connected

To run this code block you must be logged in and your studio instance must be started.

Problem 1-4 - Argmax

Create a function problem1_4() that takes a vector of integers and finds the index that holds the highest value.

Example:

> problem1_4([19,240,113]);
2
> problem1_4([1,3,7,2,7]);
3
5

Solution:

create function problem1_4(vector of integer v) -> integer
as /* Write your solution here */


validate "problem 1-4"
check problem1_4([1,2,3,4,5,6,100,8,9,10]) => 7
check problem1_4([1,2,3,100,5,6,100,8,9,10]) => bag(4,7);
Not connected

To run this code block you must be logged in and your studio instance must be started.

Part 2 - Stream filters

Problem 2-1

Create a function 'problem2_1()' that takes a a stream of numbers and only emits the numbers that are less than 10.

Example:

> problem2_1(vstream([12,8,7,19,5]));
[1,2,4,5]

Solution:

create function problem2_1(stream of number s) -> stream of number
as /* Write your solution here */


validate "problem 2-1"
check problem2_1(vstream([1,2,3,4,5,6,7,8,9,10,11,12,13,14]))
=> bag(1,2,3,4,5,6,7,8,9);
Not connected

To run this code block you must be logged in and your studio instance must be started.

Part 3 - Synthetic/recorded data and csv files

Problem 3-1

Create a function problem3_1() that outputs the following stream of vectors:

[1,2,3]
[4,5,6]
[7,8,9]

Solution:

 create function problem3_1() -> Stream of Vector
as /* Write your solution here */


validate "problem 3-1"
check extract(problem3_1()) => bag([1,2,3], [4,5,6], [7,8,9]);
Not connected

To run this code block you must be logged in and your studio instance must be started.

Problem 3-2

Create a function problem3_2() that returns the following tuples:

(1, "job")
(4, "well")
(7, "done")

Solution:

 create function problem3_2() -> Bag of (Integer, Charstring)
as /* Write your solution here */


validate "problem 3-2"
check problem3_2() => bag((1, "job"), (4,"well"), (7,"done"));
Not connected

To run this code block you must be logged in and your studio instance must be started.

Problem 3-3

Create a function problem3_3() that takes a CSV file as input and outputs a stream of vectors where each vector is one row in the CSV file but with 1 added to each element.

For example, a CSV with the following contents:

1,2,3
4,5,6
7,8,9

would generate a stream of the following vectors:

[2,3,4]
[5,6,7]
[8,9,10]

Solution:

create function problem3_3(Charstring csv_file) -> Stream of Vector
as /* Write your solution here */


validate "problem 3-3"
check extract(problem3_3(temp_folder() + "input.csv"))
=> bag([2,3,4], [5,6,7], [8,9,10]);
Not connected

To run this code block you must be logged in and your studio instance must be started.

Problem 3-4

Create a function problem3_4() that takes a CSV file as input and parses each row as a Timeval with a utc timestamp and a value. The function should output a stream of timevals where each timeval corresponds to a row in the CSV file.

For example, a CSV with the following contents:

"2023-01-17T09:49:45.449Z",1
"2023-01-17T09:49:45.564Z",2
"2023-01-17T09:49:45.660Z",3

would generate a stream of the following vectors:

ts(|2023-01-17T09:49:45.449Z|, [1])
ts(|2023-01-17T09:49:45.564Z|, [2]),
ts(|2023-01-17T09:49:45.660Z|, [3]),

Solution:

create function problem3_4(Charstring csv_file) -> Stream of Timeval
as /* Write your solution here */


validate "problem 3-4"
check(extract(problem3_4(temp_folder() + "input2.csv"))) =>
bag(ts(|2023-01-17T09:49:45.449Z|, [1]),
ts(|2023-01-17T09:49:45.564Z|, [2]),
ts(|2023-01-17T09:49:45.660Z|, [3]),
ts(|2023-01-17T09:49:45.751Z|, [4]),
ts(|2023-01-17T09:49:45.860Z|, [5]),
ts(|2023-01-17T09:49:45.954Z|, [6]),
ts(|2023-01-17T09:49:46.049Z|, [7]),
ts(|2023-01-17T09:49:46.158Z|, [8]),
ts(|2023-01-17T09:49:46.250Z|, [9]),
ts(|2023-01-17T09:49:46.358Z|, [10])
);
Not connected

To run this code block you must be logged in and your studio instance must be started.

Part 4 - Stream windows

Problem 4-1

Create a function problem4_1() that takes a stream of number and outputs a stream of every second number from the stream.

Example:

> problem4_1(vstream[1,2,3,4,5,6]);
[1,3,5]

Solution:

create function problem4_1(stream of number s) -> stream of number
as /* Write your solution here */


validate "problem 4-1"
check extract(problem4_1(vstream([1,2,3,4,5,6]))) => bag(1,3,5)
check extract(problem4_1(vstream([10,20,99,40]))) => bag(10,99);
Not connected

To run this code block you must be logged in and your studio instance must be started.

Problem 4-2

Create a function problem4_2() that takes a stream of numbers, computes a moving average over sequences of five with a distance of one, and outputs these as a stream of numbers

Example:

> problem4_2(vstream[1,1,1,1,1,2,2,2,2,2]);
[1,1.2,1.4,1.6,1.8,2]

Solution:

create function problem4_2(stream of number s) -> stream of number
as /* Write your solution here */


validate "problem 4-2"
check extract(problem4_2(vstream([1,1,1,1,1,2,2,2,2,2]))) => bag(1,1.2,1.4,1.6,1.8,2);
Not connected

To run this code block you must be logged in and your studio instance must be started.

Problem 4-3

Create a function problem4_3() that takes a stream of timevals and samples the first 100 milliseconds from every second.

Example: The file input3.csv contains 500 timevals created during 5 seconds. Use ts(diota(0.01,1,500)) to create a similar stream, except with your current time. The resulting stream should look like this (but with different timestamps):

ts(|2023-01-17T14:22:25.349Z|, [[1],[2],[3],[4],[5],[6],[7],[8],[9]])
ts(|2023-01-17T14:22:26.349Z|, [[100],[101],[102],[103],[104],[105],[106],[107],[108],[109]])
ts(|2023-01-17T14:22:27.349Z|, [[201],[202],[203],[204],[205],[206],[207],[208],[209],[210]])
ts(|2023-01-17T14:22:28.349Z|, [[301],[302],[303],[304],[305],[306],[307],[308],[309],[310]])
ts(|2023-01-17T14:22:29.349Z|, [[401],[402],[403],[404],[405],[406],[407],[408],[409],[410]])

Solution:

create function problem4_3(Stream of Timeval tv) -> Stream of Timeval of vector
as /* Write your solution here */


create function read_test_data(Charstring csv_file) -> Stream of Timeval
as select Stream of t
from Timeval t, Vector v, Number value, charstring timestring
where v in csv:file_stream(csv_file)
and [timestring, value] = v
and t = ts(parse_iso_timestamp(timestring), value);

validate "problem 4-3"
check(extract(problem4_3(read_test_data(temp_folder() + "input3.csv")))) =>
bag(ts(|2023-01-17T14:22:25.349Z|, [[1],[2],[3],[4],[5],[6],[7],[8],[9]]),
ts(|2023-01-17T14:22:26.349Z|, [[100],[101],[102],[103],[104],[105],[106],[107],[108],[109]]),
ts(|2023-01-17T14:22:27.349Z|, [[201],[202],[203],[204],[205],[206],[207],[208],[209],[210]]),
ts(|2023-01-17T14:22:28.349Z|, [[301],[302],[303],[304],[305],[306],[307],[308],[309],[310]]),
ts(|2023-01-17T14:22:29.349Z|, [[401],[402],[403],[404],[405],[406],[407],[408],[409],[410]]));
Not connected

To run this code block you must be logged in and your studio instance must be started.

Problem 4-4

Create a function problem4_4() that takes a stream of timevals and, whenever there is a value divisible with 100, gets a timewindow of 100 ms.

Example: from the diota(0.1,1,500) stream in the previous exercise there should be five windows, as in the bag seen below.


Solution:

create function hundred_p(Timeval of Number tn) -> Boolean
as select mod(value(tn),100) = 0;

create function problem4_4(Stream of Timeval tv) -> Stream of Timeval of vector
as /* Write your solution here */


validate "problem 4-4"
check(extract(problem4_4(read_test_data(temp_folder() + "input3.csv")))) =>
bag(ts(|2023-01-17T14:22:26.255Z|, [92,93,94,95,96,97,98,99,100]),
ts(|2023-01-17T14:22:27.245Z|, [190,191,192,193,194,195,196,197,198,199,200]),
ts(|2023-01-17T14:22:28.238Z|, [289,290,291,292,293,294,295,296,297,298,299,300]),
ts(|2023-01-17T14:22:29.239Z|, [390,391,392,393,394,395,396,397,398,399,400]),
ts(|2023-01-17T14:22:30.247Z|, [492,493,494,495,496,497,498,499,500])
);
Not connected

To run this code block you must be logged in and your studio instance must be started.