Skip to main content

Working with recorded data

Guide specification
Guide type:Studio code
Requirements:None
Recommended reading:None

Introduction

In the model development phase it is often inconvenient to work directly on data streams from devices that are in active production settings. You will need to be able to replay scenarios and to have data that represents both general cases and any edge cases you want your model to handle. The possibility to work on both synthetic data and recorded data is often a requirement to facilitate a smooth and rigorous model development process and to guarantee reliability in model performance.

Note

Working with recorded data on CAN busses offers extra possibilities and is described in the CAN bus module documentation.

Basic recording and replaying of data

The simplest way to record data is to use csv:write_file(). You provide it with a stream or bag of vectors and it will save the vectors as CSV to a file (one vector per row).

To write the values 1-10 to a CSV file we can simply take the numbers emitted by diota() and wrap each value in a vector.

csv:write_file(sa_home() + "example_1.csv",
(select [n]
from number n
where n in diota(0.1,1,10)));
Not connected

To run this code block you must be logged in and your studio instance must be started.

If you want to emit the result to the output window while saving it you can provide a 1 as second argument (feedback) to csv:write_file() (this query will overwrite the file example_1.csv saved in the previous query).

csv:write_file(sa_home() + "example_1.csv",
1,
(select [n]
from number n
where n in diota(0.1,1,10)));
Not connected

To run this code block you must be logged in and your studio instance must be started.

Replaying the data is as simple as running the csv:file_stream() function.

csv:file_stream(sa_home() + "example_1.csv");
Not connected

To run this code block you must be logged in and your studio instance must be started.

Since csv:write_file() takes a stream of vector we can save vectors of any dimension. This query saves a three-dimensional vector produced by three different streams using pivot() (using -1 as default values).

csv:write_file(sa_home() + "example_2.csv",
(select v
from Vector v, Stream of Number s1,
Stream of Number s2, Stream of Number s3
where s1 = diota(0.1,1,10)
and s2 = heartbeat(0.1)
and s3 = simstream(0.1)
and v in pivot([s1,s2,s3], [-1,-1,-1])
limit 30));
Not connected

To run this code block you must be logged in and your studio instance must be started.

Verify the results by replaying the saved file.

//plot: Line Plot
csv:file_stream(sa_home() + "example_2.csv");
Not connected

To run this code block you must be logged in and your studio instance must be started.

Data with timestamps

Maybe we want to have some time information in the saved data. This can easily be done by saving timestamps together with the data. An OSQL Timeval can be cast to a string on UTC format with utc_time(). So to save a vector with the timestamp as first value we can create a timeval each time we get a vector from the stream and add the timeval to the result.

csv:write_file(sa_home() + "example_3.csv",
(select [t, v1, v2]
from Vector v, Number v1, Number v2, Charstring t
where v in pivot([diota(0.1,1,10),diota(0.1,1,10)], [-1,-1])
and [v1,v2] = v
and t = utc_time(timestamp(ts(v)))));
Not connected

To run this code block you must be logged in and your studio instance must be started.

Verify the results by replaying the saved file.

csv:file_stream(sa_home() + "example_3.csv");
Not connected

To run this code block you must be logged in and your studio instance must be started.

You can use parse_iso_timestamp() to parse the timestamps when reading the recorded data. To illustrate this we create a function that reads the recorded data in example_3.csv and outputs a stream of timestamped vectors.

create function replay_recorded_ts_stream()
-> Stream of Timeval of Vector
as select stream of t
from Timeval t, Vector v,
Number v1, Number v2, Charstring tim
where v in csv:file_stream(sa_home() +
"example_3.csv")
and [tim,v1,v2] = v
and t = ts(parse_iso_timestamp(tim),[v1,v2]);
Not connected

To run this code block you must be logged in and your studio instance must be started.

Run the function by executing the following query:

replay_recorded_ts_stream();
Not connected

To run this code block you must be logged in and your studio instance must be started.

As you can see the function outputs a stream of the timestamped vectors recorded in example_3.csv.

GPS example

Just to show another application example we'll look at how to replay GPS data from a recorded file. We have provided a file gps.csv with GPS data recorded during a drive through Uppsala, Sweden.

http:download_file(
"https://assets.streamanalyze.com/docs/guides/data/gps.csv",
{}, sa_home() + "gps.csv");
Not connected

To run this code block you must be logged in and your studio instance must be started.

First we create a function that replays the data at a specified pace.

create function replay_gps_stream(Number pace)
-> Stream of Vector
as select Stream of v
from Vector v
where v in csv:file_stream(sa_home() + "gps.csv",
"read",
pace);
Not connected

To run this code block you must be logged in and your studio instance must be started.

We then wrap the GPS values in in GeoJSON records to be able to render the drive on a map.

create function geojson_stream(Number pace, Charstring name)
-> Stream of Record
as select Stream of geojson:point(p,
{"persistent": true,
"id": name,
"style": {"label": name}})
from Vector p
where p in replay_gps_stream(pace);
Not connected

To run this code block you must be logged in and your studio instance must be started.

And finally we start the stream with GeoJSON visualization activated to see how the car drives around in Uppsala.

//plot: Geo JSON
geojson_stream(0.5, "car-01");
Not connected

To run this code block you must be logged in and your studio instance must be started.

The GPS positioning is a bit jittery at first due to low initial accuracy but improves as the car starts driving towards the city center.

Conclusion

This guide has shown how to record data streams, and how to replay and work with recorded streams. As next step we would recommend reading the Advanced recording examples guide where we try these concepts on a real edge device.