Skip to main content

Working with recorded data

Introduction

In the model development phase it is often inconvenient to work directly on data streams from devices that are in active production settings. You will need to be able to replay scenarios and to have data that represents both general cases and any edge cases you want your model to handle. The possibility to work on both synthetic data and recorded data is often a requirement to facilitate a smooth and rigorous model development process and to guarantee reliability in model performance.

Note

Working with recorded data on CAN busses offers extra possibilities and is described in the CAN bus module documentation.

Basic recording and replaying of data

The simplest way to record data is to use csv:write_file(Charstring file,Stream of Vector s)->Boolean. You provide it with a stream of vectors and it will save the vectors as CSV to a file (one vector per row).

To write the values 1-10 to a CSV file we can simply make vectors of the numeric elements in the stream generated by the synthetic stream generator diota(pace,l,u).

Example:

// Write query output to CSV
csv:write_file(sa_home() + "example_1.csv",
(select Stream of [n]
from Integer n
where n in diota(0.1,1,10)))

If you want to send the elements to the output window while saving it you can provide the number 1 as argument feedback to csv:write_file(Charstring file,Number feedback,Stream of Vector s)->Stream of Vector.

This query will overwrite the file example_1.csv saved in the previous query:

// Verbal write to CSV
csv:write_file(sa_home() + "example_1.csv",
1,
(select Stream of [n]
from Integer n
where n in diota(0.1,1,10)))

Replay the data by running the csv:file_stream(Charstring file)->Stream of Vector function:

// Replay data from CSV
csv:file_stream(sa_home() + "example_1.csv")

Since csv:write_file() takes a stream of vector as argument we can save vectors of any dimension. This query saves a three-dimensional vector produced by three different streams using pivot() (using -1 as default values).

// Write 3D vectors to CSV
csv:write_file(sa_home() + "example_2.csv",
(select Stream of v
from Vector v, Stream of Integer s1,
Stream of Real s2, Stream of Real s3
where s1 = diota(0.1,1,10)
and s2 = heartbeat(0.1)
and s3 = simstream(0.1)
and v in pivot([s1,s2,s3], [-1,-1,-1])
limit 10))

Verify the results by replaying the saved file.

//plot: Line Plot
csv:file_stream(sa_home() + "example_2.csv")

Data with timestamps

Maybe we want to have some time information in the saved data. This can easily be done by saving current time stamp string together with each result data element. The current wall time as an UTC time stamp string is returned by the expression utc_time(). So to save a vector with the UTC time stamp string as first value we can compute the current UTC time stamp string each time we return a new element from the result.

csv:write_file(sa_home() + "example_3.csv",
(select Stream of [utc_time(), v1, v2]
from Vector v, Integer v1, Integer v2
where v in pivot([diota(0.1,1,10),diota(0.1,1,10)], [-1,-1])
and v = [v1,v2]))

Verify the results by replaying the saved file.

csv:file_stream(sa_home() + "example_3.csv")

You can use parse_iso_timestamp() to convert time stamp strings into time points when reading the recorded data. To illustrate this we create a function that reads the recorded data in example_3.csv and outputs a stream of time stamped vectors.

create function replay_recorded_ts_stream()
-> Stream of Timeval of Vector
as select Stream of t
from Timeval t, Vector v,
Number v1, Number v2, Charstring tim
where v in csv:file_stream(sa_home() + "example_3.csv")
and [tim,v1,v2] = v
and t = ts(parse_iso_timestamp(tim),[v1,v2])

Run the function by executing the following query:

replay_recorded_ts_stream()

As you can see the function outputs a stream of the timestamped vectors recorded in example_3.csv.

GPS example

Just to show another application example we'll look at how to replay GPS data from a recorded file. We have provided a file gps.csv with GPS data recorded during a drive through Uppsala, Sweden.

http:download_file(
"https://assets.streamanalyze.com/docs/guides/data/gps.csv",
{}, sa_home() + "gps.csv")

First we create a function that replays the data at a specified pace.

create function replay_gps_stream(Real pace)
-> Stream of Vector
as select Stream of v
from Vector v
where v in csv:file_stream(sa_home() + "gps.csv", "read", pace);

We then wrap the GPS values in in GeoJSON records to be able to render the drive on a map.

create function geojson_stream(Number pace, Charstring name)
-> Stream of Record
as select Stream of geojson:point(p,
{"persistent": true,
"id": name,
"style": {"label": name}})
from Vector p
where p in replay_gps_stream(pace);

And finally we start the stream with GeoJSON visualization activated to see how the car drives around in Uppsala.

//plot: Geo JSON
geojson_stream(0.5, "car-01")

The GPS positioning is a bit jittery at first due to low initial accuracy but improves as the car starts driving towards the city center.

Conclusion

This guide has shown how to record data streams, and how to replay and work with recorded streams. As next step we would recommend reading the Advanced recording examples guide where we try these concepts on a real edge device.