Good Practice: Intended Usage of Raw/Live/Historical Measurements

wwerner · August 2, 2024, 4:11pm

raw vs combined vs aggregated vs live vs historical measurements WUT?

If you are that person , this post might help you.

Overview

Measurements are ingested from assets like batteries, chargers and heat pumps continuously - depending on the vendor, asset type and firmware, this can happen at any pace.

Measurements are then uploaded to the cloud in - currently - two second intervals. For every asset, the latest available measurement since the last upload is used. This results in higher resolution measurements being discarded and lower resolution measurements being uploaded with same frequency as the driver can read from the asset.

That results in a lot of measurements being available. For the vast majority of use cases, this level of granularity is not required.

Each measurement is specific to an asset type, e.g. a meter measurement looks different than a heat pump measurement. This is what we refer to as raw measurements. raw measurements are collected and retrieved on an asset level, so to get all raw measurements of a system, you’d need to query them for each asset attached to a certain gridBox individually.

Measurements from assets attached to a gridBox are put together within a small time window. They are called live measurements. For details and caveats, please refer to the use case section.

Both live and historical measurements are aggregated measurements. live measurements are aggregated with a window of minimum two seconds whereas historical measurement aggregations are available for window sizes of 15 minutes or one hour.

And combined measurements? The combined measurements endpoint returns both raw and energy management combined into a single object. You can forget about them, they are just there for convenience for specific use cases and do not offer additional information, compared to the other endpoints.

So, what should I use for my use case?

The most current, high resolution data we make available comes from the system live measurements endpoint. We use live measurements to show, e.g., the live system view in XENON. Be aware, however, that live measurements do not provide a (near) real-time view of the system’s state: It is the last known set of measurements over a window of five minutes. So if there are no measurements from an asset more current than 5 minutes, they still will be included in the response. As an example, only five minutes after an asset stops sending measurements, the response will reflect that. Before, the last measurement received will be included.

This implies you should not infer on-/offline state of assets based on live measurements.

You should not preemptively pull live measurements and store them for later consumption. Live measurements are tailored towards providing the most current view of energy usage to end users on demand, not for analytics and recommendation algorithms.

If you want to analzye or vizualize longer periods of time, you should use historical measurements. Take care to request only the period you need and specify a reasonable resolution given the period’s length - pulling one year of data in 15m interval would not be accepted, e.g.
XENON uses these data to provide the historical view.

We now discourage using raw measurements via the API and might eventually turn off external access, if possible. This has several reasons:

As the format for every measurement depends on the appliance type, meaning your app would need to consider different formats when working with them. We learned the lesson.
When processing raw measurements, you need to account for measurements arriving late due to edge connectivity issues. The live and historical aggregations account for that.
The cost of using them is high (both for gridX and your company) as a lot of data has to be retrieved, transferred and stored again. While this is especially true for raw measurements (due to retrieving them per asset instead of per system), retrieving live measurements at a high frequency also incurs high cost and needs to be considered carefully.

How not to use measurements

As you can see from the description above, the measurements we offer through the API are aggregated views and delivery of every single measurement taken as well as the delivery latency is not guaranteed. Take this into account when designing solutions based on granular measurements.

In particular, if you are planning to implement use cases similar to the ones listed below, please get in touch first so we can find an idiomatic solution together.

Calculating your own aggregations

You want to create custom aggregation based on measurements, e.g. specifically filtered energy consumption or production over a given time period.

Why is it problematic?

As live and historical measurements are already pre-processed on the edge, during ingestion and may arrive late, special care needs to be taken to prevent calculation errors from creeping in. Getting this right requires deeper insights into the measurement data pipeline. As we continue to optimize, build new features and scale, internals may change and thus, invalidate prior assumptions.

Additionally, waiting for measurements to arrive and be ready for querying and then downloading them takes time. Typically this can happen rather quick, but for live views of the system’s state, it might still not be up to date enough, depending on your use case.

What to do instead?

Talk to us about the aggregation you need. We might just be able to provide the aggregations you require. What’s more, as we can skip transferring the data to your processing nodes, we can provide aggregations with significantly lower latencies. You also don’t need to concern yourselves with gridX internal updates that might influence the computation of aggregations.

Downloading all measurements continuously

You want to keep a copy of all measurements over time in your own data warehousing solution, probably for future analysis.

Why is it problematic?

Besides the issues with de-duplication, late arriving measurements and pre-processing mentioned above, downloading all measurements continuously will cause significant data transfer and storage cost. Assuming you have 20k systems in the field and download all their measurements every 2s, you’ll end up with loading and ingesting hundreds of millions data points per day. Setting up infrastructure and architecture that can handle this is non-trivial, and expensive. Especially so in case the data is transferred between different cloud computing providers.

Verifying completeness of data can also become an issue - if a gap occurs in the polling job on your end, e.g. due to updates, special care needs to be taken to reconcile what was downloaded already with the data missing.

What to do instead?

Talk to us about the analysis you want to run. There are ongoing efforts to provide deeper insights into data on gridX side, and we’d love to learn about your use cases and consider them when designing analytics features.

If obtaining a copy of measurements still is a hard requirement for you, we may provide a solution to retrieve measurement data in large batches (incurring data transfer and storage cost).

Taking asset controlling decisions directly

You directly want to control energy assets based on measurement data, e.g. charge a battery when PV production is high or feed energy back into the grid.

Why is it problematic?

Controlling energy resources needs to be approached in a holistic fashion, as you interact with a complex system that is steered through various optimization algorithms, both in the cloud and on the edge. As mentioned, the measurements retrieved through the API may not be sufficiently current to take controlling decisions. Interacting with energy resources directly needs to be carefully considered as not to run into ill side effects.

What to do instead?

This is why gridX offers higher level APIs that integrate with internal optimization strategies and don’t require direct interaction with energy resources. Consider our flex Module, e.g.

In any case, please reach out to discuss your requirements, if you consider this too limiting.

Details

Measurement Data Flow

When talking about measurements, we feel it’s helpful to have a rough mental model of their origin and dataflow. Assets (like wallboxes, batteries, heat pumps and PV systems) send measurements to the gridBox they are connected to. They do so at their own pace, and it’s vendor and asset type specific. Some assets send multiple measurements per second, others only every few seconds. Some of these measurements are taken into consideration locally (i.e. on the edge/the gridBox) to control assets directly.

In a certain interval, currently every two seconds, the gridBox collects the latest measurement from all assets and uploads it to the gridX cloud systems. If no uplink is available (due to the household experiencing network issues), measurements are cached and uploaded once a connection is re-established.

sequenceDiagram
    box Edge
        participant A1 as Heat Pump
        participant A2 as PV System
        participant A3 as Battery
        participant A4 as ...


        participant GB as gridBox
    end
    
    box Cloud
        participant I as Ingestion
        participant S as Storage
        participant A as API
        participant BE as App Backends
    end
    
    box Edge
    participant APP as Apps
    actor U as User
    end


loop assets contiously send measurements
    A1 ->> GB: send measurement 
    A4 ->> GB: send measurement 
    A2 ->> GB: send measurement 
    A1 ->> GB: send measurement  
    A3 ->> GB: send measurement 


    GB -->+ GB: cache measurement
end


loop measurements are uploaded in regular intervals
    GB ->>- I: Upload measurements
    I --> I: Preprocess<br>measurements
    I ->> S: Store measurement<br>timeseries
    S --> S: Aggregate measurements
end


activate U
U ->>+ APP: View, e.g. statistics
activate A 
alt Native App
    APP ->> A: Request aggregation
end
opt Web App
    APP ->> BE: Request aggregation
    BE ->> A: Request aggregation
end
opt Server side, non-interactive apps
    BE ->> A: Request aggregation
end
A ->> S: Load aggregation
note over BE: ... potentially post-process, return all the way back


S --> U: 
deactivate A 
deactivate U

Timing and window sizes

This table summarizes the current state of measurement sending/ingestion timing and batching. This is to aid your understanding, but please bear in mind this is the current state of internal implementations, are not guaranteed and might change without prior notice. We’ll keep this post updated, but there may be a certain lag. Give us a heads-up if you notice something is off

Action/Aggregation	Interval
Asset sends measurement to gridBox	Continuously, depending on the asset type and vendor.
gridBox sends measurements to the cloud	Every 2s, if there’s a network connection to the cloud. In case of a network partition, the measurements to be uploaded are cached on the gridBox and uploaded once the partition is healed. If the gridBox is prone to run out of disc space, the cache will be compacted by deleting every other measurements starting by the oldest ones. This means, should this happen, older measurements are getting more sparse the longer the network partition persists while the gridBox still has low disk space.
Live measurements	Measurements in >= 2s resolution are collected over a 5min window, returning the latest available measurements within that window.
Historical measurements	The period and resolution of historical measurement aggregations can - in certain boundaries - be defined when requesting them. Please refer to the API docs for details. Measurements arriving late due to edge connectivity issues will be included in the historical measurements with a delay of a bit more than two hours.
Raw measurements	Raw measurements that arrive at the ts-api with a minimal delay (time between measurement timestamp and time.Now) should be available to query again withing a few seconds. Raw measurements that arrive with a delay of more than 45 minutes bypass the hot storage and it can take up to 45 minutes for them to be available. This is an exception, though, not the norm.

API Documentation

wwerner · October 23, 2024, 12:04pm

2 posts were split to a new topic: How are missing raw measurements handled in 15m historic measurements aggregations

Topic		Replies	Views
Webhook Receiver API Specification 📚 API Documentation	6	183	March 20, 2025
Using the Tariff V2 Endpoint for Time of Use 🛠️ Good Practices good-practice	0	58	July 10, 2024
Good practice: Using organizational tokens 🛠️ Good Practices good-practice	0	161	July 9, 2024
gridX API Changelog 📚 API Documentation changelog	22	333	May 6, 2025
Good practice: Reading a systems country (location) 🛠️ Good Practices good-practice	7	105	April 3, 2025

Good Practice: Intended Usage of Raw/Live/Historical Measurements

Overview

So, what should I use for my use case?

How not to use measurements

Calculating your own aggregations

Downloading all measurements continuously

Taking asset controlling decisions directly

Details

Measurement Data Flow

Timing and window sizes

API Documentation

Related topics