More data collection on Raspberry Pi

In a previous post I started collecting my own time series data. Besides logging gas usage I’ve also started logging my water consumption. For this I use a reflection sensor hooked up to the same Grove Hat as the reed switch. Both are seen as digital pulses to the Grove Hat, which is attached as a “hat” on the Raspberry Pi. The reflection sensor combines an LED and photodiode and generates a high-pulse every time the reflective aluminium passes under the sensor, which gives me a 1 liter per pulse resolution. There’s a little potmeter on the sensor, which you can turn until it stops responding even to the aluminium part of the water meter, then dial it up a little. This way it has a low chance of false positive pulses.

Even though my script only sends in data-points when something changes (e.g. when water or gas are consumed), the data structure needs to reserve space. The beauty with “Carbon” as a time series data back-end is that over time this database doesn’t grow beyond pre-determined to be acceptable storage limits. Otherwise, depending on the resolution and time you collect data, we would end up with quite a large database. The trade-off between storage size and level of details and duration of retention needs to be figured out in advance, or you’ll start losing data after a day.

This happens in the file /etc/carbon/storage-schemas.conf:

[carbon]
pattern = ^carbon\.
retentions = 60:90d

[meters]
pattern = ^meters.*
retentions = 60s:1d,5m:7d,15m:31d,1h:5y

# Always have the catchall LAST
# otherwise it will match first and override everything else

[default_1min_for_1day]
pattern = .*
retentions = 60s:1d,5m:7d,1h:5y

Also, since older measurements will become aggregated into lower resolution data, we need to specify in what way it can be aggregated. The default aggregation is averaging, which is fine for various metrics. E.g. I’d like to know average temperatures at the hourly level, but it’s better to aggregate to the last values for meters that have an absolute “usage” value (instead of a rate of usage).

This is configured as follows in: /etc/carbon/storage-aggregation.conf

[gasusage]
pattern = meters.gasM3
aggregationMethod = last

[waterusage]
pattern = meters.waterLiterCount
aggregationMethod = last

[default_average]
pattern = .*
xFilesFactor = 0
aggregationMethod = average

Right after this you should remove (a) or migrate (b) your measurements so that it will use the new retention settings. You can do (a)  as follows:

cd /var/lib/graphite/whisper ; mkdir /tmp/whisper-old ; mv * /tmp/whisper-old

See this page of the Carbon documentation.

You may also like...