openhab2: check if alive

Posted by ads' corner on Tuesday, 2020-04-28
Posted in [Ansible][Openhab][Software]

As mentioned in my previous openHAB blog post: it seems to be too much to ask to have a reliable and working display in the kitchen.

Another problem I found is that the weather data occasionally does not update. Everything seems to work, no errors in the log, just no updated data. Of course this results in outdated weather info in the kitchen, and the threat that “someone” will buy a regular weather station and render my 7" Raspberry display useless.

After fighting with logs and debugging and not finding any real clue, I decided to tackle this problem from another angle, and regularly check if the weather data is up to date.

One of the information I get from the data source (Open Weather Map) is the timestamp of the last update:

1
DateTime		homeCurrentTS			"Timestamp Current [%1$tY-%1$tm-%1$tdT%1$tH:%1$tM:%1$tS]"	<time>	{ channel="openweathermap:weather-and-forecast:home:current#time-stamp" }

Now “all” I have to do is read that timestamp, and then compare it with the current time. The first implementation I had running worked for a while, until it did no longer. There is a longer blog post here.

In order to read the current timestamp from openHAB, I’m using the commandline client in /usr/share/openhab2/runtime/bin/client. This requires providing the proper environment so the client can login into Karaf and read the data:

1
2
3
4
5
6
7
#!/bin/bash

# load environment variables for openhab2
. /etc/profile.d/openhab2.sh

# get current timestamp
currentts=`echo "smarthome:status homeCurrentTS" | /usr/share/openhab2/runtime/bin/client -l 0 -b`

After this step, I may or may not have the current timestamp. Verification is required:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
if [ -z "$currentts" ];
then
    echo "Can't extract timestamp!"
    echo "openhab2 weather error: can't extract timestamp" | systemd-cat --identifier="openhab-check-alive" --priority="err"
    exit 1
fi

if [ "$currentts" = "NULL" ];
then
    echo "No timestamp in database!"
    echo "openhab2 weather error: no timestamp in database" | systemd-cat --identifier="openhab-check-alive" --priority="err"
    exit 1
fi

It’s quite possible that the client can not read any data, that is catched by the first check. And if openHAB is currently starting or restarting, it will return NULL (the string). In both cases I log an error and exit. Probably should investigate how often that happens, but so far that does not seem to be a problem.

After passing both checks, I should have a valid timestamp in $currentts. That is converted into the Unix time (seconds since 1970) to make it easier to calculate differences. And the current time (in seconds) is required as well:

1
2
3
4
5
6
7
8
# transform into UNIX timestamp
unixts=`date --date="$currentts" '+%s'`

# current time
time_now=`date '+%s'`

# difference between timestamps
time_diff=`expr "$time_now" - "$unixts"`

$time_diff has the difference between the timestamp from the database and the current time. There is a small race condition if the weather time is in the future, or the system time lags behind.

Finally let’s check if the time is too old, and trigger a service restart:

1
2
3
4
5
6
7
if [ "$time_diff" -ge "4000" ];
then
    echo "Timestamp is too old!"
    echo "Restarting openhab2"
    echo "openhab2 weather timestamp is too old ($time_diff seconds)" | systemd-cat --identifier="openhab-check-alive" --priority="warning"
    systemctl restart openhab2
fi

Last but not least, something must install this script on the openHAB Raspberry, and run this check once in a while. A cron job, installed by Ansible:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
- name: Install check-alive script for openhab2
  copy:
    src: files/openhab-check-alive.sh
    dest: /root/openhab-check-alive.sh
    owner: root
    group: root
    mode: 0700
  register: openhab_check_alive

- name: openhab2 check-alive
  cron:
    name: "openhab2 check-alive"
    minute: "0,15,30,45"
    hour: "*"
    day: "*"
    weekday: "*"
    month: "*"
    state: present
    disabled: no
    user: "root"
    job: "/bin/bash /root/openhab-check-alive.sh"

Categories: [Ansible] [Openhab] [Software]