Member-only story

KNMI derived temperature with Inverse Distance Weighting in Oracle SQL

Johan Louwers
6 min readApr 11, 2024

--

heatmap example

In this article we will build upon an earlier post where we outlined how you can calculate the distance between geographical coordinates using the Haversine formula using pure Oracle SQL. In this example we used a dataset which contains the synoptic meteorological observations provided by the Royal Netherlands Meteorological Institute (KNMI) on a 10 minute basis via a combination of MQTT events and REST APIs.

In this dataset which we perstisted in an Oracle Autonomous database in Oracle Cloud we have a value named ‘ta’ which holds Ambient Temperature 1.5m 10 Min Average. This means that we have a Ambient Temperature datapoint on a per 10 minute basis for each individual weather station which was able to report this value back the Royal Netherlands Meteorological Institute (KNMI)

What we attempt to do in this post is to derive a ambient temperature for any given location (within the Netherlands) on any given moment (which falls within the persisted dataset). For this we will use a combination of the haversine formula and inverse distance weighting, a multivariate interpolation method for value estimation.

Select timeboxed KNMI data with SQL

We need to make sure we get a base dataset which is based upon a given time in an EPOCH timestamp format. As outlined in an earlier post we have a dataset in the Oracle database table named KNMI10. The below query will give all records from the table that are within a range of 30 minutes before and 30 minutes after the given EPOCH timestamp.

SELECT
lat,
lon,
stationname,
ta,
TO_DATE(time, 'YYYY-MM-DD HH24:MI:SS') AS observation_time
FROM
KNMI10
WHERE
(TO_DATE(time, 'YYYY-MM-DD HH24:MI:SS') >=
(TO_TIMESTAMP('1970-01-01 00:00:00', 'YYYY-MM-DD HH24:MI:SS') + (1712312248 / 86400) - (30 / (24*60)))
AND
TO_DATE(time, 'YYYY-MM-DD HH24:MI:SS') <=
(TO_TIMESTAMP('1970-01-01 00:00:00', 'YYYY-MM-DD HH24:MI:SS') + (1712312248 / 86400) + (30 / (24*60))))
AND TA IS NOT NULL
ORDER BY
observation_time;

This is required to make sure we do not have datapoints that are to far out of the requested time as they will become statistically irrelevant.

--

--

Johan Louwers
Johan Louwers

Written by Johan Louwers

Johan Louwers is a technology enthousiasts with a long background in supporting enterprises and startups alike as CTO, Chief Enterprise Architect and developer.

No responses yet