Data Modeling with Redis - German Postcodes Geoposition

Following the previous, more general blog post about data modeling with Redis (data types visualization), this post covers a specific example: Storing and retrieving German postcodes ("Postleitzahlen", short PLZ) and especially their geopositions.

Update: There is also a follow-up blog post now available: Data Modeling with Redis - German Postcodes Advanced

Use Case

The main use case for this is to have a select drop-down on a website where you can enter a German postcode (e.g. 45130) or a so-called "Leitregion", lead region, (e.g. 45) and then receive an option list of all cities in this lead region. Upon selecting one option you can then query for:

  • a list (array) of all postcodes and city names within the given lead region
  • the geoposition of this postcode (array)
  • the (json) object with further information on this postcode

P-G-A_Sample_ScreenShot
An example client implementation can be found here.

Data Modeling Overview

Data Types:

The are three main Redis data types/structures for this specific use case:

  • one sorted set for all city names (with the same score for all set members)
  • one sorted set for all postcodes (with the member's geohash as score)
  • one Redis hash for each postcode (with all information on the postcode object)

Naming / Prefixes:

All keys are prefixed with de:postcodes. Postcode hashes are on top prefixed with object. The resulting keys then are:

de:postcodes:names
de:postcodes:positions
de:postcodes:object:45130
de:postcodes:object:45131
de:postcodes:object:<...>

Stack for the API Server:

A dockerized NodeJS API based on ExpressJS and Swagger-Node along with the node-redis module connecting to the Redis database.

P-G-A_API_Docs_SreenShot
The API documentation based on Swagger/openAPI can be found here:
https://p-g-a.daten-und-bass.io/docs/
The raw format is available here:
https://p-g-a.daten-und-bass.io/api/v1

Data Model Discussion

One might argue that there is indeed data redundancy here: The object already contains all information for a given postcode. However, the first two data types are used for two main purposes:

  • sorted set of city names: performance (out-of-the-box lexicographical ordering by postcode plus city name)
  • sorted set of postcodes: being able to use Redis built-in geoposition capabilities (such as GEOPOS)

So, with regard to the primary use case (a drop-down list with ordered postcode options as input and the geoposition for the selection as output) these two sorted sets have been added. In fact, you might also put it the other way round: For the "narrow" use case (just get me a geopostion, and only the geoposition of this postcode/PLZ) the two sorted sets would have sufficed. The postcode object "only" serves for being able to get further information beyond the geoposition (e.g. bounding-box or OpenStreetMap related info).

Besides the use case argument, data redundancy here in this case does not harm that much as the data, once imported to the db, remains fairly stable. User access is read-only and postcodes and their geopositions only change rarely.

Main caveat nevertheless: Modeled like this, user input has to be the postcode itself (in order to have the above mentioned performance advantage). The city or suburb name instead would require a scan along with a pattern in Redis which probably, even with Redis, is without good performance from the user perspective and resource consuming from the database perspective.

Data Model Details

The following section covers some details on the chosen model and provides sample redis-cli commands for storing and retrieving the data along with quotes from the official Redis command documentation. All these commands are also available in NodeJS in the npm module redis.

de:postcodes:names

One sorted set of postcodes with city names (all with the same score of 0 to have lexicographical ordering by postcode):

ZADD de:postcodes:names 0 "45130 Essen Rüttenscheid"

"Adds all the specified members with the specified scores to the sorted set stored at key".

ZRANGEBYLEX de:postcodes:names [45130 [45130\xff
ZRANGEBYLEX de:postcodes:names [45 [45\xff
# for all cities within the so-called lead region

"When all the elements in a sorted set are inserted with the same score, in order to force lexicographical ordering, this command returns all the elements in the sorted set at key with a value between min and max."

de:postcodes:positions

One sorted set of postcodes only (score calculated from latitude and longitude as geohash):

GEOADD de:postcodes:positions 51.43758188556734012 7.00937658548355103 "45130"

"Adds the specified geospatial items (latitude, longitude, name) to the specified key."

GEOPOS de:postcodes:positions "45130"

"Returns the positions (longitude,latitude) of all the specified members of the geospatial index represented by the sorted set at key."

de:postcodes:object:<plz>

One Redis hash for each postcode with further information on this postcode (e.g. bounding-box or OpenStreetMap related info):

HMSET de:postcodes:object:45130 osm_id "2073330" type "postcode" "boundingbox" "51.4325648,51.443045,6.9960612,7.0230201" <...>

"Sets the specified fields to their respective values in the hash stored at key."

HGETALL de:postcodes:object:45130

"Returns all fields and values of the hash stored at key."

Summary

This is one example for how you can use Redis data structures and go beyond "simple" key-value pairs.

For sure there may be other ways to achieve the use case requirements here but it proved pretty reliable, maintainable and with good performance - at least during developing, testing and now staging as an early beta version . Source code and live demos can be found in the next section.

Update: There is also a follow-up blog post now available: Data Modeling with Redis - German Postcodes Advanced

Further Information:

Redis:
https://redis.io/topics/data-types
https://redis.io/commands
https://www.npmjs.com/package/redis
https://daten-und-bass.io/blog/data-modeling-with-redis-data-types-visualization/

VueJS Component: vue-multiselect:
https://github.com/shentao/vue-multiselect

Postcode Geopos API: Live demos
https://daten-und-bass.remotedocker/stage/p-g-a/
https://p-g-a.daten-und-bass.io/docs/

Postcode Geopos API: Source code
https://github.com/daten-und-bass/p-g-a_sample
https://github.com/daten-und-bass/postcode-geopos-api