Using solr with the postgres database

YellowOSM's vision is to make Austrian businesses and companies searchable, to display them on a map and to display detailed information such as address or opening times. What can be explained in one sentence, however, requires a lot of technical effort in the background. In simplified terms, the system architecture can be summarized as follows:

OpenStreetMap => database => search server => website

The arrow => indicates a data flow from left to right. The basis for YellowOSM is OpenStreetMap. The data from OpenStreetmap is freely available in a variety of formats - for the beginning we limit ourselves to the data for Austria and import them into our database. Thankfully, the geofabrik from Karlsruhe provides a suitable data extract for this purpose: We import this data into our local PostgreSQL database using the osm2pgsql tool developed by OpenStreetMap. The compressed dump of the Geofabrik is about half a gigabyte in size - imported into PostgreSQL this results in several gigabytes of data that can be imported within an hour.

Now that the data is in our system, we can make it accessible for the website. In principle, the website could access the PostgreSQL database directly (or with little program code in between) - but access is much more efficient with a dedicated search server. There are two major open source search servers, Solr and Elasticsearch, both of which are widely used. Both of them also have Apache Lucene as a substructure, a library for information retrieval such as search and ranking of search results that has been established for almost 20 years. Both search servers offer efficient search options for geodata within a geographic area (a so-called bounding box), for example within the currently displayed map section. Elasticsearch offers slightly better performance and is used more often than Solr - good reasons to choose it.

The task now is to feed Elasticsearch with data - this is done through a handful of database queries that extract the relevant data from the database and bring it into a JSON format for import into Elasticsearch. OpenStreetMap mainly exports the "nodes" that represent points (for example a point that stands for a restaurant). In addition, polygons are used - for example in the case of the Graz "Apotheke zu Maria Trost", which is not shown as a point but based on the building outlines (a polygon) in OpenStreetMap.

The last piece of the puzzle is then the front end - here, as is currently usual everywhere, JavaScript is used. There are currently three frameworks available for JavaScript: React, Vue and Angular. All three offer solid opportunities for modern front-end development, which is why we choose what we have the most experience with: Angular, which we set up with Angular-CLI.

We use OpenLayers as the mapping library, which, in contrast to leaflet, offers more options for adjustments, even if the documentation - um * cough * - can be expanded.

So there is not much in the way of the first prototype - but more about that in the next blog post!