Michal Zimmermann Pieces of knowledge from the world of GIS.

Articles tagged with postgis tag

Routing with pgRouting: Catchment Area Calculation

Written on Apr 14, 2017 and marked as postgis, pgrouting | SQL

For a long time I’ve wanted to play with pgRouting and that time has finally come. Among many other routing functions there is one that caught my eye, called pgr_drivingdistance. As the documentation says, it returns the driving distance from a start node using Dijkstra algorithm. The aforementioned distance doesn’t need to be defined in Euclidean space (the real distance between two points), it might be calculated in units of time, slopeness etc. How to get it going?

Data

OSM will do as it always does. There is a tool called osm2pgrouting to help you load the data, the pure GDAL seems to be a better way to me though. Importing the downloaded data is trivial.

ogr2ogr -f "PostgreSQL" PG:"dbname=pgrouting active_schema=cze" \
    -s_srs EPSG:4326 \
    -t_srs EPSG:5514 \
    roads.shp \
    -nln roads \
    -lco GEOMETRY_NAME=the_geom \
    -lco FID=id \
    -gt 65000 \
    -nlt PROMOTE_TO_MULTI \
    -clipsrc 16.538 49.147 16.699 49.240

To route the network, it has to be properly noded. Although pgRouting comes with built-in pgr_nodenetwork, it didn’t seem to work very well. To node the network, use PostGIS ST_Node. Note this doesn’t consider bridges and tunnels.

CREATE TABLE cze.roads_noded AS
SELECT
    (ST_Dump(geom)).geom the_geom
FROM (
    SELECT
        ST_Node(geom) geom
    FROM (
        SELECT ST_Union(the_geom) geom
        FROM cze.roads
    ) a
) b;

After noding the network, all the information about speed limits and oneways is lost. If needed, it can be brought back with following:

CREATE INDEX ON cze.roads_noded USING gist(the_geom);
ALTER TABLE cze.roads_noded ADD COLUMN id SERIAL PRIMARY KEY;
ALTER TABLE cze.roads_noded ADD COLUMN maxspeed integer;

UPDATE cze.roads_noded
SET maxspeed = a.maxspeed
FROM (
    SELECT DISTINCT ON (rn.id)
        rn.id,
        r.maxspeed
    FROM cze.roads_noded rn
    JOIN cze.roads r ON (ST_Intersects(rn.the_geom, r.the_geom))
    ORDER BY rn.id, ST_Length(ST_Intersection(rn.the_geom, r.the_geom)) DESC
) a
WHERE cze.roads_noded.id = a.id;

With everything set, the topology can be built.

ALTER TABLE cze.roads_noded ADD COLUMN source integer;
ALTER TABLE cze.roads_noded ADD COLUMN target integer;
SELECT pgr_createTopology('cze.roads_noded', 1);

This function creates the cze.roads_noded_vertices_pgr that contains all the extracted nodes from the network.

As already mentioned, measures other than length can be used as a distance, I chose the time to get to a given node on foot.

ALTER TABLE cze.roads_noded ADD COLUMN cost_minutes integer;
UPDATE cze.roads_noded
SET cost_minutes = (ST_Length(the_geom) / 83.0)::integer; -- it takes average person one minute to walk 83 meters

UPDATE cze.roads_noded
SET cost_minutes = 1
WHERE cost_minutes = 0;

Routing

Now the interesting part. All the routing functions are built on what’s called inner queries that are expected to return a certain data structure with no geometry included. As I want to see the results in QGIS immediately, I had to use a simple anonymous PL/pgSQL block that writes polygonal catchment areas to a table (consider it a proof of concept, not the final solution).

DROP TABLE IF EXISTS cze.temp;
CREATE TABLE cze.temp AS
SELECT *
FROM cze.roads_noded_vertices_pgr ver
JOIN (
    SELECT *
    FROM pgr_drivingDistance(
        'SELECT id, source, target, cost_minutes as cost, cost_minutes as reverse_cost FROM cze.roads_noded',
        6686,
        10,
        true
    )
)dist ON ver.id = dist.node;

DO $$
DECLARE
    c integer;
BEGIN
    DROP TABLE IF EXISTS tmp;
    CREATE TABLE tmp (
        agg_cost integer,
        geom geometry(MULTIPOLYGON, 5514)
    );

    -- order by the biggest area so the polygons are not hidden beneath the bigger ones
    FOR c IN SELECT agg_cost FROM cze.temp GROUP BY agg_cost HAVING COUNT(1) > 3 ORDER BY 1 DESC LOOP
        RAISE INFO '%', c;
        INSERT INTO tmp (agg_cost, geom)
        SELECT
            c,
            ST_Multi(ST_SetSRID(pgr_pointsAsPolygon(
                'SELECT
                        temp.id::integer,
                        ST_X(temp.the_geom)::float AS x,
                        ST_Y(temp.the_geom)::float AS y
                FROM cze.temp
                WHERE agg_cost = ' || c
            ), 5514));
    END LOOP;
END$$;

Using pgr_pointsAsPolygon renders resulting nodes accessible in 10-minute walk in polygons, but weird looking ones. Not bad, could be better though.

How about seeing only nodes instead of polygons?

SELECT
    agg_cost,
    ST_PointN(geom, i)
FROM (
    SELECT
        agg_cost,
        ST_ExteriorRing((ST_Dump(geom)).geom) geom,
        generate_series(0,ST_NumPoints(ST_ExteriorRing((ST_Dump(geom)).geom))) i
    FROM tmp
) a;

Looks good, could be better though.

How about creating concave hulls from the extracted nodes?

SELECT
    agg_cost,
    ST_ConcaveHull(ST_Union(geom)) geom
FROM (
    SELECT
        agg_cost,
        ST_PointN(geom, i) geom
    FROM (
        SELECT
            agg_cost,<div class="text-center"><img src="{filename}/assets/routing-with-pgrouting-catchment-area-calculation/nodes1.png" width="70%" /></div>
            ST_ExteriorRing((ST_Dump(geom)).geom) geom,
            generate_series(0,ST_NumPoints(ST_ExteriorRing((ST_Dump(geom)).geom))) i
        FROM tmp
    ) a
) b
GROUP BY agg_cost
ORDER BY agg_cost DESC;

This one looks the best I guess.

Remarks

  • The documentation doesn’t help much.
  • I’d expect existing functions to return different data structures to be easy-to-use, actually.
  • LATERAL might be really handy with those inner queries, have to give it a shot in the future.
  • Pedestrians usually don’t follow the road network.
  • Bridges and tunnels might be an issue.

Subdivide and Conquer: Effective Spatial Indexes in PostGIS

Written on Jan 10, 2017 and marked as sql, postgresql, postgis | SQL

Spatial indexes are absolutely crucial part of any spatial database and - as I tend to say quite often - only a fool would try to query spatial data without building spatial indexes beforehand.

Spatial indexes are based on bounding box comparisons, which are generally very fast. Yet, there are situations when spatial indexes don’t help much (or they don’t help as much as they could, if you wish).

Bounding box comparisons are effective with lots of small bounding boxes rather then few large ones. Why? See the picture above. The curved line (imagine it’s a pipeline for example) clearly demonstrates when the spatial index/bounding box comparison might fall short of what you’d expect.

Once the bounding box gets really big, it intersects so many other geometries’ bounding boxes that the whole comparison starts to slow down.

Luckily, PostGIS 2.2 introduced a ST_Subdivide function that can lend a helping hand in here.

Until today, we delivered the parcel geometries into our real estate acquisition process system with the following query, that takes all the geometries from the req_geom table (pipelines, remember?) and intersects them with cadastral parcels. The second part of the query adds those parcels that haven’t been digitalized and were created manually by one of my workmates.

INSERT INTO requested_parcels (uid, par_id)
SELECT
    reqs.uid,
    b.id par_id
 FROM
    running_requests reqs
 JOIN
    req_geom a ON (reqs.uid = a.uid)
 JOIN
    pargeo b ON (ST_Intersects(a.geom, b.geom))
 UNION
 SELECT
    reqs.uid,
    a.idpar::numeric
 FROM
    running_requests reqs
 JOIN
     req_man a ON (reqs.uid = a.uid);

It’s a perfectly standard query that intersects several request geometries with ~20M parcels, nothing really fancy. Except that it takes 25 minutes to finish. Why? Pipelines, remember?

Yet, the query below takes only 30 seconds to finish (that’s a huge time saver considering that the whole process used to take ~40 minutes)! Why? Because the ST_Subdivide effectively shrinks the req_geom geometries until they have 50 vertices each at most. Such small geometries are perfect input for the bounding box comparison. Remember to call DISTINCT when using ST_Subdivide, you’d probably get duplicate parcel ids otherwise.

I also replaced the UNION with the WHERE NOT EXISTS expression, as it’s reasonable to assume that numeric ids comparison will be faster.

INSERT INTO requested_parcels (uid, par_id)
SELECT DISTINCT
    reqs.uid,
    b.id par_id
 FROM
    running_requests reqs
 JOIN
    (
        SELECT
            uid,
            ST_Subdivide(geom, 50) geom
        FROM
            req_geom
     ) a ON (reqs.uid = a.uid)
 JOIN
     pargeo b ON (ST_Intersects(a.geom, b.geom));

 INSERT INTO requested_parcels (uid, par_id)
 SELECT
     reqs.uid,
     a.idpar::numeric
 FROM
     running_requests reqs
 JOIN
     req_man a ON (reqs.uid = a.uid)
 WHERE NOT EXISTS (
     SELECT 1
     FROM pozadovane_parcely pp
     WHERE pp.par_id = a.idpar
  );

PostGIS Custom Function to Create Wind Rose

Written on Sep 1, 2016 and marked as postgis, postgresql, sql | SQL

I’ve come across the beautiful GIS StackExchange question recently, asking how to draw a wind rose within PostGIS.

It’s pretty easy to accomplish this with a custom PLPGSQL procedure below, that takes line geometry, number of sections and radius of the inner circle as parameters.

CREATE OR REPLACE FUNCTION ST_WindRose(
    line geometry,
    directions int,
    radius numeric
)
RETURNS TABLE (
    id integer,
    geom geometry(LINESTRING)
)
AS $ST_WindRose$
BEGIN
    IF directions % 2 <> 0 THEN
        RAISE EXCEPTION 'Odd number of directions found, please provide even number of directions instead.';
    END IF;

IF radius > ST_Length(line) THEN
    RAISE EXCEPTION 'Inner circle radius is bigger than the wind rose diameter, please make it smaller.';
END IF;

RETURN QUERY
WITH rose AS (
    SELECT
        ST_Rotate(_line, radians(360) / directions * dirs.id, ST_Centroid(_line)) _line
    FROM (
        SELECT line _line
    ) a
    CROSS JOIN (
        SELECT generate_series(1, directions / 2) id
    ) dirs
)
SELECT
    row_number() OVER ()::integer id,
    _line geom
FROM (
    SELECT _line FROM rose
    UNION ALL
    SELECT ST_ExteriorRing(ST_Buffer(ST_Centroid(line), radius, 30)) -- inner circle
    UNION ALL
    SELECT ST_ExteriorRing(ST_Buffer(ST_Centroid(line), ST_Length(line)/2, 30)) -- outer circle
) a;
END
$ST_WindRose$
LANGUAGE PLPGSQL;

Wind rose created with this function might look like the one below.

Run it as follows. The line parameter should be a simple straight line made of just two vertices.

SELECT * FROM ST_WindRose(ST_MakeLine(ST_MakePoint(0,0), ST_MakePoint(0,1)), 12, 0.01);

PostGIS Custom Function to Create Polygon from Centroid

Written on Aug 28, 2016 and marked as postgis, postgresql, sql | SQL

Needed to create a polygon from a point defining its size in both axes, here’s a little syntax sugar to make life easier.

CREATE OR REPLACE FUNCTION ST_PolygonFromCentroid(centroid geometry, xsize numeric, ysize numeric)
RETURNS geometry
AS $ST_PolygonFromCentroid$
SELECT ST_MakeEnvelope(
    ST_X(ST_Translate($1, -$2, -$3)),
    ST_Y(ST_Translate($1, -$2, -$3)),
    ST_X(ST_Translate($1, $2, $3)),
    ST_Y(ST_Translate($1, $2, $3))
);
$ST_PolygonFromCentroid$
LANGUAGE SQL;

Run it as:

SELECT ST_PolygonFromCentroid(ST_SetSRID(ST_MakePoint(13.912,50.633),4326), 1, 1);

Finding Polygons Lying across Other Polygons with PostGIS

Written on Aug 5, 2016 and marked as postgis, postgresql, sql | SQL

Doing overlays (ST_Intersection()) in PostGIS based on spatial relationships (ST_Intersects(), ST_Contains(), …) is so easy it is not something you get particularly excited about.

Today I faced a bit more interesting task: given two polygon layers, get me all the polygons from layer A such that they lie across the polygons from layer B and… a picture worth a thousand words, right?

I hope you got the idea, it is fairly simple:

  1. Intersect A (red, blue) with B (green)
  2. Subtract the result of previous from layer A
  3. Combine results from steps 1 and 2
  4. Keep polygon only if its id occurs more than twice (that means it went straight through the layer B)
  5. Profit!
WITH overlays AS (
/* nothing fancy here */
    SELECT
        A.ogc_fid a_id,
        B.ogc_fid b_id,
        ST_Intersection(A.geom, B.geom) geom,
        ST_Area(ST_Intersection(A.geom, B.geom) area_shared
    FROM A
    JOIN B ON (ST_Intersects(A.geom, B.geom)
),
diffs AS (
/* note this is a 1:1 relationship in ST_Difference */
/* a little hack is needed to prevent PostGIS from returning its usual difference mess */
    SELECT
        o.a_id,
        o.b_id,
        (ST_Dump(ST_Difference(ST_Buffer(A.geom, -0.0001), o.geom))).geom, -- ugly hack
        o.area_shared
    FROM overlays o
    JOIN A ON (o.a_id = A.id)
),

merged AS (
/* put those two result sets together */
    SELECT * FROM overlays
    UNION ALL
    SELECT * FROM diffs
),

merged_reduced AS (
/* get only those A polygons that consist of three parts at least for each intersection with B polygon */
  SELECT
    m.*
  FROM merged m
  JOIN (
    SELECT
      a_id,
      b_id
    FROM merged
    GROUP BY a_id, b_id
    HAVING COUNT(1) > 2
  ) a ON (a.a_id = m.a_id AND a.b_id = m.b_id)
)
/* do as you wish with the result */
SELECT *
FROM merged_reduced;

In my case, centerlines of layer B were also included and their length inside each intersection was used to divide the area of the smallest part with. It was fun, actually.

Dead Simple Random Points in Polygons with PostGIS

Written on Aug 3, 2016 and marked as postgis, postgresql, sql | SQL

Since PostgreSQL 9.3 there has been a handy little keyword called LATERAL, which - combined with JOIN - might rock your GIS world in a second. To keep it simple, a LATERAL JOIN enables a subquery in the FROM part of a query to reference columns from preceding expressions in the FROM list. What the heck?

Imagine that not so unusual request to generate random points in polygons (something I needed to do today). Do it automatically without your favorite piece of desktop GIS software.

It is pretty easy using LATERAL JOIN:

SELECT
    a.id,
    b.*
FROM (
    VALUES(
        1,
        ST_SetSRID(
            ST_GeomFromText(
                'POLYGON((0 0, -1 0, -1 -1, 0 -1, 0 0))'
            ),
        4326)
    )
    UNION ALL
    VALUES(
        2,
        ST_SetSRID(
            ST_GeomFromText(
                'POLYGON((0 0, 1 0, 1 1, 0 1, 0 0))'
            ),
        4326)
    )
) a(id, geom)
CROSS JOIN LATERAL (
    SELECT ST_SetSRID(ST_MakePoint(tmp.x, tmp.y), 4326) geom
    FROM (
        SELECT
            random() * (ST_XMax(a.geom) - ST_XMin(a.geom)) + ST_XMin(a.geom) x,
            random() * (ST_YMax(a.geom) - ST_YMin(a.geom)) + ST_YMin(a.geom) y
        FROM generate_series(0,200)
    ) tmp
) b;

What actually happened over there? If you want to put points inside polygons, you need… polygons. We will do just fine with two of them created inside this query:

VALUES(
    1,
    ST_SetSRID(
        ST_GeomFromText(
            'POLYGON((0 0, -1 0, -1 -1, 0 -1, 0 0))'
        ),
    4326)
)
UNION ALL
VALUES(
    2,
    ST_SetSRID(
        ST_GeomFromText(
            'POLYGON((0 0, 1 0, 1 1, 0 1, 0 0))'
        ),
    4326)
)

All the magic happens inside the LATERAL JOIN part of the query:

CROSS JOIN LATERAL (
    SELECT ST_SetSRID(ST_MakePoint(tmp.x, tmp.y), 4326) geom
    FROM (
        SELECT
            random() * (ST_XMax(a.geom) - ST_XMin(a.geom)) + ST_XMin(a.geom) x,
            random() * (ST_YMax(a.geom) - ST_YMin(a.geom)) + ST_YMin(a.geom) y
        FROM generate_series(0,200)
    ) tmp
) b;

The inner SELECT calculates random points based on the extent of the polygon. Note it directly calls a.geom, a value that comes from the previous SELECT! The LATERAL JOIN part is thus run for every row of the previous SELECT statement inside FROM part of the query. This means it will return 201 points for each of the two polygons (run the query inside QGIS to see the result).

Note all the points fall inside the polygons by accident, because they are square. Otherwise a ST_Contains or ST_Within should be used inside the outermost WHERE query to filter outliers. This part could use some tweaking.

Liftago Open Dataset Infographics

Written on Dec 21, 2015 and marked as python, postgis, svg, visualization | data

Liftago (the Czech analogy of Uber) has recently released a sample of its data covering four weeks of driver/pasenger interactions.

Have a look at my infographics created with PostGIS, Inkscape, Python and pygal.

PostGIS Case Study: Vozejkmap Open Data (Part III)

Written on Nov 14, 2015 and marked as postgresql, postgis, leaflet, javascript | web maps

After a while I got back to my PostGIS open data case study. Last time I left it with clustering implemented, looking forward to incorporate Turf.js in the future. And the future is now. The code is still available on GitHub.

Subgroup clustering

Vozejkmap data is categorized based on the place type (banks, parking lots, pubs, …). One of the core features of map showing such data should be the easy way to turn these categories on and off.

As far as I know, it’s not trivial to do this with the standard Leaflet library. Extending L.control.layers and implement its addOverlay, removeOverlay methods on your own might be the way to add needed behavior. Fortunately, there’s an easier option thanks to Leaflet.FeatureGroup.SubGroup that can handle such use case and is really straightforward. See the code below.

cluster = L.markerClusterGroup({
    chunkedLoading: true,
    chunkInterval: 500
});

cluster.addTo(map);

...

for (var category in categories) {
    // just use L.featureGroup.subGroup instead of L.layerGroup or L.featureGroup
    overlays[my.Style.set(category).type] = L.featureGroup.subGroup(cluster, categories[category]);
}

mapkey = L.control.layers(null, overlays).addTo(map);

With this piece of code you get a map key with checkboxes for all the categories, yet they’re still kept in the single cluster on the map. Brilliant!

Using Turf.js for analysis

Turf is one of those libraries I get amazed easily with, spending a week trying to find a use case, finally putting it aside with “I’ll get back to it later”. I usually don’t. This time it’s different.

I use Turf to get the nearest neighbor for any marker on click. My first try ended up with the same marker being the result as it was a member of a feature collection passed to turf.nearest() method. After snooping around the docs I found turf.remove() method that can filter GeoJSON based on key-value pair.

Another handy function is turf.distance() that gives you distance between two points. The code below adds an information about the nearest point and its distance into the popup.

// data is a geojson feature collection
json = L.geoJson(data, {
    onEachFeature: function(feature, layer) {
        layer.on("click", function(e) {
            var nearest = turf.nearest(layer.toGeoJSON(), turf.remove(data, "title", feature.properties.title)),
                distance = turf.distance(layer.toGeoJSON(), nearest, "kilometers").toPrecision(2),
                popup = L.popup({offset: [0, -35]}).setLatLng(e.latlng),
                content = L.Util.template(
                    "<h1>{title}</h1><p>{description}</p> \
                    <p>Nejbližší bod: {nearest} je {distance} km daleko.</p>", {
                    title: feature.properties.title,
                    description: feature.properties.description,
                    nearest: nearest.properties.title,
                    distance: distance
                });

            popup.setContent(content);
            popup.openOn(map);

            ...

From what I’ve tried so far, Turf seems to be incredibly fast and easy to use. I’ll try to find the nearest point for any of the categories, that could take Turf some time.

Update

Turf is blazing fast! I’ve implemented nearest point for each of the categories and it gets done in a blink of an eye. Some screenshots below. Geolocation implemented as well.

You can locate the point easily.

You can hide the infobox.

You can jump to any of the nearest places.

Installing PostGIS 2.2 with SFCGAL on Ubuntu-based OS

Written on Oct 29, 2015 and marked as postgresql, postgis, linux | development

I’ve seen a bunch of questions on GIS StackExchange recently related to SFCGAL extension for PostGIS 2.2. Great news are it can be installed with one simple query CREATE EXTENSION postgis_sfcgal. Not so great news are you have to compile it from source for Ubuntu-based OS (14.04) as recent versions of required packages are not available in the repositories.

I tested my solution on elementary OS 0.3.1 based on Ubuntu 14.04. And it works! It installs PostgreSQL 9.4 from repositories together with GDAL and GEOS and some other libs PostGIS depends on. PostGIS itself, CGAL, Boost, MPFR and GMP are built from source.

Here comes the code (commented where needed).

sudo -i
echo "deb http://apt.postgresql.org/pub/repos/apt/ trusty-pgdg main" | tee -a /etc/apt/sources.list
wget --quiet -O - https://www.postgresql.org/media/keys/ACCC4CF8.asc | sudo apt-key add -
apt-get update
apt-get install -y postgresql-9.4 \
    postgresql-client-9.4 \
    postgresql-contrib-9.4 \
    libpq-dev \
    postgresql-server-dev-9.4 \
    build-essential \
    libgeos-c1 \
    libgdal-dev \
    libproj-dev \
    libjson0-dev \
    libxml2-dev \
    libxml2-utils \
    xsltproc \
    docbook-xsl \
    docbook-mathml \
    cmake \
    gcc \
    m4 \
    icu-devtools

exit # leave root otherwise postgis will choke

cd /tmp
touch download.txt
cat <<EOT >> download.txt
https://gmplib.org/download/gmp/gmp-6.0.0a.tar.bz2
https://github.com/Oslandia/SFCGAL/archive/v1.2.0.tar.gz
http://www.mpfr.org/mpfr-current/mpfr-3.1.3.tar.gz
http://downloads.sourceforge.net/project/boost/boost/1.59.0/boost_1_59_0.tar.gz
https://github.com/CGAL/cgal/archive/releases/CGAL-4.6.3.tar.gz
http://download.osgeo.org/postgis/source/postgis-2.2.0.tar.gz

EOT

cat download.txt | xargs -n 1 -P 8 wget # make wget a little bit faster

tar xjf gmp-6.0.0a.tar.bz2
tar xzf mpfr-3.1.3.tar.gz
tar xzf v1.2.0.tar.gz
tar xzf boost_1_59_0.tar.gz
tar xzf CGAL-4.6.3.tar.gz
tar xzf postgis-2.2.0.tar.gz

CORES=$(nproc)

if [[ $CORES > 1 ]]; then
    CORES=$(expr $CORES - 1) # be nice to your PC
fi

cd gmp-6.0.0
./configure && make -j $CORES && sudo make -j $CORES install

cd ..
cd mpfr-3.1.3
./configure && make -j $CORES && sudo make -j $CORES install

cd ..
cd boost_1_59_0
./bootstrap.sh --prefix=/usr/local --with-libraries=all && sudo ./b2 install # there might be some warnings along the way, don't panic
echo "/usr/local/lib" | sudo tee /etc/ld.so.conf.d/boost.conf
sudo ldconfig

cd ..
cd cgal-releases-CGAL-4.6.3
cmake . && make -j $CORES && sudo make -j $CORES install

cd ..
cd SFCGAL-1.2.0/
cmake . && make -j $CORES && sudo make -j $CORES install

cd ..
cd postgis-2.2.0
./configure \
    --with-geosconfig=/usr/bin/geos-config \
    --with-xml2config=/usr/bin/xml2-config \
    --with-projdir=/usr/share/proj \
    --with-libiconv=/usr/bin \
    --with-jsondir=/usr/include/json \
    --with-gdalconfig=/usr/bin/gdal-config \
    --with-raster \
    --with-topology \
    --with-sfcgal=/usr/local/bin/sfcgal-config && \
make && make cheatsheets && sudo make install # deliberately one CPU only

sudo -u postgres psql
sudo -u postgres createdb spatial_template
sudo -u postgres psql -d spatial_template -c "CREATE EXTENSION postgis;"
sudo -u postgres psql -d spatial_template -c "CREATE EXTENSION postgis_topology;"
sudo -u postgres psql -d spatial_template -c "CREATE EXTENSION postgis_sfcgal;"
sudo -u postgres psql -d spatial_template -c "SELECT postgis_full_version();"

Automated Map Creation With QGIS, PostGIS, Python, SVG and ImageMagick

Written on Aug 9, 2015 and marked as qgis, postgis, python, svg, linux | automation

As mentioned in QGIS Tips For Collaborative Mapping we’re in the middle of crop evaluation project at CleverMaps.

With the QGIS workflow up and running, I’ve been focused on different QGIS related task: automatic map generation for land blocks that meet certain conditions. The logic behind identifying such land blocks is as follows:

  • if the original area and the measured one differ more than 0.5 % or
  • number of declared crops differs from number of crops identified or
  • at least one parcel in the land block was given a certain error code

Let’s assume that with several lines of SQL code we can store these mentioned above in a table called land_blocks with geometries being the result of calling ST_Union() over parcels for each land block.

Map composition

Every map should feature following layers:

  • land blocks (remember the land_blocks table?) - labeled with ID, yellowish borders, no fill
  • land parcels - that’s my source layer - labeled with letters, blue borders, no fill
  • other layers
  • HR, VHR, NIR imagery, orthophoto - served via WMS

Labels should be visible only for the featured land block (both for the land parcels and the land block itself. The whole map scales dynamically, showing small land blocks zoomed in and the large ones zoomed out.

Every map features additional items:

  • dynamic list of subsidies farmer asks for - showing both measured and declared area
  • dynamic list of land parcels with their areas and error codes
  • scalebar
  • map key
  • logos

Atlas creation

Now with requirements defined, let’s create some maps. It’s incredibly easy to generate a series of maps with QGIS atlas options.

Atlas generation settings

Coverage layer is presumably the only thing you really need - as the name suggests, it covers your area of interest. One map will be created for each feature in this layer, unless you decide to use some filtering - which I did.

Output filenames can be tweaked to your needs, here’s what such a function might look like. If there is a slash in the land block ID (XXXXXXX/Y), the filename is set to USER-ID_XXXXXXX-00Y_M_00, USER-ID_XXXXXXX-000_M_00 otherwise.

CASE WHEN strpos(attribute($atlasfeature, 'kod_pb'), '/') > -1
    THEN
        ji || '_' ||
        substr(
            attribute($atlasfeature, 'kod_pb'), 0,
            strpos(attribute($atlasfeature, 'kod_pb'), '/')+1 -- slash position
        ) || '-' ||
        lpad(
            substr(
                attribute($atlasfeature, 'kod_pb'),
                strpos(attribute($atlasfeature, 'kod_pb'), '/') + 2,
                length(attribute($atlasfeature, 'kod_pb'))
            ),
        3, '0') || '_M_00'
    ELSE
        ji || '_' || attribute($atlasfeature, 'kod_pb') || '-000_M_00'
END

Map scale & variable substitutions

Different land blocks are of different sizes, thus needing different scales to fit in the map. Again, QGIS handles this might-become-a-nightmare-pretty-easily issue with a single click. You can define the scale as:

  • margin around feature: percentage of the space displayed around
  • predefined scale (best fit): my choice, sometimes it doesn’t display the entire land block though
  • fixed scale: sets the scale the same for all the maps

With these settings, I get a map similar to the one below. Notice two interesting things:

  • Scale uses decimal places, which I find a huge failure. Has anyone ever seen a map with such scale? The worst is there is no easy way to hide these, or at least I didn’t find one.
  • You can see a bunch of [something in the brackets] notations. These will be substituted with actual values during the atlas generation. Showing land block ID with a preceeding label is as easy as [%concat('PB: ', "kod_pb")%] (mind the percentage sign).

Styling the map using atlas features

Atlas features are a great help for map customization. As mentioned earlier, in my case, only one land block label per map should be visible. That can be achieved with a simple label dialog expression:

CASE
    WHEN $id = $atlasfeatureid
    THEN "kod_pb"
END

QGIS keeps reference to each coverage layer feature ID during atlas generation, so you can use it for comparison. The best part is you can use IDs with any layer you need:

CASE
    WHEN attribute($atlasfeature, 'kod_pb') = "kod_pb"
    THEN "kod_zp"
END

With this simple expression, I get labels only for those land parcels that are part of the mapped land block. Even the layer style can be controlled with atlas feature. Land parcels inside the land block have blue borders, the rest is yellowish, remember? It’s a piece of cake with rule-based styling.

Atlas generation

When you’re set, atlas can be created with a single button. I used SVG as an output format to easily manipulate the map content afterwards. The resulting maps look like the one in the first picture without the text in the red rectangle. A Python script appends this to each map afterwards.

Roughly speaking, generating 300 maps takes an hour or so, I guess that depends on the map complexity and hardware though.

Adding textual content

SVG output is just plain old XML that you can edit by hand or by script. A simple Python script, part of map post-processing, loads SVG from the database and adds it to the left pane of each map.

SELECT
      ji,
      kod_pb,
      concat(
            '<g fill="none" stroke="#000000" stroke-opacity="1" stroke-width="1"
                  stroke-linecap="square" stroke-linejoin="bevel" transform="matrix(1.18081,0,0,1.18081,270.0,550.0)"
                  font-family="Droid Sans" font-size="35" font-style="normal">',
            kultura, vymery, vymery_hodnoty,
            vcs_titul, vcs_brk, vcs_brs, vcs_chmel,
            vcs_zvv, vcs_zv, vcs_ovv, vcs_ov, vcs_cur, vcs_bip,
            lfa, lfa_h1, lfa_h2, lfa_h3,
            lfa_h4, lfa_h5, lfa_oa, lfa_ob, lfa_s,
            natura, aeo_eafrd_text, dv_aeo_eafrd_a1,
            dv_aeo_eafrd_a2o, dv_aeo_eafrd_a2v, dv_aeo_eafrd_b1,
            dv_aeo_eafrd_b2, dv_aeo_eafrd_b3, dv_aeo_eafrd_b4,
            dv_aeo_eafrd_b5, dv_aeo_eafrd_b6, dv_aeo_eafrd_b7,
            dv_aeo_eafrd_b8, dv_aeo_eafrd_b9, dv_aeo_eafrd_c1,
            dv_aeo_eafrd_c3, aeko_text, dv_aeko_a, dv_aeko_b, dv_aeko_c,
            dv_aeko_d1, dv_aeko_d2, dv_aeko_d3, dv_aeko_d4, dv_aeko_d5,
            dv_aeko_d6, dv_aeko_d7, dv_aeko_d8, dv_aeko_d9, dv_aeko_d10,
            dv_aeko_e, dv_aeko_f, ez, dzes_text, rep, obi, seop, meop, pbz, dzes7,
            '</g>'
      ) popis
FROM (...);

Each column from the previous query is a result of SELECT similar to the one below.

SELECT concat('<text fill="#000000" fill-opacity="1" stroke="none">BrK: ', dv_brk, ' ha / ', "MV_BRK", ' ha;', kod_dpz, '</text>') vcs_brk

The transform="matrix(1.18081,0,0,1.18081,270.0,550.0) part puts the text on the right spot. Great finding about SVG is that it places each <text> element on the new line, so you don’t need to deal with calculating the position in your script.

Scale adjustment is done with a dirty lambda function.

content = re.sub(r">(\d{1,3}\.\d{3,5})</text>", lambda m :">    " + str(int(round(float(m.group(1))))) + "</text>", old_map.read())

SVG to JPEG conversion

We deliver maps as JPEG files with 150 DPI on A4 paper format. ImageMagick converts the formats with a simple shell command:

convert -density 150 -resize 1753x1240 input.svg output.jpg

Conclusion

I created pretty efficient semi-automated workflow using several open source technologies that saves me a lot of work. Although QGIS has some minor pet peeves (scale with decimal places, not showing the entire feature, not substituting variables at times), it definitely makes boring map creation quite amusing. The more I work with big data / on big tasks, the more I find Linux a must-have.

The whole process was done with QGIS 2.10, PostGIS 2.1, PostgreSQL 9.3, Python 2.7, ImageMagick 6.7.