Articles in the GIS category

Fighting Raster GeoPackage with GDAL

As I’m still running Ubuntu 16.04 based Linux Mint, I have no access to GDAL 2.x repositories (except for ubuntugis, that I really don’t like to use). Provided with a GeoPackage raster file recently, I had to find a way to load it into QGIS, somehow. The solution is simple: Docker with gdal_translate.

Preparing the Docker container

I like using Docker for experiments that might leave the OS in an unexpected state (which is exactly what happens to me with ubuntugis repository whenever I use it. That’s why I don’t anymore.). A very simple Dockerfile keeps the troubles away from you.

FROM ubuntu:17.04
RUN apt update
RUN apt install -y gdal-bin

cd into the folder and build the image with docker build -t gdal .. Once ready, summon the daemon, run the container, mount the GeoPackage file to the container directory and you’re ready to rock.

docker run -v /path/to/geopackage:/home/ -it gdal

Raster GeoPackage to GeoTiff translation

With the container running, the raster GeoPackage to GeoTiff translation can be done easily with gdal_translate. Note I chose to cut the source file into tiles, because the gdal_translate was choking about the resulting size.

#!/bin/bash
SIZE=10000
ULX=-630000
ULY=-1135450
LRX=-560000
LRY=-1172479
COUNTER_X=0
COUNTER_Y=0

while [[ $ULX -lt $LRX ]]
do
    while [[ $ULY -gt $LRY ]]
    do
        echo $ULX, $(($ULX+$SIZE)), $ULY, $(($ULY-$SIZE))

        gdal_translate \
            -co TILED=YES \
            -co COMPRESS=DEFLATE \
            -co TFW=YES \
            -co NUM_THREADS=ALL_CPUS \
            -a_nodata 0 \
            -of GTiff \
            -projwin $ULX, $ULY, $(($ULX+$SIZE)), $(($ULY-$SIZE)) \
            -projwin_srs EPSG:5514 \
            data/detected.gpkg data/detected_${COUNTER_X}_${COUNTER_Y}.tiff

        ULY=$(($ULY-$SIZE))
        COUNTER_Y=$((COUNTER_Y+1))
    done
    ULX=$(($ULX+$SIZE))
    ULY=-1135450
    COUNTER_X=$((COUNTER_X+1))
done

Final Touch: Raster to Vector

After the GeoTiff is written to hard drive, inotifywait can be used to generate overviews. And with ease of calling gdal_polygonize.py on each of GeoTiffs…vector layer, at you service.

Mapping North America with QGIS: Tips and Tricks

Recently I’ve bought a book called Maps by Aleksandra Mizielinska and Daniel Mizielinski to my nephew. The book’s absolutely wonderful and made me want to try crafting a map with similar looks. I don’t do maps much at CleverMaps, so this was a great opportunity to find out what new features became available during the last months of QGIS development.

Result

A map of North America in scale of 1:22,000,000 featuring the biggest lakes, rivers, mountain ranges and basic administrative units for the North American countries. I aimed for visually appealing overview map rather than perfectly correct topographic one.

Data

I used my beloved Natural Earth dataset for both cultural (boundaries, cities) and physical (rivers, lakes) map features. Different scales came to play for different map layers as they seemed a bit too/few simplified for the given scale.

Fonts

I usually use built-in system fonts (Ubuntu Condensed or such), but this kind of map needed a more handwritten looking, sort of childish font. After searching dafont.com I chose PreCursive by RaseOne Full Time Artists and KG Primary Penmanship by Kimberly Geswein.

Symbols

The mountain point symbol was one of the two custom symbols used on the map. It comes from BSGStudio. The ocean wave symbol was made by myself.

QGIS effects

I’ve used several techniques I find interesting enough to be listed here.

Coastlines

For a long time I’ve considered coastlines a field for cartographic invention. They can be emphasized by shading or 3D effects. I chose the set of four parallel coastlines subtly disappearing into the sea, hopefully invoking the feeling of waves coming to the shore.

It’s done by dissolving all the features and buffering them again and again.

Buffered labels

Buffered labels are usually hard to get right, because they fill so much space if the buffer color’s not corresponding to its surroundings. But choosing the proper color can be a real struggle at times.

On this map, almost all the labels are buffered with the color of its surroundings, which makes them more legible, yet not too expressive. This is possible thanks to QGIS expression based properties that let you define unique styling to different map features.

Where it isn’t possible (e.g. Bahamas or Honduras) to choose just one buffer color, the label is not buffered at all (or the semi-transparent white buffer is used).

Note the Rocky Mountains label is split on the borders of the U.S.A. and Canada and its both parts match the background color.

Tapered rivers

Rivers are tapered based on the Natural Earth’s width attribute value for each river segment.

Labels in separate layers

I’m used to put labels into separate layers in more complicated map compositions, especially when you need to draw label along path for areal features (such as countries or states).

It becomes a bit harder to keep the features in sync with the labels though. I’d like to use only one layer for all the map layers in the future, as I feel that’s the way to go for the best labeling.

Labels wrapped on character

Some labels just can’t fit the feature they belong to and QGIS lets you deal with this by wrapping labels on a special character, \ in my case.

Layer blending mode

The mechanics behind layer blending modes are still a mystery to me, but they can add that little extra to a map very easily. Thanks to the Overlay blending mode, the Rocky Mountains may remain very subtle on different kinds of background.

Wifileaks Wi-Fi Networks Dataviz

Wifileaks is a project by Jakub Čížek aimed to map the Czech wi-fi networks with Android/iOS app. The data gathered by people using the app is available to download and features ~ 90,000,000 records, each representing the position of the cellphone when connecting to the network. Just about perfect to craft some maps!

Using PostgreSQL cstore_fdw

I ran out of disk space immediately after loading the dataset into the PostgreSQL database. After fiddling around I remembered that columnar store should be a bit space-friendlier than the old fashioned relational database. Thus, I installed the cstore_fdw by Citus Data in just few steps.

sudo apt install libprotobuf-c-dev libprotobuf-c1 protobuf-c-compiler postgresql-server-dev-9.6
git clone [email protected]:citusdata/cstore_fdw.git
PATH=/usr/bin/:$PATH make
PATH=/usr/bin/:$PATH make install

# when the cstore_fdw installation finishes, add the following line to your postgresql.conf and restart the database cluster
shared_preload_libraries = 'cstore_fdw'

This makes another FDW available to you inside the PostgreSQL. The actual foreign server has to be created before loading the data into a foreign table.

cat &lt;&lt;END | psql -qAt --no-psqlrc
    CREATE SERVER cstore_server FOREIGN DATA WRAPPER cstore_fdw;
    CREATE SCHEMA data_cstore;
    CREATE FOREIGN TABLE data_cstore.wifi (
        id integer,
        mac text,
        ssid text,
        signal_strength numeric,
        security integer,
        lat numeric,
        lon numeric,
        alt numeric,
        unixtime bigint,
        filename text
    )
    SERVER cstore_server
    OPTIONS (compression 'pglz');
END

The foreign table is 3× smaller than it’s standard counterpart. However, this comes with some costs:

neither UPDATE nor DELETE can be used
no CREATE INDEX
no SERIAL

To overcome these shortcomings I used COPY statement to spit out the slightly modified table and immediately loaded it back in.

cat &lt;&lt;END | psql -qAt --no-psqlrc
COPY (
    SELECT
        row_number() OVER (),
        mac,
        ssid,
        signal_strength,
        security,
        split_part(filename, '_', 2)::integer,
        to_timestamp(unixtime),
        ST_Transform(ST_SetSRID(ST_MakePoint(lon, lat, alt), 4326), 32633)
    FROM data_cstore.wifi
    WHERE lon BETWEEN 0 AND 20
        AND lat BETWEEN 18 AND 84
) TO '/tmp/wifileaks.db' WITH CSV DELIMITER ';'
    DROP SCHEMA IF EXISTS data_cstore CASCADE;

DROP SCHEMA data_cstore;
CREATE SCHEMA data_cstore;
CREATE FOREIGN TABLE data_cstore.wifi (
    id integer,
    mac text,
    ssid text,
    signal_strength numeric,
    security integer,
    userid integer,
    unixtime timestamp without time zone,
    geom geometry(POINTZ, 32633)
)
SERVER cstore_server
OPTIONS (compression 'pglz');
END

Putting the networks on the map

As mentioned, each row of data represents the cellphone’s location when connecting to a wi-fi network. To get real wi-fi transmitter position, I calculated the average of location of each cellphone ever connected (although the signal strength should be taken into account here as well).

CREATE UNLOGGED TABLE data_cstore.wifi_avg_loc AS
SELECT
    row_number() OVER () id,
    mac,
    ST_SetSRID(ST_MakePoint(x, y), 32633) geom
FROM (
    SELECT
        mac,
        AVG(ST_X(geom)) x,
        AVG(ST_Y(geom)) y
    FROM data_cstore.wifi_loc
    GROUP BY 1
) a;

Routing with GRASS GIS: Catchment Area Calculation

I got my hands on pgRouting in the last post and I’m about to do the same with GRASS GIS in this one.

GRASS GIS stores the topology for the native vector format by default, which makes it easy to use for the network analysis. All the commands associated with the network analysis can be found in the v.net family. The ones I’m going to discuss in this post are v.net itself, v.net.path, .v.net.alloc and v.net.iso, respectively.

Data

I’m going to use the roads data from the previous post together with some random points used as catchment areas centers.

# create the new GRASS GIS location
grass -text -c ./osm/czech

# import the roads
v.in.ogr input="PG:host=localhost dbname=pgrouting" layer=cze.roads output=roads -eo  --overwrite

# import the random points
v.in.ogr input="PG:host=localhost dbname=pgrouting" layer=temp.points output=points -eo --overwrite

I got six different points and the pretty dense road network. Note none of the points is connected to the existing network.

You have to have routable network to do the actual routing (the worst sentence ever written). To do so, let’s:

connect the random points to the network
add nodes to ends and intersections of the roads

Note I’m using the 500m as the max distance in which to connect the points to the network.

v.net input=roads points=points operation=connect threshold=500 output=network
v.net input=network output=network_noded operation=nodes

Finding the shortest path between two points

Once the network is routable, it is easy to find the shortest path between points number 1 and 4 and store it in the new map.

echo "1 1 4" | v.net.path input=network_noded output=path_1_4

The algorithm doesn’t take bridges, tunnels and oneways into account, it’s capable of doing so though.

Distributing the subnets for nearest centers

v.net.alloc input=network_noded output=network_alloc center_cats=1-6 node_layer=2

v.net.alloc module takes the given centers and distributes the network so each of its parts belongs to exactly one center - the nearest one (speaking the distance, time units, …).

Creating catchment areas

v.net.iso input=network_noded output=network_iso center_cats=1-6 costs=1000,3000,5000

v.net.iso splits net by cost isolines. Again, the costs might be specified as lengths, time units, ….

Two different ways lead to the actual catchment area creation. First, you extract nodes from the roads with their values, turn them into the raster grid and either extract contours or polygonize the raster. I find the last step suboptimal and would love to find another way of polygonizing the results.

Note when extracting contours the interval has to be set to the reasonable number depending on the nodes values.

Remarks

Once you grasp the basics, GRASS GIS is real fun. Grasping the basics is pretty tough though.
Pedestrians usually don’t follow the road network.
Bridges and tunnels might be an issue.
Personally, I find GRASS GIS easier to use for the network analysis compared to pgRouting.