actinia tutorial at OpenGeoHub Summer School 2019

Author: Markus Neteler, mundialis GmbH & Co. KG, Bonn

Fork Me On GitLab

URL of this dcument: https://neteler.gitlab.io/actinia-introduction/

Last update: 1 Dec 2019

Abstract

Actinia (https://actinia.mundialis.de/) is an open source REST API for scalable, distributed, high performance processing of geographical data that uses mainly GRASS GIS for computational tasks. Core functionality includes the processing of single and time series of satellite images, of raster and vector data. With the existing (e.g. Landsat) and Copernicus Sentinel big geodata pools which are growing day by day, actinia is designed to follow the paradigm of bringing algorithms to the cloud stored geodata. Actinia is an OSGeo Community Project since 2019.

In this course we will briefly give a short introduction to REST API and cloud processing concepts. This is followed by an introduction to actinia processing along with hands-on to get more familiar with the topic by exercises.

DOI

Required software for this tutorial

  • REST client:
  • For the "ace - actinia command execution" section:
    • GRASS GIS 7.8+
    • three Python packages:
      • Linux: pip3 install click requests simplejson
      • Windows users (OSGeo4W, Advanced installation, search window):
        • python3-click, python3-requests, python3-simplejson

Note: We will use the demo actinia server at https://actinia.mundialis.de/ - hence Internet connection is required.

OpenGeoHub Summer School 2019 tutorial

Content

  • Warming up
  • Introduction
    • Why cloud computing ?
    • Overview actinia
  • REST API and geoprocessing basics
    • What is REST: intro
  • First Hand-on: working with REST API requests
    • Step by step...
  • Exploring the API: finding available actinia endpoints
    • REST actinia examples with curl
  • Controlling actinia from a running GRASS GIS session
    • Further command line exercise suggestions
  • Own exercises in actinia
  • Conclusions and future
  • See also: openEO resources
  • References
  • About the trainer

Planned tutorial time: 2:30 hs = 150 min

Warming up

A graphical intro to actinia - GRASS GIS in the cloud: actinia geoprocessing (note: requires Chrome/ium browser)

Introduction

For this tutorial we assume working knowledge concerning geospatial analysis and Earth observation. The tutorial includes, however, a brief introduction to REST (Representational State Transfer) API and cloud processing related basics.

Why cloud computing ?

With the tremendous increase of available geospatial and Earth observation lately driven by the Copernicus programme (esp. Sentinel satellites) and increasing availability of open data the need for computational resources is growing in a non-linear way.

Cloud technology offers a series of advantages:

  • scalable, distributed, and high performance processing
  • large quantities of Earth Observation (EO) and geodata provided in dedicated cloud infrastructures
  • addressing the paradigm of computing next to the data
  • no need to bother yourself with the low-level management of petabytes of data

Still, some critical issues have to be addressed:

  • lack of Analysis-Ready-data (ARD) available for consumption in the cloud
  • lack of compatibility between different data systems
  • lack of cloud abstraction, for easier move between vendors and providers

Overview actinia

Actinia (https://actinia.mundialis.de/) is an open source REST API for scalable, distributed, high performance processing of geospatial and Earth observation data that uses mainly GRASS GIS for computational tasks. Core functionality includes the processing of single and time series of satellite images, of raster and vector data. With the existing (e.g. Landsat) and Copernicus Sentinel big geodata pools which are growing day by day, actinia is designed to follow the paradigm of bringing algorithms to the cloud stored geodata. Actinia is an OSGeo Community Project since 2019. The source code is available on GitHub at https://github.com/mundialis/actinia_core. It is written in Python and uses Flask, Redis, and other components.

Functionality beyond GRASS GIS

While at time actinia is mainly a REST interface to GRASS GIS it offers through wrapping the possibility to extend its functionality with other software (ESA SNAP, GDAL, ...). Extensions are added by writing a GRASS GIS Addon Python script which then includes the respective function calls of the software to be integrated.

Persistent and ephemeral databases

With persistent storage we consider a data storage which keeps data also in case of shutoff as well as keeping them without a scheduled deletion time. In the Geo/EO context, persistent storage is used to provide, e.g. the base cartography, i.e. elevation models, street networks, building footprints, etc.

The ephemeral storage is used for on demand computed results including user generated data and temporary data as occurring in processing chains. In an ephemeral storage data are only kept for a limited period of time (e.g., in actinia, for 24 hs by default).

In the cloud computing context this is relevant as cost incurs when storing data.

Accordingly, actinia offers two modes of operation: persistent and ephemeral processing. In particular, the actinia server is typically deployed on a server with access to a persistent GRASS GIS database (PDB) and optionally to one or more GRASS GIS user databases (UDB).

The actinia server has access to compute nodes (actinia nodes; separate physically distinct machines) where the actual computations are performed. The actinia server acts as a load balancer, distributing jobs to actinia nodes. Results are either stored in GRASS UDBs in GRASS native format or directly exported to a different data format (see Fig. 1).


Fig. 1: Architecture of an actinia deployment (source: mundialis FTTH talk 2019 )

Deployment

In a nutshell, deployment means to launch software, usually in an automated way on a computer node. A series of technologies exist for that but importantly virtualization plays an important role which helps towards a higher level of abstraction instead of a high dependency on hardware and software specifics.

An aim is to operate Infrastructure as Code (IaC), i.e. to have a set of scripts which order the needed computational resources in the cloud, setup the network and storage topology, connect to the nodes, install them with the needed software (usually docker based, i.e. so-called containers are launched from prepared images) and processing chains. Basically, the entire software part of a cloud computing infrastructure is launched "simply" through scripts with the advantage of restarting it easily as needed, maintain it and migrate to other hardware.

CI/CD systems (continuous integration/continuous deployment) allow to define dependencies, prevent from launching broken software and allow the versioning of the entire software stack.

In terms of actinia, various ways of deployment are offered: local installation, docker, docker-compose, docker-swarm, Openshift, and kubernetes.

Architecture of actinia

Several components play a role in a cloud deployment of actinia (for an example, see Fig. 2):

  • analytics: this are the workers of GRASS GIS or wrapped other software,
  • external data sources: import providers for various external data sources,
  • interface layer:
  • metadata management: interface to GNOS, managed through actinia-GDI
  • database system:
    • job management in a Redis database
    • the GRASS GIS database (here are the geo/EO data!)
  • connection to OGC Web services for output
  • Geoserver integration


Fig. 2: Architecture of an actinia deployment (source: Carmen Tawalika)

REST API and geoprocessing basics

What is REST: intro

An API (Application Programming Interface) defines a way of communicating between different software applications. A RESTful API (Representational State Transfer - REST, for details see https://en.wikipedia.org/wiki/Representational_state_transfer) is a web API for creating web services that communicate with web resources.

In detail, a REST API uses URL arguments to specify what information shall be returned through the API. This is not much different from requesting a Web page in a browser but through the REST API we can execute commands remotely and retrieve the results.

Each URL is called a request while the data sent back to the user is called a response, after some processing was performed.

A request consists of four parts (see also [1]):

  • the endpoints
  • the header
  • the data (or body)
  • the methods

Endpoint:

An endpoint is the URL you request for. It follows this structure: https://api.some.server The final part of an endpoint is query parameters. Using query parameters you can modify your request with key-value pairs, beginning with a question mark (?). With an ampersand (&) each parameter pair is separated, e.g.:

?query1=value1&query2=value2

As an example, we check the repositories of a GitHub user, in sorted form:

https://api.github.com/users/neteler/repos?sort=pushed

Header & Body:

  • Both requests and responses have two parts: a header, and optionally a body
  • Response headers contain information about the response.
  • In both requests & responses, the body contains the actual data being transmitted (e.g., population data)

Methods and Response Codes

(source: [2])

Request methods:

  • In REST APIs, every request has an HTTP method type associated with it.
  • The most common HTTP methods (or verbs) include:
  • GET - a GET request asks to receive a copy of a resource
  • POST - a POST request sends data to a server in order to change or update resources
  • PUT - a PUT request sends data to a server in order to replace existing or create new resources
  • DELETE - a DELETE request is sent to remove or destroy a resource

Response codes:

  • HTTP responses don't have methods, but they do have status codes: HTTP status codes are included in the header of every response in a REST API. Status codes include information about the result of the original request.
  • Selected status codes (see also https://httpstatuses.com):
    • 200 - OK | All fine
    • 404 - Not Found | The requested resource was not found
    • 401 - Unauthorized | The request was rejected, as the sender is not (or wrongly) authorized
    • 500 - Internal Server Error | Something went wrong while the server was processing your request

JSON format

JSON is a structured, machine readable format (while also human readable at the same time; in contrast to XML, at least for many people). JSON is short for JavaScript Object Notation.

# this command line...
GRASS 7.8.dev (nc_spm_08):~ > v.buffer input=roadlines output=roadbuf10 distance=10 --json

looks like the following in JSON:

{
  "module": "v.buffer",
  "id": "v.buffer_1804289383",
  "inputs":[
     {"param": "input", "value": "roadlines"},
     {"param": "layer", "value": "-1"},
     {"param": "type", "value": "point,line,area"},
     {"param": "distance", "value": "10"},
     {"param": "angle", "value": "0"},
     {"param": "scale", "value": "1.0"}
   ],
  "outputs":[
     {"param": "output", "value": "roadbuf10"}
   ]
}

Hint: When writing JSON files, some linting (validation) might come handy, e.g. using https://jsonlint.com/.

First Hand-on: working with REST API requests

Step by step...

Step 1:

  • get your credentials (for authentication) from the trainer (or use the "demouser" with "gu3st!pa55w0rd")

Step 2:



Fig. 3: Using RESTman

For a curl example, see below ("REST actinia examples with curl").

Step 3:

Step 4:

  • Submit a compute job and check its status (in case of asynchronous jobs by polling).

Exploring the API: finding available actinia endpoints

The actinia REST API documentation at https://redocly.github.io/redoc/?url=https://actinia.mundialis.de/api/v1/swagger.json comes with a series of examples.

Check out the various sections:

  • Authentication Management
  • API Log
  • Cache Management
  • Satellite Image Algorithms
  • Location Management
  • Mapset Management
  • Processing
  • Raster Management
  • Raster Statistics
  • STRDS Management
  • STRDS Sampling
  • STRDS Statistics
  • Vector Management
  • Resource Management

To see a simple list of endpoints (and more), see the "paths" section in the API JSON. To get the available endpoints on command line, run

# sudo npm install -g json
curl -X GET https://actinia.mundialis.de/api/v1/swagger.json | json paths | json -ka

REST actinia examples with curl

Here we use the command line and the curl software:

Preparation:

# set credentials and REST server URL
export actinia="https://actinia.mundialis.de"
export AUTH='-u demouser:gu3st!pa55w0rd'

List locations:

# show available locations (locations are like projects)
curl ${AUTH} -X GET ${actinia}/api/v1/locations

Show capabilities of user:

# NOTE: endpoint not available to the demouser but only to the admin user
# show accessible_datasets, accessible_modules, raster cell_limit, process_num_limit, process_time_limit
curl ${AUTH} -X GET "${actinia}/api/v1/users/demouser"

List mapsets in locations:

# show available mapsets of a specific location
curl ${AUTH} -X GET "${actinia}/api/v1/locations/nc_spm_08/mapsets"

List map layers and their metadata:

# show available vector maps in a specific location/mapset
curl ${AUTH} -X GET "${actinia}/api/v1/locations/nc_spm_08/mapsets/PERMANENT/vector_layers"

# show metadata of a specific vector map
curl ${AUTH} -X GET "${actinia}/api/v1/locations/nc_spm_08/mapsets/PERMANENT/vector_layers/geology"

# show available raster maps in a specific location/mapset
curl ${AUTH} -X GET "${actinia}/api/v1/locations/nc_spm_08/mapsets/PERMANENT/raster_layers"
curl ${AUTH} -X GET "${actinia}/api/v1/locations/nc_spm_08/mapsets/landsat/raster_layers"
curl ${AUTH} -X GET "${actinia}/api/v1/locations/nc_spm_08/mapsets/modis_lst/raster_layers"

# show metadata of a specific raster map
curl ${AUTH} -X GET "${actinia}/api/v1/locations/nc_spm_08/mapsets/landsat/raster_layers/lsat7_2000_40"

# show available STRDS in a specific location/mapset
# STRDS = space time raster data set
curl ${AUTH} -X GET "${actinia}/api/v1/locations/nc_spm_08/mapsets/modis_lst/strds"

# show specific STRDS in a specific location/mapset
curl ${AUTH} -X GET "${actinia}/api/v1/locations/latlong_wgs84/mapsets/modis_ndvi_global/strds/ndvi_16_5600m"

# Get a list or raster layers from a STRDS
curl ${AUTH} -X GET "${actinia}/api/v1/locations/ECAD/mapsets/PERMANENT/strds/precipitation_1950_2013_yearly_mm/raster_layers"

# Get a list or raster layers from a STRDS, with date filter
curl ${AUTH} -X GET "${actinia}/api/v1/locations/ECAD/mapsets/PERMANENT/strds/precipitation_1950_2013_yearly_mm/raster_layers?where=start_time>2012-01-01"

Map layer queries:

We query in North Carolina, at 78W, 36N:

# query point value in a STRDS, sending the JSON code directly in request
# (North Carolina LST time series)
curl ${AUTH} -X POST -H "content-type: application/json" "${actinia}/api/v1/locations/nc_spm_08/mapsets/modis_lst/strds/LST_Day_monthly/sampling_sync_geojson" -d '{"type":"FeatureCollection","crs":{"type":"name","properties":{"name":"urn:ogc:def:crs:EPSG::4326"}},"features":[{"type":"Feature","properties":{"cat":1},"geometry":{"type":"Point","coordinates":[-78,36]}}]}'

Sending JSON payload as a file:

It is often much more convenient to store the JSON payload in a file and send it to server:

# store query in a JSON file "pc_query_point_.json" (or use a text editor for this)
echo '{"type":"FeatureCollection","crs":{"type":"name","properties":{"name":"urn:ogc:def:crs:EPSG::4326"}},"features":[{"type":"Feature","properties":{"cat":1},"geometry":{"type":"Point","coordinates":[-78,36]}}]}' > pc_query_point_.json

# send JSON file as payload to query the STRDS
curl ${AUTH} -X POST -H "content-type: application/json" "${actinia}/api/v1/locations/nc_spm_08/mapsets/modis_lst/strds/LST_Day_monthly/sampling_sync_geojson" -d @pc_query_point_.json

Validation of a process chain:

actinia can also be used to validate a process chain. Download the process chain process_chain_long.json and validate it:

# validation of a process chain (using sync call)
curl ${AUTH} -H "Content-Type: application/json" -X POST "${actinia}/api/v1/locations/nc_spm_08/process_chain_validation_sync" -d @process_chain_long.json

Converting a process chain back into commands:

To turn a process chain back into a command style notation, the validator can be used for this as well and the relevant code extracted from the resulting JSON response. Download the process chain process_chain_long.json and extract the underlying commands by parsing the response with json:

# command extraction from a process chain (using sync call) by parsing the 'process_results' response:
curl ${AUTH} -H "Content-Type: application/json" -X POST "${actinia}/api/v1/locations/nc_spm_08/process_chain_validation_sync" -d @process_chain_long.json | json process_results
[
  "grass g.region ['raster=elevation@PERMANENT', 'res=4', '-p']",
  "grass r.slope.aspect ['elevation=elevation@PERMANENT', 'format=degrees', 'precision=FCELL', 'zscale=1.0', 'min_slope=0.0', 'slope=my_slope', 'aspect=my_aspect', '-a']",
  "grass r.watershed ['elevation=elevation@PERMANENT', 'convergence=5', 'memory=300', 'accumulation=my_accumulation']",
  "grass r.info ['map=my_aspect', '-gr']"
]

Dealing with workflows (processing chains)

Data exchange: import and export

Actinia can import from external Web resources. use data in the actinia server (persistent and ephemeral storage) and make results available for download as Web resources. Note that the download of Web resources provided by actinia requires authentication.

Controlling actinia from a running GRASS GIS session

Controlling actinia from a running GRASS GIS session is a convenient way of writing process chains. It requires some basic GRASS GIS knowledge (for intro course, see here.

The "ace" - actinia command execution from a GRASS GIS terminal is a wrapper tool written in Python which simplifies the writing of processing chains notably.

To try it out, start GRASS GIS with the nc_spm_08 North Carolina sample location. You can download it easily through the Download button in the graphical startup (recommended; see Fig. 4) or from grass.osgeo.org/download/sample-data/.


Fig. 4: Download and extraction of nc_spm_08 North Carolina sample location ("Complete NC location")

Before starting GRASS GIS with the downloaded location create a new mapset "ace" in nc_spm_08.

Note: Since we want to do cloud computing, the full location would not be needed but it is useful to have for an initial exercise in order to compare local and remote computations.

Needed Python libraries

In case not yet present on the system, the following Python libraries are needed:

  • Linux: pip3 install click requests simplejson
  • Windows users (OSGeo4W, Advanced installation, search window):
    • three Python packages: python3-click, python3-requests, python3-simplejson

Installation of ace tools

You need to be in a running GRASS GIS session:

# importer installation
g.extension url=https://github.com/mundialis/importer extension=importer

# exporter installation
g.extension url=https://github.com/mundialis/exporter extension=exporter

# ace tool (installation via g.extension forthcoming)
mkdir -p $HOME/bin/
cd $HOME/bin/
wget https://raw.githubusercontent.com/mundialis/actinia_core/master/scripts/ace
chmod a+x ace

To explore the ace tool, follow the usage examples at:

https://github.com/mundialis/actinia_core/blob/master/scripts/README.md

Further command line exercise suggestions

For this you can either use "ace" or write with an editor the JSON process chains and send them to actinia.

Computations using data in the nc_spm_08 location:

  • compute NDVI from a Landsat scene (using i.vi)
  • slope and aspect from a DEM (there are several; using r.slope.aspect)
  • flow accumulation with r.watershed from a DEM
  • buffer around hospitals (using v.buffer)
  • advanced: network allocation with hospitals and streets_wake (using v.net.alloc)
  • generalizing vector polygons with GRASS GIS' topology engine (using v.generalize)

Further examples incl. Spatio-Temporal sampling:

See: https://github.com/mundialis/actinia_core/blob/master/scripts/curl_commands.sh

Own exercises in actinia

EXERCISE: "Population at risk near coastal areas"

  • needed geodata:
    • SRTM 30m (already available in actinia - find out the location yourself)
    • Global Population 2015 (already available in actinia - find out the location yourself)
    • vector shorelines (get from naturalearthdata)
  • fetch metadata with actinia interface
  • before doing any computations: what's important about projections?
  • proposed workflow:
    • set computational region to a small subregion and constrain the pixel number through defined user settings
    • buffer SRTM land areas by 5000 m inwards
    • zonal statistics with population map

EXERCISE: "Property risks from trees"

(draft idea only, submit your suggestion to trainer how to solve this task)

  • define your region of interest
  • needed geodata:
    • building footprints
      • download from OSM (via http://overpass-turbo.eu/ | Wizard > building > ok > Export > Geojson)
      • these data are now on your machine and not on the actinia server
      • use "ace importer" or cURL to upload
    • select Sentinel-2 scene
  • proposed workflow:
    • actinia "ace" importer for building footprint upload
    • v.buffer of 10m and 30m around footprints
    • select S2 scene, compute NDVI with i.vi
    • filter NDVI threshold > 0.6 (map algebra) to get the tree pixels - more exiting would be a ML approach (with previously prepared training data ;-)) (r.learn.ml offers RF and SVM)
    • on binary tree map (which corresponds to risk exposure)
    • count number of tree pixels in 5x5 moving window (r.neighbors with method "count")
    • compute property risk statistics using buffers and tree count map and upload to buffered building map (v.rast.stats, method=maximum)
    • export of results through REST resources

Conclusions and future

See also: openEO resources

References

[1] Zell Liew, 2018: Understanding And Using REST APIs, https://www.smashingmagazine.com/2018/01/understanding-using-rest-api/

[2] Planet 2019: Developer resource center, https://developers.planet.com/planetschool/rest-apis/

[3] actinia API reference documentation

[4] actinia paper: DOI]

About the trainer

Markus Neteler is founder of mundialis GmbH & Co. KG, Bonn, Germany. From 2001-2015 he worked as a researcher in Italy. Markus is co-founder of OSGeo and since 1998, coordinator of the GRASS GIS development (for details, see his private homepage).


  • Repository of this material on gitlab

About | Privacy