More

Combine overlapping rasters in gdal_calc.py, taking the maximum value from each pixel?

Combine overlapping rasters in gdal_calc.py, taking the maximum value from each pixel?


I want to combine 3 overlapping rasters to create 1 raster. Each have the same dimensions, resolution and CRS. I want to populate the new raster with the maximum value from each of the layers. However, when I use gdal_calc.py the output only writes two of the rasters.

Any ideas?

gdal_calc.py -A ./EXTENT_30_OS_clipped_nodata.tif -B ./EXTENT_100_OS_clipped_nodata.tif -C ./EXTENT_1000_OS_clipped_nodata.tif --calc="maximum(A,B,C)" --outfile=./EXTENT_1000_100_30_OS.tif --type=Byte --NoDataValue=-9999 --overwrite

gdal_calc.py as stated in documentation:

http://www.gdal.org/gdal_calc.html

uses numpy array functions, in this case numpy.maximum, which is specified here:

http://docs.scipy.org/doc/numpy/reference/generated/numpy.maximum.html

It says you can use only 2 parameters. So your calculation expression for three grids must look like this:

maximum(maximum(A,B),C)


Combine overlapping rasters in gdal_calc.py, taking the maximum value from each pixel? - Geographic Information Systems

This chapter provides conceptual and usage information about loading, storing, accessing, and working with spatial data in a Big Data environment.


    Oracle Big Data Spatial and Graph features enable spatial data to be stored, accessed, and analyzed quickly and efficiently for location-based decision making.
    Oracle Big Data Spatial and Graph supports the storage and processing of both vector and raster spatial data.
    Oracle Spatial Hadoop Image Processing Framework allows the creation of new combined images resulting from a series of processing phases in parallel.
    The first step to process images using the Oracle Spatial and Graph Hadoop Image Processing Framework is to actually have the images in HDFS, followed by having the images separated into smart tiles.
    Once the images are loaded into HDFS, they can be processed in parallel using Oracle Spatial Hadoop Image Processing Framework.
    The framework provides a raster processing API that lets you load and process rasters without creating XML but instead using a Java application. The application can be executed inside the cluster or on a remote node.
    When you create custom processing classes. you can use the Oracle Spatial Hadoop Raster Simulator Framework to do the following by "pretending" to plug them into the Oracle Raster Processing Framework.
    Oracle Big Data Spatial Raster Processing for Apache Spark is a spatial raster processing API for Java.
    Oracle Big Data Spatial Vector Analysis is a Spatial Vector Analysis API, which runs as a Hadoop job and provides MapReduce components for spatial processing of data stored in HDFS.
    Oracle Big Data Spatial Vector Analysis for Apache Spark is a spatial vector analysis API for Java and Scala that provides spatially-enabled RDDs (Resilient Distributed Datasets) that support spatial transformations and actions, spatial partitioning, and indexing.
    Oracle Big Data Spatial Vector Hive Analysis provides spatial functions to analyze the data using Hive.
    You can use the Oracle Big Data SpatialViewer Web Application (SpatialViewer) to perform a variety of tasks.

2.1 About Big Data Spatial and Graph Support for Spatial Data

Oracle Big Data Spatial and Graph features enable spatial data to be stored, accessed, and analyzed quickly and efficiently for location-based decision making.

Spatial data represents the location characteristics of real or conceptual objects in relation to the real or conceptual space on a Geographic Information System (GIS) or other location-based application.

The spatial features are used to geotag, enrich, visualize, transform, load, and process the location-specific two and three dimensional geographical images, and manipulate geometrical shapes for GIS functions.

2.1.1 What is Big Data Spatial and Graph on Apache Hadoop?

Oracle Big Data Spatial and Graph on Apache Hadoop is a framework that uses the MapReduce programs and analytic capabilities in a Hadoop cluster to store, access, and analyze the spatial data. The spatial features provide a schema and functions that facilitate the storage, retrieval, update, and query of collections of spatial data. Big Data Spatial and Graph on Hadoop supports storing and processing spatial images, which could be geometric shapes, raster, or vector images and stored in one of the several hundred supported formats.

Oracle Spatial and Graph Developer's Guide for an introduction to spatial concepts, data, and operations

2.1.2 Advantages of Oracle Big Data Spatial and Graph

The advantages of using Oracle Big Data Spatial and Graph include the following:

Unlike some of the GIS-centric spatial processing systems and engines, Oracle Big Data Spatial and Graph is capable of processing both structured and unstructured spatial information.

Customers are not forced or restricted to store only one particular form of data in their environment. They can have their data stored both as a spatial or nonspatial business data and still can use Oracle Big Data to do their spatial processing.

This is a framework, and therefore customers can use the available APIs to custom-build their applications or operations.

Oracle Big Data Spatial can process both vector and raster types of information and images.

2.1.3 Oracle Big Data Spatial Features and Functions

The spatial data is loaded for query and analysis by the Spatial Server and the images are stored and processed by an Image Processing Framework. You can use the Oracle Big Data Spatial and Graph server on Hadoop for:

Cataloguing the geospatial information, such as geographical map-based footprints, availability of resources in a geography, and so on.

Topological processing to calculate distance operations, such as nearest neighbor in a map location.

Categorization to build hierarchical maps of geographies and enrich the map by creating demographic associations within the map elements.

The following functions are built into Oracle Big Data Spatial and Graph:

Indexing function for faster retrieval of the spatial data.

Map function to display map-based footprints.

Zoom function to zoom-in and zoom-out specific geographical regions.

Mosaic and Group function to group a set of image files for processing to create a mosaic or subset operations.

Cartesian and geodetic coordinate functions to represent the spatial data in one of these coordinate systems.

Hierarchical function that builds and relates geometric hierarchy, such as country, state, city, postal code, and so on. This function can process the input data in the form of documents or latitude/longitude coordinates.

2.1.4 Oracle Big Data Spatial Files, Formats, and Software Requirements

The stored spatial data or images can be in one of these supported formats:

Both Geodetic and Cartesian data

Other GDAL supported formats

You must have the following software, to store and process the spatial data:

GCC Compiler - Only when the GDAL-supported formats are used

2.2 Oracle Big Data Vector and Raster Data Processing

Oracle Big Data Spatial and Graph supports the storage and processing of both vector and raster spatial data.

2.2.1 Oracle Big Data Spatial Raster Data Processing

For processing the raster data, the GDAL loader loads the raster spatial data or images onto a HDFS environment. The following basic operations can be performed on a raster spatial data:

Mosaic: Combine multiple raster images to create a single mosaic image.

Subset: Perform subset operations on individual images.

Raster algebra operations: Perform algebra operations on every pixel in the rasters (for example, add, divide, multiply, log, pow, sine, sinh, and acos).

User-specified processing: Raster processing is based on the classes that user sets to be executed in mapping and reducing phases.

This feature supports a MapReduce framework for raster analysis operations. The users have the ability to custom-build their own raster operations, such as performing an algebraic function on a raster data and so on. For example, calculate the slope at each base of a digital elevation model or a 3D representation of a spatial surface, such as a terrain. For details, see Oracle Big Data Spatial Hadoop Image Processing Framework for Raster Data Processing.

2.2.2 Oracle Big Data Spatial Vector Data Processing

This feature supports the processing of spatial vector data:

Loaded and stored on to a Hadoop HDFS environment

Stored either as Cartesian or geodetic data

The stored spatial vector data can be used for performing the following query operations and more:

Sevetal data service operations are supported for the spatial vector data:

In addition, there is a limited Map Visualization API support for only the HTML5 format. You can access these APIs to create custom operations. For details, see "Oracle Big Data Spatial Vector Analysis."

2.3 Oracle Big Data Spatial Hadoop Image Processing Framework for Raster Data Processing

Oracle Spatial Hadoop Image Processing Framework allows the creation of new combined images resulting from a series of processing phases in parallel.

It includes the following features:

HDFS Images storage, where every block size split is stored as a separate tile, ready for future independent processing

Subset, user-defined, and map algebra operations processed in parallel using the MapReduce framework

Ability to add custom processing classes to be executed in the mapping or reducing phases in parallel in a transparent way

Fast processing of georeferenced images

Support for GDAL formats, multiple bands images, DEMs (digital elevation models), multiple pixel depths, and SRIDs

Java API providing access to framework operations useful for web services or standalone Java applications

Framework for testing and debugging user processing classes in the local environment

The Oracle Spatial Hadoop Image Processing Framework consists of two modules, a Loader and Processor, each one represented by a Hadoop job running on different stages in a Hadoop cluster, as represented in the following diagram. Also, you can load and process the images using the Image Server web application, and you can use the Java API to expose the framework’s capabilities.

For installation and configuration information, see:

2.3.1 Image Loader

The Image Loader is a Hadoop job that loads a specific image or a group of images into HDFS.

While importing, the image is tiled and stored as an HDFS block.

GDAL is used to tile the image.

Each tile is loaded by a different mapper, so reading is parallel and faster.

Each tile includes a certain number of overlapping bytes (user input), so that the tiles cover area from the adjacent tiles.

A MapReduce job uses a mapper to load the information for each tile. There are 'n' number of mappers, depending on the number of tiles, image resolution and block size.

A single reduce phase per image puts together all the information loaded by the mappers and stores the images into a special .ohif format, which contains the resolution, bands, offsets, and image data. This way the file offset containing each tile and the node location is known.

Each tile contains information for every band. This is helpful when there is a need to process only a few tiles then, only the corresponding blocks are loaded.

The following diagram represents an Image Loader process:

2.3.2 Image Processor

The Image Processor is a Hadoop job that filters tiles to be processed based on the user input and performs processing in parallel to create a new image.

Processes specific tiles of the image identified by the user. You can identify one, zero, or multiple processing classes. These classes are executed in the mapping or reducing phase, depending on your configuration. For the mapping phase, after the execution of processing classes, a mosaic operation is performed to adapt the pixels to the final output format requested by the user. If no mosaic operation was requested, the input raster is sent to reduce phase as is. For reducer phase, all the tiles are put together into a GDAL data set that is input for user reduce processing class, where final output may be changed or analyzed according to user needs.

A mapper loads the data corresponding to one tile, conserving data locality.

Once the data is loaded, the mapper filters the bands requested by the user.

Filtered information is processed and sent to each mapper in the reduce phase, where bytes are put together and a final processed image is stored into HDFS or regular File System depending on the user request.

The following diagram represents an Image Processor job:


Description of the illustration image_processor_job.png

2.4 Loading an Image to Hadoop Using the Image Loader

The first step to process images using the Oracle Spatial and Graph Hadoop Image Processing Framework is to actually have the images in HDFS, followed by having the images separated into smart tiles.

This allows the processing job to work separately on each tile independently. The Image Loader lets you import a single image or a collection of them into HDFS in parallel, which decreases the load time.

The Image Loader imports images from a file system into HDFS, where each block contains data for all the bands of the image, so that if further processing is required on specific positions, the information can be processed on a single node.

2.4.1 Image Loading Job

The image loading job has its custom input format that splits the image into related image splits. The splits are calculated based on an algorithm that reads square blocks of the image covering a defined area, which is determined by

area = ((blockSize - metadata bytes) / number of bands) / bytes per pixel.

For those pieces that do not use the complete block size, the remaining bytes are refilled with zeros.

Splits are assigned to different mappers where every assigned tile is read using GDAL based on the ImageSplit information. As a result an ImageDataWritable instance is created and saved in the context.

The metadata set in the ImageDataWritable instance is used by the processing classes to set up the tiled image in order to manipulate and process it. Since the source images are read from multiple mappers, the load is performed in parallel and faster.

After the mappers finish reading, the reducer picks up the tiles from the context and puts them together to save the file into HDFS. A special reading process is required to read the image back.

2.4.2 Input Parameters

The following input parameters are supplied to the Hadoop command:

  • SOURCE_IMGS_PATH is a path to the source image(s) or folder(s). For multiple inputs use a comma separator. This path must be accessible via NFS to all nodes in the cluster.
  • HDFS_OUTPUT_FOLDER is the HDFS output folder where the loaded images are stored.
  • OVERLAPPING_PIXELS is an optional number of overlapping pixels on the borders of each tile, if this parameter is not specified a default of two overlapping pixels is considered.
  • GDAL_LIB_PATH is the path where GDAL libraries are located.
  • GDAL_DATA_PATH is the path where GDAL data folder is located. This path must be accessible through NFS to all nodes in the cluster.
  • THUMBNAIL_PATH is an optional path to store a thumbnail of the loaded image(s). This path must be accessible through NFS to all nodes in the cluster and must have write access permission for yarn users.
  • -expand controls whether the HDFS path of the loaded raster expands the source path, including all directories. If you set this to false , the .ohif file is stored directly in the output directory (specified using the -o option) without including that directory’s path in the raster.
  • -extractLogs controls whether the logs of the executed application should be extracted to the system temporary directory. By default, it is not enabled. The extraction does not include logs that are not part of Oracle Framework classes.
  • -logFilter <LINES_TO_INCLUDE_IN_LOG> is a comma-separated String that lists all the patterns to include in the extracted logs, for example, to include custom processing classes packages.
  • -pyramid <OUTPUT_DIRECTORY, LEVEL, [RESAMPLING]> allows the creation of pyramids while making the initial raster load. An OUPUT_DIRECTORY must be provided to store the local pyramids before uploading to HDFS pyramids are loaded in the same HDFSA directory requested for load. A pyramid LEVEL must be provided to indicate how many pyramids are required for each raster. A RESAMPLING algorithm is optional to specify the method used to execute the resampling if none is set, then BILINEAR is used.

For example, the following command loads all the georeferenced images under the images folder and adds an overlapping of 10 pixels on every border possible. The HDFS output folder is ohiftest and thumbnail of the loaded image are stored in the processtest folder.

By default, the Mappers and Reducers are configured to get 2 GB of JVM, but users can override this settings or any other job configuration properties by adding an imagejob.prop properties file in the same folder location from where the command is being executed. This properties file may list all the configuration properties that you want to override. For example,

Java heap memory ( java.opts properties) must be equal to or less than the total memory assigned to mappers and reducers ( mapreduce.map.memory and mapreduce.reduce.memory ). Thus, if you increase Java heap memory, you might also need to increase the memory for mappers and reducers.

For GDAL to work properly, the libraries must be available using $LD_LIBRARY_PATH. Make sure that the shared libraries path is set properly in your shell window before executing a job. For example:

2.4.3 Output Parameters

The reducer generates two output files per input image. The first one is the .ohif file that concentrates all the tiles for the source image, each tile may be processed as a separated instance by a processing mapper. Internally each tile is stored as a HDFS block, blocks are located in several nodes, one node may contain one or more blocks of a specific .ohif file. The .ohif file is stored in user specified folder with -out flag, under the /user/<USER_EXECUTING_JOB>/OUT_FOLDER/<PARENT_DIRECTORIES_OF_SOURCE_RASTER> if the flag 𠄾xpand was not used. Otherwise, the .ohif file will be located at /user/<USER_EXECUTING_JOB>/OUT_FOLDER/ , and the file can be identified as original_filename.ohif .

The second output is a related metadata file that lists all the pieces of the image and the coordinates that each one covers. The file is located in HDFS under the metadata location, and its name is hash generated using the name of the ohif file. This file is for Oracle internal use only, and lists important metadata of the source raster. Some example lines from a metadata file:

If the -thumbnail flag was specified, a thumbnail of the source image is stored in the related folder. This is a way to visualize a translation of the .ohif file. Job execution logs can be accessed using the command yarn logs -applicationId <applicationId> .

2.5 Processing an Image Using the Oracle Spatial Hadoop Image Processor

Once the images are loaded into HDFS, they can be processed in parallel using Oracle Spatial Hadoop Image Processing Framework.

You specify an output, and the framework filters the tiles to fit into that output, processes them, and puts them all together to store them into a single file. Map algebra operations are also available and, if set, will be the first part of the processing phase. You can specify additional processing classes to be executed before the final output is created by the framework.

The image processor loads specific blocks of data, based on the input (mosaic description or a single raster), and selects only the bands and pixels that fit into the final output. All the specified processing classes are executed and the final output is stored into HDFS or the file system depending on the user request.

2.5.1 Image Processing Job

The image processing job has different flows depending on the type of processing requested by the user.

Default Image Processing Job Flow: executed for processing that includes a mosaic operation, single raster operation, or basic multiple raster operation.

Multiple Raster Image Processing Job Flow: executed for processing that includes complex multiple raster algebra operations.

2.5.1.1 Default Image Processing Job Flow

The default image processing job flow is executed when any of the following processing is requested:

Basic multiple raster algebra operation

The flow has its own custom FilterInputFormat , which determines the tiles to be processed, based on the SRID and coordinates. Only images with same data type (pixel depth) as the mosaic input data type (pixel depth) are considered. Only the tiles that intersect with coordinates specified by the user for the mosaic output are included. For processing of a single raster or basic multiple raster algebra operation (excluding mosaic), the filter includes all the tiles of the input rasters, because the processing will be executed on the complete images. Once the tiles are selected, a custom ImageProcessSplit is created for each image.

When a mapper receives the ImageProcessSplit , it reads the information based on what the ImageSplit specifies, performs a filter to select only the bands indicated by the user, and executes the list of map operations and of processing classes defined in the request, if any.

Each mapper process runs in the node where the data is located. After the map algebra operations and processing classes are executed, a validation verifies if the user is requesting mosaic operation or if analysis includes the complete image and if a mosaic operation is requested, the final process executes the operation. The mosaic operation selects from every tile only the pixels that fit into the output and makes the necessary resolution changes to add them in the mosaic output. The single process operation just copies the previous raster tile bytes as they are. The resulting bytes are stored in NFS to be recovered by the reducer.

A single reducer picks the tiles and puts them together. If you specified any basic multiple raster algebra operation, then it is executed at the same time the tiles are merged into the final output. This operation affects only the intersecting pixels in the mosaic output, or in every pixel if no mosaic operation was requested. If you specified a reducer processing class, the GDAL data set with the output raster is sent to this class for analysis and processing. If you selected HDFS output, the ImageLoader is called to store the result into HDFS. Otherwise, by default the image is prepared using GDAL and is stored in the file system (NFS).

2.5.1.2 Multiple Raster Image Processing Job Flow

The multiple raster image processing job flow is executed when a complex multiple raster algebra operation is requested. It applies to rasters that have the same MBR, pixel type, pixel size, and SRID, since these operations are applied pixel by pixel in the corresponding cell, where every pixel represents the same coordinates.

The flow has its own custom MultipleRasterInputFormat , which determines the tiles to be processed, based on the SRID and coordinates. Only images with same MBR, pixel type, pixel size and SRID are considered. Only the rasters that match with coordinates specified by the first raster in the catalog are included. All the tiles of the input rasters are considered, because the processing will be executed on the complete images.

Once the tiles are selected, a custom MultipleRasterSplit is created. This split contains a small area of every original tile, depending on the block size, because now all the rasters must be included in a split, even if it is only a small area. Each of these is called an IndividualRasterSplit , and they are contained in a parent MultipleRasterSplit .

When a mapper receives the MultipleRasterSplit , it reads the information of all the raster´s tiles that are included in the parent split, performs a filter to select only the bands indicated by the user and only the small corresponding area to process in this specific mapper, and then executes the complex multiple raster algebra operation.

Data locality may be lost in this part of the process, because multiple rasters are included for a single mapper that may not be in the same node. The resulting bytes for every pixel are put in the context to be recovered by the reducer.

A single reducer picks pixel values and puts them together. If you specified a reducer processing class, the GDAL data set with the output raster is sent to this class for analysis and processing. The list of tiles that this class receives is null for this scenario, and the class can only work with the output data set. If you selected HDFS output, the ImageLoader is called to store the result into HDFS. Otherwise, by default the image is prepared using GDAL and is stored in the file system (NFS).

2.5.2 Input Parameters

The following input parameters can be supplied to the hadoop command:

  • MOSAIC_CONFIG_PATH is the path to the mosaic configuration xml, that defines the features of the output.
  • GDAL_LIBRARIES_PATH is the path where GDAL libraries are located.
  • GDAL_DATA_PATH is the path where the GDAL data folder is located. This path must be accessible via NFS to all nodes in the cluster.
  • IMAGE_CATALOG_PATH is the path to the catalog xml that lists the HDFS image(s) to be processed. This is optional because you can also specify a single raster to process using 𠄿ile flag.
  • USER_PROCESS_JAR_PATH is an optional user-defined jar file or comma-separated list of jar files, each of which contains additional processing classes to be applied to the source images.
  • THUMBNAIL_PATH is an optional flag to activate the thumbnail creation of the loaded image(s). This path must be accessible via NFS to all nodes in the cluster and is valid only for an HDFS output.
  • USER_NATIVE_LIBRARIES_PATH is an optional comma-separated list of additional native libraries to use in the analysis. It can also be a directory containing all the native libraries to load in the application.
  • USER_PARAMETERS is an optional key/value list used to define input data for user processing classes. Use a semicolon to separate parameters. For example: azimuth=315altitude=45
  • SINGLE_RASTER_PATH is an optional path to the .ohif file that will be processed by the job. If this is set, you do not need to set a catalog.

For example, the following command will process all the files listed in the catalog file input.xml file using the mosaic output definition set in testFS.xml file.

By default, the Mappers and Reducers are configured to get 2 GB of JVM, but users can override this settings or any other job configuration properties by adding an imagejob.prop properties file in the same folder location from where the command is being executed.

For GDAL to work properly, the libraries must be available using $LD_LIBRARY_PATH. Make sure that the shared libraries path is set properly in your shell window before executing a job. For example:

2.5.2.1 Catalog XML Structure

The following is an example of input catalog XML used to list every source image considered for mosaic operation generated by the image processing job.

A <catalog> element contains the list of <image> elements to process.

Each <image> element defines a source image or a source folder within the <raster> element. All the images within the folder are processed.

The <bands> element specifies the number of bands of the image, The datatype attribute has the raster data type and the config attribute specifies which band should appear in the mosaic output band order. For example: 3,1,2 specifies that mosaic output band number 1 will have band number 3 of this raster, mosaic band number 2 will have source band 1, and mosaic band number 3 will have source band 2. This order may change from raster to raster.

Parent topic: Input Parameters

2.5.2.2 Mosaic Definition XML Structure

The following is an example of a mosaic configuration XML used to define the features of the output generated by the image processing job.

The <mosaic> element defines the specifications of the processing output. The exec attribute specifies if the processing will include mosaic operation or not. If set to “false” , a mosaic operation is not executed and a single raster is processed if set to “true” or not set, a mosaic operation is performed. Some of the following elements are required only for mosaic operations and ignored for single raster processing.

The <output> element defines the features such as <SRID> considered for the output. All the images in different SRID are converted to the mosaic SRID in order to decide if any of its tiles fit into the mosaic or not. This element is not required for single raster processing, because the output rster has the same SRID as the input.

The <directory> element defines where the output is located. It can be in an HDFS or in regular FileSystem (FS), which is specified in the tag type.

The <tempFsFolder> element sets the path to store the mosaic output temporarily. The attribute delete=”false” can be specified to keep the output of the process even if the loader was executed to store it in HDFS.

The <filename> and <format> elements specify the output filename. <filename> is not required for single raster process and if it is not specified, the name of the input file (determined by the -file attribute during the job call) is used for the output file. <format> is not required for single raster processing, because the output raster has the same format as the input.

The <width> and <height> elements set the mosaic output resolution. They are not required for single raster processing, because the output raster has the same resolution as the input.

The <algorithm> element sets the order algorithm for the images. A 1 order means, by source last modified date, and a 2 order means, by image size. The order tag represents ascendant or descendant modes. (These properties are for mosaic operations where multiple rasters may overlap.)

The <bands> element specifies the number of bands in the output mosaic. Images with fewer bands than this number are discarded. The config attribute can be used for single raster processing to set the band configuration for output, because there is no catalog.

The <nodata> element specifies the color in the first three bands for all the pixels in the mosaic output that have no value.

The <pixelType> element sets the pixel type of the mosaic output. Source images that do not have the same pixel size are discarded for processing. This element is not required for single raster processing: if not specified, the pixel type will be the same as for the input.

The <crop> element defines the coordinates included in the mosaic output in the following order: startcoordinateX , pixelXWidth , RotationX , startcoordinateY , RotationY , and pixelheightY . This element is not required for single raster processing: if not specified, the complete image is considered for analysis.

The <process> element lists all the classes to execute before the mosaic operation.

The <classMapper> element is used for classes that will be executed during mapping phase, and the <classReducer> element is used for classes that will be executed during reduce phase. Both elements have the params attribute, where you can send input parameters to processing classes according to your needs.

The <operations> element lists all the map algebra operations that will be processed for this request. This element can also include a request for pyramid operations for example:


Abstract

The coastal waters of the Maltese Islands, central Mediterranean Sea, sustain a diversity of marine habitats and support a wide range of human activities. The islands’ shallow waters are characterised by a paucity of hydrographic and marine geo-environmental data, which is problematic in view of the requirements of the Maltese Islands to assess the state of their coastal waters by 2012 as part of the EU Marine Strategy Directive. Multibeam echosounder (MBES) systems are today recognised as one of the most effective tools to map the seafloor, although the quantitative characterisation of MBES data for seafloor and habitat mapping is still an underdeveloped field. The purpose of this study is to outline a semi-automated, Geographic Information System-based methodology to map the distribution of habitats in shallow coastal waters using high-resolution MBES data. What distinguishes our methodology from those proposed in previous studies is the combination of a suite of geomorphometric and textural analytical techniques to map specific types of seafloor morphologies and compositions the selection of the techniques is based on identifying which geophysical parameter would be influenced by the seabed type under consideration.

We tested our approach in a 28 km 2 area of Maltese coastal waters. Three data sets were collected from this study area: (i) MBES bathymetry and backscatter data (ii) Remotely Operated Vehicle imagery and (iii) photographs and sediment samples from dive surveys. The seabed was classified into five elementary morphological zones and features – flat and sloping zones, crests, depressions and breaks of slope – using morphometric derivatives, the Bathymetric Position Index and geomorphometric mapping. Segmentation of the study area into seagrass-covered and unvegetated seafloor was based on roughness estimation. Further subdivision of these classes into the four predominant types of composition – medium sand, maërl associated with sand and gravel, seagrass settled on sand and gravel, and seagrass settled on bedrock – was carried out through supervised classifications of morphometric derivatives of the bathymetry and textural indices of backscatter, based on information from training stations. The resulting morphologic and seabed composition maps were combined to plot the distribution of the predominant habitats in the coastal waters offshore NE Malta, some of which are of high conservation value. Ground-truthing of the habitat map using ROV imagery and dive observations confirms that our approach produces a simplified and accurate representation of seafloor habitats while using all the information available within the MBES data sets.

Highlights

► We outline a semi-automated method to map habitats in shallow coastal waters. ► The method is based on high resolution multibeam bathymetry and backscatter data. ► It combines morphometric and textural analyses to map morphology and composition. ► We use method to plot distribution of the predominant habitats offshore NE Malta. ► Ground-truthing with ROV and dive observations confirms validity of our approach.


Segment Images

Segmentation is the process of partitioning an image into objects by grouping neighboring pixels with common values. The objects in the image ideally correspond to real-world features. Effective segmentation ensures that classification results are more accurate.

  1. Enable the Preview option in the Object Creation panel. A Preview Window appears with segments outlined in green.
  2. Under Segment Settings, select an Algorithm from the drop-down list provided. The following options are available:
    • Edge: Best for detecting edges of features where objects of interest have sharp edges. Set an appropriate Scale Level and Merge Level (see steps below) to effectively delineate features.
    • Intensity: Best for segmenting images with subtle gradients such as digital elevation models (DEMs) or images of electromagnetic fields. When selecting this method, don't perform any merging set the Merge Level to 0. Merging is used primarily to combine segments with similar spectral information. Elevation and other related attributes are not appropriate for merging.

See Watershed Algorithm Background for more detailed descriptions of each option.

Tip: For best segmentation results, select a combination of bands that have similar spectral ranges such as R, G, B, and NIR bands. You should not perform segmentation with a combination of custom bands (normalized difference or HSI color space) and visible/NIR bands. You can perform segmentation on the normalized difference or color space bands by themselves, but not in combination with visible and NIR bands.

  • Full Lambda Schedule: (default). Merges small segments within larger, textured areas such as trees or clouds, where over-segmentation may be a problem.
  • Fast Lambda: Merges adjacent segments with similar colors and border sizes.

See Merge Algorithms Background for more detailed descriptions of each option.


Combine overlapping rasters in gdal_calc.py, taking the maximum value from each pixel? - Geographic Information Systems

This chapter describes advanced image processing capabilities, including GCP georeferencing, reprojection, rectification, orthorectification, warping, image scaling, stretching, filtering, masking, segmentation, NDVI computation, Tasseled Cap Transformation, image appending, bands merging, and large-scale advanced image mosaicking.

This chapter also describes the concept and application of virtual mosaic within the context of a large-scale image database and on-the-fly spatial queries over it.

The operations in this chapter are most commonly used to process geospatial images, particularly raw satellite imagery and airborne photographs. However, those operations, just like the GeoRaster raster algebra, apply to all raster data types.

This chapter contains the following major sections.


    In addition to spatial referencing capability, advanced georeferencing capabilities are available.
    Image reprojection is the process of transforming an image from one SRS (spatial reference system, or coordinate system) to another.
    Most raster data originating from remote sensors above the ground is usually subject to distortion caused by the terrain, the view angles of the instrument, and the irregular shape of the Earth. Image rectification as explained in this section is the process of transforming the images to reduce some of that distortion.
    Orthorectification is a rectification transformation process where information about the elevation, the terrain, and the shape of the Earth is used to improve the quality of the output rectified image. Oracle GeoRaster supports single image orthorectification with average height value or DEM.
    Image warping transforms an input GeoRaster object to an output GeoRaster object using the spatial reference information from a specified SDO_GEOR_SRS object.
    Affine transformation is the process of using geometric transformations of translation, scaling, rotation, shearing, and reflection on an image to produce another image.
    The color and contrast of images can be enhanced to improve their visual quality. The SDO_GEOR_IP package (“IP” for image processing) provides a set of subprograms for image enhancement, including performing image stretching, image normalization, image equalization, histogram matching, and image dodging.
    Image filtering is the process of applying a convolution filter on an image to achieve a specific purpose. For example, applying a low-pass filter on an image can smooth and reduce noise in an image, while applying a high-pass filter on an image can enhance the details of the image or even detect the edges inside the image.
    Segmentation is a simple type of classification algorithm, and can be useful in classifying certain types of images into larger ground feature categories, such as land, cloud, water, or snow.
    Image pyramiding is one of the most commonly used processes in building large-scale image databases.
    Bitmap pyramiding can produce high-quality pyramids in certain cases where traditional pyramiding is not adequate.
    In remote sensing, the Normalized Difference Vegetation Index (NDVI) is a widely used vegetation index, enabling users to quickly identify vegetated areas and monitor the growth and "condition" of plants.
    Tasseled Cap Transformation (TCT) is a useful tool for analyzing physical ground features using remotely sensed imagery.
    To perform image masking, an application can query the GeoRaster database for bitmap masks, retrieve the desired bitmap mask or masks, and apply the masking operation on the target GeoRaster object for the purpose of displaying the object or performing some other processing.
    For image classification, time series analysis, and raster GIS modeling, multiple bands or layers of different GeoRaster objects may need to be merged into a single GeoRaster object.
    You can append one image to another image when the two images have the same number of bands.
    A large geospatial area typically consists of many smaller aerial photographs or satellite images. Large-scale image mosaicking can stitch these small geospatial images into one large image to get a better view of the whole spatial area.
    A virtual mosaic treats a set of GeoRaster images as one large virtually mosaicked image.
    Serving of image and raster data to clients or applications is supported through many features of the GeoRaster PL/SQL and Java APIs.

6.1 Advanced Georeferencing

In addition to spatial referencing capability, advanced georeferencing capabilities are available.

In GeoRaster, the spatial referencing capability is called SRS (spatial reference system) or georeferencing, which may or may not be related to geography or a geospatial scheme. Georeferencing is a key feature of GeoRaster and is the foundation of spatial query and operations over geospatial image and gridded raster data. See Georeferencing for a detailed description of the SRS models.

GeoRaster supports non-geospatial images, fine art photos, and multi-dimensional arrays, which might not be associated with any coordinate system. For those images and rasters, there is generally no need for georeferencing, but most of the GeoRaster operations still work on them, such as pyramiding, scaling, subsetting, band merging, stretching, and algebraic operations. In these cases, you address the pixels (cells) using the raster's cell space coordinates (that is, row, column, and band).

You can also create a user-defined coordinate system (a new SRID) that is not related to geography, and you can use that SRID as the model coordinate system for the rasters. Then, you can spatially reference these rasters to that SRID that is, an SRS metadata component will be created for each of those rasters. Doing this causes those rasters to be spatially referenced, and thus co-located in that user-defined model coordinate system. After this is done for all related rasters, GeoRaster operations will work on those rasters as if they are georeferenced to a geographic coordinate system. For example, assume that an artist has painted a large mural on a wall, and that you want to be able to take many high-resolution photographs of different tiles of this wall and then stitch them together. You can spatially reference the tile images and then use the GeoRaster mosaicking capability to do the stitching.

If you do not define a new coordinate system, you can still co-locate the images in the cell space. That is, you can set up different ULT coordinates for the images by calling the SDO_GEOR.setULTCoordinate procedure, so that the images are aligned in the same coordinate system and then can be mosaicked.

Most geospatial image and raster files that you have are probably already georeferenced by other software tools, and thus they may come with georeferencing information. In those cases, the georeferencing information can be directly loaded with the rasters or afterward by using SDO_GEOR.importFrom, SDO_GEOR.setSRS, the GeoRaster loader tool, GDAL, or other third-party ETL tools. For more information, check GeoRaster Tools: Viewer_ Loader_ Exporter and Georeferencing GeoRaster Objects.

If a geospatial image does not have spatial reference information, you can use the GeoRaster Ground Control Point (GCP) support to georeference the image. GCPs are collected either automatically by the remote sensing system or manually afterward. For an image without GCP information, you can use a GeoRaster visualization tool to collection GCPs for the GeoRaster object. GCPs are described in Ground Control Point (GCP) Georeferencing Model.

After you have the GCPs and want to store them in the GeoRaster metadata, you can get and set the GCP-based georeferencing mode by using the SDO_GEOR.getGCPGeorefModel function and the SDO_GEOR.setGCPGeorefModel procedure. To get, set, and edit only GCPs, use the SDO_GEOR.getControlPoint function and the SDO_GEOR.setControlPoint and SDO_GEOR.deleteControlPoint procedures. The GCPs can also be stored in the GeoRaster metadata when you call SDO_GEOR.georeference.

To get and set only the geometric model, use the SDO_GEOR.getGCPGeorefMethod function and the SDO_GEOR.setGCPGeorefMethod procedure. GeoRaster also allows you to store check points ( pointType = 2), which are treated and manipulated in the same way as control points ( pointType = 1) except that check points are not used to create the SRS coefficient when SDO_GEOR.georeference is called with the GCPs.

If you have ground control points (GCPs) that are either stored in the GeoRaster object or not, and if you want to calculate the functional fitting georeferencing model, you can call the SDO_GEOR.georeference procedure to find the solution. The functional fitting georeferencing model stores all coefficients in the GeoRaster SRS and enables the coordinate transformations between cell space and model space. To generate the functional fitting georeferencing model using GCP, you must specify an appropriate geometric model. The specific geometric models supported by SDO_GEOR.georeference are Affine Transformation, Quadratic Polynomial, Cubic Polynomial, DLT, Quadratic Rational, and RPC. These models are described in Functional Fitting Georeferencing Model.

Example 6-1 Setting Up the GCP Georeferencing Model

For example, if you have a Landsat image in a plain area and want to georeference it, you might choose the Quadratic Polynomial geometric model. For that purpose, assuming you have collected 9 GCPs (at least 6 GCPs in this case) and 3 check points, you can set up the GCPs and store them in the GeoRaster's metadata using the code in Example 6-1.

Example 6-2 Generating the Functional Fitting Model Using GCPs

After using the code in Example 6-1, you can generate the functional fitting model coefficients by using the code in Example 6-2.

The steps in Example 6-1 and Example 6-2 can be combined without the need to pre-set the GCPs into the GeoRaster object's metadata (see the example for SDO_GEOR.georeference in SDO_GEOR Package Reference). The returned value array of SDO_GEOR.georeference in Example 6-2 contains RMS values and residuals for each GCP. Using these, you can examine the solution accuracy and identify erratic GCPs. If the accuracy is not satisfactory, recheck all GCPs to make sure they are accurate and add more GCPs as necessary, and then run the script or scripts again.

The GCP support in GeoRaster enables you to spatially reference any non -geospatial images and rasters also.

After geospatial images are georeferenced, you can process those images, such as applying rectification, reprojection, and mosaicking, and spatially querying and subsetting the rasters using geometry polygons in different coordinate systems.


4 Modeling the uncertainty of occurrence data

Random noise was injected into the synthetically created occurrence locations by randomly altering the original occurrence locations by a standard deviation of 1% (10 units). We assumed that the study area is a small region that is 1 km in each direction and contains a microhabitat similar to the one modeled in the previous section. In this case, a standard deviation of 1% would be equivalent to 10 m, which is typical for a handheld GPS (USDOD, 2008 Wing et al., 2005 ). The model was run 100 times and resulted in a mean AIC of 26,423 with a standard deviation of 14. The mean AUC value was 0.78 with a standard deviation of less than 0.001. HEMI 2 also produces maps of the standard deviation (Figure 9) and response curves showing the mean, minimum, maximum, and confidence intervals (Figure 10). The maps can be used to evaluate the spatial distribution of high or low confidence within our models, while the response curves characterize the variance within our models for each environmental variable.

HEMI 2 produced histograms of the AIC and AUC values and graphs of the cumulative mean value for AIC and AUC (Figure 11) for all model runs. The histograms should approach a normal distribution when sufficient runs have been completed and the cumulative AIC and AUC curves should show that these values are stabilizing over time. The histograms and performance metrics for each run are available on the HEMI 2 Web site (http://gsp.humboldt.edu/HEMI2 ) .


4 DISCUSSION

Although SEEMs increase model accuracy over continental scales (Fink et al., 2013, 2010 ), our study found their performance differed by species and predictor resolution even in a state with variable climate and diverse ecoregions. Two species were often better represented by SEEMs, suggesting their distributional processes may vary regionally. There were few obvious commonalities among these species that would lead to SEEMs being more accurate for them. One species is nonpasserine (Northern Bobwhite), and the other is a common grassland passerine (Western Meadowlark). Two species were always better with statewide models (Brown-headed Cowbird and Dickcissel). The cowbird is strongly dependent on habitat structure (Benson et al., 2013 Bernath-Plaisted, Nenninger, & Koper, 2017 ), but these variables are not what is measured by the predictor layers that we used. Dickcissel is known for its semi-nomadic movement patterns (Temple, 2017 ) as such, neither species may be as dependent on local climatic variation mapped by the BioClim predictor inputs. The inconsistencies in the remainder of the species suggest that a larger sample of species and predictor resolutions is needed to compare why models are appropriate for given situations. On our original models, the predictors are consistently finer-scaled (30 m) than some, but not all, response location data (ranging from exact point count locations to aggregate sightings along a 4.3 km transect). However, Fink et al. ( 2010 ) used transects almost twice as long as ours (up to 8.1 vs. 4.3 km) with 30 m resolution predictor data, so that should not account for differences between our results.

A potential mechanism for variation between species includes whether species’ distributions depend more upon bioclimatic versus ecological variables, as bioclimatic variables should change more smoothly over a larger area (potentially reducing the need for adaptive local models). It could also be that species-specific processes determine whether SEEMs are required. However, one benefit of random forest models and other machine learning methods is minimal tuning and expert opinion required to generate an accurate map (Fink et al., 2010 ). Requiring researchers to choose spatial scale based on expert opinion of variable importance negates this benefit. However, the fact that most species showed different model performance based on whether we used fine or coarse predictor resolution suggests that model performance depends at least partially on dataset resolutions. Researchers who suspect that a SEE model is appropriate for their dataset and system can compare a small number of base models for different regions or times and see if relationships vary among the test models.

An alternative approach for modelers seeking increased accuracy is the use of nonspatially explicit ensemble models, where different base models (predicting for the whole study area) are combined to produce a single prediction map (Araújo & New, 2007 Oppel et al., 2012 ). We recommend this approach as more efficient for regional managers. Multiple maps will still be generated for the whole study area (n = number of base models used), but typically fewer than the number of support sets created in a SEEM or STEM. These types of ensembles are known to increase accuracy relative to a single base model (Araujo & New 2007 Oppel et al., 2012 ). Although large-scale solutions to conserve grasslands are needed (Samson et al., 2004 ), local and regional conservation and management efforts also have critical impacts (Brennan et al., 2005 ). We expected that SEEMs would be most accurate and therefore relevant to wildlife management in this state with diverse ecotypes that occur at scales larger than predictors but smaller than our study region. However, based on our study, we recommend that when using a single base model type, all distribution model types should be run (statewide and at least one or more scales of SEEM) if computing capacity is available.


A Review of Satellite Remote Sensing Techniques of River Delta Morphology Change

River deltas are important coastal depositional systems that are home to almost half a billion people worldwide. Understanding morphology changes in deltas is important in identifying vulnerabilities to natural disasters and improving sustainable planning and management. In this paper, we critically review literature on satellite remote sensing techniques that were used to study delta morphology changes. We identify and categorize these techniques into 3 major classes: (1) one-step change detection, 2) two-step change detection, and (3) ensemble classifications. In total, we offer a review of 18 techniques with example studies, and strengths and caveats of each. Synthesis of literature reveals that sub-pixel-based algorithms perform better than pixel-based ones. Machine learning techniques rank second to sub-pixel techniques, although an ensemble of techniques can be used just as effectively to achieve high feature detection accuracies. We also evaluate the performance of the 7 most commonly used techniques in literature on a sample of global deltas. Findings show the unsupervised classification significantly outperforms the others, and is recommended as a first-order delta morphological feature extraction technique in previously unknown, or, data sparse deltaic territories. We propose four pathways for future advancement delta morphological remote sensing: (1) utilizing high-resolution imagery and development of more efficient data mining techniques, (2) moving toward universal applicability of algorithms and their transferability across satellite platforms, (3) use of ancillary data in image processing algorithms, and (4) development of a global-scale repository of deltaic data for the sharing of scientific knowledge across disciplines.


Landslide susceptibility mapping in and around Mussoorie Township using fuzzy set procedure, MamLand and improved fuzzy expert system-A comparative study

A landslide susceptibility map (LSM) is an imperative element in the planning of sustainable development practices and geo-environmental conservations in mountainous terrains. In recent times, approaches that couple soft computing techniques and Geographic Information System (GIS) has emerged as better-suited models that can diminish the flaws and limitations of heuristic, probabilistic and distribution approaches in landslide susceptibility mapping. This paper presents an improved fuzzy expert system (FES) model, a fusion of Mamdani fuzzy inference system (Mamdani-FIS) and frequency ratio method for GIS-based landslide susceptibility mapping. The improved FES model has been applied for mesoscale (1:15,000) landslide susceptibility mapping of Mussoorie Township, Uttarakhand, India, along with conventional fuzzy set procedure (FSP) and an existing FES model, MamLand. The LSMs generated through different procedures have been validated and compared by means of spatial distribution of susceptibility zones and statistical analysis with the help of landslide inventory. The validation and comparative analysis have indicated the significantly better performance of the improved FES model over FSP and MamLand.

This is a preview of subscription content, access via your institution.


2.2 Oracle Big Data Vector and Raster Data Processing

Oracle Big Data Spatial and Graph supports the storage and processing of both vector and raster spatial data.

2.2.1 Oracle Big Data Spatial Raster Data Processing

For processing the raster data, the GDAL loader loads the raster spatial data or images onto a HDFS environment. The following basic operations can be performed on a raster spatial data:

Mosaic: Combine multiple raster images to create a single mosaic image.

Subset: Perform subset operations on individual images.

Raster algebra operations: Perform algebra operations on every pixel in the rasters (for example, add, divide, multiply, log, pow, sine, sinh, and acos).

User-specified processing: Raster processing is based on the classes that user sets to be executed in mapping and reducing phases.

This feature supports a MapReduce framework for raster analysis operations. The users have the ability to custom-build their own raster operations, such as performing an algebraic function on a raster data and so on. For example, calculate the slope at each base of a digital elevation model or a 3D representation of a spatial surface, such as a terrain. For details, see Oracle Big Data Spatial Hadoop Image Processing Framework for Raster Data Processing.

2.2.2 Oracle Big Data Spatial Vector Data Processing

This feature supports the processing of spatial vector data:

Loaded and stored on to a Hadoop HDFS environment

Stored either as Cartesian or geodetic data

The stored spatial vector data can be used for performing the following query operations and more:

Sevetal data service operations are supported for the spatial vector data:

In addition, there is a limited Map Visualization API support for only the HTML5 format. You can access these APIs to create custom operations. For details, see "Oracle Big Data Spatial Vector Analysis."


Acknowledgements

The authors of this study appreciate the geology post-graduation program of Universidade Federal de Minas Gerais, Brazil ICMBIO/CECAV (Chico Mendes Institute for Biodiversity Conservation/Brazilian National Center for Research and Conservation of Caves) for financial support and M. Antonieta Mourão for constructive criticism, comments and suggestions.

Funding

Funding was provided by GERDAU/ICG/GEO/PAN CAVERNAS DO SÃO FRANCISCO (Grand No. 22317).