GDAL聊聊GDAL的数据模型

Posted

tags:

篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了GDAL聊聊GDAL的数据模型相关的知识,希望对你有一定的参考价值。

GDAL是个非常优秀的GIS数据操作库,最近在和实习生介绍GDAL的简单使用,顺手写下记录

本篇记录栅格数据,代码环境为C#

在GDAL中,栅格数据大致是以一个Dataset对应一个栅格数据文件(.Tif/GeoTiff格式),而这个栅格中的各种信息被包含在Dataset的对象中作为属性。

基本上一个栅格数据在GDAL的数据模型中存储是基于波段的方式,一般一个单波段数据在GDAL中读取后,所得到的Dataset中仅包含一个Band对象,而BandCount属性也为1.多波段数据类似,即是说在GDAL里的Dataset对象与在ArcGIS里所谈的栅格数据集是类似的概念。

这里以官方文档为准搬运相关的概念和说明。

Dataset

A dataset (represented by the GDALDataset class) is an assembly of related raster bands and some information common to them all. In particular the dataset has a concept of the raster size (in pixels and lines) that applies to all the bands. The dataset is also responsible for the georeferencing transform and coordinate system definition of all bands. The dataset itself can also have associated metadata, a list of name/value pairs in string form.

Note that the GDAL dataset, and raster band data model is loosely based on the OpenGIS Grid Coverages specification.

其中标红的部分是在具体的栅格数据处理中我们应该关注的内容,包括数据的大小(图像的长宽),地理参考和坐标系定义, 数据的元数据等。这些项目在Dataset对象中定义,对所有这个Dataset下的Band有效。

接下来首先讨论Dataset中包含的基本内容。

非常重要的是数据的投影和地理参考。

数据的坐标系【Coordinate System】

GDAL的坐标系定义采用OpenGIS的投影字符串规范表示,所以当你使用 Dataset.GetProjection()方法时会发现返回值为一个字符串而不是什么Projection对象。这保证在大部分情况下数据的投影文件(信息)能够被GDAL读取并正确解释。这个坐标系定义中包含以下内容:

 

  • An overall coordinate system name.
  • A geographic coordinate system name.
  • A datum identifier.
  • An ellipsoid name, semi-major axis, and inverse flattening.
  • A prime meridian name and offset from Greenwich.
  • A projection method type (i.e. Transverse Mercator).
  • A list of projection parameters (i.e. central_meridian).
  • A units name, and conversion factor to meters or radians.
  • Names and ordering for the axes.
  • Codes for most of the above in terms of predefined coordinate systems from authorities such as EPSG.

这里感觉需要提一下的就是在调用一些GDAL的投影转换方法时,要求的参数可能写作“WKT”,熟悉OpenGIS的会知道这是OpenGIS WKT coordiante System Definitions,也就是这里的投影字符串。

个人感觉稍微有用一些的Tips是最近同事提醒的,原来用于判定两个数据是否同一个坐标系统我是直接采用Dataset.GetProjection()对得到的字符串做Equals判断,这样并不严谨。原因自然是某些软件读取了数据之后会将其WKT坐标系定义(此处存疑)修改为其他标准的坐标系定义,所以更建议使用GDAL中的OGR库的SpatialReference对象进行判定。

代码示例如下(暂时没学会怎么插入代码段,先截图了)

技术分享

因为GDAL是C++的库,所以习惯各方面保持C++的风格,比如条件判断基本是以01方式做,需要习惯下。

转换参数【 GeoTransform】

这个参数一般可以被叫做6参数,因为其对象是个double[6]数组。这个参数用于标定数据的地理位置等信息,相关的方法是 Dataset.GetGeoTransform(out double[] args)、 Dataset.SetGeoTransform(double[] args)

GDAL datasets have two ways of describing the relationship between raster positions (in pixel/line coordinates) and georeferenced coordinates. The first, and most commonly used is the affine transform (the other is GCPs).

关于六参数的具体解释将在另外的文章中解释。

一个真实地理坐标和影像数据行列的转换关系如下:

    Xgeo = GT(0) + Xpixel*GT(1) + Yline*GT(2)
    Ygeo = GT(3) + Xpixel*GT(4) + Yline*GT(5)

 

【GCPs】

关于GCPs了解不多,这里暂时搬运官方解释

A dataset can have a set of control points relating one or more positions on the raster to georeferenced coordinates. All GCPs share a georeferencing coordinate system (returned by GDALDataset::GetGCPProjection()). Each GCP (represented as the GDAL_GCP class) contains the following:

typedef struct
{
    char        *pszId; 
    char        *pszInfo;
    double      dfGCPPixel;
    double      dfGCPLine;
    double      dfGCPX;
    double      dfGCPY;
    double      dfGCPZ;
} GDAL_GCP;

 

元数据【Metadata】

这个部分请参阅前一篇博客,关于GDAL的Metadata

 

栅格波段 【Raster Band】

波段对象(Raster Band)是GDAL中的重要对象。一个Band对象表示一个波段/通道/图层,因此一个RGB数据在GDAL的模型中实际上是一个包含3个波段的Dataset,其中波段与Red/Green/Blue分别对应。

关于波段的内容同样将在另一篇博客中详细解释。

 

【Color Table】

这个几乎没用过,直接搬运了。

先看结构定义:

A color table consists of zero or more color entries described in C by the following structure:

技术分享
 1 typedef struct
 2 {
 3     /- gray, red, cyan or hue -/
 4     short      c1;
 5     /- green, magenta, or lightness -/    
 6     short      c2;
 7     /- blue, yellow, or saturation -/
 8     short      c3;
 9     /- alpha or black band -/
10     short      c4;      
11 } GDALColorEntry;
View Code

The color table also has a palette interpretation value (GDALPaletteInterp) which is one of the following values, and indicates how the c1/c2/c3/c4 values of a color entry should be interpreted.

  • GPI_Gray: Use c1 as gray scale value.
  • GPI_RGB: Use c1 as red, c2 as green, c3 as blue and c4 as alpha.
  • GPI_CMYK: Use c1 as cyan, c2 as magenta, c3 as yellow and c4 as black.
  • GPI_HLS: Use c1 as hue, c2 as lightness, and c3 as saturation.

To associate a color with a raster pixel, the pixel value is used as a subscript into the color table. That means that the colors are always applied starting at zero and ascending. There is no provision for indicating a pre-scaling mechanism before looking up in the color table.

 

【Overviews】

根据官方说明,这个是波段的缩略图。

A band may have zero or more overviews. Each overview is represented as a "free standing" GDALRasterBand. The size (in pixels and lines) of the overview will be different than the underlying raster, but the geographic region covered by overviews is the same as the full resolution band.

The overviews are used to display reduced resolution overviews more quickly than could be done by reading all the full resolution data and downsampling.

Bands also have a HasArbitraryOverviews property which is TRUE if the raster can be read at any resolution efficiently but with no distinct overview levels. This applies to some FFT encoded images, or images pulled through gateways (like OGDI) where downsampling can be done efficiently at the remote point.

 

关于最后两个对象,后期研究一下再来补充。

 

感谢观看

以上是关于GDAL聊聊GDAL的数据模型的主要内容,如果未能解决你的问题,请参考以下文章

py#gdal写入栅格的问题

GDAL 安装教程(Python)

GDAL——命令使用专题——ogrinfo命令

Ubuntu 安装 GDAL C++库

GDAL库——读取图像并提取基本信息

Linux(centos8)系统安装编译GDAL 2.2.1