In Geographic Information Systems (GIS), understanding the different types of spatial data is fundamental for effectively representing, analyzing, and interpreting the world around us. Spatial data refers to information about the location and shape of geographic features and the relationships between them. These data types are broadly categorized into two main forms: vector data and raster data.
â
Vector data represents geographic features using points, lines, and polygons â ideal for discrete objects like roads, buildings, and administrative boundaries.
â
Raster data is made up of grids or pixels, commonly used to represent continuous phenomena like elevation, temperature, or satellite imagery.
â
You will learn how to distinguish between these data types and understand when and how to use each in different GIS tasks.
â
You will also explore how vector and raster data are stored, processed, visualized, and analyzed in GIS platforms like QGIS.
By the end of this lesson, youâll be able to choose the right data type depending on your project needs and spatial analysis goals.
In the GIS4School courses [1], GIS data models (or geospatial data models), are defined as a set of constructs and abstractions for describing and representing geographic entities in a digital system. Basically, GIS data models reshape these entities into discrete geographic objects (vector models) or continuous surfaces (raster models) and fit both numerical and/or textual attributes with coordinates into computer files. The structure of these models is independent of specific data items and, in most cases, of the particular GIS application that is used to manipulate them.
GIS data models are often interchangeable so that the same geographic entity or phenomenon may be represented by different models. As an example, topographic relief of mountains may be portrayed as a continuous surface or as a series of lines (discrete objects) representing contours of equal elevation. Conversions between models entail some costs both computationally and in data accuracy but GIS software provides functions to perform automatically such conversions.
The right data model to use strictly depends on the specific application. What is important to keep in mind is that there is no single data model that is best for all circumstances. Nowadays, GIS software is able to incorporate multiple data models and so can be applied to a wide range of different applications.
As we mentioned, there are two different types of GIS data, vector data and raster data and each type of data has its format. Check the following videos to get an idea!
Vector data structures represent specific features on the Earthâs surface, and assign attributes to those features. Vectors are composed of discrete geometric locations (x, y values) known as vertices that define the shape of the spatial object. The organization of the vertices determines the type of vector that we are working with is a point, line or polygon (see the image below [2]).
Image source: National Ecological Observatory Network (NEON)
All vector data can be represented as:
Vector datasets are used in many industries besides geospatial fields. For example, computer graphics are largely vector-based, although the data structures in use tend to join points using arcs and complex curves rather than straight lines. Computer-Aided Design (CAD) for example, is also vector- based software for engineers. The difference is that geospatial datasets are accompanied by information tying their features to real-world locations (coordinates).
â âVector data advantages and disadvantages
Key advantages of vector data include:
Key disadvantages of vector data include:
Images of vector data on a map (Image source: Geography Realm)
Points: Image 1, Lines and Points: Image 2 and Lines and Polygons: Image 3
Raster data refers to any grid-based or pixelated dataset in which each pixel corresponds to a specific geographic location. These pixels can hold continuous values (such as elevation or temperature) or categorical values (such as land cover types or land use classes).
Raster data shares the same structure as a digital image that we use on our daily life (i.e. TV or mobile screen, photos etc.). The key difference between a standard image (i.e. .jpeg, .png or other formats) and a geospatial raster is the presence of spatial reference information (coordinates or more accurately, the Coordinate Reference System – CRS). This metadata connects the image to a real-world location and includes details such as the rasterâs extent, cell size, number of rows and columns, and the coordinate reference system (CRS). These components allow GIS software to accurately place and analyze raster data within a geographic context [1,2]. Important Attributes of Raster Data include:
NoDataValue
): Cell values can be either positive or negative, integer, or floating point. Cells can also have a NoData value to represent the absence of data. Sometimes there are homogeneous areas in a raster dataset that the you do not want to display. These can include borders, backgrounds, or other data considered to not have valid values. Sometimes these are expressed as NoData values, although other times they may have real values. See they grey areas of the image below.The image above depicts how a raster file will look like if we load it on a GIS platform or more precisely, what type of information GIS platforms use to load a raster image. To do that, all GIS platforms use the ‘Extent’ of the area/image (i.e. the number of rows and columns), the cell size (either in meters or degrees, depending on the Coordinate Reference System – CRS), the CRS and the X,Y origin (either the bottom or top left corner). When we load a raster file, NoData values will not be displayed and our data will be masked like is shown in the image below.
This image below provides a clear and intuitive explanation of raster data, for example, of an aerial map or satellite image from Google Earth:
Image Source: National Ecological Observatory Network (NEON)
Raster data consists of a grid of equally sized cells (or pixels), where each cell holds a numerical value representing an attribute of the landscape, such as elevation, vegetation density, or temperature. In the image above, a satellite photograph is zoomed into a small section, revealing a 5×5 matrix of values ranging from 1 to 9 to highlight how images are structure in pixels. These values could correspond to environmental variables, such as vegetation index (e.g., NDVI), with each number indicating the condition or intensity of the measured variable at that location. The numeric values we observe are visually translated into a color-coded raster map using a defined color ramp, allowing users to interpret spatial patterns at a glance. Lighter shades typically represent lower values, while darker shades indicate higher values. This process of rasterization is essential for analyzing continuous spatial phenomena and supports a wide range of geospatial applications, from environmental monitoring and land-use classification to urban planning and resource management.
Image source: Homer, C.G., et al., 2015, Completion of the 2011 National Land Cover Database for the conterminous United States-Representing a decade of land cover change information. Photogrammetric Engineering and Remote Sensing, v. 81, no. 5, p. 345-354
The impact of resolution (cell size)
A resolution of a raster represents the area on the ground that each pixel of the raster covers. The image below illustrates the effect of changes in resolution.
Image Source: National Ecological Observatory Network (NEON)
Multi-band Raster Data
A raster can contain one or more bands. One type of multi-band raster dataset that is familiar to many of us is a color image. A basic color image consists of three bands: red, green, and blue. In particular, each color (red, green, and blue) can have an intensity value ranging from 0 to 255, with 0 representing complete darkness and 255 representing maximum brightness (16,581,375 combinations!!). Each band represents light reflected from the red, green or blue portions of the electromagnetic spectrum (it’s all about physics). The pixel brightness for each band, when composited creates the colors that we see in an image. Hence, multi-band raster data in satellite images refers to a dataset where each individual raster band represents a different wavelength (or spectral band) of the electromagnetic spectrum [3,4].
These bands, when combined, provide a richer understanding of the Earth’s surface (see the image below) and can be used for various applications like vegetation analysis, land use mapping, and environmental studies.
Image Source: National Ecological Observatory Network (NEON)
đ°ď¸ Beyond Visible Light: Multispectral and Hyperspectral Data: While RGB images utilize the visible spectrum, many remote sensing applications require data beyond what the human eye can perceive [5]:
These advanced datasets are essential for applications like agriculture monitoring, mineral exploration, and environmental assessment.
â âRaster data advantages and disadvantages
Key advantages of raster data include:
Key disadvantages of raster data include:
GIS data and analysis is an integral part across numerous industries and sectors, utilizing both vector and raster data to analyze spatial information, make informed decisions, and optimize operations. Vector dataâcomprising points, lines, and polygonsâeffectively represents discrete features such as human infrastructure, boundaries, and networks. Raster data, consisting of pixelated grids, is ideal for continuous data like satellite imagery, elevation models, and weather maps. Industries such as urban planning, agriculture, environmental management, and public health leverage these data types to enhance their services and strategies. For instance, urban planners use vector data to design infrastructure layouts, while environmental agencies utilize raster data to monitor land cover changes. The synergy of vector and raster data in GIS enables a comprehensive understanding of spatial phenomena, facilitating efficient resource management and strategic planning across various sectors. Professionals in almmost any industry can benefit from GIS technology. Here are some of the most popular examples of industries that use GIS [6].
In the following lessons, we will delve into specific industries and we will solve real-world problems that harness the power of vector and raster GIS data, exploring how these tools contribute to their success and innovation!