introduction.tex

\chapter{Introduction} \label{chap:introduction}
An incomprehensible number of galaxies, stars, and other celestial bodies exist in the universe.
There is one specific planet, though---better known as Earth---that is of particular interest.
All life that we are aware of, including every human to have ever existed, has called this planet their home.
The Earth is also home to many complex physical and chemical processes that, along with the activities of its inhabitants, cause it to change continually.
Given its significance and complexity, we need to understand the Earth to the best extent possible so that we can make informed decisions regarding it, and make sure the effects of our activities are for the better.


One of the ways humanity works toward a better understanding of the Earth is by gathering data about it.
Remote sensing satellites and aircraft, along with various smart technologies, have created a significant increase in the amount of geospatial data available and continually being collected.
These developments have led to the challenge of \textit{geospatial big data}, meaning that the volume and complexity of data exceed the capacity of current computing systems~\cite{lee2015geospatial}.
With applications in urban planning, agriculture, education, natural disaster prediction and management, and many other fields, the development of technologies for managing geospatial data is more crucial than ever.
These tools are needed to efficiently integrate, process, analyze, and visualize geospatial data in order to use it to make informed decisions.
Such a need has led to the vision of a Digital Earth system, where all data regarding the Earth is available in a common reference used by the public to inform their decisions.
The use of such a system would range from world governments informing policy by analyzing climate trends to homeowners referencing local utility line locations when building a fence.
In reality, such an expansive and holistic system is yet to be realized; however, many approaches for smaller-scope Digital Earths exist.


Geographic information systems (GIS) are one of the conventional approaches used for creating a Digital Earth.
With a GIS, different datasets are represented and stored as individual map layers in a shared coordinate system, which acts as a proxy for the Earth.
These individual layers are overlayed to create a complete model of the Earth (or region of the Earth) and combined with different operations to perform various analyses.
However, being created and gathered by diverse organizations, geospatial data exists in many different formats, coordinate systems, and resolutions.
The conventional GIS pipeline---which requires experts to clean, remap, process, integrate, and distribute data---is unsustainable in the face of geospatial big data.
Furthermore, even after integrating data with a GIS, overlay operations such as intersections and data extraction are computationally expensive when many data layers are present~\cite{wang2015improving}.
Coordinate-based representations of geospatial data also do not provide proper facilities to represent the uncertainty or error of the data they are used to represent~\cite{goodchild2018reimagining}.


Another challenge with GIS is the frequent use of planar maps as the underlying representation.
Flattening the Earth produces inevitable map edges and distortion that impact the analysis and visualization of geospatial data.
Map edges can cause serious misunderstanding of the continuity between different sides of the map---especially with young children~\cite{hennerdal2015beyond}---and also affect the estimation of distance even when the continuity is correctly understood~\cite{hruby2016journey}.
Distortion causes inaccuracies in calculations such as geodesics and buffering if using planar geometry~\cite{flaterbuffering}, which is especially problematic when working with large distances and areas.
These distortions also affect the visual analysis of data by misrepresenting the size and shape of different regions.


Compared to a planar map, a globe provides a more accurate reference model for the Earth and avoids many of the above issues present with maps.
Despite this, \textit{physical} globes have many practical challenges, the most notable being expensive manufacturing, difficult storage, and the inability to accommodate different scales conveniently.
Many of these drawbacks of physical globes, however, are lessened or negated in a digital setting.
Computer graphics algorithms and hardware allow rendering of globes in real-time, storage is no different than for a digital map, and interactive systems allow zooming and panning to show any part of the Earth at any scale.
Digital globes still cannot display the entire Earth at once; however, this is partially addressed with multi-view focus plus context rendering techniques~\cite{sherlock2017visualizations}.


Because of the advantages of globes, there has been a recent push toward globe-based Digital Earth systems, one of the most well-known examples of which being Google Earth\footnote{\url{https://www.google.com/earth/}}.
In these systems, data is visualized on a 3D model of the Earth---approximated as a sphere or ellipsoid---as opposed to a traditional planar map.
While globe-based Digital Earths are becoming more popular, many of these systems still use a flat map as the underlying representation for geospatial data.
Thus, these systems still suffer errors introduced by distortion resulting from any planar operations done with the underlying coordinate system.


A common approach to avoid coordinate-based representations of geospatial data is to use an area- or cell-based representation.
A partition of the Earth into a set of non-overlapping cells allows information to be associated with the cells(s) that correspond with the appropriate region(s) of the Earth.
Such partitionings are termed discrete global grids (DGG).
To accommodate different resolutions of data, a hierarchy of these DGG's at successively finer resolutions, referred to as a discrete global grid system (DGGS), is used.
Not only does this cell hierarchy allow multiple resolutions of data to be supported, but it also provides an inbuilt mechanism for representing data uncertainty by using a cell (or set of cells) to represents the range of possible locations for a datum.
Additionally, a hierarchical data structure facilitates efficient algorithms by first calculating results with a coarse representation, and only refining further if necessary.
Nevertheless, in order for DGGS cells to serve as a suitable representation for geospatial data, their cells should be compact and near-uniform in size.
Compact cells ensure that data from geographically distant (relative to the size of the cell) locations are not represented in the same cell.
Uniformity of size ensures that each resolution of the DGGS represents the entire Earth at a particular spatial resolution and that cells of the same size do not appear at different resolutions of the grid system.


While DGGS's are an effective tool for integrating and managing geospatial data on the surface of the Earth, they have no inbuilt mechanism for supporting 3D data.
Many of today's geospatial sensors, along with other technologies such as numerical weather prediction, generate data with an associated altitude in addition to other dimensions.
With a DGGS, 3D data is flattened with altitude stored only as an attribute.
Since attributes do not index data, such flattening is problematic in applications where data must be accessed or otherwise filtered by altitude.
This shortcoming has motivated the development of volumetric (3D) discrete global grids and grid systems to allow native support for 3D data.
Going forward, we use the term 3D DGGS to refer to these technologies collectively.


Most human activity and interest, and correspondingly geospatial data, is located in a relatively small region above and below the surface of the Earth ($\pm$10--500 km).
Likewise, most existing approaches for 3D DGGS's have focused on this region.
However, there are select processes such as seismic wave propagation and the magnetosphere that span much beyond this region.
Furthermore, the range of altitudes satellites orbit at is extensive, with satellites in high Earth orbit exceeding 36,000 km.
Therefore, to support the full range of human activity and interest, 3D DGGS's that extend to the centre of the Earth and far beyond the atmosphere are needed.


\section{Problem Statement} \label{chap:1:problem}
Despite the necessity for 3D DGGS's, the area of research is still relatively unexplored compared to traditional DGGS and GIS.
Additionally, of the proposed methods, many are only appropriate for data with small altitude ranges, or data near the surface of the Earth.
Thus, for data with more extensive altitude ranges, there are even fewer methods that are appropriate. 
From this, we arrive at the primary goal of this thesis: to develop 3D DGGS's that have adequate support for unbounded ranges of altitude. 


A simple approach for such a 3D DGGS would be to embed the Earth in a 3D Euclidean voxel partitioning or similar space partitioning data structure, for which many efficient and well-tested algorithms exist.
While such a system is simple in its construction, it ignores the spherical nature of the Earth.
Cells are not aligned with the surface of the Earth, which means there is no consistent orientation of up (increasing altitude) and down (decreasing altitude) with respect to a cell.
This type of grid also provides a poor approximation for the surface of the Earth, especially at low resolutions.
Likewise, traversal along the surface of the Earth is complex and requires ``zig-zagging'' up and down.
While these issues are negligible for small-scale regions of the Earth, for a 3D DGGS with global coverage, the underlying grid should be sphere-based and Earth-centric.
Cells should be aligned with the surface of the Earth and have a consistent orientation for increasing and decreasing altitude.


The main challenges associated with Earth-centric grids are caused by cells converging toward the singularity at the centre of the Earth.
That is, cells closer to the centre of the grid become increasingly small and skinny and eventually degenerate to pyramids
(see \cref{fig:3dllg}).
The problem with this is two-fold, as it reduces both the compactness and volume of cells; furthermore, the resulting difference in the volumes of cells is unbounded.
For grids with a small radial extent, these effects are negligible; however, for the range of altitudes we wish to support in this thesis, they cannot be ignored.
It is worth noting that this same issue presents itself in traditional latitude-longitude grids at the poles, also demonstrated in \cref{fig:3dllg}.
Addressing these issues is one of the main challenges of this thesis.


\begin{figure}[htb!]
	\centering
	\includegraphics[width=\textwidth]{3dllg.pdf}
	\caption[Different views of a 3D latitude-longitude grid]{
		One octant of a 3D latitude-longitude grid viewed (a) at an angle, (b) from the front, and (c) from the side.
		Note how cells near the pole and equator degenerate in size and compactness and are much smaller than those near the surface and equator.
	}
	\label{fig:3dllg}
\end{figure}


\section{Goals and Scope} \label{chap:1:goals}
In addition to the initial goal of supporting an unbounded range of altitudes, there are several other properties for a 3D DGGS we aim to explore in this thesis.
While we discuss these in the context of a 3D DGGS, they are equally applicable to their conventional 2D counterparts.


As discussed previously, the cells of a 3D DGGS need to be compact while also maintaining near-uniform volume.
Ideally, all cells of the grid would be exactly congruent; however, this is impossible with an Earth Centric grid.
Therefore, for our 3D DGGS's, we aim to have cells with similar shapes and sizes without sacrificing compactness.
In addition to near-uniform volume, cells being \textit{exactly} equal volume is beneficial in specific applications.
Calculating the volume of a cell can be an expensive operation, significantly impacting the speed of analysis if this property is frequently needed.
If all cells have the same size, this calculation is replaced with a constant resulting in a significant performance increase in these applications.
Thus, we also aim to have perfectly equal volume cells in our 3D DGGS's when it is possible and desired to do so.


Once the cells of a 3D DGGS are defined, methods for mapping geospatial data to the appropriate set of cells are needed.
Likewise, the inverse process of mapping data associated with a set of cells back to the corresponding regions of the Earth is needed for visualizing data on the globe.
These two operations are referred to as grid encoding and grid decoding (or just encoding and decoding), respectively, and allow integration of geospatial data with the grid system.
With data integrated, analyses such as region growing and data aggregation require spatial neighbourhood and hierarchy traversal queries.
Ensuring all these operations are efficient allows quick integration, processing, analysis, and visualization of data with the 3D DGGS.


Finally, our 3D DGGS should leverage the accomplishments already made with conventional DGGS to facilitate the efficient and straightforward transference of data between 2D and 3D DGGS's. This interoperability allows smooth migration from 2D to 3D systems while also providing backwards compatibility from 3D to 2D systems.


Given these aims, we refine the primary goal of this thesis: develop methods for Earth-centric 3D DGGS's to support unbounded ranges of altitude, equal (or near-equal) volume cells, efficient operations, and interoperability with 2D DGGS's.
To limit the scope, we mostly focus on the techniques used for creating the 3D DGGS's as opposed to sophisticated implementations of full systems.
Doing so allows a broader exploration of different methods that we aggregate into more general findings.


\section{Methodology} \label{chap:1:method}
To achieve the goals of this thesis, we apply a specific class of refinement techniques---which we refer to as semiregular degenerate---for creating 3D DGGS's.
The main idea of this type of refinement is to refine degenerate cells differently than non-degenerate ones in order to combat the reduction of cell size and compactness in these regions.
Refer to \cref{chap:3:semiregDegen} for further details.
Using semiregular degenerate refinement, we explore two main approaches for creating 3D DGGS's with the desired properties.


The first approach modifies the refinement rules of an existing 3D DGGS, the Spheroid Degenerated-Octree Grid (SDOG)~\cite{yu2009sdog}, to improve its volume-preserving properties.
With SDOG, a sphere with twice the radius of the Earth is initially divided into eight equal octants via the equatorial plane and two perpendicular meridian planes.
Cells (including the initial octants) are then refined using splitting surfaces located at the midpoint of the three spherical coordinates: latitude, longitude, and radius.
The extent of these splitting surfaces (and the number of resulting children cells) depends on the shape of the cell, which prevents excessive cell degeneration near the poles and centre of the grid.
We perform the modifications by analyzing the placement of these splitting surfaces and finding ideal new locations that result in cells of the most uniform volume possible given certain constraints.


The second approach is a general method for creating a 3D DGGS that uses a conventional 2D DGGS as a starting point and specification.
First, we create the initial 3D discretization by extruding the faces of the initial discretization of the input DGGS into prismatoid cells.
Bases of prismatoid cells are then refined using the same refinement scheme as the input DGGS and combined with a radial refinement.
Such an approach for creating a 3D DGGS allows the research, development, and data integration achieved with conventional DGGS to readily be leveraged with the 3D one.
We take care to perform the extension in such a way that it achieves the desired properties of the 3D DGGS regardless of the 2D one used.
The extension also allows for an additional property, a target aspect ratio for cells, to be attained.


For both of these approaches, we define mappings between physical points and their corresponding locations in grid space, which we use to implement efficient encoding and decoding operations.
For the SDOG-based approach, we provide constant time encoding and decoding algorithms that outperform the standard versions of these algorithms at high levels of refinement.
These new algorithms can be used both with conventional SDOG and---by using the provided mappings---the volume-preserving ones.
We also provide similar algorithms for the 3D DGGS's that result from the extension method.


We evaluate the effectiveness of the first approach using two metrics of volume preservation: one for the maximum difference in volumes and one for the distribution.
We also use a third metric to evaluate the impact these modifications have on the compactness of cells.
For our proposed encoding and decoding algorithms, we compare the runtime performance with the conventional algorithms at different levels of refinement both for conventional SDOG and our modified grids.


We evaluate the second approach through a series of use cases, each employing a different combination of input DGGS and 3D parameters to create a resulting 3D DGGS most appropriate for the application.
These use cases illustrate both the versatility and sophistication of the method, being able to quickly create many different 3D DGGS's while achieving the desirable properties outlined above.


As a part of this thesis, we have implemented several components programmatically.
For both SDOG and the grid extension method, we created basic visualization frameworks for displaying the resulting grid geometries of different refinement rules.
These tools allowed quick implementation and visual analysis of different modifications to refinement, and the outputs served as the basis for many of the figures in this thesis.
For SDOG, we also implemented a tool for calculating volume and compactness statistics for different refinement methods along with both the conventional and improved encoding and decoding algorithms.
These were used for the analysis and runtime comparison, respectively.
Finally, the entire grid extension methodology was integrated with an internal research toolset used for designing and testing different conventional DGGS's.
This expanded toolset created the different 3D DGGS's employed in the use case examples.


\section{Contribution} \label{chap:1:contribution}
This main contributions of this thesis fall into three groups:

\begin{enumerate}
	\item Method for extending 2D DGGS to 3D.
	This method is fully general and supports any valid DGGS, including any refinement factor and non-congruent refinement.
	Furthermore, clearly defined parameters allow for a target aspect ratio for cells and a balanced tradeoff between volume preservation and cell compactness.
	We use a set of sample use cases to evaluate the method.

	\item Modifications of SDOG refinement.
	These modifications are able to achieve perfect volume preservation between all non-degenerate cells.
	Additionally, we provide blending functions that allow a balanced tradeoff between volume preservation and cell compactness.
	An analysis of volume preservation and compactness for the different modifications is also provided.

	\item Mapping functions (for modified SDOG \textit{and} extended grids).
	These mappings enable the volume preservation of the above methods while still allowing efficient encoding and decoding.
	For SDOG, we also propose new constant time encoding and decoding algorithms and benchmark their performance against the standard versions.
\end{enumerate}

The method for modified SDOG refinement, along with its mappings and analysis, has been published in the journal \textit{Geoinformatica}~\cite{ulmer2020toward}.
The efficient encoding and decoding algorithms are an unpublished extension of this work.
The grid extension method, along with its mappings and use case results, has been published in the journal \textit{ISPRS International Journal of Geo-Information} in a special issue on Global Grid Systems~\cite{ulmer2020general}.


\section{Thesis Overview} \label{chap:1:overview}
The outline for the remaining chapters of this thesis is as follows:
\Cref{chap:background} covers background information on map projections, Digital Earth systems, and global grids.
\Cref{chap:3ddggs} discusses the benefits and drawbacks of different approaches for creating a 3D DGGS and also provides a detailed explanation of the semiregular degenerate refinement strategy employed throughout the thesis.
\Cref{chap:sdog} introduces SDOG and provides an analysis of the number of cells in SDOG at a given resolution.
We then provided the modified refinement rules that improve volume preservation and the analysis of said rules.
\Cref{chap:extension} covers the grid extension method for creating a 3D DGGS from a 2D one.
We start with a basic version of the method for 1-to-4 refinement factors, and then introduce extensions for other factors and targeting a specific cell aspect ratio.
\Cref{chap:mapping} derives the mapping functions used for modified SDOG and the extended grids.
\Cref{chap:coding} then details how these mappings are used for efficient encoding and decoding operations.
This chapter is also where we introduce and compare the improved encoding and decoding algorithms for SDOG.
\Cref{chap:usecases} showcases three sample use cases for the grid extensions method, as well as the 3D DGGS's used for said use cases.
\Cref{chap:discussion} compares the different 3D DGGS's presented throughout the thesis and briefly discusses in which situations each would be most useful.
Finally, \cref{chap:conclusion} concludes with a summary, along with limitations and directions for future work.