GIS Basics - What is a GIS?

The definition of what constitutes a geographic information system is an active academic pursuit in itself. The name “geographic information system” provides some answers. Geographic relates to attributes and spatial relationships of positions in space. Information is the useful data that we attach to these positions. The word ‘system’ suggests numerous components, which in the case of GIS can be thought of as users, hardware, software, and data. So, in one sense, a GIS is simply a software package; in another it is computer database containing information about the earth; and in yet another way a GIS is an integrated system of software and geo-referenced data for planning. One widely agreed upon definition quoted in numerous texts on GIS in archaeology comes from Star and Estes (1990) who state that a GIS “is an information system designed to work with data referenced by spatial or geographic co-ordinates. In other words a GIS is both a database system with specific capabilities for spatially referenced data as well as a set of operations for (analysis) with the data” (quoted in Wheatley and Gillings 2002: 9). This definition was a refinement of Burrough’s 1986 definition which characterized GIS as “a powerful set of tools for collecting, storing, retrieving at will, transforming, and displaying spatial data from the real world for a particular set of purposes.” These definitions demonstrate that GIS is not simply a monolithic set of programs, or even one program. Instead, it is a concept that requires definition in order to understand.

The components which constitute a GIS include the computer system, data, data management procedures, and the people who use it (Delaney 1999: 6). The computer system is perhaps the simplest to define. A typical computer system used in GIS includes the hardware and software. Hardware is the computer itself, typically one running with Microsoft Windows OS, a Pentium 2 gigahertz or faster processor, and 512mb of RAM memory. The software, in the case of this thesis, is ESRI’s ArcGIS suite of programs and extensions.

Data within a GIS exists in thematic layers, which rest upon one another in any order the user defines (Figure 4-1). These layers can be thought of as transparencies with drawings on them which rest on top of one another.


Examples of layers used in a GIS (from Schuurman 2004)

These thematic layers are of two fundamental data types: vector and raster. Vector data consists of points, lines, and polygons while raster data represents contiguous data. The vast majority of space in a vector data file is empty with no value, while in raster datasets every point has some value attached to it (Fisher 1999: 5). Figure 4-3 demonstrates this; in the vector representation a large amount of the map has no information. However, using raster datasets involves assigning a value to every point in space; in this case ‘empty’ space has a value of Farm. Raster and Vector data can be used to represent the same features and each has its strengths and weaknesses.


Features represented using vector and raster data (from Schuurman 2004)

Vector data is the preferred data structure to use when tight spatial control is desired, such as the outlines of houses or roads. Raster data is most suitable for data that includes values for every part of space, such as elevation or topography. Figure 4-4 demonstrates how features from an aerial photograph have been digitized (traced) into a series of vector files. This is a common use of maps and plans.


Industrial Features Digitized from Aerial Photograph (click image for a larger version)

ArcGIS organizes vector data into shapefiles. A single shapefile used in ArcGIS is, in actuality, between three and twelve system files stored on the hard drive. ArcGIS stores raster data in a number of formats, including commonly used image formats such as jpeg and tiff. It is important to note that a complex project, with numerous datasets and companion images, can eventually require dozens of shapefiles, resulting in hundreds or even thousands of system files. Therefore, projects that include a GIS component should include some discussion of how the data was organized, especially if the information is to be used by future researchers.

The next component of a GIS is the data management procedures. In some ways, this is similar to the ideas common to any information management system, such as a database or a spreadsheet. The basic abilities of a GIS are to store, manipulate, and retrieve data. An important consideration too often neglected is that information systems should provide access to multiple users and allow efficient updating (Wheatley and Gillings 2002: 13). The importance of this cannot be stressed enough, especially in an environment like graduate archaeology programs where one student project will sometimes inform numerous projects to follow.

The final component of a GIS is people. In fact, this is arguably the most important component of any GIS created and maintained. No GIS exists in a vacuum, and all are created in order to be used by someone. Individuals use these systems to plan and implement projects, basing critical decisions on the information contained in the GIS. A consideration of providing access to other users and implementing an efficient protocol for updating a GIS should inform any project that makes use of these systems, otherwise creating them serves little purpose beyond an immediate, limited use.