Monday, April 18, 2016

GIS 5935 Lab 15: Dasymetric Mapping

Dasymetric mapping is essentially breaking down larger aggregations of data into smaller areas or units. It is often used in population density estimates. For example, it could be the process of taking aggregated statewide population data and breaking it down to the county level. This week's assignment was along those lines. For this analysis, I was to take raster data and use its imperviousness to estimate prospective student populations for eight high schools. This was done by joining the raster's zonal statistics table to census vector data, then running an Ordinary Least Squares analysis on that to determine the estimated population. From there, I had to clip the OLS result to each school boundary and determine its area and new estimated population. The results of my estimated population vs. a reference, "true" population can be seen below:

School Reference Population Estimated Population Error Abs(Error)
Hagerty 4706 4214 492 492
Lake Brantley 6313 6094 219 219
Seminole 11776 10881 895 895
Winter Springs 5693 3863 1830 1830
Lyman 7853 8477 -624 624
Oviedo 4780 4750 30 30
Lake Howell 8585 6561 2024 2024
Lake Mary 5014 4885 129 129
Total: 54720 49725 4995 6243
Accuracy (%): 11.40899123

Tuesday, April 12, 2016

GIS5935 Lab 14: Modifiable Areal Unit Problem

A prime example of the modifiable areal unit problem is the delineation of political districts. Gerrymandering occurs when boundaries for political districts are manipulated to help gain a particular advantage. The purpose of this analysis was to measure gerrymandering affected the boundaries of a set of districts. The two ways they are affected are compactness, where the geometric boundaries take on unusual shapes, and community, where counties or other communities are divided into multiple districts. 

In this analysis, to measure the compactness of the districts to find the ones with the oddest geometric properties, I created a ratio of the perimeter of each boundary vs. the area, and found the ten worst districts based on compactness. To measure the community, I had to determine which districts broke up the boundaries of the most counties, while excluding any districts that broke up a county but fell completely within a county to account for higher population densities. The first example below shows compactness, and the second shows community:






Monday, April 4, 2016

GIS5935 Lab 13: Effects of Scale

This analysis consisted of comparing two DEMs, one from LIDAR and one from SRTM. Both were set to the same coordinate system and the same cell size of 90 meters, then compared a number of ways. To compare the two DEMS, I first looked at the differences in elevation and slope. I checked the minimum and maximum elevation for each one, as well as the minimum, maximum and average slope. I also compared the aspect of each of them to find any major differences in the direction that each cell faced.

The slope and elevation differences can be seen here:

LIDAR SRTM
Maximum Elevation 1063.67 1053
Minimun Elevation 4.31 12
Maximum Slope 50.46 45.77
Minimum Slope 0.89 1.53
Average Slope 29.65 27.49

The differences in aspect can be seen in this comparison:

The comparison of the two DEMs resulted in a slight, noticeable difference in each area. For example, the maximum elevation for the LIDAR data was 1063 meters, while the SRTM data was ten meters lower at 1053. The minimum elevations were also 4.3 and 12 meters, respectively. The image above also shows the differences in aspect that can be seen throughout each DEM. These can possibly be explained by how each DEM was developed. Since the LIDAR data was developed from instruments much closer to the ground than the SRTM (which was collected from a space shuttle), subtle differences in aspect are bound to arise.
 There was also a slight difference in slope between the two, LIDAR having an average 29.65 degree slope and SRTM having a 27.49 degree slope. This again could probably be attributed to how each set of data was developed. Given how the ground is detected using LIDAR at a much more specific level, as opposed to SRTM, which was an entire earth-encompassing project, there is likely to be a little more generalization throughout the STRM data. The higher slope average of the LIDAR data suggests slightly less generalization and therefore slightly more accuracy.


Wednesday, March 30, 2016

GIS5935 Lab 12: Spatial Regression in ArcGIS

Spatial regression using tools in ArcGIS was the topic for this week. The two tools used were Ordinary Least Squares and Geographically weighted regression. The second part of the lab consisted of carrying out a regression analysis using both tools and comparing the results. Two shapefiles were used as the data for this, one consisting of the locations of all crimes reported for a specific county in one year, and the other consisting census tracts for the county with different demographic variables. From there, one crime was selected on which to perform the analysis, then 3 variables were also selected. Then the crime rate for each census tract was determined to be used at the dependent variable

First, the OLS tool was ran on the selected variables. Then the GWR tool was ran on the same variables. For the GWR, I tried using both adaptive and fixed kernel types, then ran the Global Moran's I tool on both results to see which one resulted in a better performing model. Using the adaptive kernel type seemed to work better. From there I read through the statistics of all of the results to compare and see how the model improved.

The GWR improved on the OLS a good bit. The adjusted R-squared improved from about 35% with the OLS to about 40% with the GWR. The AIC was also lowered from 1916 to 1909.

Tuesday, March 22, 2016

GIS 5935 Lab 11: Regression in ArcGIS

This week's assignment consisted of using tools in ArcGIS to perform a regression analysis. Previously, similar assessments to these had been carried out in Excel, but ArcGIS takes it a couple steps further. While in Excel you can determine all of the necessary elements for a regression analysis such as correlation, adjusted R-squared, P-value and everything else for all of your variables, ArcGIS helps to determine the performance of the analysis and which variables should be included or excluded.

The performance of the model is determined using the Ordinary Least Squares tool, which generates a regression analysis using dependent and independent variables from a feature class. The results of this tool can be viewed to help to determine which variables work better, and which ones could be biased or redundant and should possibly be excluded. One way this tool works very well is how it analyzes the residuals and determines spatial autocorrelation and whether explanatory variables are missing. If there is an issue, it advises to use the Spatial Autocorrelation tool on the residuals, which tells whether the residuals are randomly distributed or clustered. This is an especially useful tool for improving models because it's a simple, straightforward method to determine the distribution of each variable and can help pinpoint issues and potential problems with the regression analysis.

Monday, March 14, 2016

GIS 5935 Lab 10: Bivariate Regression

For this assignment, I used a regression analysis to determine the missing rainfall data for Station A between 1931 and 1949. In order to accomplish this, I first had to determine the slope and the intercept coefficient for the relationship between the two sets of available data for the variables. Then I multiplied the the slope with the value of Station B for each year, and added the intercept coefficient to determine the rainfall that was missing for each year for Station A. While something like rainfall is impossible to precisely predict, the statistics used here could be very useful in similar scenarios. It's not very different from recent previous assignments, like surface interpolation. It may not be precise, but it gives you a good idea of what the reality probably is or was.

The results can be seen below:


Year Station B Station A
1931 1005.84 1013.45
1932 1148.08 1133.81 Slope: 0.846171
1933 691.39 747.37 Intercept:  162.3421
1934 1328.25 1286.27
1935 1042.42 1044.40
1936 1502.41 1433.64
1937 1027.18 1031.51
1938 995.93 1005.07
1939 1323.59 1282.33
1940 946.19 962.98
1941 989.58 999.70
1942 1124.60 1113.94
1943 955.04 970.47
1944 1215.64 1190.98
1945 1418.22 1362.40
1946 1323.34 1282.11
1947 1391.75 1340.00
1948 1338.97 1295.34
1949 1204.47 1181.53

Monday, March 7, 2016

GIS5935 Lab 8: Surface Interpolation

In the first part of this lab, I compared the results of using Spline and IDW interpolation methods to create a Digital Elevation Model. This was done by using elevation data points as the input for each technique, then running each tool with the same parameters and comparing the results.

Overall, the difference in the results of the two interpolation methods wasn't extremely substantial, but there were some notable differences. Throughout a majority of the data, the difference in elevation between the two datasets is anywhere from 2 to 12 feet. But there are also several places where the difference is 30 to 40 feet. The areas with these larger differences, however, are mostly the areas without elevation data points, which shows how each interpolation process will give slightly different results. 

Below is map layout that shows the areas of difference in the two methods: