The goal of this exercise is to explore the basics of downloading data, geocoding with ArcGIS, and analyzing the results for error. The exercise consisted of the following objectives:
- Download list of mines from the WisconsinWatch website.
- Geocode the mines using the ESRI address locator.
- Geocode the mines with PLSS if need be.
- Compare results with classmates.
The first step in this exercise was to obtain the locational data of mines in Wisconsin. Professor Hupy directed us to the Investigative Journalism website called WisconsinWatch. This website provided an Excel file with address records for over 130 sand mines in Wisconsin. The data was split up among the students in the class so that each student had 14 mines to geocode, with overlap among the students so that we could check for potential errors. Figure 1 below displays the data that I was required to geocode.
| Figure 1. Raw location data for 14 sand mines in Wisconsin. Notice how inconsistent the Facility Address column is. |
Using Google Earth and ESRI address locator, I used the address descriptions that were given to manually locate and update the addresses. I first located the mine by analyzing aerial images with Google Earth, then I used ESRI to select an address that was nearest to the mine's location. I then updated the information in a new Excel spreadsheet to be geocoded once all the addresses were normalized. It was necessary to use Google Earth because it has much more recent aerial images, and revealed the locations of mines that weren't even present in ESRI's aerial images.
Some of the mines only had PLSS information for the Facility Address (see the last row in Figure 1). The PLSS system breaks the state up into lots of squares, organized by Township (34N in the above example), Range (11W in the above example), and Section (28 in the above example). In that example, the mine is located in the Southwestern quarter of the Southwestern quarter of Section 28 of Township 34 North Range 11 West. Given this information, and by using a PLSS grid that was supplied from the department's online database, it was easy to find the mine's location on the aerial image. After this process, I then normalized the data in the new Excel spreadsheet by adding a City, Address, and Postal column to the spreadsheet and updating the new addresses as I located them with ESRI. Figure 2 shows the completed Excel table.
| Figure 2. Normalized address data for the same 14 mines. The updated addresses are much more clean and easy to locate than the previously supplied location information. |
| Figure 3. The results of the Point Distance tool. There were 224 rows (16 mines for each of my 14 mines). The table is unnecessarily redundant. |
| Figure 4. Table of my mines (INPUT_FID) paired with the nearest of my classmates' mines (NEAR_FID), with the distance between the two in meters and kilometers. |
Overall, my classmates and I were fairly consistent with the mines that we mapped. Figure 5 shows the completed map of my mines compared to those of my classmates. This map, paired with Figure 4 above both support this claim. The nearest two mines were only 8 meters apart, while the furthest were 19 km apart. However, it is easy to see this outlier by viewing the map below. This could have happened either by myself identifying an incorrect mine, or by my classmates failing to identify a correct mine.
![]() |
| Figure 5. A map of our assigned sand mines. |
After all this work, it is easy to imagine the sources of error between my mines and those of my classmates. It is clear that each of us were fairly precise in locating our mines, as it appears that there are clear pairs for each mine. However, the accuracy for these mines vary depending on the operator. Any error in this activity is most likely an operational error, or an error that is created due to inevitable variation or mistakes caused by the operator of the software or hardware. I would say that our errors are due to inevitable variations in feature classification, image analysis, and attribute data input. There will always be errors in image analysis; what one person may identify as a small mining operation, another person may identify as a large farm. When it comes to aerial imagery, it isn't always clear what an object is, and sometimes it causes errors. During the geocoding process, errors due to variation in feature classification and attribute data input are also unavoidable. Sometimes there are several addresses to decide from, and it is up to the analyst to make a decision; not everyone will make the same decision. It is unavoidable to prevent errors like this, it is just up to the analyst to be aware of their presence and either fix them, or take them into account when using the data.
Conclusion
It was no easy hike getting from that first sloppy address table to a complete map of geocoded mines. This being said, I feel like my classmates and I did a good job with the mines that we were assigned. The map shows that each mine has a clear partner (they aren't just scattered about), and aside from the 19 km outlier, each mine is within 6 km of the other mine in the pair. I was surprised to see such precise results after considering the unavoidable errors.

No comments:
Post a Comment