Monthly Archives: March 2017

Final Project

  • Presentation Due: Tue May 16, 2017 (in class)
  • Final Project Due: Fri May 26, 2017 (by 12 noon)

For the final project you will identify some publicly available datasets, build a workflow, and create a PostGIS database to answer an interesting spatial question of your choosing. You will also write a brief paper and create a map that describes your database.

See this document for full details. Our class on Tue Apr 4 will be devoted to discussing the project, and your homework assignment for that class will be dedicated to doing some background research for it.

Assignment 8 – Spatial Relationships

Due Date: Apr 4, 2017

Introduction

  • In this exercise you’ll be working with the gep664 database in the nyc schema. It’s assumed that you have also loaded the ZCTA shapefile into this schema (which we did in our previous class). You’ll also need to use QGIS.
  • This exercise is worth 6 points. Each question is worth 1 point each.
  • Be sure to follow the proper conventions for SQL statements and the guidelines for submitting assignments, as posted on the course website. Failure to do so will result in point deductions.
  • For this assignment you will need to submit a SQL file and a map as an image file. When you submit your electronic assignment to the Box, attach the SQL file and the image as attachments in the same message. At the beginning of class hand in a paper copy of your SQL file; you don’t have to print out and hand in the map.

Questions

This assignment will give you additional practice with spatial relationships and joins, and you’ll pull together several concepts that you have learned thus far in the course.

You’ll be working with data in the nyc schema; this includes the ZCTA feature that we loaded from a shapefile in our previous class. If you don’t have the ZCTA file download it from the course website (under Class 7 data files) and load in into the gep664 database using the shapefile loader under PgAdmin – Plugins menu or the database manager in QGIS.

For the following, provide: the SQL statement, the row count, and any specific deliverable that the questions asks for.

Part I -Spatial Relationships

There are 339 census tracts and 25 ZCTAs in the Bronx. ZIP Codes / ZCTAs often do not align well with other census or legal geographies. In this exercise you’ll test and see how well they do or do not align with census tracts. Pay careful attention to the relationship you’re being asked to test.

1. Select all census tracts (the tractid and tractnum) that are within ZCTAs in the Bronx.

2. Select all census tracts (the tractid and tractnum) that overlap ZCTAs in the Bronx. Use DISTINCT to remove tracts that overlap more than one ZCTA.

3. Select all census tracts (the tractid and tractnum) that have their geographic centroid within ZCTAs in the Bronx.

Part II – Spatial Join

4. Write a SELECT statement where you select the subway stations (stop id, name, and trains) for the entire city and assign the stations the ZCTA code that they are within. For this statement, use the JOIN clause and make it a left outer join so that you keep all subway records on the left whether they have a matching ZCTA or not. Sort the result in descending order by ZCTA number.

Part III – Spatial Joins and Summaries

To refresh your memory when answering these questions, refer back to the slides and the Boundless tutorial. When writing these longer statements, construct them bit by bit, getting one piece to work before moving on to add the next piece.

In the nyc schema in the course database:

  • census_tracts is a spatial table with census tract geometry; it also contains a column that identifies the Neighborhood Tabulation Area (NTA) that each tract is part of.
  • census2010_tracts contains data from the 2010 Census for each census tract.
  • census2010_lookup contains the list of census variables and the codes used to identify them as column headings in the census2010_tracts table.

5. Identify the codes for the variables that represent the total population, and the total population who are 16 years old and over. Then write a SELECT query where you relate these variables to the census tract geographies and create a summary for NTAs, so each row in the result is for an NTA and the columns represent: total population, total population 16 and over, and the percent of the NTA’s population that is 16 and over for each NTA. To avoid division by zero, select only tracts that have an over 16 population greater than zero. In order to get decimals for your percent total, remember to use the CAST operator around your integers:

(CAST(value1 as numeric))/(CAST(value 2 as numeric))

6. Take the query you wrote in question 5 and save it as spatial view called ntas_workforce (remember to do this, you must add the geometry to the select clause and union it – refer back to the slides). Or if you prefer, you can create a new table and insert the data into it (instead of creating a view). Provide the full query as your answer. Then, using QGIS connect to the course database and add this view to the map. Symbolize the NTAs to display a graduated map showing the percent of the population who are 16 and over. Classify the data using equal intervals with 5 categories. It’s normal if you have some “holes” in your map, as these represent unpopulated areas. You don’t have to create a fancy, finished map (with a layout etc) – just take a screenshot of the window and save the result as an image file (under Project menu – Save as Image). When naming the file, use the same convention as the sql template (i.e. use your name and the assignment number).

Assignment 7 – Spatial Reference Systems

Due Date: Mar 28, 2017

Introduction

  • In this exercise you’ll be working with the gep664 database in the copper schema you created from assignment 5. You’ll also need to use QGIS.
  • This exercise is worth 6 points. Each question is worth 1 point each.
  • Be sure to follow the proper conventions for SQL statements and the guidelines for submitting assignments, as posted on the course website. Failure to do so will result in point deductions.
  • For this assignment you will need to submit a SQL file and a map as an image file. When you submit your electronic assignment to the Box, attach the SQL file and the image as attachments in the same message. At the beginning of class hand in a paper copy of your SQL file; you do not have to print out and hand in the map.

Questions

This assignment will give you additional practice with adding geometry columns, loading spatial data, and transforming coordinate systems.

You’ll be working with data in the copper schema; if you don’t have the files the entire schema is available as a backup file on the course website. Download the course data for class 7 from the Data and Software page, which contains the copper schema backup and a country shapefile that you’ll need to load.

For the following, provide: the SQL statements (questions 1, 3-5) and any specific deliverable that each question asks for.

Part I – Data Loading and Transformation

1. Add a geometry column to the copper smelters table, and build geometry for the table using the longitude and latitude fields. These coordinates are currently in WGS 84 (SRID 4326). When building the geometry, transform them to Pseudo Mercator (SRID 3857).

2. In the course data files there is a shapefile of countries from Natural Earth. Import this shapefile into the copper schema using the shp2pgsql-gui, located under the plugins menu. Name it countries_temp and assign it the appropriate SRID (it is in WGS 84). Note – if you receive an error about UTF-8 encoding while trying to import, hit the options button and change the encoding from UTF-8 to LATIN1. Since there is no SQL to return, copy the output that’s printed in the Log Window of the shp2pgsql-gui after you’ve loaded the shapefile, and paste that into your text file as your answer.

3. Create a new table called countries_bndy that will hold a cleaned-up version of countries_temp. Use the new column names (1st column in the list below) and look at the existing columns (2nd column in the list below) in the countries_temp table to assign appropriate data types. Make uid the primary key, and make sure to include a geometry column in the new table that can hold multipolygons and assign it the SRID for Pseudo Mercator.

uid - adm0_a3
iso - iso_a3
name - name
name_long - name_long
ctype - type
continent - continent
subregion - subregion

4. Insert the appropriate data from countries_temp into the countries_bndy file, and as part of the insert operation transform the geometry from WGS 84 to Pseudo Mercator.

5. Create a view called smelter_sum where you join the smelters table to the countries_bndy file using the iso country codes in each table. In the view, group the data by country, count the number of smelters, and sum the smelter capacity. Make sure to include the geometry column (in SELECT and GROUP BY). In your answer, return the SQL statement AND the number of rows that are in the view.

Part II – Make a Map

6. Using QGIS, connect to the course database and add the smelter_sum view to the map. Symbolize the countries to display a graduated map showing total smelter capacity. Classify the data using equal intervals with 5 categories. You don’t have to create a fancy, finished map (with a layout etc) – just take a screenshot of the window and save the result as an image file (under Project menu – Save as Image). When naming the file, use the same convention as the sql template (i.e. use your name and the assignment number).

Need help using QGIS?

Midterm Quiz

Takes Place: Mar 28, 2017

The midterm quiz will take place at the beginning of class on Tue Mar 28th. This is the same date that it was originally scheduled for (i.e. it’s not being moved because of the snow). Make sure to arrive on time for class.

The quiz is worth 10 points (each question is worth 1 point) and will consist of two parts:

Part I – Definitions

Of these 7 terms, 5 will appear on the quiz. You will choose 3 to define in 4-6 complete sentences. Your answers must address the primary meaning of the terms, with some supporting details. You may not bring any notes (test is closed-book). Look at these example definitions to see what a full, partial, and no credit answer would look like.

  • Alias
  • Data type
  • Entity Relationship Model
  • Geometry type
  • Primary key
  • Schema
  • View

Part II – SQL

You will be given print outs of 2 sample tables. There will be 7 questions where you are asked to write a SQL statement based on these tables. The material in this part covers just the fundamentals from classes 2 & 3, except there will be 1 question about adding geometry columns. Your statements must follow the standard SQL style guidelines.

You will be given a SQL reference sheet (this sheet here) that you can refer to throughout the test, but some questions may include material that is not on the sheet.

Assignment 6 – Spatial Data Basics

Due Date: Mar 14, 2017

Introduction

  • In this exercise you’ll be working with the gep664 database and the sample data from Chapter 1 in PostGIS in Action.
  • This exercise is worth 6 points. Each question is worth 1 point each.
  • Be sure to follow the proper conventions for SQL statements and the guidelines for submitting assignments, as posted on the course website. Failure to do so will result in point deductions.

Questions

The first part of this assignment echoes what we did in class in the lecture and covers material from PostGIS in Action Chapter 2 and the Boundless PostGIS tutorial (parts 9 and 10).

The second part of this assignment is derived from Chapter 1 in PostGIS in Action. It assumes that you have read the chapter and did all of the exercises using the sample data provided with the book. A link for downloading this data is available on the course website on the Data & Software page; it’s a ZIP file that contains subfolders for every chapter in the book.

For the following, provide: the SQL statement (questions 1-6) and the record count (questions 4-6) that is automatically returned by pgAdmin or psql.

Part I – Geometry Basics

For this exercise, you’ll be working in the course database gep664 in the nyc schema.

For the questions in this part, all of the geometry should be defined using SRID 2263, which is the local state plane zone for NYC (NAD 83 NY State Plane Long Island (ft-us)).

1. Create a point feature to represent the city’s primary train stations. Create an empty table called nyc.train_stations with 3 columns: tid integer, name text, and geom geometry(point,2263). Write an INSERT statement where you insert the VALUES for id, name, and point geometry into the table. Use the manual method for constructing geometry as outlined in section 9.1 in the Boundless tutorial. In your answer, just provide the final statement (the INSERT).

1,Penn Station,986029,212733
2,Grand Central Station,990607,213615

2. Create a line feature to represent the Times Square Shuttle subway. Create an empty table called nyc.subway_shuttle with 3 columns: sid varchar(1), name text, and geom geometry(linestring,2263). Look in the subway stations table and find the stops for Grand Central (stop_id 901) and Times Square (stop_id 902). Return the geometry from these stations as text so you can see the coordinates (they’re in NY State Plane). Using these coordinates, write an INSERT statement where you insert: an id (S), name (Times Square Shuttle) and linestring geometry into the nyc.subway_shuttle table. Use the manual method for constructing geometry as outlined in section 9.1 in the Boundless tutorial. In your answer, just provide the final statement (the INSERT).

3. Create point features for the NJ PATH stations in Manhattan. Create an empty table called nyc.path_stations with 5 columns: sid integer, name text, xcoord numeric(6), ycoord numeric(6), and geom geometry(point,2263). Write an INSERT statement to insert the data into the table. For this question, do NOT use the manual geometry methods you used above; create the geometry using the ST_Point function. In your answer provide the INSERT statement and the statement used for creating the geometry.

1,33rd Street,987455,211986
2,23rd Street,986185,209902
3,14th Street,985064,207918
4,9th Street,984618,206718
5,Christopher Street,982237,206324
9,World Trade Center,981276,198593

Part II – PostGIS in Action Chapter 1

For this part, you’ll work with the sample data from Chapter 1 of PostGIS in Action. This part assumes that you read and completed the chapter’s exercises. Your answer must include the SQL statement AND the row count.

4. Count all of the restaurants that are within 1/2 mile of a highway within the states of NY, NJ, and CT. Group by franchise, so the result shows the total count of each restaurant within all three states (i.e. do not group by state).

5. Return the coordinates in plain text of all Pizza Hut restaurants in the restaurants table.

6. Calculate and sum the lengths of all the principal highways in miles within the state of NJ by highway name. Since many of the highways exist as multiple features be sure to group them. Sort the results from longest to shortest distance.