Due Date: Mar 7, 2017
- In this exercise you’ll be working with the copper data tables we processed in class and the gep664 database.
- This exercise is worth 6 points. Questions 1-5 are worth 1 point each, questions 6-7 are worth 1/2 point each.
- Be sure to follow the proper conventions for SQL statements and the guidelines for submitting assignments, as posted on the course website. Failure to do so will result in point deductions.
For the following, provide the SQL statement AND the record count that is automatically returned by pgAdmin or psql (for the tables you import, verify the record count after import).
This assignment assumes that you have already processed and cleaned the copper data tables as we did in class. The original data files are available for download on the course website on the Data & Software page. You will be importing data into the gep664 database.
When importing data into the database you may use the \copy command or the pgAdmin GUI (create blank table to hold data, right click, choose Import).
Remember: 1 – be careful when assigning number types to columns, to insure that you don’t truncate data and 2 – in order for the \copy command or GUI Import to work, the order and number of columns in the database table must match the order and number in the import table.
Part I – Table Creation and Import
For this exercise, create a schema called copper for the user postgres (CREATE SCHEMA copper AUTHORIZATION postgres;) in the gep664 database, and import the copper data tables that we cleaned in class.
1. Create a new database table called smelter and import the copper smelter data.
2. Create a new database table called trade and import the copper ore trade data (you don’t have to do anything with the pivot table we created in class).
3. Create a new database table called mines_porcu and import the main porphyry copper mining data (from main.csv). As a reminder – in class we deleted several columns; these are the only ones that should be in the final table: rec_id, depname, country, ccode, stprov, latitude, longitude, oreton, cugrd, mogrd, augrd, aggrd, deptype, comments.
Part II – Data cleaning and Import
For this part, you’ll work with the main sedimentary copper mine data (main.csv in the sedcu folder) that we did not use in class, but that is stored with the other sample data. It is in a CSV format and is similar to the porphyry table in structure.
4. Using the ISO country code table that we used in class, write an Excel VLOOKUP formula to assign the proper 3-letter ISO country code to each country. For codes not found, fill them in manually. Instead of SQL code, submit the VLOOKUP formula as your answer and comment it out in the template. After the formula write one sentence that indicates which columns the formula is referring to for the lookup value, range, and returned value.
5. Delete unnecessary columns and save the table. Create a new database table in the copper schema called mines_sedcu and import the main sedimentary copper mining data. The columns that should appear in the final table are: rec_id, depname, country, ccode, stprov, latitude, longitude, oreton, cugrd, cogrd, aggrd, deptype, comments. Most of the columns are similar to the porphyry table, but you will need to re-order them so they’re consistent.
Part III – Select and Summarize
For this part, you will create summary views in the copper schema for the data you just imported.
6. Create a view called exports_5yr that sums the total copper ore exports (just exports, not re-exports) by country for the past 5 years, and display just the top 20 countries.
7. Create two separate views, one called pmine_total for porphyry mines and one called smine_total for sedimentary mines, that counts the number of copper mines (a count of records) and sums the total ore in tons available by country. Sort each view by number of mines in descending order.
BONUS Question. Optional, and worth 1 point. Count the mines and sum the ore totals for porphyry and sedimentary sites in one query to create a single view called copper.mines_total. Counts from both tables must appear, even if they don’t have a corresponding match in the other table. You may not change the underlying structure of the tables or create different tables, but (hint) you may import the ISO country code table into the database as is and use it in the query.