6A-Importing Data into Access
1. Open a new blank access database (programs, office, access) and save it with a name and in a folder that you will remember (e.g., on your jump drive and named 6-ExampleData):
2. Open the first excel file that you will be importing (in this case 2ndQ08_hourly.xls). Assume that this is NOT your original file, which you have kept in a different folder, but a file that you have copied from the original file and can manipulate or even lose, and have the original file to go back to. Open the sheet Raw_Data. (This is a protected sheet, so that you can copy but not delete cells. The pw is itep, but don’t delete anything—that is just for your info in case you want to do the same thing. Try to unprotect it to see how it works; click on Review in the toolbar. Note that this is poor protection because you can delete the entire sheet, so this method is not foolproof.)
It is easiest if the first row contains column names, and the data begins right away in the second row. In this case you can see that the units populate the 2nd row. This is useful in order to verify the units, so we do not want to delete that row completely. Instead, copy the entire sheet into another sheet and name it To_Import. This second sheet is unprotected, so cut the 2nd row from To_Import and copy it to a 3rd sheet that you will rename Units. Now you have one data table for import (sheet called To_Import) with column headers in the first row and contiguous data beginning in the second row. Save the file.
Now, check each column in the To_Import sheet for weird values one final time using Autofilter. You can see that the column PM25 Avg contains repeated values of 985, which are obviously errors (and based on conversations with our site operator we know that this column is a calculated column that we do not need). So, make a note that we will not import this column based on the site operator’s knowledge of what the column means (we will just use the column PM25, and you can see that column contains blanks for those rows that contain 985 for the PM25 Avg, indicating that those rows were invalidated for PM25). Now that you have reviewed your data to import, you can begin to import all the data into Access.
3. When starting an Access database, you can either create a table in design view, or import some data first and then make sure the table design is what you want (e.g., formats of Number and not Text for your concentrations, etc.). In this case we are not sure how many columns and what types of data they are so we will first import one quarter of data and then set up that table the way we want to.
- Click External Data, Excel, and browse to find the file of your data. MAKE A NOTE of which table you are importing, especially if you have a lot of excel tables to import. In this case, import 2ndQ08_hourly (that is the table with the earliest data). Import the source table in a new table. Click ok. Make sure you are importing the correct sheet (in this case the sheet name is called To_Import (but you have also the sheet RawData, which you will not manipulate but only copy from if necessary), and that you have clicked First Row Contains Column Headings.
- Check the Field Name and Data Type Options: especially date and time, and each parameter’s value (Double for the values, or Text for flags). Use the Data-Time field as the key field, so you need to check Indexed Yes, No Duplicates for that field. If you have a lot of fields to import and you want, as we do in this case, the Data Type Selected as Double, you can just keep clicking on the next column and type D in the Data Type panel to select Double in the drop-down box and it will go faster. Be careful in this step, because for some reason Access wants to import several of these numeric fields as Text. In order to keep the decimals, you want Double as the data type.
4. Do not import the column PM25 Avg:
5. Choose date-time as the key field:
6. Name the table tbl_All:
7. Verify that the table has the same number of records (minus one for the header row) as you had in your excel file. (If you imported 2ndQ08_hourly.xls that will be 2411 rows.) Look at the tbl_All in design view also, by right-clicking the table in the navigation panel on the left, to make sure that the Data Types are as you intend them:
And one row down:
8. Now look at the table in datasheet view by clicking the View icon in the upper left in the menu bar. Verify that the data include the same values as are in the excel file.
9. Close the table, because you will now be importing the rest of the data. For simplicity, all the rest of the example data is in one file: 2ndImportIntoAccess.xls which contains 21753 rows of data (the rest of the valid data through 12/31/2010). Because we will be APPENDING to an existing table, Access will assume that all the columns are to be imported. So, first go into the file 2ndImportIntoAccess and delete the entire column PM25 Avg. Make sure you delete the entire column and not just all its data. (Delete, Delete Sheet Columns.) Now save the file. (If this were your data, you would save it with a new name, but we have the original data saved for you in case you make an error with this file.) Because this file is so huge it takes a few seconds to save.
10. In your Access database (with the tbl_All) click on Data, External, Excel, and browse to find the file, and append to the tbl_All. Click ok.
In the next step, Access will import the columns into the already-named fields using the field types you specified in the first import (It might take a few seconds). Specify the import to be from the sheet To_Import. Note that Access does not ask you about column headings or data types because these are already specified in tbl_All.
11. Verify that now tbl_All contains 24164 rows, which is the count of the data rows in the two excel files that you imported. (2411 rows in the first imported excel file, plus 21753 rows in the rest of the data). Check that the first date-time is 3/13/08 0:00 and the last date is 12/31/10 23:00 by sorting (Home, Sort & Filter, Z-to-A).
12. Exit this database. In windows explorer, navigate to the database, right-click it, select Copy, then Paste, then rename the database copy something like: “CopyAfterImport.
5 (9/7/12)