SESSION 2 - DATA RETRIEVAL AND MANAGEMENT

2.1 Introduction

Very often, when writing essays or projects, you will want to refer to empirical data. These data can be collected in various ways. One way is to go to the statistics section in the Library and to consult government publications such as Economic Trends, Social Trends, etc. There are also an increasing number of sources on the World Wide Web but these sources are frequently modified and up-dated so be aware that some of the details below may be out of date and you may need to dig around to find exactly what you want.

The basic principles for doing this kind of research on 'secondary' data are the same as those followed in most scientific work. First there is a theory from which hypotheses are derived. Evidence is then explored to establish whether there is any provisional support for the hypothesis, and conclusions are drawn. The data exploration part of this process is what concerns us here. There are really two parts to it: one is for purposes of description and the other is for making inferences about the validity of hypotheses. We will consider both aspects in more detail later on in the course. The objective at this stage is to ensure familiarity with the process of collecting data from web-based sources and importing the data into applications where they can be explored. The normal sequence, once you are clear about the hypothesis or research question you are asking, will be as follows:

  • locate and view the data available
  • select the bits of data you want (the variable name or definition and the time period you are interested in);
  • download data to a file;
  • import the file into the application in which you will analyse the data, or into Excel if you're not sure what you're going to do with them.

2.2 Data Sources

Click here for a selected list of sites offering economic data.

Click here for the Department’s links to data and other resources.

2.2.1 UK official statistical data

For data on the UK economy the best option is the site of the government's statistical office, at http://www.statistics.gov.uk/ It offers a wide range of data (free) and it is easy to use.

2.2.2 The Biz/ed site

An alternative to the government statistics site is the business education site, Biz/ed, accessible by clicking here(bized.ac.uk) It includes links to:

  • A listing with links to companies belonging to the FTSE 100 Index
  • Country by country data is available at Penn World Data
  • A useful addition to the Biz/ed website is the TIMEWEB package (Time Series Data on the web). You may wish to look at some of the online learning material available to help develop your data analytical skills
  • A very good Economics Glossary.

In addition the Bized site has some data provided by the Office for National Statistics, and so is an alternative route for getting some of the data referred to in 2.2.1 above. This is a slightly cumbersome process but illustrates well how many sites with data work. It downloads the data in a format which Excel can read. It works by saving the data to a file in a format that Excel can recognise and import.

2.2.3 European data

A useful site for European data is the Eurostat site. It covers a lot of material including the following list of themes on which key indicators are available.

2.2.4. World data

Other data sources likely to be of value include from which you can download lots of useful things including some summary data for countries around the world. The Penn World Data site referred to in section 2.2.2 above is also useful for international data. In addition Chinese economic data can be obtained from the following website: China.

2.3 Data Capture and Management

Once you have identified the relevant data (usually after viewing in HTML format) you can download them to a file so as to be able to analyse them. Most web-sites will offer you the chance to save either directly to an Excel file or to a CSV or TXT file that Excel can open and read. Follow the site's instructions. You can often save a file quickly by right-clicking the appropriate link and selecting "Save target as" (if working in Internet Explorer). Remember to specify the directory in which you want the data to be saved (e.g. Drive A: or Drive H:), making sure to name the file as something intelligible. Then open the file in Excel and make sure you have what you want. Even if you import data in CSV format, save your subsequent working in an Excel file, otherwise you will lose any charts or formatting which you have added.

Once you have downloaded data from web-sites to Excel or other applications it is important to learn to organise data in a way that is appropriate to the research question being addressed. Particularly when exploring relationships between variables drawn from different sources you will need to ensure that dates, definitions and so on are mutually consistent. You will need also to ensure that the data are placed in the worksheet in a way that is consistent with producing the tables and graphs you want and with doing any analysis. The general rule is to keep the data as compact as possible (no empty rows or columns). Excel will always assume that the leftmost variable is the X-axis variable.

2.3.1 Using Excel to Produce basic Trendlines

You can use Excel to produce your own variables, such as constants and dummy variables (used extensively for Econometrics in your second year) and time trends. Trends can be used with scatterplots to help explain the relationship and the most basic method is to fit a line or trend to the data. To produce a very basic trend line in Excel, type a ‘1’ into the cell A1, then move to cell A2 and type: ‘=A1+1’. Click on the lower right corner of the cell and drag downwards to produce a linear time trend. Fortunately Excel can be used to add both linear and non-linear time trends to a scatterplot.

2.4 References

For some basic information on Internet sites useful for economists see Chapter 6 of Judge. The links given in these notes should get you started, but this is a rapidly developing area so always be prepared to hunt around a bit and to follow hints or suggestions as to new sites.

2.5 Demonstrations

Demo 2.1

From the Bized home page select the option Data (from the horizontal toolbar) and then ONS from the sub-categories. Then click on Sample Data Available. From the listing of datasets select the one you want and click on the CSV link. For demonstration purposes choose Economic Output (Annual Data) and then UK GDP in constant prices. When Internet Explorer displays the data on screen go through the File, Save As routine, but save the file as an Excel file. When saving the data choose a name that will help you identify the file and a location on Drive A: or H: making sure to keep a note of where you have sent them. Now close down Internet Explorer as if you were ending your session.

Now open Excel and load the data file you saved from the web. You should get something like this

You can now carry out any of the usual procedures for analysing data. Sometimes the data are a bit disorganised, and you have to rearrange them into neat rows.

A good way to start is to plot a line graph of your variables, using your time counter as the data for the x-axis. Click on the worksheet Chart 1 to see an example.

Demo 2.2

In this example we want to explore the conjecture that the decisions to travel abroad on holiday depends on the level of uncertainty in the economy. In order to look at this hypothesis in the context of the UK go to the Biz/ed site, sample ONS data section. Locate annual data on UK visits abroad and save. Then find unemployment levels (claimant count measure) and save. To explore the hypothesis we want a scatter-plot showing the variables in relation to each other, Excel also allows us to add in a trend line. (Go to Chart and add trendline). You will then get a choice including linear and non-linear versions, probably the linear option gives the best fit in this occasion. (we will see in a later session that we can represent the relationship numerically using regression analysis techniques). Having created an Excel file for each variable, we need to copy one set of data into the other data file. Once that is done then the relevant series can be matched up alongside each other. Once the data are matched up in one file, the graph can be drawn. Clickhere to see the file output and chart.
Demo 2.3

Go to the site http://www.statistics.gov.uk/Then follow these steps:

  1. Select the Themes option from the left hand panel.
  2. Select the theme you want from the list; for the purposes of this demonstration choose Labour Market.
  3. From the initial list of topics choose Travel to Work. This will generate a sub-category of options: choose Journey to Work.
  4. Scroll down the listing of information available and highlight the one you want: for this demonstration you are looking for Main Method of Travel to Work, Autumn 2002, Regional Trends 38. (If this fails type in: ‘Main method of travel to work’ in the site’s search engine. Then select 2002)
  5. The next option allows you to view the file as an Excel spreadsheet or as a CSV file. Choose “Download CSV Version” (the Excel version is unreliable).
  6. You should now have a file, looking likethis.Save the data to your own Drive H:, making sure you save it as an Excel file so that you can save your charts and formatting. To do some basic data analysis you first of all should re-organise the data so that it is ready for analysis. Try constructing a stacked bar graph to compare the different types of transport used in the different regions of England. Remember to move rows like the country totals which you do not need for this graph. You may also need to change blank cells to zeros if you have trouble with the appearance of the graph.

Exercises for Class 2 → have a look hereto get started with Excel