Assignment:  Excel Time Series Graph and Linear Projection

You should produce a time series graph plotting the trends in a variable measured at least five times in the General Social Survey 1972-2006 data file.  This may be the same dependent variable you used for your Multiple Bivariate Assignment, but you may choose a different variable if you prefer.   This very large data set is described in an article from NY Times.  To access the data we can use the Survey Data and Analysis program at the University of California, Berkeley.  Just click on SDA

  You may use any variable that was asked in at least five years.  These include (VARIABLE NAMES IN CAPITALS):
 premarital sex  -  PREMARSX            spanking children - SPANKING             homosexuality wrong - HOMOSEX
trust in others -  TRUST                    happiness - HAPPY                                      x-rated movies - XMOVIE
frequency of prayer - PRAY             gun permits - GUNLAW                        political party affiliation - PARTYID (just plot one party)
marijuana legalization - GRASS        happiness of marriage - HAPMAR          would vote for black president - RACPRES
exciting life  - LIFE                          newspaper readership - NEWS               would vote for woman president - FEPRES
fear at night  -  FEAR                       homosexual teachers - COLHOMO     subjective class identification -  CLASS
the afterlife  - POSTLIFE ,            homosexual speakers - SPKHOMO
confidence in institutions (just select one institution) - CONBIZ, CONCONG, CONGOVT, CONCHURCH, CONSCHLS,  CONBUS, CONCLERG, CONEDUC, CONLAB

We are only going to use one value for the variable to make a chart, e.g., the percent agreeing with the variable.  This means that your variable may need to be recoded.  It is often best to just have two values.  To recode, put the cursor over "create variables" then click on "recode variables". click on RECODING RULES (See explanation and examples) for an explanation of how to recode your variable.  The recoded variable will be stored with a new variable name which you can use to do the assignment.

The next step is to crosstabulate your variable with Year.  To do this, put the cursor over Analysis and click on "Frequencies or Crosstabulation.  Use Year as the row variable and the name of your variable (SEE SOME VARIABLE NAMES ABOVE IN CAPITAL LETTERS) as the column variable.  Click on the box for row percents.  These percents can then be typed into Excel to compute the regression line and make the graph.

Here is a summary of the steps involve in computing the regression line in EXCEL.  It is much easier to learn this by watching me do it in class than by just following instructions, unless you are already good at figuring out Excel features.

1.  Put the years in the first column.
2.  Put the data in the second column
3.  Open "tools/data analysis/regression"  (if you don't have "data analysis" go to "odd ins" on the "tools" menu and add in the "analysis tool pack"
4.  Use the regression tool to computer the regression equation with year as the X variable and your selected crime variable as Y.  This is tricky without having someone show it to you.  Have the regression data printed somewhere on your sheet
5.  Use the "intercept" and X Variable coefficients to put a regression equation in column c, row 2.  To see how to do this, look at the example in the propertycrme.xls file.  The equation must begin with a + sign.
6.  Drag the equation all the way down to the box equivalent to the year 2015.
7.  Use the graph wizard (at insert/chart) to create an XY Scatter graphbser plotting both the observed data and the predicted data projected to the year 2010.

For an example of a similar assignment, using a different data set, check the Property Crimes Excel file http://crab.rutgers.edu/~goertzel/propertycrime.xls

Here are some pointers on using Excel, based on problems that have come up in class.   For a general overview on making graphs with Excel go to:  http://qrc.depaul.edu/StudyGuide/MakingGraphsWithExcel.htm

1.  The "Data Analysis Tool Pak" must be installed on the computer you are using.  If you have Excel at home or at work, it may not have been installed.  You have to install it from the CD-ROM using the tools/add-ins command.  If you can't find it on the computers at Rutgers, first check the hint in item 2, then see if you can install it.  If that doesn't work, get help from computer center staff.

2.  If you still can't find the "Data Analysis" line on the "Tools" menu, it may be because you have the cursor on one of the graphs you have created.  The cursor has to be on the spreadsheet for the Data Analysis line to come up.

3.  When using the "regression" data analysis box:  first put the cursor in the y box, then highlight the cells for the dependent variable, then move the curson to the x box, and only then highlight the x data (the years), then move to the box for "output range" and then highlight the cell you wish the output to begin in.  It doesn't work if you highlight the data first, then click on the box you wish to use.

4.  When entering a formula to do your projections, make sure you click on the "=" sign before the box that shows the contents of the cell.  A formula must begin with an "=".

5.  When you do the regression make sure "constant is zero" box  has not been checked.

6.  If the spreadsheet has numbers instead of letters for the column labels, go to tools/options/ general and disable the "R1C1 reference style."

Here are some pointers if you need help in preparing the regression equation:

The best thing is to open the sample file I posted at:   http://crab.rutgers.edu/~goertzel/propertycrime.xls
and use it as a model.

Put the cursor on cell C3 where you see the number 563.3
Then look in the box just above, to the right of the =, which shows what is really in the box.  You will see the formula:

=-13.3815* A3 + 26977.99

A formula has to begin with an = sign, which tells Excel it is a formula.

In this example the "*" means "multiply" and the "+" means add,

  -13.3815 is the regression coefficient, which you find in cell B57

                          A3 is the address of the cell where the first date is, in this example 1973

                           26977.99 is the Intercept, also found under the word "Coefficients" in cell B56

    If you did the computations correctly, the regression line will fit the observed points.  Your finished excel file will look like the one in http://crab.rutgers.edu/~goertzel/propertycrime.xls

The next step is to copy the graph into a Word File.  First give the World File a title.  Then paste in the file.  Below the file write a short paragraph describing the trends.  Cite a few values of your dependent variable, including the first value and the projected value for 2010.  A sample of the finished Word File is here

The final step is to upload both the Excel file and the Word file to EXCEL as file attachments.