Lab 0: Getting to Know SPSS and The SPSS Help Manual

Click the links below for SPSS Help

I. Basic Operations II. Data Operations III. Statistical Tools
Starting SPSS
SPSS Layout
Open existing file
Open new file
Save data
Save results
Insert a variable
Enter data
Insert a record
Sort data
Create calculated variable
Create charts
Edit charts
Descriptive statistics
Random Sampling
Non-parametric tests
Parametric tests
Correlation
Regression

If you have detailed questions about any operation or test, check the HELP function in SPSS. The Help Manual assumes that you know how to work in a Windows environment. If you need assistance, please ask your TA.

Go to Lab 0: SPSS Practice Worksheet
 



 

I. BASIC OPERATIONS

Starting SPSS

  1. Go to the Start button (lower left corner of screen) to open the Start menu.
  2. Move your mouse up to Programs to expand the list of software.
  3. Navigate to the SPSS icon and click to start.
     

SPSS Layout

SPSS has three main features:
  1. Data Editor: displays your dataset. You can add, sort, remove variables or records, and compute new variables. The Data Editor has two tabs within the main window (at the bottom of the screen):
      Data View shows your dataset,
      Variable View shows your variable names and definitions.

    In a dataset, each column is a variable; each row is a record (or observation or case).

      Variable 1 Variable 2 . . . Variable X
    Record 1 data data data data
    Record 2 data data data data
    . . . . . . . . . . . . . . .
    Record X data data data data

     
  2. Output: displays the results of tests or charts. You can add/delete text or numbers and edit charts. The Output window has a navigation bar at the side (similar to Windows Explorer) so you can move around within your results. You can minimize, expand or delete test results.
     
  3. Main Menu: The SPSS functions are accessed from the main menu at the top of the SPSS window. Click on the menu category and scroll down to the item using your mouse.
     

Open an existing data file

  1. Under File, select Open Data.
  2. Navigate through the directories to the data file.
  3. Click OK.
     

Open a new file

  1. Under File, select New Data; a blank Data Editor window will open.
     
    Note: You can only have one dataset open at a time. If you open another dataset or a new data window, SPSS will ask you to save your old data before closing the file.
     

Saving data

  1. From the Data Editor, click Save and navigate to the appropriate location (i.e. your disk).
  2. To work on the data in another application, click Save As and specify dBase IV (.dbf) as the file type. You can import .dbf files into MS Excel.
     

Saving results

  1. From the Output window, click Save and navigate to the appropriate location. Your file will be saved as SPSS output (.spo) which requires SPSS to run.
  2. You can also copy and paste your results and charts into MS Word for later editing.
     

 

II. DATA OPERATIONS

Insert a variable

  1. In the Data Editor, under Data, select Insert Variable. A generic variable name (VAR00001) will appear at the top of a column.
  2. Put your cursor in the first cell of this column.
  3. Click to Variable View tab to define the variable.
  4. Specify:
    • Name: use 8 characters or less
    • Type: numeric, date, string (text)
    • Width: maximum number of characters in this variable (default is usually 8)
    • Decimal places: maximum decimal places required (the default is 2 decimal places).

Enter data

  1. Use Enter to move down the column.
  2. Use Tab to move along the row.
     

Insert a record

  1. Put your cursor on the row below where you want the new case.
  2. Under Data, select Insert Case.
     

Sort data

  1. Under Data, select Sort.
  2. Choose the variable and sort order:
      Ascending: A Z or low high
      Descending: Z A or high low
     

Create a calculated variable

  1. Insert a new variable
  2. Specify the appropriate name and format for the new variable.
  3. Under Transform, select Compute.
  4. Type the name of the new variable in the top left box.
  5. In the large box, create your formula. You can move variable names from the list at the lower left and use function buttons or function list under the formula box.
  6. Click OK.
  7. SPSS will ask to overwrite the newly created variable - click OK.
     

Create charts

SPSS has a range of graph and chart options. Your chart will be displayed in the Output window.

Bar charts

  1. Choose the appropriate bar chart type (simple, clustered or stacked).
  2. Specify that data in chart are Summaries for groups of cases.
  3. Specify what the bars represent (number of cases, cumulative cases, other functions, etc...).
  4. Under Category Axis, select the variable you want to graph.

Pie charts

  1. Specify that data in chart are Summaries of groups of cases and click Define.
  2. Specify what the slices represent (number of cases, cumulative cases, other functions, etc...).
  3. Under Define Slices by, select the variable you want to graph.

Scatter plots

  1. Choose Simple scatterplot.
  2. Move your variables to the X and Y axis.
  3. Under Label Cases, select the variable you want to label cases by on the graph. See steps under Chart Editor for more instructions.

Histogram

  1. Select your variable.
  2. Click check box (lower left) to display the normal curve over the histogram.
     
    Note: you cannot specify the number of intervals in the histogram.

 

Edit charts

From the Output window, double click on your chart to enter the Chart Edit mode. Some important functions are listed below. Some of the options vary depending on the chart type.

Format axis - for changing the value range or increments on each axis

  1. Under Chart, select Axis.
  2. Select the appropriate axis (X or category, Y or scale).
  3. Specify the data range to be shown if necessary.
  4. Specify the division markers and increments.

Add data labels - for labelling points on scatterplots

  1. Under Chart, select Options.
  2. For scatter plots, click Case Labels ON to label each point by the variable specified for scatter plots.

Add reference lines - for adding reference lines on scatterplots, residual plots or bar charts

  1. Under Chart, select Reference Lines.
  2. Select the appropriate axis (X or Y).
  3. Type -2 in the upper box and click ADD to move the number down into the second box.
  4. Type 0 and click ADD again.
  5. Repeat as needed.

To return to the Output window, click the close button in the top right corner of the Chart Editor.

 

III. USING THE STATISTICS TOOLS

SPSS can perform many different statistical procedures. The test results are displayed in the Output window.

Descriptive Statistics

This function calculates the standard set of descriptive statistics: , s, min, max, n.

  1. Under Analyze, select Descriptive Statistics, then select Descriptives or Frequencies. Click OK.
  2. In Descriptive Statistics, you can specify other descriptive statistics using the Options button (lower right corner of dialogue box),
  3. In Frequencies, you can specify statistics using the Statistics button.
     

Conduct random sampling

This function allows you to randomly select observations from your dataset. In Lab 3, Question 2d asks you to select 3 different random samples of life expectancy data (where n = 5, 20 and 100). Follow steps 1 to 6 for each sample. Once the three samples are chosen, go to step 7.

For each sample:

  1. Under Data, select Select Cases.
  2. Choose Random Sample of Cases and click the Sample button.
  3. Specify the sample size as follows:"Exactly {your desired sample size} cases from the first {the total number of observations} cases."
  4. Click the check box below to ensure that unselected cases are Filtered (not Deleted) and click OK.
  5. A new variable (Filter) will be added to the Data Editor, where 1 = case is selected in sample, 0 = case is not selected.
  6. Highlight the filter column and copy/paste it into a new column. Rename this column (i.e. Fil_5 for "filter to select 5 observations").
     
    Note: If you do not copy and paste the filter into a new column, SPSS will overwrite the information when you choose your next sample.

Create the next two samples using the same procedures (steps 1 to 6). You should have three filter variables labelled Fil_5, Fil_20 and Fil_100.
 

Non-parametric Tests (Goodness of Fit)

The non-parametric sub-menu lists several tests including:

  1. one sample chi-square
  2. one sample KS
  3. two sample chi-square

a. One sample Chi-square

  1. Under Statistics Non-Parametric, select Chi-square.
  2. Move your variable into the Test Variable box
  3. Under Expected Range, keep the default selection Get from Data
  4. Under Expected Values, click All categories equal
  5. Click OK.
  6. In the Output window, you will see two tables. The first table contains the observed and expected frequencies used in the Chi-square test. The residual column is the actual difference between observed and expected - not (Obs-Exp)2/Exp. The second table gives you the chi-square (*) and degrees of freedom (df).

b. One sample KS test

  1. Under Statistics Non-Parametric, select 1 sample KS.
  2. Move your variable(s) into the Test Variable box. (You can move multiple variables into the Test Variable box. SPSS will run a separate test for each variable)
  3. Under Test Distribution, click on the appropriate expected frequency distribution (normal, uniform, poisson or exponential)
  4. Click OK
  5. In the Output window, the results of the KS test are shown in one table. In the hypothesis test, use the absolute difference under Most Extreme Differences for D*, not the KS Z value. The p-value is listed as Asymp Sig. for each variable. p-value is
     

c. Two or more sample Chi-square

  1. Under Statistics Descriptives Statistics, select Crosstabs.
  2. Move your variables of interest (govt, laws, manage and title) into the Row box
  3. Move the variable that distinguishes your samples (sex) into the Column box
  4. Click on the Statistics button; select Chi-square (located in the top left corner) and click Continue.
  5. Click on the Cells button; under Counts, select Expected and Observed and click Continue
  6. Click OK to run the test
  7. In the Output window, you will see two tables for each variable you tested.
    • First table : This is a summary of the observed and expected frequencies calculated by SPSS. Notice that one variable may have 4 categories of answers, whereas another may 3 categories of answers. Use the appropriate number of categories when calculating the number of degrees of freedom.
    • Second table : This table contains the calculated *. It is called Pearson's chi-square and is listed in the column Value. The p-value is listed in the column Asymp Sig..
    • If you forgot to specify 'chi-square' under the Statistics button, you will have no test results.

     

Parametric Tests (Difference of Means)

The compare means sub-menu lists several tests including:

  1. one sample t-test
  2. independent sample t-test
  3. paired sample t-test

a. One sample t-test

  1. Under Statistics Compare Means, select One sample t-test.
  2. Select variable for test in right hand dialogue box.
  3. Move to Test Variable box using arrow.
  4. Enter the Test Value (known or hypothesized population value).
  5. Click OK
  6. In the Output window, you will see two tables. The first table contains descriptive statistics for the variable. The second table contains the test statistic in the column 't'. The degrees of freedom for the test are listed under the column 'df'.

b. Independent sample t-test

  1. Under Statistics Compare Means, select Independent Sample t-test
  2. Move the variable of interest into the Test Variable box.
  3. To separate the two samples within the test variable, move the variable with the grouping criteria into the Grouping box (see example below)
  4. Click Define Groups and fill in the text that differentiates your samples.
     
    Example: For the Squid dataset:
    Province Damage
    CHA 220
    CHA 353
    CHA 415
    RAY 279
    RAY 337
    RAY 380

  1. To conduct the 2 sample test on the variable Damage:

    • move Damage into the Test Variable box
    • move Province into the Grouping box
    • click Define Groups and type CHA in Variable 1 and RAY in Variable 2 to differentiate your groups
    • SPSS will separate the two groups and treat them as separate samples
       
  1. In the Output window, you will see two tables. The first table contains the descriptive statistics for each group. The second table contains the two t* statistics in column 't'.
    • DO NOT USE LEVENE's F*. Use the variance ratio test outlined in the lab manual.
    • If the F test indicates that the variances are equal, use the top t*.
    • If the variances are not equal, use the lower t* for the decision rule.

c. Paired sample t-test

  1. Under Statistics Compare Means, select Paired t-test.
  2. Click on the first variable of your pair - it appears in the Current Selections box as Variable 1
  3. Click on the second variable - it appears in the Current Selections box as Variable 2
  4. Click the arrow to move these variables into the Paired Variable box
  5. Click Ok
  6. In the Output window, you will see three tables. The first table contains the descriptive statistics for each variable. The second table contains correlations. The third table contains the mean and standard deviation of the differences and the t* statistic in column 't'. The degrees of freedom are listed under column 'df'.
     

Correlation

The Correlate sub-menu has two correlation functions:

  1. bivariate correlation
  2. partial correlation

a. Bivariate correlation (Pearson's r and Spearman's )

  1. Under Statistics Correlate, select Bivariate.
  2. Move the variables of interest into the Variables box (you can move multiple variables).
  3. Select the test type: Pearson's r or Spearman's
  4. Under Test of Significance, select the direction of the test (one tailed or two tailed). SPSS will calculate the appropriate p-values.
  5. Select Flag significant correlations
  6. Click OK
  7. In the Output window, you will see one table (matrix) containing the correlation coefficients, number of observations and p-values for the selected variables.

  1.   Variable A Variable B Variable C
    Variable A 1.000
    --
    80
    .854**
    .000
    80
    -.562**
    .016
    80
    Variable B .854**
    .000
    80
    1.000
    --
    80
    .254
    .302
    80
    Variable C -.562**
    .016
    80
    .254
    .302
    80
    1.000
    --
    80
    Notes:
    1. SPSS will show a perfect r (1.000) for the correlation between the same variable (A and A, B and B, etc)
    2. SPSS flags (**) correlations that are 'signficant' based on its p-value calculation. If the p-values are very small (p=0.000...), SPSS will indicate that the variables are significant at the 0.01 level - it changes from 0.05 to 0.01).
    3. The correlation matrix has a mirror image. The correlations in the top right corner (in blue) are the same as the correlations in the bottom left (in red).

b. Partial correlation

  1. Under Statistics Correlation, select Partial.
  2. Move the variables of interest into the Variables box.
  3. Move the controlled variable into the Controlling for box.
  4. Under Test of Significance, select the direction of the test (one tailed or two tailed). SPSS will calculate the appropriate p-values.
  5. Click OK
  6. In the Output window, you will see a small correlation matrix. This matrix shows the coefficient and p-value for the correlation between the two variables of interest, once the influence of the third variable (your controlled variable) has been removed.
     
    Note: the degrees of freedom are n-3 for the partial correlation.
     

Linear Regression

  1. Under Statistics Regression, select Linear.
  2. Before starting the analysis, you must identify your dependent and independent variables
  3. Move the appropriate variables into the Dependent and Independent boxes
  4. Click on the Statistics button. On the right, click Model Fit and Descriptives. Click continue.
  5. Click on the Plots button. Select *ZRESID for Y window and DEPENDNT for X window. Click continue.
  6. Click on the Save button. Under Residuals, click Standardized and click continue.
  7. Click OK
  8. In the Output window, you will see several tables. The important tables are listed below.
    1. Descriptive Statistics: Presents and n for each variable
       
    2. Correlations: Presents the Pearson's r correlations (and p-value) between variables
       
    3. Model Summary: Presents r, and the standard error of the estimate (in blue - this number is used for confidence intervals).

    1. ModelRR Square Adjusted R Square Std. Error of the Estimate
      1 0.909 0.828 0.800 152.94

       
    2. ANOVA (Analysis of Variance): Presents the regression, residual and total sum of squares, the degrees of freedom and the calculated F value (in blue) The Sig. column shows the p-value for F*.

    1. Model Sum of Squares df Mean Square F Sig.
      1 Regression 1459528.2901 1459528.290 62.396 .000
       Residual 304089.44313 23391.496    
        Total 1763617.73314      

       
    2. Coefficients: Presents the regression coefficients: (in red) and (in blue).
      Model Unstandardized Coefficients Standardized Coefficients t Sig.
      B Std. Error Beta
      1 (Constant) -1415.231 273.742  -5.170 .000
      RAINFALL 1.264.160 .910 7.899.000

       
  1. In the Data window, you will see an extra column called ZRE-1. This column contains the standardized residuals (the difference between each observation and the line of best fit).

  1. To prepare a residual plot if you haven't requested PLOT of *ZRESID and DEPENDNT:

    • Create a scatterplot with the dependent variable along the X axis and the residuals along the Y axis (if you forgot to choose 'standardized residuals' under Save when you initiated the regression function, you will not have a variable called ZRE-1).
       
    • Add references lines at 2, 0 and -2 to the residual plot. See Chart Editor for information about to adding reference lines to residual plots.