Logo

Averaging and Adding Variables with Missing Data in SPSS

by Karen Grace-Martin   35 Comments

SPSS has a nice little feature for adding and averaging variables with missing data that many people don’t know about.

It allows you to add or average variables, while specifying how many are allowed to be missing.

For example, a very common situation is a researcher needs to average the values of the 5 variables on a scale, each of which is measured on the same Likert scale.

There are two ways to do this in SPSS syntax.

Newvar=(X1 + X2 + X3 + X4 + X5)/5  or

Newvar=MEAN(X1,X2, X3, X4, X5).

In the first method, if any of the variables are missing, due to SPSS’s default of listwise deletion , Newvar will also be missing.

In the second method, if any of the variables is missing, it will still calculate the mean.  While this seems great at first,  the researcher may wish to limit how many of the 5 variables need to be observed in order to calculate the mean.  If only one or two variables are present, the mean may not be a reasonable estimate of the mean of all 5 variables.

SPSS has an option for dealing with this situation.  Running it the following way will only calculate the mean if any 4 of the 5 variables is observed.  If fewer than 4 of the variables are observed, Newvar will be system missing.

Newvar=MEAN.4(X1,X2, X3, X4, X5).

You can specify any number of variables that need to be observed.

(This same distinction holds for the SUM function in SPSS, but the scale changes based on how many are being averaged.  A better approach is to calculate the mean, then multiply by 5).

This works the same way in the syntax or in the Transform–>Compute menu dialog.

First Published  12/1/2016; Updated  7/20/21 to give more detail.

spss assignment missing values

Reader Interactions

' src=

February 23, 2023 at 10:30 am

dear Karen it seems the MEAN.2 (or whatever number you add) command is no longer working in the syntax file of SPSS v.28?

' src=

October 9, 2018 at 4:52 pm

Good day, Miss Karen,

I had a question. I understand this method is available in SPSS, and it is very useful indeed. As someone above has already asked, I wanted to determine the criteria to use as my cutoff mark for missing data. I am looking deeper into how I can determine whether my data is Missing Completely At Random (MCAR), Missing At Random (MAR), etc. In addition I am working on the factor analysis to determine how many items can be loaded together, etc.

Schafer (1999) asserted that a missing rate of 5% or less is inconsequential. Bennett (2001) maintained that statistical analysis is likely to be biased when more than 10% of data are missing. Therefore, I was thinking of using 10% as my cutoff. But is there an evidence-based/peer-reviewed method to determine the cutoff? What is this SPSS-permitted method even called? I am so tempted to use it, but afraid that without adequate backing, I will get slated when I submit for publication (I am aiming for a top journal). Thank you so much in advance!

' src=

September 20, 2016 at 4:46 am

Hi I have a data set based on Likert scale (1-5) and I have a missing data. I want to impute missing data, however, I get the average numbers (e.g., 2.76). I should have the missing data based on Likert scale, what should I do? Tnank you in advance. Farahnaz

' src=

December 23, 2016 at 1:33 am

hi farahnaz

you should set missing values 3 the middle of likert scale. that means the person doesn’t have any preference.

' src=

July 8, 2016 at 3:02 pm

Thank you SO much. I was afraid I would have to transform all of missing data into something else.

' src=

March 25, 2016 at 3:57 pm

Hi Karen, I was wondering if you could explain further the method you listed accounting for a maximum of missing responses (i.e. Newvar=MEAN.4(X1,X2, X3, X4, X5). I’m a little confused of how the .4 came about. If for example, I had a measure with 10 items (with response options 1-5) , how would I calculate this? Thanks for your help!

' src=

July 15, 2015 at 4:00 am

Old thread, but quick question to see if you know the answer. Is there any way for SPSS to only multiply variables if a given number of them are non-missing? There doesn’t seem to be a PRODUCT function (not PRODUCT.n).

This is crucial because I have (as I see others online have) recommended creating an interaction term for a regression by simply multiplying two variables. Only after, I noticed that SPSS will return a zero for the product of SYSMIS*0, which is awful! For the product of SYSMIS*1, it returns SYSMIS….

' src=

July 15, 2015 at 7:17 am

I don’t know of one–only sum.n and mean.n. And I think they’re solving the opposite problem.

But that’s a really good catch–that’s really a problem. The only think I can think of is to add an IF statement after creating the interaction term, that if either component X is SYSMIS, so is the interaction.

' src=

March 6, 2018 at 6:09 pm

You might consider the exp and log functions to do a sum where you need to have the product, eg.

a * b * c = exp( log(a) + log(b) + log(c) ) so Product.1(a,b,c) can be done as exp(SUM.1( log(a), log(b), log(c) ) which will give you the desired function. Of course .1 can be replaced by .2 or .3 to get eg Product.3 (a,b,c,d,e,f).

Hopefully this will help you further.

' src=

January 27, 2015 at 8:03 pm

I am trying to print a frequency report in SPSS for a group. I recoded the Score variable as recoded_score with 1=critical thinking group, 2=thinking group , 3=dumb group…etc. What I want to printout is the recoded groups so that I can tell how many people score in each group. The range of the Score variable is from 16 to 112. However, each time I print the frequency report from SPSS, I get the correct number of people in each group BUT the means, Std. Dev., Error Means are all wrong. How do I correct this to report the right means without using the Score variable? Any help is appreciated…

' src=

November 6, 2014 at 11:31 am

Hi, IM trying to addin up 9 variables to create a scale. So I compute new variable name = (V1+V2…V9) the sample size is 209 but at the end the new variable only has 21 . Am I doing something wrong? Thanks Sylvia

' src=

September 24, 2014 at 2:42 pm

I’m trying to compute the mean for a scale in which there are 28 questions but in two instances people are asked to respond to one question or another question but not both. For example, if their weight has increased they respond to question 13, but if their weight has decreased they would respond to question 14 and the same goes for two questions about appetite. So, the items should really be summed and divided by 26, but when I use the code above that allows for missing items, it automatically divides by 28. Is there a simple way to create the mean score, allow for missing items, and divide by 26 instead?

' src=

July 11, 2014 at 5:17 am

Hi, could you please help me on that .. for the set of data i replaced the missing data be using the mean ,, and when i checked them again noting change in the data set!! is this right? and have no much missing data but i want to report the percentage to confirm that the missing data is less than 5%. So could you please explain the right way to calculate this percentage?

many thanks, g

' src=

May 28, 2014 at 6:37 am

Dear Karen,

Thanks a lot! This little explanation saved an enormous amount work for one of my PhD students.

Much obliged! Henry

' src=

November 13, 2013 at 7:21 am

I have used the above method (mean x1,x2) for creating a new variable based on six others all ranging from 1-7.

My question is when I have created the new variable the range of this should also be from 1-7. If not there’s a mistake in recoding somewhere.

But in my example I get a new variable ranging from 1.11-6.89 – is this simply the observed minimum and maximum value of the variable or is it a recoding mistake? I haven’t been able to identify any mistake and would like to know if the range of the variable spss shows is theoretical or empiritical.

November 25, 2013 at 3:14 pm

Whenever you take an average, it’s unlikely to vary as much as the variables being averaged.

For example, the only way to get an average=1 is if someone answered 1 on all six original variables. There just may not be any of those in the data.

' src=

September 11, 2013 at 4:10 pm

So, I’m did a factor analysis and wanted to know how to proceed. For example I had six variables that loaded on Factor 1. Now, I thought I just had to sum up the six variables to get values (basically range from 0-6) for Factor 1 .

What if I had missing values for say 3 out of the 6 variables? What do I do? Please advice

September 25, 2013 at 11:00 am

You have a number of options, and it’s hard to explain them all here. It might make sense to just use the three observed. It depends on how similar your loadings are.

' src=

June 24, 2013 at 5:41 am

You have saved me days of work and I love you a little bit for it!

July 1, 2013 at 1:15 pm

Aw, shucks! 🙂

' src=

June 5, 2013 at 8:42 am

Fantastic. Thank you very, very much for posting this. Extremely helpful.

' src=

April 1, 2013 at 8:38 am

This post revealed a great time saver for me. Previously, to compute an averaged index, I would have SPSS count the number of non-missing items, recode the observations that were below a certain cut off (e.g., missing on 3 or more out of 5), sum the items and divide by the count. The method you showed is much more efficient!

April 2, 2013 at 5:40 pm

' src=

March 2, 2013 at 7:42 pm

regarding my previous question, I ‘ like to refer your book in my dissertation.

March 4, 2013 at 11:06 am

March 2, 2013 at 7:40 pm

Hi Karen, so can we conclude that the averaged measures are easier to explain?

Not sure what you’re asking here…..

December 9, 2011 at 1:39 pm

Thanks. Actually, no. Just about the worst thing you can do for missing data is replace the missing values with the computed mean. I explain why in this post: https://www.theanalysisfactor.com/mean-imputation/ .

You only want to do what I explain above if the point is to calculate a mean for those items.

' src=

December 9, 2011 at 4:30 am

Your post is great. After the new variable is computed, could you please show me how to replace the missing values with the computed mean variable in SPSS?

Thanks a lot,

' src=

September 8, 2011 at 6:48 am

It would be great if somebody can help me with this. I need to replace missing values for escs and I want to replace the mean value for each school where students are grouped. How can I do it? Thanks a lot.

December 9, 2011 at 1:42 pm

To do that you would definitely want to use the EM algorithm to get the means. If you’re doing in SPSS you have to have the missing values analysis module.

The EM means are unbiased if you calculate them using a number of different variables.

however, if you want to be careful here. This is only useful in this situation where you’re grouping. You may be better off with multiple imputation, depending on the percentage of missing information.

' src=

July 22, 2010 at 1:55 am

Thank you so much!! I never realised there were two ways of computing the mean in SPSS and that one doesn’t calculate values if there are any missing values. I keep using the mean (X1,X2) formula, so I keep getting values for people with missing values and have been fixing them up afterwards manually. I was searching for a way to fix it up using syntax and I saw this and it is really really helpful. Thank you.

You’re welcome. Glad it was helpful.

June 6, 2010 at 4:35 am

This was great! Just what I was looking for! Thank you. I was also wondering the criteria for deciding the number of variables that need to be observed

June 6, 2010 at 11:45 pm

It would depend on a number of things.

– The percentage of missing data (the higher the percentage, the more it affects results, so you have to be careful) – How similar the items are (if you run your five items on a factor analysis, it’s more reasonable to average three or four of the five if they all have similar loadings. If the loadings are wildly different, the five items don’t contribute equally to the scale). – The missing data mechanism (by averaging around a missing value, you’re assuming it’s missing completely at random, and that the other values on the scale are good estimates for it).

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Privacy Overview

Logo

SPSS Tutorial #6: How to Code, Define, Analyse, and Deal with Missing Values in SPSS

spss assignment missing values

It is rare to have a dataset that is complete hence it is important to know how to code, define and deal with missing data. This is a comprehensive post about missing values in SPSS.

Reasons for missing data

Types of missing values in spss, how to code and define missing values in spss, patterns of missing data, missing completely at random (mcar), missing at random (mar), missing not at random (mnar) / not missing at random (nmar), missing value analysis in spss, understanding the degree of missing values using descriptives, univariate statistics output, separate-variance t-tests output, cross-tabulations for categorical variables, understanding the patterns of missing values in spss, confirming mcar using little’s mcar test, how to deal with missing values in spss, deleting missing values, list wise deletion, pair wise deletion, imputing missing values, mean substitution, imputation using regression, imputation using the expectation-maximization (em) technique.

There are many reasons for having missing values/data in a dataset. These include:

Non-applicability of some questions: some questions may not be applicable to some respondents so they leave them blank.

Not knowing the response to some questions: some respondents may lack knowledge about some questions so they will leave them blank.

Refusing to answer: some respondents may know responses to some questions but refuse to answer. This can happen if the question is sensitive or makes the respondents uncomfortable.

Skip logics: in questionnaires with skip logics, there will be missing values if the skip logics require respondents to skip some questions.

SPSS has two types of missing data:

System missing data: these are generated automatically by SPSS. They are denoted with a period mark (full stop) everywhere there are blanks.

User missing data: these are generated by the user/researcher. The researcher will be guided by the codebook used when developing the questionnaire. Common user-generated missing values include 98, 99, 999 etc.

When choosing values to denote missing values, it is important to select codes that are not possible in your dataset. This will vary from one variable to another.

As earlier, explained, when choosing codes for missing values, one should select values that are unlikely in the dataset.

To demonstrate, see the examples below of two variables, gender and income level:

As shown in the table above, the options “prefer not to say” and “not applicable” have been coded 96.

The next step is to define the missing values in SPSS. If this is not done, SPSS will include the 96 values in all the analyses leading to biased results.

To define the missing values in SPSS:

  • Open SPSS and create your codebook.
  • Go to variable view and select gender.
  • Go to the column titled “missing” and click it.
  • From the open dialogue box, select the option “discrete missing values” and put in the code “96.” If you have three missing value codes, you can put them in the three boxes. If more than three, you can use the option “range plus one optional missing value.”

The code “96” will appear in the missing column for that variable.

Repeat the process for all the variables with missing value codes.

This is demonstrated using the gender variable as shown in the images below.

spss assignment missing values

Once you have a dataset that you want to work with, it is important to analyse the missing values to understand the extent of the missing values and the patterns of the missing values.

There are three main patterns of missing data: missing completely at random (MCAR), missing at random (MAR) and missing not at random or not missing at random (MNAR/NMAR).

Data are said to be missing completely at random if all the observations have an equal likelihood of being missing.

The missingness is therefore independent of both the observed and unobserved data.

Example: if you are collecting data on students’ performance, but some of the students’ marks got lost, the missing data will be considered as MCAR.

Although considered as a strong assumption, MCAR is often regarded as unrealistic.

The second pattern of missing data is the missing at random.

Data are said to be missing at random if there are systematic differences across some groups within the dataset.

The missingness of data in MAR is dependent on the observed data but not the unobserved data.

Example: if you are collecting data on income, and more males than females refuse to provide their income levels, then the missingness of data on income is dependent on gender (observed data), and this is considered to be MAR.

Missing not at random or not missing at random occurs when the missing data are dependent on unobserved data that the researcher cannot measure.

Hence the reasons for missing in this case are beyond the knowledge of the researcher.

Example: if you are collecting data on drug abuse in schools, the students who abuse drugs more are unlikely to respond to the survey. Hence the missingness in this case is related to the extent of drug abuse, which the researcher cannot directly observe.

Understanding the patterns of the missing data in a dataset is important because it will guide the researcher on the most appropriate strategies of dealing with the missing data.

Failing to deal with missing data appropriately can lead to biased results.

SPSS has a useful function called the Missing Value Analysis (MVA) that enables a user to analyse the extent and pattern of missing data in a dataset.

To use the MVA feature of SPSS, follow the procedure below:

  • Click on Analyze menu > Missing Value Analysis
  • In the dialogue box that opens, specify the quantitative variables and the categorical variables.
  • The univariate statistics will show the extent of the missing data for each of the listed variables.
  • The indicator variable refers to data present and data missing for each of the variables.
  • Under the indicator variable statistics, select the options “t-tests with groups formed by indicator variables” and “cross tabulations of categorical and indicator variables.
  • Click Continue > OK.

The above procedure is demonstrated in the images below:

spss assignment missing values

The procedure will result in three different outputs: univariate statistics, separate-variance t-tests, and cross-tabulations for categorical variables. These are shown below:

The output for univariate statistics looks like the image below:

spss assignment missing values

The column N shows the total number of present values for each listed variable.

The column Missing has two parts: count and percent. The count column shows the number of missing values, while the percent column shows the percentage of the missing values for each variable listed.

The recommended percentage of missing values is 5 percent or less. In the table above, variables r10_m, r10_y and r13 have more than the 5% allowable threshold and need to be analysed further using the separate-variance t-tests.

The second output gives the results of the t test conducted for all the quantitative variables using indicator variables with more than 5% missing values.

The output is shown in the image below:

spss assignment missing values

The first column shows the indicator variables for all variables (present vs. missing) with more than 5% missing values (r10_m, r10_y and r13).

In addition to the actual number of missing vs. present values, the means are also displayed.

The first row displays all the quantitative variables.

The table shows how the missingness of one variable affects the values of other variables.

Example: when r10_m is present, the mean value for r08 is 47,125; but when r10_m is missing, the mean value for r08 is 166,317. This is a very big difference and shows that the missingness of r10_m is affecting the mean of r08. This is also the case for all the other quantitative variables.

On the other hand, the missingness of r10_y seems to affect the means of r08, r10_m, and r12.

Lastly, the missingness of r13 seems to affect the means of r08, r10_y, and r12.

The last output is the cross-tabulations for all the categorical variables that were listed.

The cross-tabulations are similar to what is displayed in the separate-variance t-tests tables, except that in this case the categorical variables are displayed instead of the quantitative variables.

The tables also display the frequencies for each indicator in each category in the categorical variables.

Example of cross-tabulation for variable r07 (main reason of obtaining loan) is shown below:

spss assignment missing values

The tables helps one to determine whether there are significant differences in missing values across the various categories of the categorical variable.

Example: missing values for r13 (collateral for credit) are high for the categories “subsistence needs” and “purchase of agricultural machinery” but low for all the other categories.

From the separate-variance t-tests and cross-tabulations, if there are some correlations between variables, it implies that the missing data are missing on random. If there are no correlations, it implies that the missing data are missing completely at random.

Once you have analysed the extent of the missing values and how they affect other variables, it is important to further analyse the patterns of the missing values.

To analyse the patterns of the missing values in SPSS, follow the procedure below:

  • Click Patterns
  • In the Patterns dialogue box, you can select various patterns tables. Select “tabulated cases, grouped by missing value patterns.” This will also select “sort variables by missing value pattern.
  • If you select the option “cases with missing values, sorted by missing value patterns”, you will get a table that lists each case with a missing value and the pattern(s) of the missing values. If your dataset is large, you will get a very long table.

The procedure is demonstrated in the image below:

spss assignment missing values

If you found some variables that seem to influence the data, you can include them in the “additional information for” box to get additional information about the variables.

The results from the above procedure are shown below:

spss assignment missing values

The tabulated patterns table shows that:

  • 5765 cases have complete values for the variables listed.
  • 2156 cases have missing values for variable r10_y.
  • 906 cases have missing values for variable r10_m.
  • 360 cases have missing values for variables r13 and r10_y combined.
  • 277 cases have missing values for all variables except r01 and r02.

You can confirm whether the data are missing completely at random using a test called Little’s MCAR test.

This test is also found in the MVA function.

To conduct the test:

  • Click Analyze > Missing Value Analysis
  • Click on “EM” under the estimation section > click EM again from the list below
  • From the dialogue box that opens, select “normal”

The results will be displayed, as shown below:

spss assignment missing values

The null hypothesis of the Little’s MCAR test is that the data are missing completely at random.

This is determined by comparing the significance value with an alpha value of 0.05. If the significance value is less than the alpha value, then the null hypothesis is rejected, and vice versa.

In the example above, the significance value is less than the alpha value of 0.05 so the null hypothesis of MCAR is rejected. The conclusion is that the data are not missing completely at random.

After identifying the extent of the missing values and their patterns, the next step is to decide on how to deal with them so that results from data analysis is not biased.

There are two strategies of dealing missing values: deletion and imputation.

One can choose to delete missing values if the affected cases are few in number and therefore the sample size will not be significantly reduced.

Additionally, if one or two variables are significantly affected by missing values, you can delete the variables from the dataset.

There are two types of deletion of missing values:

This strategy deletes all the missing values from the dataset.

A case will be deleted even when it has only missing value.

The analysis will therefore be based on a complete dataset.

This strategy is not recommended as it can significantly reduce the sample size.

It is also known as “exclude cases analysis by analysis”

It deletes cases only if the data missing are required for the analysis being conducted.

This means that the data used will depend on the analysis being conducted and will therefore vary from one analysis to another.

Both deletion methods can be found in SPSS by clicking “options” when running any statistical analysis.

Example: a one sample t-test is conducted as shown below. The “exclude cases analysis by analysis” is selected as the deletion method.

spss assignment missing values

The second strategy of dealing with missing values is imputation of the missing values.

There are three ways of imputing missing values:

In this strategy, the mean of the variable is calculated and all the missing values in that variable are replaced by the mean.

To do this in SPSS:

Click Transform > replace missing values

Select the variable of interest. A new variable will be created with a new name.

The method to be used is “series mean.”

All the missing values will be replaced with the mean.

This procedure is demonstrated in the images below:

spss assignment missing values

In this method, the missing values are replaced by the predicted value generated from multiple regression.

Specify the quantitative variables and select regression under the estimation options.

In the variables option, you can either specify the predicted and predictor variables, or you can use all the quantitative variables.

In the regression dialogue box, select residuals as the estimation adjustment.

Select the option “create a new dataset” and specify the name of the new dataset. This will ensure that the original dataset remains unchanged for future use.

This is demonstrated in the images below:

spss assignment missing values

A new dataset will be created with complete data.

In this method, the missing data are first imputed using regression analysis.

The complete data with the imputed data is then estimated using maximum likelihood.

  • Click on “EM” under the estimation section > click EM again from the list belo
  • Select the “save complete data” to create a new dataset and specify the name of the new dataset.

Of the 5 methods of dealing with missing data, the imputation using EM is the most recommended.

In conclusion, this post provides comprehensive information using illustrative images on how to define, analyse and deal with missing values in SPSS. The Missing Value Analysis feature in SPSS has also been discussed and demonstrated with practical examples.

Grace Njeri-Otieno

Grace Njeri-Otieno is a Kenyan, a wife, a mom, and currently a PhD student, among many other balls she juggles. She holds a Bachelors' and Masters' degrees in Economics and has more than 7 years' experience with an INGO. She was inspired to start this site so as to share the lessons learned throughout her PhD journey with other PhD students. Her vision for this site is "to become a go-to resource center for PhD students in all their spheres of learning."

Recent Content

SPSS Tutorial #11: Correlation Analysis in SPSS

In this post, I discuss what correlation is, the two most common types of correlation statistics used (Pearson and Spearman), and how to conduct correlation analysis in SPSS. What is correlation...

SPSS Tutorial #10: How to Check for Normality of Data in SPSS

The normality assumption states that the data is normally distributed. This post touches on the importance of normality of data and illustrates how to check for normality of data in SPSS. Why...

  • Skip to primary navigation
  • Skip to main content
  • Skip to primary sidebar

Institute for Digital Research and Education

How can I see the number of missing values and patterns of missing values in my data file? | SPSS FAQ

Many data sets have missing values.  However, having lots of missing values can be problematic, as most statistical procedures (e.g., regression) will do a casewise deletion of cases with missing values.   This means that the procedure works runs on only the cases with complete data, and that may be a fraction of the cases in the data set.  Hence, finding out the number of missing values each variable has can be important.  Let’s look at  the following data set.

LANDVAL IMPROVAL TOTVAL SALEPRIC SALTOAPR 30000 64831 94831 118500 1.25 30000 50765 80765 93900 . 46651 18573 65224 . 1.16 45990 91402 . 184000 1.34 42394 . 40575 168000 1.43 . 3351 51102 169000 1.12 63596 2182 65778 . 1.26 56658 53806 10464 255000 1.21 51428 72451 . . 1.18 93200 . 4321 422000 1.04 76125 78172 54297 290000 1.14 . 61934 16294 237000 1.10 65376 34458 . 286500 1.43 42400 . 57446 . . 40800 92606 33406 168000 1.26

1. Number of missing values versus number of non-missing values

The first thing to do is find out how many missing values each variable has.  We can use the frequencies command with the format=notable subcommand.

FREQUENCIES VARIABLES=landval improval totval salepric saltoapr /FORMAT=NOTABLE /ORDER= ANALYSIS .

Now we know the number of missing values in each variable:  the variable salepric has four and saltoapr has two missing values.  This will help us to identify variables that may have a large number of missing values, and perhaps we may want exclude those from analysis.   

2. Number of missing values in each observation

We can also look at the distribution of missing values across observations.  For example we use the count command to create a new v ariable called cmiss , which counts the number of  missing values across each observation.  Looking at its frequency table, we know that there are four observations with no missing values, nine observations with one missing value, one observation with two missing values and one observation with three missing values. 

COUNT cmiss = landval improval totval salepric saltoapr (MISSING). FREQUENCIES VARIABLES=cmiss /ORDER= ANALYSIS .

3. Patterns of missing values

We can also look at the patterns of  missing values.  We can recode each variable into a dummy variable such that 1 is missing and 0 is non-missing.  Then we use the aggregate command to compute the frequency for each pattern of missing data. 

RECODE landval improval totval salepric saltoapr (MISSING=1) (ELSE=0) INTO land1 impr1 totv1 sale1 salt1 . AGGREGATE /OUTFILE='AGGR.SAV' /BREAK=land1 impr1 totv1 sale1 salt1 /N_BREAK=N. File AGGR.SAV has the following variables and observations. LAND1 IMPR1 TOTV1 SALE1 SALT1 N_BREAK .00 .00 .00 .00 .00 4 .00 .00 .00 .00 1.00 1 .00 .00 .00 1.00 .00 2 .00 .00 1.00 .00 .00 2 .00 .00 1.00 1.00 .00 1 .00 1.00 .00 .00 .00 2 .00 1.00 .00 1.00 1.00 1 1.00 .00 .00 .00 .00 2

Now we see that there are four observations with no missing values, one observation with one missing value in variable saltoapr , two observations with missing value in variable salepric and one observation with  missing values in both variable totval and salepric , etc.

Your Name (required)

Your Email (must be a valid email for us to receive the report!)

Comment/Error Report (required)

How to cite this page

  • © 2021 UC REGENTS

404 Not found

Overview (MISSING VALUES command)

MISSING VALUES declares values user-missing. These values can then receive special treatment in data transformations, statistical calculations, and case selection. By default, user-missing values are treated the same as the system-missing values. System-missing values are automatically assigned by the program when no legal value can be produced, such as when an alphabetical character is encountered in the data for a numeric variable, or when an illegal calculation, such as division by 0, is requested in a data transformation.

Basic Specification

The basic specification is a single variable followed by the user-missing value or values in parentheses. Each specified value for the variable is treated as user-missing for any analysis.

Syntax Rules

  • Each variable can have a maximum of three individual user-missing values. A space or comma must separate each value. For numeric variables, you can also specify a range of missing values. See the topic Specifying Ranges of Missing Values (MISSING VALUES command) for more information.
  • The missing-value specification must correspond to the variable type (numeric or string).
  • The same values can be declared missing for more than one variable by specifying a variable list followed by the values in parentheses. Variable lists must have either all numeric or all string variables.
  • Different values can be declared missing for different variables by specifying separate values for each variable. An optional slash can be used to separate specifications.
  • Missing values for string variables must be enclosed in single or double quotes. The value specifications must include any leading blanks. See the topic String Values in Command Specifications for more information.
  • For date format variables (for example, DATE , ADATE ), missing values expressed in date formats must be enclosed in single or double quotes, and values must be expressed in the same date format as the defined date format for the variable.
  • A variable list followed by an empty set of parentheses ( ) deletes any user-missing specifications for those variables.
  • The keyword ALL can be used to refer to all user-defined variables in the active dataset, provided the variables are either all numeric or all string. ALL can refer to both numeric and string variables if it is followed by an empty set of parentheses. This will delete all user-missing specifications in the active dataset.
  • More than one MISSING VALUES command can be specified per session.
  • Unlike most transformations, MISSING VALUES takes effect as soon as it is encountered. Special attention should be paid to its position among commands. See the topic Command Order for more information.
  • Missing-value specifications can be changed between procedures. New specifications replace previous ones. If a variable is mentioned more than once on one or more MISSING VALUES commands before a procedure, only the last specification is used.
  • Missing-value specifications are saved in IBM® SPSS® Statistics data files (see SAVE ) and portable files (see EXPORT ).

Limitations

Missing values for string variables cannot exceed 8 bytes. (There is no limit on the defined width of the string variable, but defined missing values cannot exceed 8 bytes.)

  • Flashes Safe Seven
  • FlashLine Login
  • Faculty & Staff Phone Directory
  • Emeriti or Retiree
  • All Departments
  • Maps & Directions

Kent State University Home

  • Building Guide
  • Departments
  • Directions & Parking
  • Faculty & Staff
  • Give to University Libraries
  • Library Instructional Spaces
  • Mission & Vision
  • Newsletters
  • Circulation
  • Course Reserves / Core Textbooks
  • Equipment for Checkout
  • Interlibrary Loan
  • Library Instruction
  • Library Tutorials
  • My Library Account
  • Open Access Kent State
  • Research Support Services
  • Statistical Consulting
  • Student Multimedia Studio
  • Citation Tools
  • Databases A-to-Z
  • Databases By Subject
  • Digital Collections
  • Discovery@Kent State
  • Government Information
  • Journal Finder
  • Library Guides
  • Connect from Off-Campus
  • Library Workshops
  • Subject Librarians Directory
  • Suggestions/Feedback
  • Writing Commons
  • Academic Integrity
  • Jobs for Students
  • International Students
  • Meet with a Librarian
  • Study Spaces
  • University Libraries Student Scholarship
  • Affordable Course Materials
  • Copyright Services
  • Selection Manager
  • Suggest a Purchase

Library Locations at the Kent Campus

  • Architecture Library
  • Fashion Library
  • Map Library
  • Performing Arts Library
  • Special Collections and Archives

Regional Campus Libraries

  • East Liverpool
  • College of Podiatric Medicine

spss assignment missing values

  • Kent State University
  • SPSS Tutorials

Defining Variables

Spss tutorials: defining variables.

  • The SPSS Environment
  • The Data View Window
  • Using SPSS Syntax
  • Data Creation in SPSS
  • Importing Data into SPSS
  • Variable Types
  • Date-Time Variables in SPSS
  • Creating a Codebook
  • Computing Variables
  • Recoding Variables
  • Recoding String Variables (Automatic Recode)
  • Weighting Cases
  • rank transform converts a set of data values by ordering them from smallest to largest, and then assigning a rank to each value. In SPSS, the Rank Cases procedure can be used to compute the rank transform of a variable." href="https://libguides.library.kent.edu/SPSS/RankCases" style="" >Rank Cases
  • Sorting Data
  • Grouping Data
  • Descriptive Stats for One Numeric Variable (Explore)
  • Descriptive Stats for One Numeric Variable (Frequencies)
  • Descriptive Stats for Many Numeric Variables (Descriptives)
  • Descriptive Stats by Group (Compare Means)
  • Frequency Tables
  • Working with "Check All That Apply" Survey Data (Multiple Response Sets)
  • Chi-Square Test of Independence
  • Pearson Correlation
  • One Sample t Test
  • Paired Samples t Test
  • Independent Samples t Test
  • One-Way ANOVA
  • How to Cite the Tutorials

Sample Data Files

Our tutorials reference a dataset called "sample" in many examples. If you'd like to download the sample dataset to work through the examples, choose one of the files below:

  • Data definitions (*.pdf)
  • Data - Comma delimited (*.csv)
  • Data - Tab delimited (*.txt)
  • Data - Excel format (*.xlsx)
  • Data - SAS format (*.sas7bdat)
  • Data - SPSS format (*.sav)
  • SPSS Syntax (*.sps) Syntax to add variable labels, value labels, set variable types, and compute several recoded variables used in later tutorials.
  • SAS Syntax (*.sas) Syntax to read the CSV-format sample data and set variable labels and formats/value labels.

Defining a variable includes giving it a name, specifying its type , the values the variable can take (e.g., 1, 2, 3), etc. Without this information, your data will be much harder to understand and use. Whenever you are working with data, it is important to make sure the variables in the data are defined so that you (and anyone else who works with the data) can tell exactly what was measured, and how.

There are three ways of defining information about variables:

  • The Variable View column attributes.
  • The Define Variable Properties window.

We explain the different attributes that variables in SPSS have and how to define them in the sections below. We conclude with an example that demonstrates why it is important to define your variables—and especially why it will make working with your data and performing analyses much more straightforward.

Defining Variables in the Variable View

You can define information about your variables by accessing the Variable View tab (at the bottom of the Data Editor window). The Variable View tab displays information about the variables in your data. You can get to the Variable View window in two ways:

  • In the Data Editor window, click the Variable View tab at the bottom.
  • In the Data Editor window, in the Data View tab, double-click a variable name at the top of the column. This method has the advantage of taking you to the specific variable you clicked.

The Variable View tab displays the following information, in columns, about each variable in your data:

The name of the variable, which is used to refer to that variable in syntax. Variable names can not contain spaces. Note that when you change the name of a variable, it does not change the data; all values associated with the variable stay the same. Renaming a variable simply changes the name of that variable while leaving everything else the same. For example, we may want to rename a variable called Sex to Gender .

To change a variable's name, double-click on the name of the variable that you wish to re-name. Type your new variable name.

The type of variable (e.g. numeric, string, etc.). (See the Variable Types tutorial for descriptions of the variable types in SPSS.)

spss assignment missing values

To change a variable's type, click inside the cell corresponding to the “Type” column for that variable. A square "..." button will appear; click on it to open the Variable Type window. Click the option that best matches the type of variable. Click OK .

The number of digits displayed for numerical values or the length of a string variable.

To set a variable's width, click inside the cell corresponding to the “Width” column for that variable. Then click the "up" or "down" arrow icons to increase or decrease the number width.

The number of digits to display after a decimal point for values of that variable. Does not apply to string variables. Note that this changes how the numbers are displayed, but does not change the values in the dataset.

To specify the number of decimal places for a numeric variable, click inside the cell corresponding to the “Decimals” column for that variable. Then click the “up” or “down” arrow icons to increase or decrease the number of decimal places.

Example: If you specify that values should have two decimal points, they will display as 1.00, 2.00, 3.00, and so on.

A brief but descriptive definition or display name for the variable. When defined, a variable's label will appear in the output in place of its name.

Example: The variable expgradate might be described by the label “Expected date of college graduation".

For coded categorical variables, the value label(s) that should be associated with each category abbreviation. Value labels are useful primarily for categorical (i.e., nominal or ordinal) variables, especially if they have been recorded as codes (e.g., 1, 2, 3). It is strongly suggested that you give each value a label so that you (and anyone looking at your data or results) understands what each value represents.

When value labels are defined, the labels will display in the output instead of the original codes.Note that defining value labels only affects the labels associated with each value, and does not change the recorded values themselves.

Example: In the sample dataset, the variable Rank represents the student's class rank. The values 1, 2, 3, 4 represent the categories Freshman, Sophomore, Junior, and Senior, respectively. Let's define the category labels for the Rank variable in the sample data.

Under the column “Values,” click the cell that corresponds to the variable whose values you wish to label. If the values are currently undefined, the cell will say “None.” Click the square “…” button. The Value Labels window appears.

spss assignment missing values

Type the first possible value (1) for your variable in the Value  field. In the Label field type the label exactly as you want it to display (e.g., "Freshman"). Click Add when you are finished defining the value and label. Your variable value and label will appear in the center box. Repeat these steps for each possible value for your variable. When all of the labels have been defined, the Value Labels window should look like this:

The main box in the Value Labels window should have the statements 1 = 'Freshman', 2 = 'Sophomore', 3 = 'Junior', 4 = 'Senior'.

Click OK at the bottom of the window.

If you wish to change or remove a value and label that you have added to the center dialog box, do the following:

  • To change a specific value or label, highlight the value/label in the center text box in the Value Labels window. Now the selected value/label will be highlighted yellow. Make changes to the selected value or label as needed. Click Change . The changes will be applied to the value/label you highlighted.
  • To remove a specific value/label, highlight the value/label in the center text box. Click Remove . The selected value/label will be removed from the center text box.

User-defined data values (or ranges of values) should be treated as missing. Note that this property does not alter or eliminate SPSS's default missing value code for numeric variables ("."). This column merely allows the user to specify up to three unique missing value codes for the given variable; or, to specify a range of numbers to treat as missing, plus one additional unique missing value code.

To set user-defined missing value codes, click inside the cell corresponding to the “Missing” column for that variable. A square button will appear; click on it.

spss assignment missing values

The Missing Values window appears.

spss assignment missing values

Click the option that best matches how you wish to define missing data and enter any associated values, then click OK at the bottom of the window.

Note that you may enter numbers or letters as discrete missing value codes in the "discrete missing values" boxes.

Caution: If you have a dataset with string variables, blank cells are not automatically recognized as missing values. In order for blanks to be recognized as missing values, you can either:

  • add a space character ( Spacebar key) as a discrete missing value code (either in the Variable View or using syntax), or
  • use the Automatic Recode procedure to recode the string variable into a labeled, numeric categorical variable with blanks recoded into a special missing value code.

The latter option works well if there are a limited number of unique string values, but is a poor option if there are many unique variations in the strings (e.g. capitalization, spelling, spacing).

The width of each column in the Data View spreadsheet. Note that this is not the same as the number of digits displayed for each value. This simply refers to the width of the actual column in the spreadsheet.

To set a variable's column width, click inside the cell corresponding to the “Columns” column for that variable. Then click the “up” or “down” arrow icons to increase or decrease the column width.

The alignment of content in the cells of the SPSS Data View spreadsheet. Options include left-justified, right-justified, or center-justified.

To set the alignment for a variable, click inside the cell corresponding to the "Align" column for that variable. Then use the drop-down menu to select your preferred alignment: Left, Right, or Center.

The level of measurement for the variable (e.g., nominal, ordinal, or scale).

Some procedures in SPSS treat categorical and scale variables differently. By default, variables with numeric responses are automatically detected as “Scale” variables. If the numeric responses actually represent categories, you must change the specified measurement level to the appropriate setting.

To define a variable's measurement level, click inside the cell corresponding to the “Measure” column for that variable. Then click the drop-down arrow to select the level of measurement for that variable: Scale, Ordinal, or Nominal.

spss assignment missing values

It is vital that you correctly define each variable's measurement level. This setting affects everything from graphs to internal algorithms for statistical analysis. Incorrectly specifying measurement level can have unintended and potentially disastrous effects on your results.

The role that a variable will play in your analyses (i.e., independent variable, dependent variable, both independent and dependent). Some options in SPSS allow you to pre-select variables for particular analyses based on their defined roles. Any variable that meets the role requirements will be available for use in such analyses. You can choose from the following roles for each variable:

  • Input: The variable will be used as a predictor (independent variable). This is the default assignment for variables.
  • Target: The variable will be used as an outcome (dependent variable).
  • Both: The variable will be used as both a predictor and an outcome (independent and dependent variable).
  • None: The variable has no role assignment.
  • Partition: The variable will partition the data into separate samples.
  • Split: Used with the IBM® SPSS® Modeler (not IBM® SPSS® Statistics).

To define a variable's role in your analysis, click inside the cell corresponding to the “Role” column for that variable. Then use the drop-down menu to select the role that variable will take: Input, Target, Both, None, Partition, or Split.

spss assignment missing values

Defining Variables using Syntax

Variable names, change an existing variable's name.

Rename one variable:

Rename more than one variable:

Variable width

Set the width for one variable:

Set the same width for multiple variables:

Set different widths for multiple variables:

Variable measurement level

Set the measurement level (nominal, ordinal, or scale) for one or more variables at a time:

Set more than one variable's measurement level at a time:

Variable labels

Set label for one variable:

Set labels for several variables:

Value labels

Define labels for one numeric variable's values:

Define labels for one string variable's values:

Define the same set of labels for more than one numeric variable (e.g. you have several 5-point Likert items that all use the same coding scheme):

Define more than one set of labels at a time:

Missing value codes

Define one special missing value for a single numeric variable:

Define more than one special missing value for a single numeric variable:

Define a set of missing value codes to be applied to several numeric variables:

Define one special missing value for a single string variable:

Define a blank character as a special missing value (only applies to string variables):

Define more than one special missing value for a single string variable:

Define different sets of special missing values for different variables:

Reset missing value codes for all variables:

Defining Variables with Define Variable Properties

The Define Variable Properties window is an efficient way of defining many variables at once, or defining many variables that share the same formatting. Click  Data > Define Variable Properties .

spss assignment missing values

The Define Variable Properties window will open.

spss assignment missing values

The left column displays all of the variables in your dataset. Select the variables you wish to define and move them to the right column using the arrow button. Note that you can specify the number of cases to scan, as well as the number of values that will display in the next step. Click  Continue  when you have finished selecting variables.

A second window will appear; this one allows you to define various properties for each variable you selected.

spss assignment missing values

A Scanned Variable List:  The “Scanned Variable List” column includes the variables selected in the previous step. Variables that do not have assigned value labels will have an X in the “Unlabeled” column. For example, if the variable Gender has potential values of “1” and “2” but these values are not labeled (e.g., “male” and “female”, respectively), the Unlabeled values check box will be selected for this variable. The current Measurement Level and Role for each variable is also displayed.

B   Cases scanned:  This section displays the number of cases that were scanned for each selected variable, as well as the number of values that are listed in the Value Label grid (G).

C   Current Variable: Displays the variable that is currently selected from the Scanned Variable List (A).

D Measurement Level:  Displays the level of measurement for the selected variable. You can change the level of measurement by clicking the menu arrow and choosing the desired measurement level from the listed options: Scale, Ordinal, Nominal. You can also see the suggested level of measurement for your selected variable. To do this, click Suggest; this will open a new window that will display the currently selected variable, the current measurement level, and SPSS’s suggested level of measurement. SPSS also provides an explanation for the suggestion, and a description of each possible type of measurement level (nominal, ordinal, scale) to help you make a decision.

E   Role:  Displays the role for the selected variable. Some options in SPSS allow you to pre-select variables for particular analyses based on their defined roles. Any variable that meets the role requirements will be available for use in such analyses. You can change the role by clicking the menu arrow and choosing the desired role from the listed options: Input, Target, Both, None, Partition, Split.

F Unlabeled Values: Specifies how many values do not have corresponding value labels.

G Value Label grid: Displays current information about the selected variable and updates the information based on any changes you make.

Label: Displays value labels that have already been specified for the variable. You can change value labels by clicking on cells beneath the “Label” column and typing labels for each value specified in the “Value” column. If there are values you wish to label that are not currently displayed, you can enter the values in the “Value” column below the last value listed.

Value: The values for the selected variable. Note: The values are based on the specified number of scanned cases (B).

Count: The number of times a value occurs. Note: The count is based on the specified number of scanned cases (B).

Missing: Defines values as missing data. To mark certain values as missing data, simply check the box under “Missing” for the associated value under the “Value” column. Note: If a variable already has defined missing values (e.g., -99), you cannot change the missing values using the Define Variables Properties window. Instead, you will need to go to Variable View and specify any changes in the “Missing” column.

Changed: If you change the value label of a variable, the row associated with the changed value label will automatically be check-marked under the “Changed” column.

H Label: Allows you to add a label for the selected variable that describes more about what the variable is. This label is for the variable rather than for the values of the variable. For example, we might select the variable StudentID and give it the label “Student ID #”.

I Type: Allows you to specify a particular kind of variable that helps SPSS know how to work with the variable during analyses. The types include numeric, comma, dot, scientific, date, dollar currency percent, string, and restricted numeric. Depending on the type you select for your variable, you may be asked to supply additional information. For example, if you select “Date” as the type, you will then be able to select the format of the date from a drop-down menu to the right. You can also set the width and may be asked to set the decimals for your variable. Notice that when you select a particular type for the variable, examples of how the variable would display in your data appear in the Value Label grid area under “Value.”

J Attributes: Allows you to define custom attributes for variables. These attributes are supplementary information not otherwise specified by the variable's label, measurement labels, and missing values.

K Copy Properties: Allows you to copy properties from one variable to another variable. You can copy the properties from another variable to the currently selected variable, or copy the properties of the currently selected variable to one or more other variables. (For example, you may have several variables representing survey items, all of which use the value labels 0 = "No" and 1 = "Yes". After defining the value labels for the first item, you can use "Copy Properties" to quickly set the labels for the remaining survey item variables.)

L Unlabeled Values:  Allows you to automatically label unlabeled values by clicking Automatic Labels.

When you are finished defining your variables, click OK at the bottom of the window to apply the changes to your data.

Example: Adding value labels

As we mentioned at the beginning of this tutorial, it is important to define the variables in your data so that you (and anyone else working with your data) can easily understand what was measured, and how. In this section, we provide an example of the confusion that can result when value labels are not defined, and how to correct it.

In the sample data, the variable Gender has two possible values: 0 and 1. The sample data file is not formatted with any value labels. Let's make a Frequency table of the Gender variable to see what the distribution of gender is in our sample. Click  Analyze > Descriptive Statistics > Frequencies . Select the variable Gender , then click OK . (The Frequencies command will produce a frequency table.) The Output Viewer displays the following results:

Frequency table output when no value labels are defined. The row is labeled 0 with n=204, 46.9%. The second row is labeled 1 with n=222, 51%.

This output shows frequencies for the variable Gender , which can take on values of “0” or “1.” We see that value “0” has 204 cases and value “1” has 222 cases. But what do these values mean? Which values represent females, and which values represent males? There is no commonly accepted coding scheme for gender, so readers not familiar with the data can not be certain what is represented in this table.

In the sample data, 0 represents a Male, and 1 represents a Female. After defining the value labels (using the methods described above) and re-running the Frequencies command, the output is much easier for the reader to understand:

Frequency table output when value labels are defined. All values in the table are the same as before, except the first row is labeled Male and the second row is labeled Female.

It may also be useful to rewrite the labels so that the numeric code is included with the label. In this situation, we could alter the label for "male" to "Male (0)", and alter the label for "female" to "Female (1)".

spss assignment missing values

As you can see from this example, including value labels for each variable makes working with data and interpreting output much more straightforward. And remember: value labels are only one of many attributes that we can define for each variable. The more information you define about each variable, the easier it will be to navigate your data and interpret the output of analyses.

Example: Defining special codes for missing values

Suppose you have conducted a survey that has a time limit, and want to be able to distinguish respondents who refused to answer a question from respondents who ran out of time.

Respondents who refused to answer a survey item are coded as -99. Respondents who did not complete the survey item in the alotted time are coded as -77. All other missing responses were left blank.

To have SPSS recognize these special missing value codes, you'll need to these numbersas indicators of missing values under the Variable View tab. Click on the cell corresponding to the "Missing" column for the variable of interest to open the Missing Values window. Click  Discrete missing values , then enter the two missing value codes.

The code -99 is entered in the first box; the code -77 is entered in the second box. The order the codes are entered does not matter.

  • You can specify up to three different missing value codes.
  • You can apply value labels to missing value codes just like you would to valid categories. This is actually a good practice, because the names of missing value codes appear in the output.

Without Value Labels

Frequency table of variable with no value labels defined

With Value Labels

Frequency table of variable with value labels assigned to all missing and nonmissing value codes

  • << Previous: Date-Time Variables in SPSS
  • Next: Creating a Codebook >>
  • Last Updated: Dec 18, 2023 12:59 PM
  • URL: https://libguides.library.kent.edu/SPSS

Street Address

Mailing address, quick links.

  • How Are We Doing?
  • Student Jobs

Information

  • Accessibility
  • Emergency Information
  • For Our Alumni
  • For the Media
  • Jobs & Employment
  • Life at KSU
  • Privacy Statement
  • Technology Support
  • Website Feedback

SPSS tutorials website header logo

SPSS Missing Values Tutorial

Spss system missing values, spss user missing values.

  • Setting User Missing Values
  • Setting user missing values - metric variables

Inspecting Missing Values per Variable

  • Inspecting missing values per case

SPSS Data Analysis with Missing Values

No User Missing Values Set in Variable View

What are “Missing Values” in SPSS?

In SPSS , “missing values” may refer to 2 things:

  • System missing values are values that are completely absent from the data. They are shown as periods in data view .
  • User missing values are values that are invisible while analyzing or editing data. The SPSS user specifies which values -if any- must be excluded.

This tutorial walks you through both. We'll use bank.sav -partly shown below- throughout. You'll get the most out of this tutorial if you try the examples for yourself after downloading and opening this file.

SPSS Bank Sav Data View

System missing values are values that are completely absent from the data. System missing values are shown as dots in data view as shown below.

SPSS System Missing Values In Data View Example Bank

System missing values are only found in numeric variables. String variables don't have system missing values. Data may contain system missing values for several reasons :

  • some respondents weren't asked some questions due to the questionnaire routing;
  • a respondent skipped some questions;
  • something went wrong while converting or editing the data;
  • some values weren't recorded due to equipment failure .

In some cases system missing values make perfect sense. For example, say I ask “do you own a car?” and somebody answers “ no ”. Well, then my survey software should skip the next question: “what color is your car?” In the data, we'll probably see system missing values on color for everyone who does not own a car. These missing values make perfect sense. In other cases, however, it may not be clear why there's system missings in your data. Something may or may not have gone wrong. Therefore, you should try to find out why some values are system missing especially if there's many of them. So how to detect and handle missing values in your data? We'll get to that after taking a look at the second type of missing values.

User missing values are values that are excluded when analyzing or editing data. “User” in user missing refers to the SPSS user. Hey, that's you! So it's you who may need to set some values as user missing. So which -if any- values must be excluded? Briefly,

  • for categorical variables, answers such as “don't know” or “no answer” are typically excluded from analysis.
  • For metric variables, unlikely values -a reaction time of 50ms or a monthly salary of € 9,999,999- are usually set as user missing.

For bank.sav , no user missing values have been set yet, as can be seen in variable view.

Let's now see if any values should be set as user missing and how to do so.

User Missing Values for Categorical Variables

A quick way for inspecting categorical variables is running frequency distributions and corresponding bar charts. Make sure the output tables show both values and value labels. The easiest way for doing so is running the syntax below.

No User Missing Values Set Categorical Variable

First note that q1 is an ordinal variable: higher values indicate higher levels of agreement. However, this does not go for 11: “No answer” does not indicate more agreement than 10 - “Totally agree”. Therefore, only values 1 through 10 make up an ordinal variable and 11 should be excluded. The syntax below shows the right way to do so.

SPSS User Missings Set In Frequency Table

Note that 11 is shown among the missing values now. It occurs 6 times in q1 and there's also 14 system missing values. In variable view, we also see that 11 is set as a user missing value for q1 through q9.

SPSS User Missing Values In Variable View

User Missing values for Metric Variables

The right way to inspect metric variables is running histograms over them. The syntax below shows the easiest way to do so.

SPSS Histogram With Outliers

Some respondents report working over 150 hours per week. Perhaps these are their monthly -rather than weekly- hours. In any case, such values are not credible. We'll therefore set all values of 50 hours per week or more as user missing. After doing so, the distribution of the remaining values looks plausible.

A super fast way to inspect (system and user) missing values per variable is running a basic DESCRIPTIVES table. Before doing so, make sure you don't have any WEIGHT or FILTER switched on. You can check this by running SHOW WEIGHT FILTER N. Also note that there's 464 cases in these data. So let's now inspect the descriptive statistics.

SPSS Inspect Missing Values Per Variable

The N column shows the number of non missing values per variable. Since we've 464 cases in total, (464 - N) is the number of missing values per variable. If any variables have high percentages of missingness, you may want to exclude them from -especially- multivariate analyses. Importantly, note that Valid N (listwise) = 309 . These are the cases without any missing values on all variables in this table. Some procedures will use only those 309 cases -known as listwise exclusion of missing values in SPSS. Conclusion: none of our variables -columns of cells in data view- have huge percentages of missingness. Let's now see if any cases -rows of cells in data view- have many missing values.

Inspecting Missing Values per Case

For inspecting if any cases have many missing values, we'll create a new variable. This variable holds the number of missing values over a set of variables that we'd like to analyze together. In the example below, that'll be q1 to q9. We'll use a short and simple variable name: mis_1 is fine. Just make sure you add a description of what's in it -the number of missing...- as a variable label.

SPSS Inspect Missing Values Per Case

In this table, 0 means zero missing values over q1 to q9. This holds for 309 cases. This is the Valid N (listwise) we saw in the descriptives table earlier on. Also note that 1 case has 8 missing values out of 9 variables. We may doubt if this respondent filled out the questionnaire seriously. Perhaps we'd better exclude it from the analyses over q1 to q9. The right way to do so is using a FILTER .

So how does SPSS analyze data if they contain missing values? Well, in most situations, SPSS runs each analysis on all cases it can use for it. Right, now our data contain 464 cases. However, most analyses can't use all 464 because some may drop out due to missing values. Which cases drop out depends on which analysis we run on which variables. Therefore, an important best practice is to always inspect how many cases are actually used for each analysis you run. This is not always what you might expect. Let's first take a look at pairwise exclusion of missing values.

Pairwise Exclusion of Missing Values

Pairwise exclusion means that each correlation between a pair of variables uses all cases having valid values on these 2 variables. --> Let's inspect all (Pearson) correlations among q1 to q9. The simplest way for doing so is just running correlations q1 to q9. If we do so, we get the table shown below.

SPSS Pairwise Exclusion Of Missing Values Correlation Matrix

Note that each correlation is based on a different number of cases. Precisely, each correlation between a pair of variables uses all cases having valid values on these 2 variables. This is known as pairwise exclusion of missing values. Note that most correlations are based on some 410 up to 440 cases.

Listwise Exclusion of Missing Values

Let's now rerun the same correlations after adding a line to our minimal syntax: correlations q1 to q9 /missing listwise. After running it, we get a smaller correlation matrix as shown below. It no longer includes the number of cases per correlation.

SPSS Listwise Exclusion Of Missing Values Correlation Matrix

Each correlation is based on the same 309 cases, the listwise N . These are the cases without missing values on all variables in the table: q1 to q9. This is known as listwise exclusion of missing values. Obviously, listwise exclusion often uses far fewer cases than pairwise exclusion. This is why we often recommend the latter: we want to use as many cases as possible. However, if many missing values are present, pairwise exclusion may cause computational issues. In any case, make sure you know if your analysis uses listwise or pairwise exclusion of missing values. By default, regression and factor analysis use listwise exclusion and in most cases, that's not what you want.

SPSS Listwise Exclusion Of Missing Values

Exclude Missing Values Analysis by Analysis

Analyzing if 2 variables are associated is known as bivariate analysis. When doing so, SPSS can only use cases having valid values on both variables. Makes sense, right? Now, if you run several bivariate analyses in one go, you can exclude cases analysis by analysis : each separate analysis uses all cases it can. Different analyses may use different subsets of cases. If you don't want that, you can often choose listwise exclusion instead: each analysis uses only cases without missing values on all variables for all analyses. The figure below illustrates this for ANOVA .

SPSS ANOVA Exclude Missing Values By Analysis

We usually want to use as many cases as possible for each analysis. So we prefer to exclude cases analysis by analysis. But whichever you choose, make sure you know how many cases are used for each analysis. So check your output carefully. The Kolmogorov-Smirnov test is especially tricky in this respect: by default, one option excludes cases analysis by analysis and the other uses listwise exclusion.

Editing Data with Missing Values

Editing data with missing values can be tricky. Different commands and functions act differently in this case. Even something as basic as computing means in SPSS can go very wrong if you're unaware of this. The syntax below shows 3 ways we sometimes encounter. With missing values, however, 2 of those yield incorrect results .

SPSS Compute Means Correct Way

Final Notes

In real world data, missing values are common. They don't usually cause a lot of trouble when analyzing or editing data but in some cases they do. A little extra care often suffices if missingness is limited. Double check your results and know what you're doing.

Thanks for reading.

Tell us what you think!

This tutorial has 37 comments:.

spss assignment missing values

By Ruben Geert van den Berg on June 1st, 2016

First off, do your system missings indicate zeroes? If so, then RECODE them to zeroes and try again.

If system missing do not indicate zeroes, use SUM instead of "+".

I'll add a few tiny examples below. Let me know whether that solves your problem, ok?

data list free/v1 v2 v3. begin data 5 2 7 6 '' 2 '' '' 5 '' '' '' end data.

*Plus operator returns sysmis when missing in arguments. compute plustotal = v1 + v2 + v3. execute.

*Sum operator returns sysmis only if all arguments are missing. compute sumtotal = sum(v1,v2,v3). execute.

*Recode and sum. recode v1 to v3 (sysmis = 0). compute recodetotal = v1 + v2 + v3. execute.

spss assignment missing values

By Kay on July 28th, 2016

If you have missing data that is greater than 5%,would it be more realistic to delete the data versus using the mean in each missing data box?

By Ruben Geert van den Berg on July 28th, 2016

Unfortunately, it's not that simple. The first question you should ask yourself is why data are missing in the first place. Second, what are you going to do with the data? Missing values tend to be more problematic as more variables are involved in an analysis because they tend to reduce the number of complete case in -for instance- factor analysis.

I wouldn't propose any simple rule such as > 5% or > 10% for all different scenarios. Also, replacing missing values with a variable may alleviate some trouble but it obviously biases results as well so try and use it sparsely, ok?

spss assignment missing values

By Boushra on October 7th, 2016

Hello, A few variables of my data have quite a lot of system missing value because the survey was designed in a way that it surveyed all participants for some questions and only 20% of participants for other set of questions. the question is how can I deal with these system missing values because I think that I can not conduct expectation maximization (EM) because my data is categorical data. Thanks in advance for your help.

By Ruben Geert van den Berg on October 8th, 2016

Hi Boushra!

When analyzing variables separately or perhaps in pairs (bivariate analyses), this doesn't usually pose too much of a problem. That's obviously different for analyses involving many variables at once (such as factor analysis or regression).

There's no ideal solution.

You can perhaps treat your sample as two separate samples, with and without the system missings and analyze them separately with different sets of variables.

In some cases, you can RECODE the system missings into a valid category and treat the recoded variables as nominal variables or perhaps use dummy coding for them.

If the overall percentage of (n*k) missing data points for n cases and k variables is low, perhaps around 5%-10%, you could consider (multiple) imputation of the missing values.

These are some basic ways to handle the situation but -again- none of those are ideal. You'll have to make some sacrifices for carrying on with your analyses I'm afraid.

Privacy Overview

  • Mastering SPSS: Essential Topics and How to Solve Assignments Effectively

Essential Topics to Master Before Starting an SPSS Assignment

Oliver Henderson

Understanding the Basics of SPSS

Understanding the basics of SPSS is crucial for any data analysis project. SPSS (Statistical Package for the Social Sciences) is a powerful software widely used in various fields to perform statistical analyses and interpret data. It provides an intuitive interface, making it accessible to both beginners and experienced researchers. By learning the fundamentals of data entry, importing, and cleaning, users can ensure accurate and reliable analyses. Moreover, mastering descriptive statistics, hypothesis testing, and data visualization will enable researchers to draw meaningful insights from their data. This foundational knowledge sets the stage for more advanced statistical analyses and a successful SPSS journey.

master-before-starting-an-spss-assignment

The following topics are essential to know:

Data Entry and Data Import

Data entry and data import are critical steps in the SPSS workflow. Properly organizing and entering data is essential for accurate analysis and valid results. SPSS offers various methods to input data, including manual entry or importing from external sources like Excel or CSV files. Understanding how to handle missing data and outliers during this process is crucial to ensure data integrity. Additionally, knowing how to label variables and assign value labels improves data clarity and interpretation. By mastering data entry and import, researchers can avoid data errors, save time, and lay a solid foundation for a successful SPSS assignment.

Some of the assignments you can expect on data entry and data import include:

  • Data Entry Accuracy Assessment: To solve a data entry accuracy assessment assignment, carefully enter the provided dataset into SPSS while minimizing errors. Double-check the data for accuracy and correct any mistakes. Use validation techniques such as cross-referencing with the original data source. Analyze any discrepancies and document your approach to ensure transparency. This exercise helps improve data entry skills and emphasizes the importance of accurate data handling for reliable statistical analysis.
  • Data Import and Cleaning: To solve a data import and cleaning assignment, start by importing the dataset into SPSS from various file formats (Excel, CSV). Address missing values, duplicates, and outliers. Check data consistency and validity. Employ functions for data cleaning, like recoding variables or imputing missing values. Document your steps clearly. Lastly, validate the cleaned dataset for accuracy and usability before proceeding with any further analysis.
  • Merging Datasets: To solve an assignment on merging datasets in SPSS, follow these steps. First, ensure datasets have a common identifier (e.g., ID). Use the "Merge Files" function, select appropriate merge type (e.g., inner, outer), and identify the matching variable. Check for duplicate records and resolve inconsistencies. Use the "Split File" option for separate analyses. Validate the merged dataset by comparing results with the original files. A successful assignment requires understanding data relationships and using SPSS tools accurately for a comprehensive analysis.
  • Longitudinal Data Handling: To solve an assignment on longitudinal data handling, first, understand the dataset's structure and time points. Organize the data in SPSS, ensuring it's in the appropriate format (wide or long). Use the "Restructure Data" or "Split File" functions to perform time-series analysis. Apply statistical techniques such as repeated measures ANOVA or growth curve modeling to examine trends and changes over time. Finally, interpret and present the findings, showcasing a clear understanding of the data's longitudinal nature and demonstrating analytical skills.

Descriptive Statistics

Descriptive statistics play a fundamental role in data analysis by providing a concise summary of the main features within a dataset. These statistics, including measures like mean, median, mode, standard deviation, and variance, offer valuable insights into the central tendency, spread, and distribution of the data. Understanding descriptive statistics in SPSS allows researchers to gain a clear understanding of their data before moving on to more complex analyses. Additionally, visual representations, such as histograms and box plots, help researchers identify patterns and outliers, making it easier to make informed decisions and draw meaningful conclusions from the data at hand.

Here are the types of assignments you will get on descriptive statistics and how you can solve them:

  • Central Tendency Assignment: To solve a central tendency assignment, import the dataset into SPSS, calculate the mean, median, and mode using the "Descriptive" option, and interpret the results. The mean represents the average, the median is the middle value, and the mode is the most frequent value in the dataset, providing insights into the central tendencies of the data.
  • Measures of Dispersion Assignment: To solve a measures of dispersion assignment, import the dataset into SPSS, then calculate the range, standard deviation, and variance using the "Descriptive" option. Interpret the results to understand the spread of the data, identifying the variability and distribution characteristics.
  • Frequency Distribution Assignment: To solve a frequency distribution assignment, import the dataset into SPSS, then use the "Frequencies" option to generate frequency tables for the variables of interest. Additionally, create histograms to visualize the distribution. Analyze the frequency tables and histograms to identify patterns and trends in the data.
  • Correlation Assignment: To solve a correlation assignment, first, import the dataset into SPSS. Choose the variables you want to explore for correlation. Use the "Correlations" option to calculate correlation coefficients. Interpret the results to determine the strength and direction of the relationship between the variables, considering statistical significance using p-values.

Hypothesis Testing

Hypothesis testing is a fundamental concept in statistics and plays a pivotal role in research and decision-making processes. In SPSS, researchers can examine whether their hypotheses are supported or refuted based on empirical evidence. By setting up null and alternative hypotheses and using appropriate statistical tests like t-tests or ANOVA, analysts can draw conclusions about the population from a sample. Understanding p-values, significance levels, and the correct interpretation of results are essential to avoid drawing incorrect conclusions. Hypothesis testing in SPSS empowers researchers to make data-driven decisions and contributes to the validity and reliability of their research findings.

Types of Hypothesis Testing Assignments:

  • One-Sample T-Test Assignment: In this assignment, you are given a dataset with a single sample, and you need to test whether the sample mean differs significantly from a hypothesized value. Use SPSS to perform a one-sample t-test. Enter the data, set the null hypothesis, select the t-test option, and interpret the result based on the p-value and significance level.
  • Independent Samples T-Test Assignment: In this assignment, you are provided with two separate datasets representing independent groups, and you need to determine if there is a significant difference in the means of the two groups. Input the data, set the null hypothesis, select the t-test option, and interpret the outcome based on the p-value and significance level.
  • Paired Samples T-Test Assignment: In this assignment, you are given two related datasets, and your task is to examine if there is a significant difference between the means of the paired samples. Use SPSS to execute a paired samples t-test. Enter the paired data, set the null hypothesis, select the t-test option, and interpret the results using the p-value and significance level.
  • One-Way ANOVA Assignment: In this assignment, you are provided with a dataset containing multiple groups, and you need to ascertain if there are significant differences in means across those groups. Employ SPSS to perform a one-way ANOVA. Enter the data, set the null hypothesis, select the ANOVA option, and interpret the result based on the p-value and significance level. Additionally, post-hoc tests may be required to identify specific group differences.

Correlation and Regression

Correlation measures the relationship between two or more variables, while regression predicts the value of a dependent variable based on one or more independent variables. These topics are often encountered in research and data analysis. Knowing how to perform correlation and regression analyses in SPSS will enable you to explore relationships and make predictions from your data.

  • Simple Correlation Analysis Assignment: For this assignment, calculate and interpret the correlation coefficient between two variables using SPSS. Identify the strength and direction of the relationship and present your findings in a clear and concise manner.
  • Multiple Regression Assignment: In this task, perform multiple regression analysis in SPSS to predict a dependent variable based on two or more independent variables. Select relevant variables, run the regression, and interpret the coefficients to draw meaningful conclusions.
  • Correlation and Regression Comparison Assignment: Compare and contrast correlation and regression analyses in SPSS. Explain their purposes, assumptions, and interpretations. Provide examples to demonstrate their applications in different scenarios.
  • Real-Life Data Analysis Assignment: Obtain a dataset with variables suitable for correlation and regression analysis. Clean the data, perform the appropriate analysis in SPSS, and interpret the results. Discuss the practical implications of the findings in a real-world context.

Data Visualization

Data visualization plays a pivotal role in understanding complex datasets and communicating insights effectively. SPSS offers a wide range of visualization options, such as histograms, scatter plots, and bar charts, allowing researchers to present data in a visually engaging manner. By choosing the appropriate charts, researchers can identify patterns, trends, and outliers, making it easier to draw conclusions from the data. Furthermore, visualizations aid in conveying findings to a broader audience, making complex statistical information more accessible and comprehensible. A skillful use of data visualization in SPSS enhances the clarity and impact of research results, thereby strengthening the overall research narrative.

Types of data visualization assignments:

  • Creating Descriptive Visualizations: In this type of assignment, you may be asked to generate descriptive visualizations for a given dataset using SPSS. Start by importing the data and exploring its variables. Use appropriate chart types such as histograms, bar charts, and pie charts to visualize the distribution of categorical and numerical variables. Customize the visuals by adding labels, titles, and color schemes to improve clarity. For numerical data, consider box plots and scatter plots to identify outliers and patterns. Present the visualizations along with a brief interpretation of the main insights.
  • Comparative Visualizations: In a comparative visualization assignment, you might need to compare two or more groups or variables. Use grouped bar charts, stacked bar charts, or line graphs to demonstrate the differences between the groups. Apply color coding and legends to make the visualizations more informative. For more advanced analyses, consider using heatmaps or radar charts to display multivariate comparisons. Explain the key findings and any significant trends or patterns observed in the data.
  • Time-Series Visualizations: Time-series visualizations involve displaying data points over time. Use line graphs or area charts to represent the trends and changes in the data over specific time intervals. Pay attention to the x-axis labels and format to ensure the time is displayed accurately. Utilize different line styles or colors for multiple time series. If applicable, add annotations or callouts to highlight important events or occurrences during the time period. Analyze the visualizations to draw conclusions about any temporal patterns or fluctuations.
  • Geospatial Visualizations: In geospatial visualization assignments, you will be working with spatial data and representing it on maps. Import the geographic data into SPSS and link it with your dataset. Use choropleth maps to display numerical data for different regions or territories. You can also use bubble maps to show variations in data based on the size of the bubbles in different locations. Customize the map legend, color scales, and data ranges to enhance the visualization's clarity. Analyze the geospatial visualizations to draw insights about spatial patterns and regional differences in the data.

Data Transformation and Variable Recoding

Data transformation and variable recoding are vital skills in SPSS for preparing data for analysis. Data transformation involves converting variables into different formats or scales, such as logarithmic or square root transformations, to meet statistical assumptions. Variable recoding allows researchers to combine or modify existing variables, simplifying the analysis. These techniques are useful when dealing with skewed data or categorical variables. By mastering these methods, researchers can enhance the accuracy and reliability of their analyses and derive more insightful results from their data.

  • Log Transformation for Skewed Data: To solve an assignment on log transformation for skewed data, first, identify the skewed variable. Calculate the natural logarithm (ln) of each value in the variable to create a new transformed variable. This process helps normalize the data, making it suitable for analysis that requires normally distributed data.
  • Recoding Categorical Variables: To solve an assignment on recoding categorical variables, start by identifying the specific categorical variable and the desired outcome (e.g., binary or multi-category recoding). Create a new variable, assign codes to each category accordingly, and recode the data. Validate the recoded variable's accuracy and use it in subsequent analyses for simplified interpretations.
  • Standardization of Variables: To solve an assignment on standardization of variables, calculate the mean and standard deviation for each variable. For each data point, minus the mean and divide the answer by the standard deviation. This process will transform the variables into a common scale with a mean of 0 and a standard deviation of 1, allowing for fair comparisons and unbiased analysis.
  • Binning Continuous Variables: To solve an assignment on binning continuous variables, first, determine suitable bin intervals based on the data's distribution and context. Then, divide the range of the continuous variable into these intervals and create a new categorical variable. Assign data points to the corresponding bins, facilitating analysis and interpretation in distinct groups.

Mastering the essential topics in SPSS and knowing how to approach SPSS assignments will empower you to handle various data analysis tasks confidently. By understanding the basics of SPSS, data entry, hypothesis testing, correlation, regression, data visualization, and data transformation, you will be well-prepared to tackle a wide range of statistical problems. Through practice and hands-on experience with SPSS, you can enhance your analytical skills and become proficient in using this powerful statistical software for research and data analysis.

Post a comment...

Mastering spss: essential topics and how to solve assignments effectively submit your assignment, attached files.

IMAGES

  1. Missing Values in SPSS

    spss assignment missing values

  2. SPSS Tutorial #6: How to Code, Define, Analyse, and Deal with Missing

    spss assignment missing values

  3. How to Replace Missing Values in SPSS?

    spss assignment missing values

  4. Missing Values SPSS

    spss assignment missing values

  5. SPSS Tutorial #6: How to Code, Define, Analyse, and Deal with Missing

    spss assignment missing values

  6. Missing Values in SPSS

    spss assignment missing values

VIDEO

  1. Replace missing values in "SPSS" -12

  2. 7-Day Data Analysis Workshop

  3. SPSS Assignment #6

  4. SPSS Assignment #7

  5. Missed Data and Its Management in SPSS (Amharic tutorial)

  6. Quantitative assignment by SPSS data

COMMENTS

  1. Missing Values in SPSS

    In SPSS, "missing values" may refer to 2 things: System missing values are values that are completely absent from the data. They are shown as periods in data view. User missing values are values that are invisible while analyzing or editing data. The SPSS user specifies which values -if any- must be excluded. This tutorial walks you through ...

  2. Missing data

    There are two types of missing values in SPSS: 1) system-missing values, and 2) ... Missing values in assignment expressions. An assignment expression may appear on a compute or an if command. It is important to understand how missing values are handled in assignment statements. Consider the example shown below.

  3. SPSS

    Setting Missing Values in SPSS. Perhaps unsurprisingly, missing values can be specified with the MISSING VALUES command. A thing to note, however, is that missing values can be specified for multiple variables at once. Second, missing values may be specified as a range. If a range is used, a single discrete missing value can be added to it.

  4. Missing Value Analysis

    Missing value analysis helps address several concerns caused by incomplete data. If cases with missing values are systematically different from cases without missing values, the results can be misleading. Also, missing data may reduce the precision of calculated statistics because there is less information than originally planned. Another ...

  5. PDF IBM SPSS Missing Values 19

    Preface. IBM® SPSS® Statistics is a comprehensive system for analyzing data. The Missing Values optional add-on module provides the additional analytic techniques described in this manual. The Missing Values add-on module must be used with the SPSS Statistics Core system and is completely integrated into that system.

  6. PDF IBM SPSS Missing Values 22

    In the main Missing Value Analysis dialog box, select the variable(s) for which you want to estimate missing values using the EM method. Select EM in the Estimation group. To specify predicted and predictor variables, click Variables. See the topic "Predicted and Predictor Variables" on page 8 for more information.

  7. Averaging and Adding Variables with Missing Data in SPSS

    There are two ways to do this in SPSS syntax. Newvar= (X1 + X2 + X3 + X4 + X5)/5 or. Newvar=MEAN (X1,X2, X3, X4, X5). In the first method, if any of the variables are missing, due to SPSS's default of listwise deletion, Newvar will also be missing. In the second method, if any of the variables is missing, it will still calculate the mean.

  8. Missing Values

    The IBM® SPSS® Missing Values module helps you manage missing values in your data and draw more valid conclusions. Uncover the patterns behind missing data, estimate summary statistics and impute missing values using statistical algorithms. The module helps you build models that account for missing data and remove hidden bias.

  9. Missing-Value functions (COMPUTE command)

    The MISSING VALUE command declares the value 0 as missing for V1, V2, and V3. AllValid is the sum of three variables only for cases with valid values for all three variables. AllValid is assigned the system-missing value for a case if any variable in the assignment expression has a system- or user-missing value.

  10. Missing Values in SPSS

    DESCRIPTION:When analyzing real-world data in SPSS, missing values can be a real pain.In this SPSS beginners video, I'll cover everything you want to know ab...

  11. SPSS Tutorial #6: How to Code, Define, Analyse, and Deal with Missing

    To analyse the patterns of the missing values in SPSS, follow the procedure below: Click on Analyze menu > Missing Value Analysis; Click Patterns; In the Patterns dialogue box, you can select various patterns tables. Select "tabulated cases, grouped by missing value patterns." This will also select "sort variables by missing value pattern.

  12. How can I see the number of missing values and patterns of missing

    Now we know the number of missing values in each variable: the variable salepric has four and saltoapr has two missing values. This will help us to identify variables that may have a large number of missing values, and perhaps we may want exclude those from analysis. 2. Number of missing values in each observation

  13. SPSS Tutorial #6: How to Code, Define, Analyse, and Deal with Missing

    The next step is to setup the missing values in SPSS. If this is not done, SPSS wishes include the 96 values in all aforementioned analyses leading to biased results. Missing values includes assignment expressions. In assignment expression may appear on a computation or an if charge. It is important to understand how missing values ...

  14. PDF IBM SPSS Missing Values 20

    Preface. IBM® SPSS® Statistics is a comprehensive system for analyzing data. The Missing Values optional add-on module provides the additional analytic techniques described in this manual. The Missing Values add-on module must be used with the SPSS Statistics Core system and is completely integrated into that system.

  15. Overview (MISSING VALUES command)

    A space or comma must separate each value. For numeric variables, you can also specify a range of missing values. See the topic Specifying Ranges of Missing Values (MISSING VALUES command) for more information. The missing-value specification must correspond to the variable type (numeric or string). The same values can be declared missing for ...

  16. SPSS Tutorials: Defining Variables

    Written and illustrated tutorials for the statistical software SPSS. Variable definitions include a variable's name, type, label, formatting, role, and other attributes. This tutorial shows how to define variable properties in SPSS, especially custom missing values and value labels for categorical variables.

  17. spss

    Missing is a function and figures out if the value for the case is system-missing (a dot) or a user defined missing value. If yes, the parter after the then is executed. In this case, I told SPSS to assign it a "system missing value", visible as a dot. If you want to use a custom missing value, like "6", you would have to include it in the IF ...

  18. Missing Values in SPSS

    In SPSS, "missing values" may refer to 2 things: System missing values are values that are completely absent from the data. They are shown as periods in data view. User missing values are values that are invisible while analyzing or editing data. The SPSS user specifies which values -if any- must be excluded. This tutorial walks you through ...

  19. Mastering SPSS: Tips and Techniques for Excelling in Assignments

    Handling Missing Data. A ubiquitous challenge in SPSS assignments is the labyrinth of missing data, a hurdle often stemming from survey non-response or data entry mishaps. Effectively addressing this issue is imperative for robust analyses. SPSS, cognizant of the prevalence of missing data, provides a versatile toolkit encompassing various ...

  20. Essential Topics to Solve SPSS Assignments Effectively

    Data Import and Cleaning: To solve a data import and cleaning assignment, start by importing the dataset into SPSS from various file formats (Excel, CSV). Address missing values, duplicates, and outliers. Check data consistency and validity. Employ functions for data cleaning, like recoding variables or imputing missing values.