Differences between revisions 17 and 19 (spanning 2 versions)

How do I scatterplot observations which have the same set of co-ordinates?

Sometimes the same co-ordinates are shared by more than one observation. A scatterplot will only, however, show one point for each unique x,y combination regardless of the number of observations that share this combination.

One way of disentangling this is to add a proportionately small amount to the observed values of one of the variables, say, y. The below syntax uses the rv.uniform function in SPSS to add a small random amount to the y values when an x,y combination has previously occurred. The new values which are now all unique (ynew) can then be plotted against x.

sort cases by x.
exe.
sort cases by y.
exe.
COMPUTE copy=0. 
DO IF ($CASENUM NE 1). 
IF (x EQ LAG(x) AND y EQ LAG(y)) copy = 1. 
END IF. 
EXECUTE. 

compute ynew = y.
if (copy eq 1) YNEW = Y + RV.UNIFORM(Y*0.01,Y*0.015). 
EXE.

The above process, whereby concurrent points are plotted so they are visible individually on the scatterplot, is called jittering. You can also modify the above script to jitter only in the horizontal direction as below:

DATA LIST list / x(f9.3) y (f2.0).
BEGIN DATA
1.00000 2.00000
1.00000 2.00000
1.00000 2.00000
2.00000 3.00000
4.00000 4.00000
5.00000 5.00000
3.00000 6.00000
4.00000 7.00000
END DATA.

sort cases by x.
exe.
sort cases by y.
exe.
COMPUTE copy=0. 
DO IF ($CASENUM NE 1). 
IF (x EQ LAG(x) AND y EQ LAG(y)) copy = 1. 
END IF. 
EXECUTE. 

compute xnew = x.
if (copy eq 1) xNEW = x + RV.UNIFORM(x*0.001,x). 
EXE.

GRAPH
  /SCATTERPLOT(BIVAR)=xnew WITH y
  /MISSING=LISTWISE .

Steve Simon describes jittering as a slight random shifting of data to avoid overprinting. He suggests an alternative to the XNEW in the above syntax given below. See this and other helpful comments by Steve on the Do's and Don'ts of graphical plotting in the MS Word file located in the 2011 EDA Graduate talk zip file [:StatsCourse2011: located here.]

xNEW = x + 0.2 * (UNIFORM(1) - 0.5).

The use of open instead of solid shapes also helps to see the jittered points.

In versions 13 to 16 of SPSS there is an option for 'jittering' points in the graph menu bar. This is only available using the Interactive mode: Select Graphs > Interactive > Scatterplot.... Double click on the plot and then on any data point. You should then see a 'jitter' tab. You can use this tab to 'jitter' both x and y co-ordinates in the scatterplot so all overlapping points are then visible.

In later versions of SPSS (18 and 19) the 'jitter' and other graphical options are no longer available in the menu bar and can only be done using the SPSS syntax given below. This syntax assumes there are two variables, min (y-axis) and max (x-axis), which we wish to scatterplot. You will, therefore, need to substitute your own variable names into this syntax. You can then edit the graph (e.g. to change the scale of axes) in the usual way by double-clicking on it.

IGRAPH
        /Y=VAR(min) TYPE=SCALE
        /X1=VAR(max) TYPE=SCALE
        /COORDINATE=VERTICAL
        /SCATTER COINCIDENT=JITTER.

-  ⇤ ← Revision 17 as of 2011-09-20 10:45:48 → 
  Size: 3133
  Editor: PeterWatson
  Comment:
+   ← Revision 19 as of 2011-09-20 10:46:48 → ⇥
  Size: 3149
  Editor: PeterWatson
  Comment:
-Deletions are marked like this.
+Additions are marked like this.
 Line 58:
-syntax given below. See this and other helpful comments by Steve on the Do's and Don'ts of graph plotting in the Word file located in the 2011 EDA Graduate talk zip file [:StatsCourse2011 here.]
+syntax given below. See this and other helpful comments by Steve on the Do's and Don'ts of graphical plotting in the MS Word file located in the 2011 EDA Graduate talk zip file [:StatsCourse2011: located here.]

MRC CBU Wiki

Quick Links

Search Wiki

Page Tools

How do I scatterplot observations which have the same set of co-ordinates?