Statistical process control

There are three main packages designed specifically for creating statistical process control charts in R

  1. qcc – Access an introductory tutorial here
  2. qicharts – Access a tutorial on control charts here, Access a tutorial on run charts here
  3. spc – This is a newer package for which there are not any tutorials available outside of the package documentation which can be accessed here

In this tutorial we will have a brief look at qicharts as I think the presentation of the charts is nicer than those produced by the qcc package and the syntax is easier to understand. The tutorial for qicharts available via the hyperlinks above make use of healthcare examples. This overview is based on those two tutorials.

First we start by importing the qicharts package and setting a control for the random seed


library(qicharts)
set.seed(10)

Run charts

The input for the qic() function that is used to generate run charts is a vector of values. In the example below we use the rpois() function to generate a random vector of integers with a mean of 10. The qic() function then plots this vector of data, we include the argument runvals=TRUE to see the number of useful observations, the predicted and actual longest run and crossings. The predicted run and crossing values are in the brackets and in the example our actual run value is smaller than the maximum expected run and the number of crossings is larger than the minimum number of expected crossings indicating no unexpected variation.


Input:
y <- rpois(24, 10)
a <- qic(y, runvals=TRUE)

Output:
spcFig1

Let's introduce a shift in the mean of our data which might represent a change in process. In the plot below we can see that the centre line has changed and is now coloured red and is dashed, this indicates there being high variation in the data. We further interrogate this by looking at the run values. The longest run is now larger than the predicted maximum and the number of crossings is lower than the expected minimum. This indicates the presence of non-random variation.


Input:
y[13:24] <- rpois(12, 5)
b <- qic(y, runvals=TRUE)

Output:
spcFig2

If we were using this graph to demonstrate the impact of a change in process for example at point 12 we can freeze the centre time so that it only calculated based on the first 12 values using the freeze=12 argument. The graph is also split into before and after data by default.


Input:
qic(y,freeze=12)

Output:
spcFig3

We can then break the graph at point 12 to more clearly show the difference in centre line values using the breaks= argument.


Input:
qic(y,breaks=12)

Output:
spcFig4

We can plot the rate of defects occurring by introducing a denominator argument into the qic() function. y is now our vector of input data representing for example the number of patients for whom a performance target was missed, this will be the numerator. n becomes the total number of patients for example, this is our denominator. These two arguments in the qic() function now result in a plot of the rate of defects.


Input:
y <- rbinom(24, 30, 0.2)
n <- sample(22:30,24,replace=TRUE)
qic(y,n)

Output:
spcFig5

We can format the run chart by adding a title, x and y axis labels and notes. If the data represents defects by week we can create a vector of dates and introduce this as the x-axis ticks using the argument x=. The title, x-axis and y-axis labels can be changed using the common main=, xlab= and ylab= arguments. If we want to add a note regarding one of the points of the graph we initialise a notes variable using notes <- NA. A note can then be added to a specific point using subscript assignment notes['point number'] and assigning a string vector.


Input:
week <- seq(as.Date("2016-01-01"),length.out=24,by="week")
notes <- NA
notes[17] <- "Note for this point"
qic(y, n, x=week, main="Run chart", ylab="Proportion", xlab="Week", notes=notes)

Output:
spcFig6

Control charts

The qic() function can be used to produce several different process control charts. The chart that you need to use will depend on the data that you are using and the type of chart that you want. The chart type is given in the chart= argument of the qic() function.

Control charts: use of control limits to identify variation

  • c - count chart type which counts the number of defects, shows upper and lower control limits
  • u - this chart accounts for the variation over time and/or between samples plotting the rate of defects
  • p - the proportion of defects is plotted in a p chart
  • g - used for rare events to plot the number of units (e.g. patients) between defects
  • i - individual chart for continuous data
  • mr - chart for continuous data using a moving range i.e absolute difference one data point and the next

Below we create some data to represent the number of emergency referrals made for the admission of adult patients to acute psychiatric service for inpatient care. On average 30 referrals per week are received, referral to admission should take one day and the probability of a patient not receiving an admission within 24 hours (a defect) is 20%.


Input:
u.referrals <- 30
u.r2a <- 1
u.days <- u.referrals*u.r2a
p.Miss <- 0.2

referrals <- rpois(24,lambda=u.referrals)
patientdays <- round(rnorm(24,mean=u.days,sd=4))
n.Miss <- rpois(24, lambda=u.referrals*p.Miss)
n.pat.Miss <- rbinom(24,size=u.referrals,prob=p.Miss)
week <- seq(as.Date("2016-01-01"),length.out=24,by="week")

d <- data.frame(week,referrals,patientdays,n.Miss,n.pat.Miss)
d

Output:
week	 referrals patientdays n.Miss n.pat.Miss
2016-01-01	25	29	6	4
2016-01-08	27	30	6	8
2016-01-15	36	35	9	5
2016-01-22	34	32	4	6
2016-01-29	27	24	8	3
2016-02-05	35	24	4	8
2016-02-12	30	34	9	9
2016-02-19	26	31	6	5
2016-02-26	24	35	8	7
2016-03-04	31	27	7	5
2016-03-11	37	32	5	2
2016-03-18	25	26	6	5
2016-03-25	27	33	6	4
2016-04-01	26	27	2	6
2016-04-08	33	32	6	5
2016-04-15	28	33	7	9
2016-04-22	33	23	3	4
2016-04-29	33	34	9	12
2016-05-06	32	35	7	7
2016-05-13	30	25	7	4
2016-05-20	33	35	4	8
2016-05-27	42	28	9	12
2016-06-03	23	29	6	7
2016-06-10	27	34	4	7

A c chart is similar to a run chart but it includes upper and lower control limits to identify non-random variation > 3 standard deviations from the mean (different rules for the control limits can be introduced).


Input:
c <- qic(n.Miss, x=week, data=d, chart="c", ylab="count", xlab="week", runvals=TRUE)
c

Output:
$y
 [1] 6 6 9 4 8 4 9 6 8 7 5 6 6 2 6 7 3 9 7 7 4 9 6 4

$cl
 [1] 6.166667 6.166667 6.166667 6.166667 6.166667 6.166667 6.166667 6.166667
 [9] 6.166667 6.166667 6.166667 6.166667 6.166667 6.166667 6.166667 6.166667
[17] 6.166667 6.166667 6.166667 6.166667 6.166667 6.166667 6.166667 6.166667

$lcl
 [1] NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA

$ucl
 [1] 13.6165 13.6165 13.6165 13.6165 13.6165 13.6165 13.6165 13.6165 13.6165
[10] 13.6165 13.6165 13.6165 13.6165 13.6165 13.6165 13.6165 13.6165 13.6165
[19] 13.6165 13.6165 13.6165 13.6165 13.6165 13.6165

$agg.fun
[1] "mean"

$n
 [1] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1

$labels
 [1] "2016-01-01" "2016-01-08" "2016-01-15" "2016-01-22" "2016-01-29"
 [6] "2016-02-05" "2016-02-12" "2016-02-19" "2016-02-26" "2016-03-04"
[11] "2016-03-11" "2016-03-18" "2016-03-25" "2016-04-01" "2016-04-08"
[16] "2016-04-15" "2016-04-22" "2016-04-29" "2016-05-06" "2016-05-13"
[21] "2016-05-20" "2016-05-27" "2016-06-03" "2016-06-10"

$notes
 [1] NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA

$parts
$parts[[1]]
 [1]  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24


$n.obs
[1] 24

$n.useful
[1] 24

$longest.run
[1] 5

$longest.run.max
[1] 8

$n.crossings
[1] 14

$n.crossings.min
[1] 8

$runs.test
[1] FALSE

$signals
integer(0)

$dots.only
[1] FALSE

$runvals
[1] TRUE

$linevals
[1] TRUE

$main
[1] "C Chart of n.Miss"

$xlab
[1] "week"

$ylab
[1] "count"

$pre.text
[1] "Before data"

$post.text
[1] "After data"

$llabs
[1] "LCL" "CL"  "UCL" "TRG"

$nint
[1] 5

$cex
[1] 0.8

attr(,"class")
[1] "qic"

spcFig7

A u chart is used to plot the rate with which defect occur in the data. In the example below, we use the patient days calculation a the denominator for the rate calculation. Note the varying upper control limit accounting for the variance in the denominator across the data.


Input:
d <- qic(n.Miss, n=patientdays, x=week, data=d, chart="u", multiply=1, ylab="count", xlab="week")
d

Output:
$y
 [1] 0.20689655 0.20000000 0.25714286 0.12500000 0.33333333 0.16666667
 [7] 0.26470588 0.19354839 0.22857143 0.25925926 0.15625000 0.23076923
[13] 0.18181818 0.07407407 0.18750000 0.21212121 0.13043478 0.26470588
[19] 0.20000000 0.28000000 0.11428571 0.32142857 0.20689655 0.11764706

$cl
 [1] 0.2035763 0.2035763 0.2035763 0.2035763 0.2035763 0.2035763 0.2035763
 [8] 0.2035763 0.2035763 0.2035763 0.2035763 0.2035763 0.2035763 0.2035763
[15] 0.2035763 0.2035763 0.2035763 0.2035763 0.2035763 0.2035763 0.2035763
[22] 0.2035763 0.2035763 0.2035763

$lcl
 [1] NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA

$ucl
 [1] 0.4549304 0.4507057 0.4323736 0.4428583 0.4798753 0.4798753 0.4357139
 [8] 0.4466870 0.4323736 0.4640735 0.4428583 0.4690358 0.4392049 0.4640735
[15] 0.4428583 0.4392049 0.4858179 0.4357139 0.4323736 0.4742930 0.4323736
[22] 0.4593795 0.4549304 0.4357139

$agg.fun
[1] "mean"

$n
 [1] 29 30 35 32 24 24 34 31 35 27 32 26 33 27 32 33 23 34 35 25 35 28 29 34

$labels
 [1] "2016-01-01" "2016-01-08" "2016-01-15" "2016-01-22" "2016-01-29"
 [6] "2016-02-05" "2016-02-12" "2016-02-19" "2016-02-26" "2016-03-04"
[11] "2016-03-11" "2016-03-18" "2016-03-25" "2016-04-01" "2016-04-08"
[16] "2016-04-15" "2016-04-22" "2016-04-29" "2016-05-06" "2016-05-13"
[21] "2016-05-20" "2016-05-27" "2016-06-03" "2016-06-10"

$notes
 [1] NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA

$parts
$parts[[1]]
 [1]  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24


$n.obs
[1] 24

$n.useful
[1] 24

$longest.run
[1] 3

$longest.run.max
[1] 8

$n.crossings
[1] 19

$n.crossings.min
[1] 8

$runs.test
[1] FALSE

$signals
integer(0)

$dots.only
[1] FALSE

$runvals
[1] FALSE

$linevals
[1] TRUE

$main
[1] "U Chart of n.Miss / patientdays"

$xlab
[1] "week"

$ylab
[1] "count"

$pre.text
[1] "Before data"

$post.text
[1] "After data"

$llabs
[1] "LCL" "CL"  "UCL" "TRG"

$nint
[1] 5

$cex
[1] 0.8

attr(,"class")
[1] "qic"

spcFig8

The p chart plots the proportion of defects occurring over the data.


Input:
e <- qic(n.Miss, n=referrals, x=week, data=d, chart="p", multiply=100, ylab="count", xlab="week")
e

Output:
$y
 [1] 24.000000 22.222222 25.000000 11.764706 29.629630 11.428571 30.000000
 [8] 23.076923 33.333333 22.580645 13.513514 24.000000 22.222222  7.692308
[15] 18.181818 25.000000  9.090909 27.272727 21.875000 23.333333 12.121212
[22] 21.428571 26.086957 14.814815

$cl
 [1] 20.44199 20.44199 20.44199 20.44199 20.44199 20.44199 20.44199 20.44199
 [9] 20.44199 20.44199 20.44199 20.44199 20.44199 20.44199 20.44199 20.44199
[17] 20.44199 20.44199 20.44199 20.44199 20.44199 20.44199 20.44199 20.44199

$lcl
 [1]        NA        NA 0.2781353        NA        NA        NA        NA
 [8]        NA        NA        NA 0.5524862        NA        NA        NA
[15]        NA        NA        NA        NA        NA        NA        NA
[22] 1.7738879        NA        NA

$ucl
 [1] 44.63861 43.72520 40.60584 41.19042 43.72520 40.89187 42.53038 44.16873
 [9] 45.13757 42.17120 40.33149 44.63861 43.72520 44.16873 41.50245 43.30565
[17] 41.50245 41.50245 41.82899 42.53038 41.50245 39.11009 45.66871 43.72520

$agg.fun
[1] "mean"

$n
 [1] 25 27 36 34 27 35 30 26 24 31 37 25 27 26 33 28 33 33 32 30 33 42 23 27

$labels
 [1] "2016-01-01" "2016-01-08" "2016-01-15" "2016-01-22" "2016-01-29"
 [6] "2016-02-05" "2016-02-12" "2016-02-19" "2016-02-26" "2016-03-04"
[11] "2016-03-11" "2016-03-18" "2016-03-25" "2016-04-01" "2016-04-08"
[16] "2016-04-15" "2016-04-22" "2016-04-29" "2016-05-06" "2016-05-13"
[21] "2016-05-20" "2016-05-27" "2016-06-03" "2016-06-10"

$notes
 [1] NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA

$parts
$parts[[1]]
 [1]  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24


$n.obs
[1] 24

$n.useful
[1] 24

$longest.run
[1] 4

$longest.run.max
[1] 8

$n.crossings
[1] 13

$n.crossings.min
[1] 8

$runs.test
[1] FALSE

$signals
integer(0)

$dots.only
[1] FALSE

$runvals
[1] FALSE

$linevals
[1] TRUE

$main
[1] "P Chart of n.Miss / referrals x 100"

$xlab
[1] "week"

$ylab
[1] "count"

$pre.text
[1] "Before data"

$post.text
[1] "After data"

$llabs
[1] "LCL" "CL"  "UCL" "TRG"

$nint
[1] 5

$cex
[1] 0.8

attr(,"class")
[1] "qic"

spcFig9

The other control chart variants are used in special cases. Have a look at the tutorials flagged above for guidance on their use.