The Federal Food and Drugs Administration's (FDA) Process Analytical Technology (PAT) initiative [1] promotes continuous process improvement throughout the pharmaceutical industry.
Compliant Monitoring Systems
This is to enable manufacturers to adopt more efficient, flexible processes by:
- Reducing production cycle times
- Reducing energy and material waste
- Using automation to reduce errors
- Reducing variability
All these result in reduced costs.
PAT requires characterization, measurement, review and improvement steps. This needs an understanding of what is being done and how to measure it.
Monitoring systems (SCADA) collect data for analysis and reporting. Monitoring systems also provide current status reporting and alarm notification and management.
The amount of data collected by a monitoring system is substantial. The typical sample rate is one measurement per minute or 1440 samples per day.
Monitoring systems create large historical data sets with ease. This is a helpful resource to any PAT programme. The difficult part is how to use this information.
Statistical methods can be used to find points of interest. The following describes some techniques that do not need detailed understanding of mathematics or statistics.
The Basics
These are terms that cannot be avoided. There are many sources that describe statistics in detail [2][3][4].
The Mean. The mean is the sum of the data values divided by the number of data values. It is often written like this:

The Standard Deviation (é. The standard deviation is the spread of the data values. Every mean has a standard deviation. They are a pair of values. It is incorrect to state a mean without the standard deviation.
Without the standard deviation one cannot judge how representative the mean is of the “real” value and it cannot be judged if two measurements are different.
The standard deviation measures thesignificance when comparing results and sets the confidence limits of a result. The calculation of confidence limits is a non-trivial [12] exercise.
The standard deviation is calculated as follows:

The Standard Error. The standard error is how well the mean represents the “real” value. In a sense it is the uncertainty in the measurement.
Sample Size. As a guide, the sample size should be greater than 30 to be sure the mean and standard deviation are representative.
The above shows the standard error varies with the square root of the sample size. This is important. To halve the uncertainty one must take four times the number of sample.
A practical point is that although taking more samples improves the accuracy of a result, there may be no improvement in the results' significance because the uncertainties may only reduce from “not much” to “not a lot”. That is the extra work will yield no benefit.
Distribution. A distribution is a plot of the probability of a value occurring against the value.
Natural measurements usually fall into one of two distributions, normal and Poisson. The normal distribution has the well know bell shaped appearance:

The normal distribution is expected from analogue data (temperature, pressure etc.) while the Poisson distribution is what one expects from particle counting data. At high counts the Poisson distribution is the same as the normal distribution.
An important difference between the normal distribution and the Poisson distribution is the standard deviation of a Poisson distribution is the square root of the mean (which is the same as the count), for a normal distribution the standard deviation is independent of the mean.
The distributions have the important property that the 67% of the values lie within one standard deviation of the mean, 95% of the values lie within two standard deviations of the mean and 99% of values lie within three standard deviations of the mean.
A consequence of this is that for Poission distributions a count of zero is not significantly different to a count of four (at 2é or nine (at 3à).
If two sets of samples are taken, the mean and standard deviation are unlikely to be the same. They may be similar, but not the same. The question is are they significantly different or significantly the same ?
To answer this, divide the difference between the means by the standard error in the means:

Where se is the standard error in the means, s is the standard deviation and n is the number of samples, then calculate

If this absolute value is greater than 2 then the means probably represent significantly different values.
Data Analysis
There are various ways to try and extract data. Some methods use computer programs while others can be performed by hand.
When data is analysed, information is sought on possible improvements and problem areas.
The following are some possible ways of extracting information.
Correlation
Correlation is a way to measure dependencies between different parameters.
One well known correlation procedure is least squares curve fitting [5][6]. A difficulty with this type of analysis is the model must be known, for example, that it is a polynomial of degree x.
To test if two parameters are related the coefficient of correlation can be calculated as follows:
Where x and y are the parameters.
If the value of r is +/- 1 then x and y completely depend on each other. If the value is zero then there is no dependency. The absolute value of r is a measure of how strongly x depends on y.
Computer programs are usually needed for this type of analysis. These are readily available [5] as either source code or applications.
Statistical Process Control
Statistical Process Control (SPC) [11] is a range of methods intended to spot variations and trends to improve or control some activity.

The simpler methods are based on graph plotting and counting. SPC as described here is designed for unskilled humans.
The starting point for SPC is that the parameter is stable and under control. That is the measured values are normally distributed about a mean value.
An SPC parameter has a specified mean value, an upper control limit and a lower control limit.
By plotting data on a graph and counting the following items, areas of interest can be spotted:
- The number of consecutive points outside the control limits
- The number of consecutive points inside the control limits are noted
- The number of consecutive points only increasing or decreasing.
- The number of consecutive points on only one side of the mean or the other.
If we combine these with some elementary probability we can find points of interest. If the control limits are placed at one standard deviation then we can calculate when events are probably exceptional (> 99% confidence).
One out of three values are expected to lie outside the control limits. When four consecutive values lie outside the control limits this suggests an increase in the standard deviation of a measurement.
Often when systems start to fail the mean value can remain constant but the standard deviation increases. So this type of analysis can spot the onset of a failure mode. If the control limits are too narrow there will be more false alarms, so hiding information.
Conversely if more than 11 consecutive values lie within the control limits (mean crowding) this suggests the results are too stable or the control limits are too wide so out of control events may not be seen and so information lost.
The odds are one in two for a value to increase or decrease. Therefore to find seven consecutive values increasing only or seven values only decreasing suggests a systematic drift or trend.
The chances are also one in two that a value will be above the mean or below it. Where seven consecutive values are only above or only below the mean indicates a possible long term drift in the mean value. Again this marks changes before they become significant.
Simple SPC find features of interest and guide any investigation. All that is needed is the ability to count. This type of SPC analysis can be performed as measurements are made or on recorded data, in principle by hand, but more likely using software included with a monitoring system or in addition to the monitoring system.
Back Propagation
Back Propagation (BP) [7] is associated with neural networks, artificial intelligence. Back Propagation is best used when the data model unknown to derive a relationship between sets of inputs and sets of outputs.
Back propagation is always implemented in software. BP Programs [8] usually include some form of user interface to configure the neural network and use it. Most modern monitoring systems will support this type of modelling.
To use back propagation a training setof data is needed. This is the information used by the neural network to learn the relationships between the inputs and the outputs.
Monitoring systems can supply large training sets.
Once a network is trained it can be used to predict outcomes for given inputs. That is answering questions such as “If I change A, B and C what is the result on D, E and F”.
For example a neural network could be trained to recognise when an alarm condition is going to arise some time in the future and then prompt an investigation.
The benefits of this technique are that is easy to implement in software, the training sets are easily obtained and no prior knowledge of the relationship between the variables and the outcomes is needed.
Conclusions
All data models assume some ideal behaviour. It is the deviations from perfection that carry most of the useful information.
Scientific discovery is often made with the phrase “Hmm... that looks odd ?”. Explaining the unexpected gives the greatest improvements in knowledge.
Any study must appreciate the underlying assumptions, limitations and where they can be wrong. All analysis methods will give an answer, it may be significant or it may be meaningless.
The “Hitch Hiker's Guide to the Galaxy” [9] has a useful illustration. Deep Thought [9] noted the answer to the ultimate question of “life the universe and everything” was 42, but it meant nothing unless the question was known. This transpired to be “six times nine” [10] which at first glance is wrong, unless base 13 arithmetic is used, in which case it is correct. The number 13 is associated with bad luck which fits with the themes in the novels. So everything appears consistent.
Although Deep Thought never said that the number system used was base 10, to then assume that the author of the novel knew that the arithmetic worked by changing from base 10 to base 13 is probably wrong too. It was just a joke.
This highlights two problems with any investigation. Firstly that unless the context and assumptions are well defined the answers are not useful and secondly one must also guard against drawing false conclusions based on personal expectations and beliefs.
Monitoring systems provide almost unlimited data to characterise a system. The techniques described in this article can help to identify areas of interest and to use other methods with greater confidence.
With understanding comes knowledge and then improvement. With improvement comes cost savings. Which is what it is all about.
References:
[1] Guidance for Industry, PAT – A Framework for Innovative Pharmaceutical Development, Manufacturing and Quality Assurance,http://www.fda.gov/cder
[2] An Introduction to Statistics, Keone Hon (Internet)
[3] Introduction to Probability,Charles M. Grinstead,J. Laurie Snell (Internet)
[4] Use and Abuse of Statistics, Pelican S.,Reichmann, W.J.
[5] Numerical Recipes in C, Cambridge Press, W. H. Press et al.
[6] Elementary Numerical Analysis , McGraw-Hill, S. Conte, C. Boor
[7]An Introduction to Back-Propagation Neural Networks,P. McCollum,http://www.seattlerobotics.org/encoder/nov98/neural.html
[8] Back Propagation Neural Network Source Code, Gideon Pertzov, (gpdev.net)
[9] The Hitch Hiker's Guide to the Galaxy, D. Adams
[10] The Restaurant at the End of the Universe, D. Adams
[11]iSix Sigma Web Site. www.isixsigma.com
[12] Peizer & Pratt JASA, vol63, p1416