In a recent post Mike Mulcare casually mentioned," the removal of outliers according to a rigorous criteria".
When I was learning the trade "three sigma" was the rule of thumb and it was explained to me that 3S represented the "extreme tail" of the bell curve.
Later, when I became responsible for the quality of the data I researched the subject.
Dr W. R. Logan in a Survey Review article (no 97 July 1955) " The Rejection of Outlying Observations", wrote the procedure which I essentially follow. The meat of the article is that the rejection criteria is relative to the sample size.
A restatement of Chauvenets' approach of determining the probability band, where P = 1 - (1/2n).
So, if any random variate in a sampling of size limit "n" possessing a deviation greater than 1-(1/2n) probability from the mean is detected it should be rejected.
For example, if the observation size n is equal to 8 angles, then 1-(1/(2*8)) = 0.94. That equates to a 1.86S probability. So any angle whose residual error or deviation from the mean was greater than 1.86S would be rejected as an outlier.
That simple formula produces the following guide;
4 observations = 1.53S
8 observations = 1.86S
10 observations = 1.96S
20 observations = 2.24S
So, on days when the IP was cranking angles, I would compute the unweighted mean after 4 sets, calculate the residuals, the standard deviation and multiply by the rejection limit to check our progress until we hit 8 sets, then I'd do it all again with the new multiplier.
Scott, how do you get from 0.94, to 1.86S? (note, I am an amateur, so maybe I should know) Basically what is the relationship of 0.94 to 1.86?
It's the relationship between error, S or sigma, and percentage of area under a normal (Gaussian) probability distribution curve.
I've never seen a formula for it I could solve.
Just used a "z table".
Thanks. I remember that now. Had statistics once. I always just used 3 X to get a general feel for the outlier.
3 sigma works for populations above 30. Surveyors generally don't observe that many samples, which is why the Chauvenet probability band is preferable to the arbitrary 3 sigma rule of thumb.
However, if you ever want to start a riot, go to a statistician conference and ask about detecting and rejecting outliers!
Scott Zelenak, post: 342741, member: 327 wrote: So any angle whose residual error or deviation from the mean was greater than 1.86S would be rejected as an outlier.
In terms of practice in the field, how, exactly do you "reject as an outlier", an observation? Make a note in the field book, including the information about which set, direct or reverse, etc.? Wouldn't you have to throw the set out, if one observation is bogus? Do you make a note in the DC, then review the file upon import? Dump it on the spot in the DC and start over?
I'm pretty sure I understand the theory (if not the math too), but am fuzzy on how this gets put into practice in the field.
Scott: page 13, Kissam's Surveying for Civil Engineers
I do not take rejection criteria casually. As I age I seem to get even less flexible.
My comments were with respect to "improving " GPS processing results by user intervention in filtering raw data.
The relevance of Chauvenet's criterion to user-processed GPS baselines and even adjustment analysis eludes me. The "black box" GPS processing software have rejection criteria built in. How would a user/processor of GPS data use this technique?