Least Squares Ellipse Confidence Calculations?

Kent McMillan

(@kent-mcmillan)

Posts: 11419

> The only verbiage I see in the Star*Net manual is "[t]he ellipse itself is computed by STAR*NET from the standard errors in Northing and Easting and the correlation between those standard errors." No mention of actual factors or algorithm.

I'm looking at "Appendix B - References" in the Star*Net Version 5 manual which cites both the 6th Edition of "Surveying Theorty and Practice" by Davis, Foote, Anderson, and Mikhail as well as the E.M. Mikhail book I mentioned above. Davis & Foote give the factor of 2.447 and attribute it to Mikhail (Pg. 33).

Posted : January 22, 2014 4:57 pm

Kent McMillan

(@kent-mcmillan)

Posts: 11419

LS Estimates for Paul

> Least squares was developed specifically to adjust true networks. A true network has a significant number of redundancies.

Sorry, but the method of least squares estimation works on any estimation problem where there is an estimate of a quantity and an uncertainty attached to it. This includes the elements of orbits and the size and figure of the Earth based upon lengths measured along meridians of longitude between stations a degree in latitude apart.

Posted : January 22, 2014 5:30 pm

bill93

(@bill93)

Posts: 9834

Topic starter

Thanks for the references. I see the names in the Star*Net documentation, but hadn't located those books.

I searched on line and found someone's extract from D, F, A, and M. that had the table of values.

I also found another book that uses the F-distribution. So differing opinions have been around for a long time. I'm surprised the academics haven't settled it long ago.

With a published reference and my own simulation, I'm pretty confident in my interpretation.

Posted : January 22, 2014 8:00 pm

Dennis Milbert

(@dennis-milbert)

Posts: 21

Registered

Two Different Statistical Problems

I think I can shed a little light on this discrepency.

I checked Chapter 19 of Gilhani and Wolf's Adjustment Computations
4th edition, and they do refer to an F distribution in Table 19.2,
pg.380. It has 3 parameters: the significance, and two degrees
of freedom (DF). The first DF is 2, referring to a 2-D problem.
That could also be 1 or 3 if you are after univariate or 3-D
problems. The second degree of freedom is taken as the degrees
of freedom (# observations - # unknowns) of the LS adjustment.

By contrast, Section 2.6 of Mikhail's Observations and Least Squares
discusses multinormal distributions (pg 27-35). He uses a
Chi Square distribution (with 2 degrees of freedom) to show how
to scale a standard error ellipse into an error ellipse of any
desired probability (0 to 1.0). His Table 2.1 (pg. 32) shows:

k=1.000 P=0.394
k=1.177 P=0.500
k=2.146 P=0.900
k=2.447 P=0.950 <------<<<< 95%
k=3.035 P=0.990

We are all used to the notion that for 1-D distributions,
a Gaussian normal distribution tells us k=1.96 for P=0.95.
For a 2-D distribution we are trying to establish a region
with a joint probability. K=1.96 would be too small, since
some of the instances along one chosen axis will be in the
"tails" of the other axis. Intuition tells us we need a
number bigger than 1.96 for a 2-D binormal distribution.
k=2.447 is that number, and it can be established from numerical
integration of the 2-D Gaussian normal distribtion.
(Or by Monte Carlo)

So, what's the deal with Gilhani and Wolf?

First, there is not a direct comparison with "k" and F(alpha,2,DF).
To compare Table 19.2 of (Gilhani & Wolf) with Table 2.1 of Mikhail:
the relation is k = SQRT(2 * F(alpha,2,DF) ).
Since F(alpha=0.95,2,DF=60)=3.15, we get a scale of k=2.50998.
Not so different from k=2.447.

In fact there is a formal relationship between F with DF2=infinity
and Chi Square. And, we are seeing that behavior being approached
in the limit for large degrees of freedom.

What is happening is that two somewhat different statistical
problems are being considered in Mikhail vs. G&W. In Mikhail,
the binormal distribution parameters are considered known.
That is, we know sigma X, sigma Y and the covariance of X and Y.
A typical way of knowing these values is if we know the sigmas
of our observations, and we compute linear error propagation.
In this version of the problem, we can do the error propagation
in a LS adjustment, or not. We can do the propagation without
any measurments at all.
k=2.447.

In Gilhani and Wolf, they are also looking at scaling an error
ellipse to 95%. But, they assume that the variances and covariance
are not known. Rather, they are subject to a variance scaling
factor (formally: the "a posteriori variance of unit weight").
In this version of the problem, we have observations, we do a
LS adjustment, we estimate our "a posteriori variance of unit
weight", we scale all of our input observation variancs. and we
do error propagarion to our scaled output variances and covariances,

In this problem the observations have to do double duty. They
have to estimate coordinates (and sometimes other parameters),
and they also have to estimate the variability of the observations.
Because the coordinates can slosh around, the residuals underestimate
the error of the observations. Quite a bit of error can be hidden
in coordinates for low degrees of freedom -- hence the larger
error estimates in Table 19.2.
k>2.447 (depending on redundancy).

The simplest case is direct measurements (y) of an unknown (x).
It is one dimensional. The best estimate is the mean, and
it is assumed that the measurement variance is unknown. In this
case the estimates of the means follow Student's t distribution.
When a large number of measurements are used to establish a
given mean (high DF), then the mean's t distribution approaches
the Gaussian normal distribution. For low DF, the "t" will
inflate the unknown's variance significantly; reflecting the
fact that observation error will hide in the coordinates, so
the normal estimate of variance will be too small.

Hope this helps...

Posted : January 23, 2014 10:36 am

Kent McMillan

(@kent-mcmillan)

Posts: 11419

Two Different Statistical Problems

> In Gilhani and Wolf, they are also looking at scaling an error
> ellipse to 95%. But, they assume that the variances and covariance
> are not known. Rather, they are subject to a variance scaling
> factor (formally: the "a posteriori variance of unit weight").
> In this version of the problem, we have observations, we do a
> LS adjustment, we estimate our "a posteriori variance of unit
> weight", we scale all of our input observation variancs. and we
> do error propagarion to our scaled output variances and covariances,

> The simplest case is direct measurements (y) of an unknown (x).
> It is one dimensional. The best estimate is the mean, and
> it is assumed that the measurement variance is unknown. In this
> case the estimates of the means follow Student's t distribution.
> When a large number of measurements are used to establish a
> given mean (high DF), then the mean's t distribution approaches
> the Gaussian normal distribution. For low DF, the "t" will
> inflate the unknown's variance significantly; reflecting the
> fact that observation error will hide in the coordinates, so
> the normal estimate of variance will be too small.

Aha! That explains perfectly well the differences in sizes of 95%-confidence error ellipses using the methods of Mikhail and those attributed to Gilhani and Wolff.

When the standard error of observations isn't well characterized beforehand and is to be determined from the adjustment residuals, the estimate of the standard error will have an uncertainty that depends upon the number of redundant observations from which it was determined. It's the inherent problem of estimating a standard error from small samples and there's no getting around it. Those uncertainties in the standard errors of the observations will produce corresponding uncertainties in the sizes of the error ellipses derived from them, which must be taken into account when computing the confidence limits of the ellipses.

The main point that underscores for me is how much more efficient it is for a surveyor to use measuring processes with well-characterized standard errors instead of bootstrapping standard errors from the adjustment. The chi-square test is then best used to simply validate a priori estimates of standard errors.

Posted : January 23, 2014 11:20 am

bill93

(@bill93)

Posts: 9834

Topic starter

Two Different Statistical Problems

Thanks! I think that may be what I was looking for and didn't know how to ask. At first read, that makes a lot of sense. I'll have to digest it more carefully.

I had noticed that W&G seemed to always scale by the sigma-zero to get the post sigmas, whereas Star*Net does not do that unless the chi-sq test fails due to being too large. (If Star*Net sees it fail on the low side, it remains pessimistic and keeps the prior estimates.) Your explanation ties together the two differences in the algorithms.

This whole discussion tells me that all users of commercial software should look into what it is doing for them, so they know the assumptions they are implicitly using.

Posted : January 23, 2014 11:47 am

Kent McMillan

(@kent-mcmillan)

Posts: 11419

Two Different Statistical Problems

> This whole discussion tells me that all users of commercial software should look into what it is doing for them, so they know the assumptions they are implicitly using.

It would be interesting to devise a test network and so that interested posters could adjust it and post results to compare, particularly the uncertainty estimates.

Posted : January 23, 2014 5:05 pm

khaled

(@khaled)

Posts: 2

Registered

Two Different Statistical Problems

How can I solve that

Posted : April 13, 2014 6:13 pm

khaled

(@khaled)

Posts: 2

Registered

Two Different Statistical Problems

I need help
How can i solve that

Posted : April 13, 2014 6:24 pm

bill93

(@bill93)

Posts: 9834

Topic starter

Two Different Statistical Problems

This looks like it might be from a textbook or sample test. If they haven't given you a computer program, they want you to work it longhand. That's quite a lot of work when you have to linearize the coefficients for the dependencies on the angle measurements based on an initial guess for the location of E, do the least squares, re-linearize, and repeat until it settles down.

If you are allowed to use a computer program, it's pretty simple to enter the data and let it crank.
I get coordinates 9920.87, 11295.26
and the 95% ellipse is at 132 degrees azimuth, with either
semimajor 0.072, semiminor 0.059 if you use the Star*Net philosophy
or
semimajor 0.066, semiminor 0.054 if you use the Wolf & Ghilani philosophy

You may be able to get a free download of Star*Net from the company web site
www.microsurvey.com/products/starnet/download_form.php
I don't know their policy on students, but it would be worth a try. The limitation on the free operation used to be, and may still be, to run problems with up to 10 points. That's an excellent way to get some practical experience with least squares.

Posted : April 14, 2014 11:00 am

jhframe

(@jim-frame)

Posts: 7277

Two Different Statistical Problems

I didn't run the numbers, but am assuming that Bill held the coordinates of A, B C and D fixed in his adjustment. This is reasonable for the exercise, in keeping with the word "given" in the problem statement. However, it's worth noting that in the real world those coordinates would have standard errors of their own that would inflate the ellipse values a bit.

Posted : April 14, 2014 11:45 am

3

bill93

(@bill93)

Posts: 9834

Topic starter