Discussion Forums

Kent McMillan

(@kent-mcmillan)

Posts: 11416

Member

Topic starter

Translate ▼

Just to continue with the subject of the estimation of uncertainties in measurement processes, here's an example to consider. Suppose a coordinate of a point is determined twice independently by the same process such as, say, network RTK.

One way to estimate the uncertainty in the results is to examine the statistics of the differences between the two coordinate values from two independent repeat measurements on the same point.

If the following are the differences between the North values, N(1) and N(2) of two coordinates taken on each of twenty different points, what is the best estimate of the uncertainty of any one North coordinate value if the errors in the coordinates are assumed to be random, normally distributed, and drawn from a population with the same standard deviation? That is, the assumption is that there is some sort of uniform process at work.

I realize that in real world positioning by GPS, the assumption that all North values will have the same standard error may not always hold up, but in the case of true black box positioning, the problem is probably legitimately posed as it is.

Posted : August 28, 2012 3:53 pm

Kent McMillan

(@kent-mcmillan)

Posts: 11416

Member

Topic starter

Translate ▼

>

And the same differences ready to be cut and pasted into a spreadsheet:

+0.040
+0.018
+0.038
+0.056
+0.080
-0.016
-0.022
-0.058
-0.003
-0.015
-0.057
+0.035
-0.047
+0.062
-0.038
-0.093
-0.071
-0.027
+0.007
-0.007

Posted : August 28, 2012 4:18 pm

3

bill93

(@bill93)

Posts: 9977

Member

Translate ▼

The first difficulty is that these differences are probably highly correlated. I'd expect the differences to wander from day to day, and each set of measurements were probably obtained in a matter of an hour or two.

Posted : August 28, 2012 4:33 pm

Kent McMillan

(@kent-mcmillan)

Posts: 11416

Member

Topic starter

Translate ▼

> The first difficulty is that these differences are probably highly correlated. I'd expect the differences to wander from day to day, and each set of measurements were probably obtained in a matter of an hour or two.

If you're speaking of some real world process, that may be somewhat true. But as a purely statistical exercise I can assure you that the above differences were formed from values produced by the Gaussian generator at random.org as sets from populations with the same means and standard errors.

You can investigate the differences for evidence of non-random bias, right?

Posted : August 28, 2012 4:48 pm

ashton

(@ashton)

Posts: 566

Member

Translate ▼

I wondered if it was really necessary to get values from a website, rather than just using a minor modification of what Excel provides. I know spreadsheets had an evil reputation for their random number generators in the past, but I understand Microsoft overhauled Excel's statistical suite in 2007, so I thought I would see how it does today. I ran a chi-squared test by dividing the results of 1000 trials for the normal distribution with 0 mean and 0.02 sigma into 16 bins. The Excel random numbers were generated with the formula =NORM.INV(RAND(), 0, 0.02)

The value of chi squared for the Excel random numbers was 8.0; with 15 degrees of freedom that result, or a higher (worse) result would occur 92% of the time if the data were truly drawn from a normal distribution.

The value of chi squared for the random.org data was 24.4. A value that high would only occur 5.9% of the time. If I did this right, we are almost justified in rejecting the hypothesis that data from random.org is normally distributed.

Posted : August 28, 2012 5:47 pm

Kent McMillan

(@kent-mcmillan)

Posts: 11416

Member

Topic starter

Translate ▼

> know spreadsheets had an evil reputation for their random number generators in the past, but I understand Microsoft overhauled Excel's statistical suite in 2007, so I thought I would see how it does today. I ran a chi-squared test by dividing the results of 1000 trials for the normal distribution with 0 mean and 0.02 sigma into 16 bins. The Excel random numbers were generated with the formula =NORM.INV(RAND(), 0, 0.02)

> The value of chi squared for the random.org data was 24.4. A value that high would only occur 5.9% of the time. If I did this right, we are almost justified in rejecting the hypothesis that data from random.org is normally distributed.

I'm not quite sure I understand the procedure that you followed. I was with you up to the point that you divided each of two sets of random numbers (normally distributed with mean = 0 and sigma = 0.02) into 16 groups with 62 members and presumably tossed out the rest. Then what did you do with the two sets of 16 groups, one from Excel and the other from random.org's random Gaussian generator?

Posted : August 28, 2012 6:34 pm

ashton

(@ashton)

Posts: 566

Member

Translate ▼

I used all 1000 values from each source. Each bin was 0.01 wide. The lowest bin was from -0.08 to -0.07. The highest bin was from 0.07 to 0.08. I used the Excel histogram feature to count the number of values in each bin. The frequencies for Excel random numbers were:

0
1
4
10
42
104
150
194
183
148
97
40
21
4
2
0

The results from random.org were

1
2
5
9
47
105
144
192
211
148
91
25
14
3
3
0

In the chi squared test the expected number of results in each bin are computed. My computed values are

0
1
5
17
44
92
150
191
191
150
92
44
17
5
1
0

Then for each bin the residuals (observed-expected)^2/expected is computed and all the residuals are summed. The lower the number, the better the "goodness of fit".

Posted : August 28, 2012 7:13 pm

Kent McMillan

(@kent-mcmillan)

Posts: 11416

Member

Topic starter

Translate ▼

Hmm. I don't think it should be much of a challenge to test some samples from random.org for normality of distribution. When I get a chance I'll do that as a demonstration.

In the meantime, the question posed with the data I posted still stands.

Posted : August 28, 2012 9:35 pm

jprice

(@jprice)

Posts: 23

Member

Translate ▼

This will work as well as any other method. Errors in GPS positions are uncertain since there are effects on GPS that can not be mesured.
I gave a paper in 2000 at the ACSM meeting in Little Rock, Arkansas that explain some of these errors. Meaning the results over differt constilations and under different atmospheric conditions is the best way to minimize your error

Posted : August 28, 2012 10:09 pm

conrad

(@conrad)

Posts: 515

Member

Translate ▼

how 'bout 0.034 ft?

Posted : August 28, 2012 11:18 pm

Kent McMillan

(@kent-mcmillan)

Posts: 11416

Member

Topic starter

Translate ▼

> how 'bout 0.034 ft?

I get an estimate of 0.033 for the standard error of the individual values of the North coordinates from which the differences were calculated, so 0.034 is probably just minor round-off error. I'll give credit for it.

The actual distribution that the 40 random values were pulled from had a standard error of 0.030 ft. There is some uncertainty in the standard error estimated from from a sample size of n = 20 differences and that is why the estimate derived from the differences didn't exactly give the value of 0.030 ft.

Posted : August 29, 2012 6:22 am

Kent McMillan

(@kent-mcmillan)

Posts: 11416

Member

Topic starter

Translate ▼

BTW for those who want to use this technique each difference, d, in coordinate values on repeated points, where d = N(1) - N(2), gives a variance, s^2.

s^2 = (d/2)^2 x 2 = d^2 / 2

That means that from a single difference, the unbiased estimate of the standard error is s

s = SQRT( d^2 /2) = d / SQRT(2)

an estimate based on such a small sample, however, has a large uncertainty. To reduce the uncertainty in the estimate, one can pool the individual estimates s(n) from larger number of differences in coordinate values on repeated points.

call the pooled estimate, s(pooled).

s(pooled) = SQRT [(s(1)^2 + s(2)^2 + ... + s(n)^2)/n]

In other words, the estimate of the standard error from n differences is just the SQRT of the mean of the variances for each of the differences.

Posted : August 29, 2012 7:04 am

conrad

(@conrad)

Posts: 515

Member

Translate ▼

i used propagation of variances:

sd_diffN^2=df/dN1*sdN1^2+df/dN2*sdN2^2

and since sdN2=sdN1

sd_diffN^2=2*sdN1^2

sdN1=sqrt(sd_diffN^2/2)

excel gives sd_diffN=0.047591353 for your sample provided.

sqrt(0.047591353^2/2)=0.033652168=0.034 ft, or 10mm in the civilised world 😉

I think being able to analyse one's own data sets for quality estimation purposes is a useful tool sometimes. also for planning surveys when dry-running numbers to get a-posteri values to compare to required spec. luckily decent survey software will allow simulation runs of approximate networks & numbers.

Posted : August 29, 2012 7:38 am

MightyMoe

(@mightymoe)

Posts: 10534

Member

Translate ▼

Some math for Kent

I'm just looking at ties to existing monuments. These were set in 2001 with RTK. They were set based on a different projection than the projection I'm checking them with.

Also, they are based on a different epoch of NAD83. They both, however, have common control points that are tied in both projections. So I took vectors from the old job and recalucalted the corners in the new job, then located the corners in the new projection and these are the results:

NW1/16
X 0.01, Y 0.04

CN1/16
X 0.02, Y 0.01

NE1/16
X 0.02, Y 0.02

CE1/16
X 0.01, Y0.04

Just some real world numbers spaced over ten years for a look at the accuracy of RTK.
Two different Trimble systems also.

Posted : August 29, 2012 8:47 am

Kent McMillan

(@kent-mcmillan)

Posts: 11416

Member

Topic starter

Translate ▼

> I think being able to analyse one's own data sets for quality estimation purposes is a useful tool sometimes. also for planning surveys when dry-running numbers to get a-posteri values to compare to required spec. luckily decent survey software will allow simulation runs of approximate networks & numbers.

Yes, but naturally any simulation depends upon having realistic values of the uncertainties in the positions to begin with. When one considers how varied the conditions are in which black box technologies like RTN RTK will be employed, I can't myself see any good practice other than testing the accuracies oneself to either verify the manufacturer's claims or to revise them for one's own work.

Naturally, simple modifications such as adding essentially exact conventional measurements between nearby points positioned via GPS techniques can provide both an important test of the a priori assumptions about the standard errors in the GPS-derived positions and demonstrate a low probability of blunders being present in the GPS work.

Posted : August 29, 2012 8:57 am

Kent McMillan

(@kent-mcmillan)

Posts: 11416

Member

Topic starter

Translate ▼

Some math for Kent

> Just some real world numbers spaced over ten years for a look at the accuracy of RTK.
> Two different Trimble systems also.

Those are differences, I take it? Would you also post the signs?

Were both sets of coordinates recomputed when the new vectors were added to the old project or were the new coordinates derived by simple transformation using just the common control point values (that didn't include the above cadastral monuments, I assume)? I'm trying to get an idea of the method used since it obviously can make a significant difference in how the data you posted is analyzed.

For example, your data suggests that each set of positions had a standard error of +/-5mm in N and E (if they were both of the same magnitude), but was each position in both sets the mean of several occupations or just from one shot?

Posted : August 29, 2012 9:00 am

Farsites

(@farsites)

Posts: 267

Member

Translate ▼

Some math

Good work and dilligence.
Looks like there is some potential there for useful application on any number of things that surveyors do. Seems suitable for things like, for instance Cat 1-B Standard Survey in TX.

Posted : August 29, 2012 9:02 am

MightyMoe

(@mightymoe)

Posts: 10534

Member

Translate ▼

Some math for Kent

Would you also post the signs?

I kind of thought you might like those. The Y's are all negative. The X for the CE1/16 is negative.

I changed the main control point from the old job to the new lat, long, numbers. I then recomputed the locations using the old vectors to put the monuments into the new projection.

Then the existing monuments were located and those are the results. I was quite happy with the results. I figured I'd see larger differences than that.

The reason the monuments were surveyed in the first place was to fence the boundary. I set the corners and line points along the boundary lines. A big hurry job. Of course, the fence never got built, so the corners didn't get disturbed.

Posted : August 29, 2012 9:46 am

Kent McMillan

(@kent-mcmillan)

Posts: 11416

Member

Topic starter

Translate ▼

Some math for Kent

> Would you also post the signs?
>
> I kind of thought you might like those. The Y's are all negative. The X for the CE1/16 is negative.
>
> I changed the main control point from the old job to the new lat, long, numbers. I then recomputed the locations using the old vectors to put the monuments into the new projection.
>
> Then the existing monuments were located and those are the results. I was quite happy with the results. I figured I'd see larger differences than that.
>
> The reason the monuments were surveyed in the first place was to fence the boundary. I set the corners and line points along the boundary lines. A big hurry job. Of course, the fence never got built, so the corners didn't get disturbed.

So, if I understand things, you updated the coordinates of the control point from the old job to those in the new system and recomputed the old project. Were the boundary monuments assigned new i.d. nos. or are the differences you mentioned the residuals when both sets of RTK vectors, new and old, were adjusted to the same point? I'm assuming the former, but better ask.

One missing piece of information is how the monuments were positioned in the old project. That is:

- Were they adjusted from more than one RTK occupation and, if so, how many?
- How long were the RTK occupations (approximately)?
- Were the monuments directly connected to any other control points other than the one occupied by the RTK base?

Likewise for the new coordinates. That is:

- Were they adjusted from more than one RTK occupation and, if so, how many?
- How long were the RTK occupations (approximately)?

My first guess would be that both sets of positions were computed from more than a one-beep RTK vector. What did I win?

Posted : August 29, 2012 10:13 am

MightyMoe

(@mightymoe)

Posts: 10534

Member

Translate ▼

Some math for Kent

I set the monuments in 2001 from a point west of the property. Then I moved the base to a control point north of the property and ran line checking the monuments as I went. No adjustments were applied. The coordinates are calculated positions of a section breakdown.

The 2012 checks were done from the west control point.

I'm holding record values for my new plat which will merge this small area into a larger boundary

As with any established and accepted property monument they all have the same error. Which, of course, is zero.

Posted : August 29, 2012 10:24 am

Using Differences in Two Repeats to Estimate Uncertainty

Please support our supporters!

Our Policies

Quick Links

Support