I've been tinkering with writing my own computer program to do least squares fitting, in 2D plane geometry, using an input file vaguely like that for Star*Net. I've got it running in an awkward manner (process input to files of matrices, solve those in Matlab, read matrices back in, and print out statistics). It is still missing some important features like centering error. I can get the same answers as StarNet's demo mode with 10 points. It also gives reasonable-looking answers for a problem with 35 points.
Star*Net says it can handle a network of 10,000 stations. I probably won't attempt that.
But what's the biggest job you have ever run a LS fit on (if you aren't NGS), and what is more typical for a moderately big job?
What do you consider a moderately big job? 5 miles of mapping control for a freeway?
I have 6 files that I work in very regularly that are over 15k each.
PacSoft only allows files up to 32k, which is why we have multiple files now.
Carlson allows for 99,999 so I'm good with my new projects. 🙂
> But what's the biggest job you have ever run a LS fit on (if you aren't NGS), and what is more typical for a moderately big job?
Biggest file probably a few dozen control points, several dozen side-tied boundary points (multiple sets), and a few thousand side shot topo points. Rarely all at the same time.
If that's big to you and you run least squares to fit those points, then yes. How many points do you do LS on?
Do you perform LS on all those points, or only on a smaller control network?
Least Squares Hijack
> I've been tinkering with writing my own computer program to do least squares fitting, in 2D plane geometry, using an input file vaguely like that for Star*Net.
This has been bugging me for a while. Since using least squares presupposed a normal (gaussian) distribution of error; has anyone ever done any empirical studies to see if this is actually the case with the small set sizes of a "typical" surveyors control network (say less than 50 stations).
On the surface it would seem to me that with modern equipment you're more likely to get a majority of the observations bunched close to the true value with one or two outliers, resulting in some sort of heavy tailed distribution.
Only in the newer files. Some of those crds are 25 years old and weren't the tightest to begin with. Compass rule for simple loops between control points.
Least Squares Hijack
Even if the distribution is heavy in the tails, with a significant amount of redundancy you still get a "good" answer although it is not mathematically the exact optimum. Nobody seems to be talking about a better way to calculate a fit under those conditions.
LS has the big advantage over simple methods of adjustment that it can accept any sufficient combination of measurements and doesn't require a particular topology.
I have looked at the distribution of residuals (normalized by std err of each) for my 35 point example, with 250 actual angle observations and a few distances. The angles had std err set according to conditions of each batch (number of D&R etc) and a couple outliers were recognized and removed. The resulting residuals were very Gaussian out to the limit of the 250 observations. It would be interesting to do this for somebody's umpteen-k point file.
> But what's the biggest job you have ever run a LS fit on (if you aren't NGS), and what is more typical for a moderately big job?
I have one that currently has around 600 stations in it, excluding sideshots. Most of my networks are much smaller, with stations numbered in the dozens.
The old and much maligned BLM CMM back in the early DOS era say 1990-93 would handle over 2000 points, 900 control points 2200 angles and distances and angles and 200 fixed azimuths. And it was working geodetically.
It was only limited by DOS era memory constraints which evolved through EMM memory and then extended memory. The above limits probably conform to 8Mb (not Gb!) of available memory. We had people who would create a 4 Mb ramdisk from that wonderful extended memory and load the software and large data sets into it and run it. Data software and all.
Can you run anything today in 4Mbs of RAM?
The GMM application has been used to adjust record data for 100's of townships, possibly up to 400 townships or so. This was record data and not angles and distance data.
A 100 townships would be on the order of 10,000 bearings and distances and probably up to 400 or more control points and I would gues about 10,000 coordinated points. Again back to about 1995.
When CMM was new in 1992 or so a large township dataset would take about 8 hours to run on an 8088. By 2002 or so with the massive increase in CPU power and memory the same adustment would take about 3 seconds. Almost all of it attribable to CPU speed increases from 20Mhz to 2 Ghz over that time frame.
We had a very very efficient least squares but efficiency has lost to CPU speed so that inefficient processes do not seem to be that bad these days and bad programs running small datasets don't look that bad.
Just a few data points for you.
- jlw
Is Is Only The Control Points That Are Adjusted
And all the side sides get translated and rotated to the adjusted control.
Essentially the matrices need only encompass the control points, but why remove all the side shots then bring them back after the adjustment? Keeping it all in the file saves considerable time.
Other than as an academic exrecise why would you wnt to do this? Especially limiting it to 2D?
In the end commercial programs are quite a bargain.
Paul in PA
Paul
>Is Is Only The Control Points That Are Adjusted
Agreed. I probably didn't ask the original question clearly enough and got mixed answers.
>Other than as an academic exrecise
That's one of my motivations. I'm interested in how to tie all that textbook stuff together, and how to interpret what commercial programs are doing and assuming for you.
For instance, I noticed an interesting difference between Star*Net and the Wolf and Ghilani text regarding the error ellipses (neither necessarily wrong), that I may make another thread about.
>limiting it to 2D?
You have to start somewhere. It's useful, and it's simpler to write and debug than 3D and geodetic.
>commercial programs are quite a bargain
If you are doing adjustments for clients' projects, agreed. I'm retired and am not going to pay for a commercial program, and I want to have something to run problems with (bigger than 10 point demo).
LARGE Least Squares Adjustment Software - FREE
There is a lot of FREE adjustment software available from various government agencies.
Consider ADJUST from NGS and USHER from NGA and also check out AUSLIG's software.
My biggest field job in star*net so far is about 4000 observations, maybe 800 points. I use the direction set data type more than anything else. Lots of trig leveling with multiple shots in both faces.
StarNet Versus Wolfe And Ghilani
I did some side by side adjustments years ago in both. One might notice a difference in 0.001' at some adjusted points, but it was nothing to write home about. I did not pay attention to differences in error ellipses, but I assume it is possible to get the exact same adjusted coordinates with considerably different error ellipses.
I would suggest you start with a perfect 6 point traverse, with minimal redundancy and redo it with only distance, angle and then distance & angle errors introduced to see which way they bulge.
Also verify that the differences are not caused by slight differences in estimated errors from program to program.
I know when you are trying to learn what is going on, a simple traverse can be quite intriging. Doing things like doubling the estimated error has significantly less impact on results.
Paul in PA
One thing to keep in mind is that systematic errors in the measurements should be corrected before a least squares adjustment. Least squares will then make a distribution of the random errors. This distribution will be better for networks with higher redundancy.
StarNet Versus Wolfe And Ghilani
The key to the difference in error statistics is found in the Star*Net help info:
If the adjustment passes the Chi Square test, the adjusted
coordinate standard deviations are used to compute the error
ellipses directly.
However, if the adjustment fails the Chi Square test, the
standard deviations are increased by the computed Total
Error Factor, to reflect the weak adjustment.
What I was seeing was the sigma_n and sigma_e values given in Wolf & Ghilani Figure 15.11 were not equal to the "Station Coordinate Standard Deviations" in Star*Net. W&G numbers were smaller in the ratio of "Standard Deviation of Unit Weight" also called sigma_0 or "Total Error Factor" and in this example 0.698.
Wolf & Ghilani seem to always do this multiplication. Star*Net doesn't let it get so optimistic if you have a better fit than predicted.
I had planned to do something vaguely similar, to take the more pessimistic of the results from the std err you supply with your observations, or the overall fit results. I would just test sigma_0 and not use any value less than 1.0 as a multiplier. Star*Net doesn't multiply unless sigma_0 is over the upper Chi-Squared limit.
-----------
I still can't make the 95% confidence value match between W&G Example 15.9 vs Star*Net "Station Coordinate Error Ellipse" for that point. The factor is not 0.698, so it will take more investigation.