A fellow surveyor and I have need of being able to access survey coordinate job files that were created many years ago. We have print-outs of the coordinates for all 700 or so jobs, and need to access those survey coordinates, most efficiently.
We recently heard about OCR and thought "thats exactly what we need"!
But is it?
We are hoping to be able to create ascii text files from those old paper print-outs. The software we are using in our day to day work is Benchmark. (Dont anyone laugh). Benchmark allows for the importing of ascii text files.
We find Benchmark works just fine for the type of work we do, and since we are both old timers we dont want to learn anything new. So you vendors out there reading this please dont try to sell us new computing software.
We just want input from surveyors who may have used this OCR thing to create ascii text files.
If there are old timers out there who still use Benchmark, and who have been in similar situations, please help us to solve out problem.
Thanks in advance.
For your application, accuracy would be a primary consideration. My experience with several OCR programs has been mixed in that regard. Others may have had better luck.
I have had much better (more accurate) results scanning the original hard copy into a pdf file and then copying and pasting from the pdf to a Word document or text file. Unless the original is severely degraded, I get almost 100% accuracy with this technique, something I never got with OCR software.
I used a version of OCR software that came with a desktop scanner a few years back to convert some technical special provisions into a word document. I found that it was somewhere around 95 to 98% accurate. I then looked around to see if there was anything more reliable and found that what I had was pretty much as good as it got that was commercially available.
The documents that I scanned were very legible typewritten specs. Even as such, the software did mis-recognize a fair amount of text.
So, bottom line, you will need to spend a good amount of time proofing the conversion as it does make mistakes in the conversion.
But, it was a time saver over having to type everything by hand.
I recently used the online website "free-ocr.com". I scanned a document in German, uploaded to OCR, and used Google translate to get it into English, and it worked just fine (or about 98%).
For what you want "free-ocr.com" will get you text files quite accurately and easy. Since it's free you are limited to 10 OCR's per hour and you have to upload a scan of your page at 300 dpi or better.
Give it a try.
I've used http://www.onlineocr.net/ many times for coordinate files, title reports, and or long legal descriptions. It isn't perfect but still a big time saver.
I would be somewhat leery about OCR. How will you ever know if an "8" was scanned as a "0"?
If you do scan it, can you change the text size and type of the coordinates? Some fonts are more easily read than others.
What I have done with old coordinate lists is 10-key them into an Excel worksheet.
If the numbers are sequential then you can make a formula to add one in the point number column of each row.
Then I enter northing enter northing enter northing enter, etc.
Then go back to the top and easting enter easting enter and so on.
It is surprisingly fast, for me anyway. I learned to 10-key in the early 80s entering topo notes into a reduction program.
After I have all the coordinates in I save as a CSV file.
I think Dave has the best idea, I've used it. I would suggest only doing the files as needed and drop all the unnecessary digits.
If you have good, clear pdf's of your "printouts" Adobe Acrobat will do an awesome job of OCR'ing your coordinate lists. Perhaps not 100% error free, but pretty close. You will have fewer errors by far than you would have transcription errors.
Dirty, crinkly copies of copies - not so much. But good clear files printed or scanned to pdf's OCR very reliably.
If this is worth something to you, it's worth not trying it with some downloaded freeware. Acrobat does cost a bit of money ($300) but you do get value. It's as easy as opening the file and then saving as text.
If you would like, email me a sample pdf and I'll run it for you, so you can compare with the original.
So maybe these "old dudes" don't have the 10 key skills. They can hire a person that does and she will have the numbers keyed in in no time. Great suggestion, Dave.
> A fellow surveyor and I have need of being able to access survey coordinate job files that were created many years ago. We have print-outs of the coordinates for all 700 or so jobs, and need to access those survey coordinates, most efficiently.
> We recently heard about OCR and thought "thats exactly what we need"!
> But is it?
> We are hoping to be able to create ascii text files from those old paper print-outs. The software we are using in our day to day work is Benchmark. (Dont anyone laugh). Benchmark allows for the importing of ascii text files.
> We find Benchmark works just fine for the type of work we do, and since we are both old timers we dont want to learn anything new. So you vendors out there reading this please dont try to sell us new computing software.
> We just want input from surveyors who may have used this OCR thing to create ascii text files.
> If there are old timers out there who still use Benchmark, and who have been in similar situations, please help us to solve out problem.
> Thanks in advance.
Adobe Pro has OCR capability built in. Print and scan the coordinates at a high resolution.
Run an OCR. Copy and paste the resulting text into an Excel spreadsheet.
Save As the Excel file to create the ascii you need.
For those worried about the accuracy of the OCR, print the original scanned documents on a clear film and the OCR'ed coordinates on regular paper. You can then lay one on top of the other to very quickly check for anomalies.
My experience has been very near perfect accuracy.
Will add due respect to those who think they can "10 key" the information faster and more accurately than this process, I heartily disagree. Maybe one can do a few coordinates or a page or two. But 700 jobs? No contest. Automate, man automate.
Larry P
I'm with the general consensus. OCR isn't perfect, and you have to edit carefully. But I don't discount it. Learn what to watch out for. A letter oh, can look a lot like number zero. letter el can look like number one (1). Just double-check everything when you use ocr.
I agree with Norman OK. I have used the adobe acrobat character recognition. It seems to work pretty well. When you scan it into adobe, you need to tell it whether you want to use ocr, or if you want it to come in in basically a picture. Depending on the quality of the hard copy, funny characters close together or wrinkled paper, etc can make the ocr software think it is some funny character in times new roman, or some other font style. You might see a change of font styles in the document you brought in.
You could also let Excel read the numbers outloud as you sit back with the printouts in front of you. Set on FAST.
Also, if you are getting someone to check numbers, embed some intentional mistakes. Works great for quality control to make sure they are actually checking.
You may already have OCR software
Microsoft Document Imaging - a free tool within the MS-Office suite - has it built in.
Thanks to all of you who had some good ideas.
Jim
> We recently heard about OCR and thought "thats exactly what we need"!
I suggest hand-writing each set of coordinates 100 times, each on a separate piece of paper.
Wait, you said OCR, not OCD. Never mind.
Jim
you forgot...you have to wash your hands after each coordinate 😉
Jim
> you forgot...you have to wash your hands after each coordinate 😉
That would have been me last night after cleaning my plotter spittoon. Washed 'em at least a dozen times with different combinations of solvents (including the universal solvent) and detergents. Still have multi-colored hands today, but it's fading.
THEY CALLED IT "GOOP"
BACK IN THE DAY AUTO MECHANICS USED SOMETHING BRANDd NAMED "GOOP" TO CLEAN THEIR GREASY HANDS. There is probably a similar product still made and available at your local auto repair store.
Liquid Dishwashing soap works very nicely for removing grease and ink. But you have to use the proper washing sequence. The sequence goes like this
1 Do not run you hands under hot water first
2 The FIRST thing you want to do is put plenty of dishwashing liquid on your hands
3 Work the inky or greasy areas until you see the soap is being saturated with the grease and/or ink
4 Use a sparing amount of cool water to wash away the dirty soap. Do not throughly was all the soap film off yet
5 Add more dishwashing scrub lightly rinse and repeat till you are done.
We all know that oil and water do not mix. So what happens when you stick your greasy and or inky hands under a blast of hot water is 2 things
1 the grease is forced to bind fast to itself and the natural oils in your hands
2 Your hands are heated up and the tissue expands and the grease sinks deeper into your hands.
As I write this, I think Crisco might work as well...