I was recently called on a survey done about four years ago for additional work. This project had a backup that failed. When I discovered this my thought was to recreate the electronic field book files from print outs, and use Optical Character Recognition (OCR) software.
Well the results were not good. In some places the OCR is exactly like the print out, and in other places it is extremely confusing.
I think the software likes letters more than numbers. It changes all the ones in Face 1 commands to the letter I. So I fixed that with a global command. Then somewhere in the middle it just goes crazy.
If you look at the result you may see what I mean.
?ÿ
Historic boundaries and conservation efforts.
I'm seeing .fbk format. Please don't show me that now. I'm trying to eat lunch here.
Oops. Please forget I posted this and enjoy your lunch.
Historic boundaries and conservation efforts.
I find screwy OCR results often in my hobby of genealogy. One source says a ship arriving at Ellis Island was the Unislellim. Another source, where the fellow riding on that ship provided the information, says it was the Constellation. A search for Unislellim produced one other return. That was for the same ship. No doubt it was the Constellation.
OCR works better with larger character sizes and with some fonts over others. We should keep that in mind when printing out data records like this. I wonder if you scanned it and enlarged the print if it would work any better?
I wonder if you scanned it and enlarged the print if it would work any better?
I have found that this really does help the OCR do its thing.
I use OneNote since that's the most convenient method on my work PC, and it works remarkably well when I drag and resize a snapshot to make the text a lot bigger.
I tried stretching the image in One Note, but it still makes crazy formatted text. I'll just have to edit, but thanks for the suggestions.
Historic boundaries and conservation efforts.