AI Assistant
Notifications
Clear all

How to Open Huge Text File

19 Posts
12 Users
0 Reactions
584 Views
jeff-wright
(@jeff-wright)
Posts: 82
Member
Topic starter
Translate
English
Spanish
French
German
Italian
Portuguese
Russian
Chinese
Japanese
Korean
Arabic
Hindi
Dutch
Polish
Turkish
Vietnamese
Thai
Swedish
Danish
Finnish
Norwegian
Czech
Hungarian
Romanian
Greek
Hebrew
Indonesian
Malay
Ukrainian
Bulgarian
Croatian
Slovak
Slovenian
Serbian
Lithuanian
Latvian
Estonian
 

Hi,

I have a huge data set that I'm trying to use for a recreational project that my dad and I are working on. The data file that I have is a large text file (comma separated values) that contains latitude, longitude, and depth in each row. The problem is that there are about 10 million rows of data! The file is too large for excel to open fully. I can open it in wordpad, but it runs so slow that it is essentially nonfunctional.

Those 10 million rows are bathymetry data, collected in 1996, of the entire Santa Monica Bay. I'm only interested in a small subset of the data. Specifically, I only want the points that lie between 33.95deg and 33.98deg latitude by 118.49deg and 118.52deg longitude. Once I get that subset of data, I'm going to convert it to UTM grid (zone 11) so I can make rough comparisons to nautical charts. Then, I'm going to use my handheld gps to find some of the spots represented in the data and scuba dive in those spots.

My question is: how can I extract that data from a file that is so big that I can't open it with exel?

Here's a link to the data from the USGS, http://geopubs.wr.usgs.gov/open-file/of02-162/data/bathy.zip

Thanks,
Jeff


 
Posted : January 11, 2012 9:29 pm
Farsites
(@farsites)
Posts: 267
Member
Translate
English
Spanish
French
German
Italian
Portuguese
Russian
Chinese
Japanese
Korean
Arabic
Hindi
Dutch
Polish
Turkish
Vietnamese
Thai
Swedish
Danish
Finnish
Norwegian
Czech
Hungarian
Romanian
Greek
Hebrew
Indonesian
Malay
Ukrainian
Bulgarian
Croatian
Slovak
Slovenian
Serbian
Lithuanian
Latvian
Estonian
 

I found this site to be quite useful. Opened soem rather large LIDAR files foillowing their advice


 
Posted : January 11, 2012 9:54 pm
loyal
(@loyal)
Posts: 3735
Member
Translate
English
Spanish
French
German
Italian
Portuguese
Russian
Chinese
Japanese
Korean
Arabic
Hindi
Dutch
Polish
Turkish
Vietnamese
Thai
Swedish
Danish
Finnish
Norwegian
Czech
Hungarian
Romanian
Greek
Hebrew
Indonesian
Malay
Ukrainian
Bulgarian
Croatian
Slovak
Slovenian
Serbian
Lithuanian
Latvian
Estonian
 

Got it (I think).

About 1MB unzipped, 193k zipped.

Where do I send it?

Loyal


 
Posted : January 11, 2012 10:02 pm
paulplatano
(@paulplatano)
Posts: 293
Member
Translate
English
Spanish
French
German
Italian
Portuguese
Russian
Chinese
Japanese
Korean
Arabic
Hindi
Dutch
Polish
Turkish
Vietnamese
Thai
Swedish
Danish
Finnish
Norwegian
Czech
Hungarian
Romanian
Greek
Hebrew
Indonesian
Malay
Ukrainian
Bulgarian
Croatian
Slovak
Slovenian
Serbian
Lithuanian
Latvian
Estonian
 

If you had dbase or Foxpro, you could read a comma delimited.
Ten million records, I am not sure. With one command you
could make a new file with the smaller subset. Then you can
sort by lat or long with the subset file. Import to ACAD
and your maybe your cogo.


 
Posted : January 11, 2012 10:07 pm
BigE
 BigE
(@bige)
Posts: 2685
Member
Translate
English
Spanish
French
German
Italian
Portuguese
Russian
Chinese
Japanese
Korean
Arabic
Hindi
Dutch
Polish
Turkish
Vietnamese
Thai
Swedish
Danish
Finnish
Norwegian
Czech
Hungarian
Romanian
Greek
Hebrew
Indonesian
Malay
Ukrainian
Bulgarian
Croatian
Slovak
Slovenian
Serbian
Lithuanian
Latvian
Estonian
 

Go look for NotePad++
I would be suprised if it can't handle it.


 
Posted : January 11, 2012 10:15 pm

jeff-wright
(@jeff-wright)
Posts: 82
Member
Topic starter
Translate
English
Spanish
French
German
Italian
Portuguese
Russian
Chinese
Japanese
Korean
Arabic
Hindi
Dutch
Polish
Turkish
Vietnamese
Thai
Swedish
Danish
Finnish
Norwegian
Czech
Hungarian
Romanian
Greek
Hebrew
Indonesian
Malay
Ukrainian
Bulgarian
Croatian
Slovak
Slovenian
Serbian
Lithuanian
Latvian
Estonian
 

Thanks Loyal! The email address in my profile is current.
I'll also play around with some of the suggestions from the other posters. Thanks for all of the replies!

Jeff


 
Posted : January 11, 2012 10:30 pm
loyal
(@loyal)
Posts: 3735
Member
Translate
English
Spanish
French
German
Italian
Portuguese
Russian
Chinese
Japanese
Korean
Arabic
Hindi
Dutch
Polish
Turkish
Vietnamese
Thai
Swedish
Danish
Finnish
Norwegian
Czech
Hungarian
Romanian
Greek
Hebrew
Indonesian
Malay
Ukrainian
Bulgarian
Croatian
Slovak
Slovenian
Serbian
Lithuanian
Latvian
Estonian
 

On its way Jeff

34,566 points (if I did it right).

I didn't reply directly to your post, so you should still be able to edit your email address out if you want to.

Loyal


 
Posted : January 11, 2012 10:40 pm
amdomag
(@amdomag)
Posts: 654
Member
Translate
English
Spanish
French
German
Italian
Portuguese
Russian
Chinese
Japanese
Korean
Arabic
Hindi
Dutch
Polish
Turkish
Vietnamese
Thai
Swedish
Danish
Finnish
Norwegian
Czech
Hungarian
Romanian
Greek
Hebrew
Indonesian
Malay
Ukrainian
Bulgarian
Croatian
Slovak
Slovenian
Serbian
Lithuanian
Latvian
Estonian
 

What Excel version are you using? I haven't tried a file of that size but what I know is that Excel 2010 has greater data capacity than 2003.


 
Posted : January 11, 2012 11:31 pm
amdomag
(@amdomag)
Posts: 654
Member
Translate
English
Spanish
French
German
Italian
Portuguese
Russian
Chinese
Japanese
Korean
Arabic
Hindi
Dutch
Polish
Turkish
Vietnamese
Thai
Swedish
Danish
Finnish
Norwegian
Czech
Hungarian
Romanian
Greek
Hebrew
Indonesian
Malay
Ukrainian
Bulgarian
Croatian
Slovak
Slovenian
Serbian
Lithuanian
Latvian
Estonian
 

If Excel 2010 can't help you then you must consult a database guy in your community. Database engines can handle huge amount of datasets with ease. Using SQL, your desired data can be retrieved in no time.


 
Posted : January 11, 2012 11:38 pm
christ-lambrecht
(@christ-lambrecht)
Posts: 1409
Member
Translate
English
Spanish
French
German
Italian
Portuguese
Russian
Chinese
Japanese
Korean
Arabic
Hindi
Dutch
Polish
Turkish
Vietnamese
Thai
Swedish
Danish
Finnish
Norwegian
Czech
Hungarian
Romanian
Greek
Hebrew
Indonesian
Malay
Ukrainian
Bulgarian
Croatian
Slovak
Slovenian
Serbian
Lithuanian
Latvian
Estonian
 

Would be interesting to know how you did it ...
I was thinking of importing in Access and filtering on Lat/Lon ?

Chr.


 
Posted : January 11, 2012 11:56 pm

paulplatano
(@paulplatano)
Posts: 293
Member
Translate
English
Spanish
French
German
Italian
Portuguese
Russian
Chinese
Japanese
Korean
Arabic
Hindi
Dutch
Polish
Turkish
Vietnamese
Thai
Swedish
Danish
Finnish
Norwegian
Czech
Hungarian
Romanian
Greek
Hebrew
Indonesian
Malay
Ukrainian
Bulgarian
Croatian
Slovak
Slovenian
Serbian
Lithuanian
Latvian
Estonian
 

If you got SQL, it would handle your file. Right on, AM!


 
Posted : January 12, 2012 12:47 am
jimcox
(@jimcox)
Posts: 2102
Member
Translate
English
Spanish
French
German
Italian
Portuguese
Russian
Chinese
Japanese
Korean
Arabic
Hindi
Dutch
Polish
Turkish
Vietnamese
Thai
Swedish
Danish
Finnish
Norwegian
Czech
Hungarian
Romanian
Greek
Hebrew
Indonesian
Malay
Ukrainian
Bulgarian
Croatian
Slovak
Slovenian
Serbian
Lithuanian
Latvian
Estonian
 

There is a neat little programming language called Awk - Gawk is a free public domain version

Its designed to process text files just like this


 
Posted : January 12, 2012 2:49 am
loyal
(@loyal)
Posts: 3735
Member
Translate
English
Spanish
French
German
Italian
Portuguese
Russian
Chinese
Japanese
Korean
Arabic
Hindi
Dutch
Polish
Turkish
Vietnamese
Thai
Swedish
Danish
Finnish
Norwegian
Czech
Hungarian
Romanian
Greek
Hebrew
Indonesian
Malay
Ukrainian
Bulgarian
Croatian
Slovak
Slovenian
Serbian
Lithuanian
Latvian
Estonian
 

Christ

The file was a simple three-field comma delimited ASCII text file containing 294,435,351 bytes (see below).


-118.63956,34.02068,-34.272
-118.63941,34.02068,-34.272
-118.63925,34.02068,-34.272
-118.63909,34.02068,-34.272
-118.63830,34.02068,-34.207
-118.63814,34.02068,-34.207
-118.63798,34.02068,-34.207
-118.63687,34.02068,-34.376
-118.63671,34.02068,-34.376

I just wrote a quick BASIC program on my DOS machine and compiled it (took about a minute):


OPEN "i", #1, "bathy.csv"
OPEN "o", #2, "Jeff.csv"

1 IF EOF(1) THEN 4
INPUT #1, Lon#, Lat#, D
IF Lon# > -118.52 AND Lon# < -118.49 THEN 2 ELSE 1
2 IF Lat# > 33.95 AND Lat# < 33.98 THEN 3 ELSE 1

3 WRITE #2, Lon#, Lat#, D
IF EOF(1) THEN 4
n = n + 1
GOTO 1

4 PRINT n: CLEAR : SYSTEM

Moved the executable to this Windoz Internet machine and ran it (about another minute), and BINGO.

It actually took less time to write/compile the program and run it, than it did to download and unzip the original file!

Loyal


 
Posted : January 12, 2012 8:18 am
dshearon
(@dshearon)
Posts: 195
Member
Translate
English
Spanish
French
German
Italian
Portuguese
Russian
Chinese
Japanese
Korean
Arabic
Hindi
Dutch
Polish
Turkish
Vietnamese
Thai
Swedish
Danish
Finnish
Norwegian
Czech
Hungarian
Romanian
Greek
Hebrew
Indonesian
Malay
Ukrainian
Bulgarian
Croatian
Slovak
Slovenian
Serbian
Lithuanian
Latvian
Estonian
 

It does. Notepad ++ is awesome software.


 
Posted : January 12, 2012 8:55 am
christ-lambrecht
(@christ-lambrecht)
Posts: 1409
Member
Translate
English
Spanish
French
German
Italian
Portuguese
Russian
Chinese
Japanese
Korean
Arabic
Hindi
Dutch
Polish
Turkish
Vietnamese
Thai
Swedish
Danish
Finnish
Norwegian
Czech
Hungarian
Romanian
Greek
Hebrew
Indonesian
Malay
Ukrainian
Bulgarian
Croatian
Slovak
Slovenian
Serbian
Lithuanian
Latvian
Estonian
 

Christ

Whow,

genius way to solve a memory problem on our Windows MB machines and do it on an old Dos machine with Q basic!

Chr.


 
Posted : January 12, 2012 9:17 am

jeff-wright
(@jeff-wright)
Posts: 82
Member
Topic starter
Translate
English
Spanish
French
German
Italian
Portuguese
Russian
Chinese
Japanese
Korean
Arabic
Hindi
Dutch
Polish
Turkish
Vietnamese
Thai
Swedish
Danish
Finnish
Norwegian
Czech
Hungarian
Romanian
Greek
Hebrew
Indonesian
Malay
Ukrainian
Bulgarian
Croatian
Slovak
Slovenian
Serbian
Lithuanian
Latvian
Estonian
 

I'm using the 2007 version of excel.


 
Posted : January 12, 2012 8:04 pm
bill93
(@bill93)
Posts: 9977
Member
Translate
English
Spanish
French
German
Italian
Portuguese
Russian
Chinese
Japanese
Korean
Arabic
Hindi
Dutch
Polish
Turkish
Vietnamese
Thai
Swedish
Danish
Finnish
Norwegian
Czech
Hungarian
Romanian
Greek
Hebrew
Indonesian
Malay
Ukrainian
Bulgarian
Croatian
Slovak
Slovenian
Serbian
Lithuanian
Latvian
Estonian
 

I like Loyal's solution.

I've used Awk and it is indeed powerful. It can do the same processing on a bunch of files in one command. But it has a super-condensed syntax that will take you a few tries to get right even if you are a programmer in other languages.


 
Posted : January 13, 2012 10:06 am
DavidALee
(@davidalee)
Posts: 1116
Member
Translate
English
Spanish
French
German
Italian
Portuguese
Russian
Chinese
Japanese
Korean
Arabic
Hindi
Dutch
Polish
Turkish
Vietnamese
Thai
Swedish
Danish
Finnish
Norwegian
Czech
Hungarian
Romanian
Greek
Hebrew
Indonesian
Malay
Ukrainian
Bulgarian
Croatian
Slovak
Slovenian
Serbian
Lithuanian
Latvian
Estonian
 

DOS

We use these batch programs for many things, they are very handy. We use them for renaming pics, editing GPS data and many other things as well. One of my favorites is one that runs through a rinex file and will strip out certain time periods, delete sat's, strip cycle slips, whatever you tell it to do.


 
Posted : January 13, 2012 10:55 am
DeletedUser
(@deleted-user)
Posts: 8340
Member
Translate
English
Spanish
French
German
Italian
Portuguese
Russian
Chinese
Japanese
Korean
Arabic
Hindi
Dutch
Polish
Turkish
Vietnamese
Thai
Swedish
Danish
Finnish
Norwegian
Czech
Hungarian
Romanian
Greek
Hebrew
Indonesian
Malay
Ukrainian
Bulgarian
Croatian
Slovak
Slovenian
Serbian
Lithuanian
Latvian
Estonian
 

DOS

I could use some help from one of you, if you are willing please email me at my profile email. Its a small programming challenge that I am working on. Thanks.
Cliff


 
Posted : January 13, 2012 11:06 am