|
|
|
| |||||||||
![]() |
|
|
«
Previous Thread
|
Next Thread
»
|
Thread Tools | Search this Thread | Display Modes |
|
#1
|
|||
|
|||
|
search between specific times in a file
7/2/2008 9:49 AM, Dave wrote: I have a data file like this: > Fri Jun 27 00:21:22 2008 2.372715 -59.46 341.998375 NA NA NA NA NA NA Fri Jun 27 00:21:23 2008 2.953534 NA NA -49.28 341.998250 NA NA NA NA Fri Jun 27 00:21:23 2008 3.551102 NA NA NA NA -58.45 341.998250 NA NA Fri Jun 27 00:21:24 2008 4.102576 NA NA NA NA NA NA -50.72 341.998250 Fri Jun 27 00:21:25 2008 4.653693 -57.85 341.998250 NA NA NA NA NA NA Fri Jun 27 00:21:25 2008 5.233784 NA NA -51.57 341.998250 NA NA NA NA Fri Jun 27 00:21:26 2008 5.783372 NA NA NA NA -54.75 341.998250 NA NA Fri Jun 27 00:21:26 2008 6.331417 NA NA NA NA NA NA -52.60 341.998250 Fri Jun 27 00:21:27 2008 6.912254 -52.77 341.998125 NA NA NA NA NA NA Fri Jun 27 00:21:27 2008 7.472123 NA NA -50.72 341.998250 NA NA NA NA Fri Jun 27 00:21:28 2008 8.053296 NA NA NA NA -53.51 341.998250 NA NA Fri Jun 27 00:21:28 2008 8.604176 NA NA NA NA NA NA -51.51 341.998250 Fri Jun 27 00:21:29 2008 9.183829 -60.16 341.998125 NA NA NA NA NA NA Fri Jun 27 00:21:30 2008 9.761417 NA NA -50.10 341.998250 NA NA NA NA Fri Jun 27 00:21:30 2008 10.312063 NA NA NA NA -57.42 341.998250 NA NA Fri Jun 27 00:21:31 2008 10.860867 NA NA NA NA NA NA -51.45 341.998250 Fri Jun 27 00:21:31 2008 11.441953 -54.70 341.998250 NA NA NA NA NA NA Fri Jun 27 00:21:32 2008 12.023939 NA NA -50.30 341.998250 NA NA NA NA > > The time covered in a data file is only a few hours, so any hour:min:sec will be unique - I don't need to worry about days wrapping around. Also, the date will always have two digits for the hour, two for the minute and two for the second, even if those numbers are zero. So one will *not* find something like > 1:23:02 or 21:3:19 > such times would be written as > 01:23:02 and 21:03:19. > I'd like to create a second file, which has a subset of these times, perhaps between 00:21:23 and 00:21:56, between 00:29:23 and 00:30:29, 01:03:05 and 03:04:07 etc. > The data file always looks similar to above - one of the other columns is never going to have a date in it. > Any ideas of a Unix tool to do this? awk '/00:21:23/,/00:21:56/' file Ed. |
|
#2
|
|||
|
|||
|
search between specific times in a file
I have a data file like this:
Fri Jun 27 00:21:22 2008 2.372715 -59.46 341.998375 NA NA NA NA NA NA Fri Jun 27 00:21:23 2008 2.953534 NA NA -49.28 341.998250 NA NA NA NA Fri Jun 27 00:21:23 2008 3.551102 NA NA NA NA -58.45 341.998250 NA NA Fri Jun 27 00:21:24 2008 4.102576 NA NA NA NA NA NA -50.72 341.998250 Fri Jun 27 00:21:25 2008 4.653693 -57.85 341.998250 NA NA NA NA NA NA Fri Jun 27 00:21:25 2008 5.233784 NA NA -51.57 341.998250 NA NA NA NA Fri Jun 27 00:21:26 2008 5.783372 NA NA NA NA -54.75 341.998250 NA NA Fri Jun 27 00:21:26 2008 6.331417 NA NA NA NA NA NA -52.60 341.998250 Fri Jun 27 00:21:27 2008 6.912254 -52.77 341.998125 NA NA NA NA NA NA Fri Jun 27 00:21:27 2008 7.472123 NA NA -50.72 341.998250 NA NA NA NA Fri Jun 27 00:21:28 2008 8.053296 NA NA NA NA -53.51 341.998250 NA NA Fri Jun 27 00:21:28 2008 8.604176 NA NA NA NA NA NA -51.51 341.998250 Fri Jun 27 00:21:29 2008 9.183829 -60.16 341.998125 NA NA NA NA NA NA Fri Jun 27 00:21:30 2008 9.761417 NA NA -50.10 341.998250 NA NA NA NA Fri Jun 27 00:21:30 2008 10.312063 NA NA NA NA -57.42 341.998250 NA NA Fri Jun 27 00:21:31 2008 10.860867 NA NA NA NA NA NA -51.45 341.998250 Fri Jun 27 00:21:31 2008 11.441953 -54.70 341.998250 NA NA NA NA NA NA Fri Jun 27 00:21:32 2008 12.023939 NA NA -50.30 341.998250 NA NA NA NA The time covered in a data file is only a few hours, so any hour:min:sec will be unique - I don't need to worry about days wrapping around. Also, the date will always have two digits for the hour, two for the minute and two for the second, even if those numbers are zero. So one will *not* find something like 1:23:02 or 21:3:19 such times would be written as 01:23:02 and 21:03:19. I'd like to create a second file, which has a subset of these times, perhaps between 00:21:23 and 00:21:56, between 00:29:23 and 00:30:29, 01:03:05 and 03:04:07 etc. The data file always looks similar to above - one of the other columns is never going to have a date in it. Any ideas of a Unix tool to do this? |
|
#3
|
|||
|
|||
|
search between specific times in a file
Dave wrote:
Ed Morton wrote: > Any ideas of a Unix tool to do this? >awk '/00:21:23/,/00:21:56/' file >> > Ed. >> > > Thanks, I also found: > sed -n '/03:45:14/,/03:50:56/p' infile outfile > works too. Beware that the sed version will not work if /start/ and /end/ have the same value and there is only a single line with that timestamp in the file, whereas awk will work even in that corner case (but that may not be a problem for you). -- echo 0|sed 's909=#3u)o19;s0#0ooo)];s()(0bu}=(;s#}#.1m"?0^2{#; s)")9v2@3%"9$);so%op]t(p$e#!o;sz(z^+.z;su+ur!z"au;sxzxd?_{h)cx;:b; s/\(\(.\).\)\(\(\)*\)\(\(.\).\)\(\(\)\6.*\2.*\)/\5\3\1\7/; tb'|awk '{while((i+=2)<=length($1)-18)a=a substr($1,i,1);print a}' |
|
#4
|
|||
|
|||
|
search between specific times in a file
Ed Morton wrote:
>Any ideas of a Unix tool to do this? > awk '/00:21:23/,/00:21:56/' file > Ed. > Thanks, I also found: sed -n '/03:45:14/,/03:50:56/p' infile outfile works too. So there are at least two ways of doing it. My files are relatively small, so speed is not an issue, but I guess one solution or the other might be faster for large files. Dave |
|
#5
|
|||
|
|||
|
search between specific times in a file
Dave B wrote:
Dave wrote: >Ed Morton wrote: >> Any ideas of a Unix tool to do this? awk '/00:21:23/,/00:21:56/' file Ed. >> >Thanks, I also found: >> >sed -n '/03:45:14/,/03:50:56/p' infile outfile >> >works too. > Beware that the sed version will not work if /start/ and /end/ have the same value and there is only a single line with that timestamp in the file, whereas awk will work even in that corner case (but that may not be a problem for you). > The awk solution seems noticeably slower though. For one set I have just done, the input file is about 5000 lines, and the output is about 500 lines. The sed solution seems to be "instant" where the awk solution takes a couple of tenths of a second. It's hard to measure such small times accurately, but the sed solution is definitely quicker. The awk solution might be quicker if the input is larger/smaller or the output larger/smaller. I can't be bothered to test it though. I'm using a Sun Blade 2000 running Solaris 10. The two CPUs run at 1200 MHz. |
|
#6
|
|||
|
|||
|
search between specific times in a file
Dave B wrote:
Dave wrote: >Ed Morton wrote: >> Any ideas of a Unix tool to do this? awk '/00:21:23/,/00:21:56/' file Ed. >> >Thanks, I also found: >> >sed -n '/03:45:14/,/03:50:56/p' infile outfile >> >works too. > Beware that the sed version will not work if /start/ and /end/ have the same value and there is only a single line with that timestamp in the file, whereas awk will work even in that corner case (but that may not be a problem for you). > Thank you. That is not relevant to me, as the lengths of data I want are always 3 seconds or more, and there are at least one timestamp every second - often more than one. So I doubt any section I want will be less than about 5 lines or so. But I will try to file that information away in the gray matter for future reference. |
|
#7
|
|||
|
|||
|
search between specific times in a file
Dave wrote:
Dave B wrote: >Dave wrote: Ed Morton wrote: Any ideas of a Unix tool to do this? awk '/00:21:23/,/00:21:56/' file Ed. Thanks, I also found: sed -n '/03:45:14/,/03:50:56/p' infile outfile works too. >Beware that the sed version will not work if /start/ and /end/ have the same >value and there is only a single line with that timestamp in the file, >whereas awk will work even in that corner case (but that may not be a >problem for you). >> > The awk solution seems noticeably slower though. > For one set I have just done, the input file is about 5000 lines, and the output is about 500 lines. The sed solution seems to be "instant" where the awk solution takes a couple of tenths of a second. It's hard to measure such small times accurately, but the sed solution is definitely quicker. The awk solution might be quicker if the input is larger/smaller or the output larger/smaller. I can't be bothered to test it though. > I'm using a Sun Blade 2000 running Solaris 10. The two CPUs run at 1200 MHz. one of my systems (linux x86) I don't see a noticeable difference: $ tim.sh 20 sed -n '/00:21:23/,/00:21:56/p' file 0.04315 $ tim.sh 20 awk '/00:21:23/,/00:21:56/' file 0.0471 That runs each solution 20 times, and computes the average timings ("file" has ~5000 lines and the specified range selects ~500 lines, as per your example). Also, please note that if there are many lines in the file whose timestamp is /end/, only the first one will be printed (both with sed and with awk). Again, that may not be a problem for you, but I guess it's better to know. -- echo 0|sed 's909=#3u)o19;s0#0ooo)];s()(0bu}=(;s#}#.1m"?0^2{#; s)")9v2@3%"9$);so%op]t(p$e#!o;sz(z^+.z;su+ur!z"au;sxzxd?_{h)cx;:b; s/\(\(.\).\)\(\(\)*\)\(\(.\).\)\(\(\)\6.*\2.*\)/\5\3\1\7/; tb'|awk '{while((i+=2)<=length($1)-18)a=a substr($1,i,1);print a}' |
![]() |
| Viewing: Web Development Archives > FAQs > Unix/Linux > search between specific times in a file |
| Thread Tools | Search this Thread |
| Display Modes | Rate This Thread |
|
|
|
|