Unix/Linux
 
Forums: » Register « |  User CP |  Games |  Calendar |  Members |  FAQs |  Sitemap |  Support | 
 
User Name:
Password:
Remember me
Go Back   Web Development Archives FAQs Unix/Linux

Reply
Add This Thread To:
  Del.icio.us   Digg   Google   Spurl   Blink   Furl   Simpy   Y! MyWeb 
Thread Tools Search this Thread Display Modes
 
Unread Web Development Archives Sponsor:
  #1  
Old July 30th, 2008, 07:19 AM
Andrew McDermott
Guest
Dev Archives Newbie (0 - 499 posts)
 
Posts: n/a  
Time spent in forums:
Reputation Power:
Comparing files

on Monday 28 July 2008 14:42 GS wrote:

Hi all,
I have two files like this:
>

file A:
>

line1
line2
line3
line4
>

file B:
>

line5
line3
line2
>

I want to get only lines from file A that do not appear in file B:
>

line1
line4
>

How can I accomplish this without looping through the lines of both files?
Is there a unix command to do it quickly? I tried "comm" and "uniq" but I
cannot get what I want.
>

Thanks
Guido
>
>
>

sort fileA fileB fileB | uniq -u

uniq -u prints only unique lines. By including file B twice any lines unique
to file B will end up being duplicated.

Andrew

Reply With Quote
  #2  
Old July 30th, 2008, 07:19 AM
Loki Harfagr
Guest
Dev Archives Newbie (0 - 499 posts)
 
Posts: n/a  
Time spent in forums:
Reputation Power:
Comparing files

Wed, 30 Jul 2008 08:18:12 +0100, Andrew McDermott did *:

on Monday 28 July 2008 14:42 GS wrote:
>
>Hi all,
>I have two files like this:
>>

>file A:
>>

>line1
>line2
>line3
>line4
>>

>file B:
>>

>line5
>line3
>line2
>>

>I want to get only lines from file A that do not appear in file B:
>>

>line1
>line4
>>

>How can I accomplish this without looping through the lines of both
>files? Is there a unix command to do it quickly? I tried "comm" and
>"uniq" but I cannot get what I want.
>>

>Thanks
>Guido
>>
>>
>>

sort fileA fileB fileB | uniq -u
>

uniq -u prints only unique lines. By including file B twice any lines
unique to file B will end up being duplicated.
>

Andrew

Excellent!
This form *may* be less straining in case of huge files:
$ sort fileA <(sort fileB fileB) | uniq -u
(though it'll be a bit slower because of the two steps)

Reply With Quote
  #3  
Old July 30th, 2008, 07:00 PM
Ed Morton
Guest
Dev Archives Newbie (0 - 499 posts)
 
Posts: n/a  
Time spent in forums:
Reputation Power:
Comparing files

7/30/2008 2:18 AM, Andrew McDermott wrote:
on Monday 28 July 2008 14:42 GS wrote:
>
>
>>Hi all,
>>I have two files like this:
>>
>>file A:
>>
>>line1
>>line2
>>line3
>>line4
>>
>>file B:
>>
>>line5
>>line3
>>line2
>>
>>I want to get only lines from file A that do not appear in file B:
>>
>>line1
>>line4
>>
>>How can I accomplish this without looping through the lines of both files?
>>Is there a unix command to do it quickly? I tried "comm" and "uniq" but I
>>cannot get what I want.
>>
>>Thanks
>>Guido
>>
>>
>>

>

sort fileA fileB fileB | uniq -u
>

uniq -u prints only unique lines. By including file B twice any lines unique
to file B will end up being duplicated.

But any lines that appear multiple times in fileA will be discarded even if they
don't appear in fileB.

Ed.


Reply With Quote
  #4  
Old July 30th, 2008, 07:00 PM
wick end
Guest
Dev Archives Newbie (0 - 499 posts)
 
Posts: n/a  
Time spent in forums:
Reputation Power:
Comparing files

But any lines that appear multiple times in fileA will be discarded even if they
don't appear in fileB.
>

Ed.

Indeed!
It seemed too easy. I should think twice.

Reply With Quote
  #5  
Old July 31st, 2008, 07:02 AM
Loki Harfagr
Guest
Dev Archives Newbie (0 - 499 posts)
 
Posts: n/a  
Time spent in forums:
Reputation Power:
Comparing files

Wed, 30 Jul 2008 17:24:59 -0500, Ed Morton did *:

7/30/2008 2:18 AM, Andrew McDermott wrote:
>on Monday 28 July 2008 14:42 GS wrote:
>>
>>

Hi all,
I have two files like this:

file A:

line1
line2
line3
line4

file B:

line5
line3
line2

I want to get only lines from file A that do not appear in file B:

line1
line4

How can I accomplish this without looping through the lines of both
files? Is there a unix command to do it quickly? I tried "comm" and
"uniq" but I cannot get what I want.

Thanks
Guido




>sort fileA fileB fileB | uniq -u
>>

>uniq -u prints only unique lines. By including file B twice any lines
>unique to file B will end up being duplicated.
>

But any lines that appear multiple times in fileA will be discarded even
if they don't appear in fileB.
>

Ed.

then this should cure it:

$ sort <(sort FileA | uniq ) <(sort FileB FileB ) | uniq -u


Reply With Quote
  #6  
Old July 31st, 2008, 09:39 AM
Ed Morton
Guest
Dev Archives Newbie (0 - 499 posts)
 
Posts: n/a  
Time spent in forums:
Reputation Power:
Comparing files

7/31/2008 3:58 AM, Loki Harfagr wrote:
Wed, 30 Jul 2008 17:24:59 -0500, Ed Morton did cat :
>
>
>7/30/2008 2:18 AM, Andrew McDermott wrote:
>>

on Monday 28 July 2008 14:42 GS wrote:



Hi all,
I have two files like this:

file A:

line1
line2
line3
line4

file B:

line5
line3
line2

I want to get only lines from file A that do not appear in file B:

line1
line4

How can I accomplish this without looping through the lines of both
files? Is there a unix command to do it quickly? I tried "comm" and
"uniq" but I cannot get what I want.

Thanks
Guido





sort fileA fileB fileB | uniq -u

uniq -u prints only unique lines. By including file B twice any lines
unique to file B will end up being duplicated.
>>
>>But any lines that appear multiple times in fileA will be discarded even
>>if they don't appear in fileB.
>>
>> Ed.

>
>

then this should cure it:
>

$ sort <(sort FileA | uniq ) <(sort FileB FileB ) | uniq -u
>


You could just use "sort -u FileA" instead of "sort FileA | uniq", but the P
probably doesn't want to get rid of duplicate lines from FileA anyway.

Ed.


Reply With Quote
  #7  
Old July 31st, 2008, 10:19 AM
Loki Harfagr
Guest
Dev Archives Newbie (0 - 499 posts)
 
Posts: n/a  
Time spent in forums:
Reputation Power:
Comparing files

Thu, 31 Jul 2008 08:18:08 -0500, Ed Morton did *:

7/31/2008 3:58 AM, Loki Harfagr wrote:
>Wed, 30 Jul 2008 17:24:59 -0500, Ed Morton did cat :
>>
>>

7/30/2008 2:18 AM, Andrew McDermott wrote:

on Monday 28 July 2008 14:42 GS wrote:



Hi all,
I have two files like this:

file A:

line1
line2
line3
line4

file B:

line5
line3
line2

I want to get only lines from file A that do not appear in file B:

line1
line4

How can I accomplish this without looping through the lines of both
files? Is there a unix command to do it quickly? I tried "comm" and
"uniq" but I cannot get what I want.

Thanks
Guido





sort fileA fileB fileB | uniq -u

uniq -u prints only unique lines. By including file B twice any lines
unique to file B will end up being duplicated.

But any lines that appear multiple times in fileA will be discarded
even if they don't appear in fileB.

Ed.
>>
>>

>then this should cure it:
>>

>$ sort <(sort FileA | uniq ) <(sort FileB FileB ) | uniq -u
>>
>>

You could just use "sort -u FileA" instead of "sort FileA | uniq",

That's right, I'm using so frequently the counting form '( sort - | uniq -c )'
that I forget easily about the "recent" extensions ;-)

but
the P probably doesn't want to get rid of duplicate lines from FileA
anyway.

Well, in this case I don't see an easier direct toolbox solution than
$ comm -23 <(sort fileA) <(sort <(sort FileB) <(sort FileB) )
but that's not really an "easy" one :-) so I'd use an awk script
(K, or other scripting language having hash and/or sorters).
But as the P sample was too small to determine if fileA data could possibly
be unique and/or pre-sorted I'll rest my case ;D)

(if the files are not too big Stephane's "grep -Fxvf FileB FileA" is certainly a good go)

Reply With Quote
Reply

Viewing: Web Development Archives FAQs Unix/Linux > Comparing files


Thread Tools  Search this Thread 
Search this Thread:

Advanced Search
Display Modes  Rate This Thread 
Rate This Thread:


Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are Off
[IMG] code is On
HTML code is Off
View Your Warnings | New Posts | Latest Threads | Shoutbox
Forum Jump


Forums: » Register « |  User CP |  Games |  Calendar |  Members |  FAQs |  Sitemap |  Support | 
  
 





© 2003-2009 by Developer Shed. All rights reserved. DS Cluster 6 Hosted by Hostway
For more Enterprise Application Development news, visit eWeek