Unix/Linux
 
Forums: » Register « |  User CP |  Games |  Calendar |  Members |  FAQs |  Sitemap |  Support | 
 
User Name:
Password:
Remember me
Go Back   Web Development Archives FAQs Unix/Linux

Reply
Add This Thread To:
  Del.icio.us   Digg   Google   Spurl   Blink   Furl   Simpy   Y! MyWeb 
Thread Tools Search this Thread Display Modes
 
Unread Web Development Archives Sponsor:
  #1  
Old June 25th, 2008, 12:09 PM
Dave B
Guest
Dev Archives Newbie (0 - 499 posts)
 
Posts: n/a  
Time spent in forums:
Reputation Power:
trying to wrestle sort and/or awk (not succesfully anyway ;-))

Albretch Mueller wrote:

The reason why I need this is because I need to sort some data (directory
structures) first on the directory depth (a numeric value) and then,
alphabetically, using the actual directory path
>

I am using find in order to get the initial data
>

find . -type f -printf '%T@ %A@ %C@ %M %n %u %g %s %d %h %f ' | awk '{ \
print("\042"$12"\042" \
"\054"$1 \
"\054"$2 \
"\054"$3 \
"\054""\042"$4"\042" \
"\054"$5 \
"\054""\042"$6"\042" \
"\054""\042"$7"\042" \
"\054"$8 \
"\054"$9 \
"\054""\042"$10"\042" \
"\054""\042"$11"\042" \
"\054""\042"$13"\042"); }'
>

but then sort does not sort on one field as numeric and the other
alphabetically
>

And/or I am not getting it right/I am missing something fundamental here

First, you should at least put \n at the end of find's printf format string,
or you'll end up with a single line of input. Then, assuming your filenames
do not contain newlines, you can do

find . -type f -printf '%T@ %A@ %C@ %M %n %u %g %s %d %h %f\n' |
LC_ALL=C sort -k 9,9n -k 10

if by "directory path" you mean from the 10th to the end of the line. If you
want to sort only on the directory path (10th field), then use

LC_ALL=C sort -k 9,9n -k 10,10

but beware that there might be spaces in the names, so the 10th field may
contain only part of the directory name.

--
echo 0|sed 's909=#3u)o19;s0#0ooo)];s()(0bu}=(;s#}#.1m"?0^2{#;
s)")9v2@3%"9$);so%op]t(p$e#!o;sz(z^+.z;su+ur!z"au;sxzxd?_{h)cx;:b;
s/\(\(.\).\)\(\(\)*\)\(\(.\).\)\(\(\)\6.*\2.*\)/\5\3\1\7/;
tb'|awk '{while((i+=2)<=length($1)-18)a=a substr($1,i,1);print a}'

Reply With Quote
  #2  
Old June 25th, 2008, 12:10 PM
Albretch Mueller
Guest
Dev Archives Newbie (0 - 499 posts)
 
Posts: n/a  
Time spent in forums:
Reputation Power:
trying to wrestle sort and/or awk (not succesfully anyway ;-))

Dave B wrote:

but beware that there might be spaces in the names, so the 10th field may
contain only part of the directory name.
>


Well, this is why (wrongly or not) I was using awk. I thought if I have the
last field under parenthesis, and since parenthesis are not allowed in
directory paths anyway, awk would process eveything between the
parenthesis, that means the whole path

Am I right on that one?

Thanks
lbrtchx

Reply With Quote
  #3  
Old June 25th, 2008, 01:30 PM
Dave B
Guest
Dev Archives Newbie (0 - 499 posts)
 
Posts: n/a  
Time spent in forums:
Reputation Power:
trying to wrestle sort and/or awk (not succesfully anyway ;-))

Albretch Mueller wrote:

Dave B wrote:
>
>but beware that there might be spaces in the names, so the 10th field may
>contain only part of the directory name.
>>

>

Well, this is why (wrongly or not) I was using awk. I thought if I have the
last field under parenthesis, and since parenthesis are not allowed in
directory paths anyway,

Parentheses are allowed.

awk would process eveything between the parenthesis, that means the whole path

In the find's printf, why not use %p instead of %h+%f? This way, you just
sort numerically on the 9th field, alphabetically on the 10th field, and you
are done.

--
echo 0|sed 's909=#3u)o19;s0#0ooo)];s()(0bu}=(;s#}#.1m"?0^2{#;
s)")9v2@3%"9$);so%op]t(p$e#!o;sz(z^+.z;su+ur!z"au;sxzxd?_{h)cx;:b;
s/\(\(.\).\)\(\(\)*\)\(\(.\).\)\(\(\)\6.*\2.*\)/\5\3\1\7/;
tb'|awk '{while((i+=2)<=length($1)-18)a=a substr($1,i,1);print a}'

Reply With Quote
  #4  
Old June 25th, 2008, 01:30 PM
Dave B
Guest
Dev Archives Newbie (0 - 499 posts)
 
Posts: n/a  
Time spent in forums:
Reputation Power:
trying to wrestle sort and/or awk (not succesfully anyway ;-))

Dave B wrote:

sort numerically on the 9th field, alphabetically on the 10th field, and you

That should be "alphabetically from the 10th field to the end"

Reply With Quote
  #5  
Old June 25th, 2008, 06:49 PM
Albretch Mueller
Guest
Dev Archives Newbie (0 - 499 posts)
 
Posts: n/a  
Time spent in forums:
Reputation Power:
trying to wrestle sort and/or awk (not succesfully anyway ;-))

Dave B wrote:
>>

but beware that there might be spaces in the names, so the 10th field
may contain only part of the directory name.

>>

>Well, this is why (wrongly or not) I was using awk. I thought if I have
>the
>last field under parenthesis, and since parenthesis are not allowed in
>directory paths anyway,
>

Parentheses are allowed.
~
Do you mean in directory path/file names? Which FS allows them?
~
do you mean in sort, in the conventional way in which all characters
from the start to the end of the parenthesis are taken into account?
~
>
>awk would process eveything between the parenthesis, that means the whole
>path
>

In the find's printf, why not use %p instead of %h+%f? This way, you just
sort numerically on the 9th field, alphabetically on the 10th field, and
you are done.
~
Actually I am using %p to do the sorting, as you suggested to me, but then
I crop that field because I don't really need it, if I have %h + %f
~
Also, I need %h + %f separate because I will then get all directories, sort
and index them and use the indexes then to substitute the path names in the
file that contains the 'found' files
~
Let me test/polish a little more my silly script for you guys to take a
look at it
~
Thanks
lbrtchx



Reply With Quote
  #6  
Old June 25th, 2008, 08:49 PM
Albretch Mueller
Guest
Dev Archives Newbie (0 - 499 posts)
 
Posts: n/a  
Time spent in forums:
Reputation Power:
trying to wrestle sort and/or awk (not succesfully anyway ;-))

Maxwell Lol wrote:

mkdir 'a()'
~
Well, you are right. The thing is that I would never have file names like
that, but of course it isn't really about silly me ;-)
~
sh-3.1# ls -l
total 7980
drwxr-xr-x 2 root root 4096 Jun 25 09:34 "a"
drwxr-xr-x 2 root root 4096 Jun 25 09:34 a()
. .
~
Is there a way to safely use find that gives you all these, I would say,
weird cases?
~
lbrtchx


Reply With Quote
  #7  
Old June 27th, 2008, 09:09 AM
Dave B
Guest
Dev Archives Newbie (0 - 499 posts)
 
Posts: n/a  
Time spent in forums:
Reputation Power:
trying to wrestle sort and/or awk (not succesfully anyway ;-))

Albretch Mueller wrote:

! Yes, it does work! I was just messing with some file names
~
Thanks

Note that if you have filenames with commas you will sort only on the first
part of the name before the first comma (no, double quotes do not protect
against that). You better use -k 5 to use all the fields from the 5th to end
of line. If, on the other hand, you do not have filesnames with commas, you
can surely avoid using double quotes since commas will already separate fields.
Bottom line: in any case, you don't need double quotes.

--
echo 0|sed 's909=#3u)o19;s0#0ooo)];s()(0bu}=(;s#}#.1m"?0^2{#;
s)")9v2@3%"9$);so%op]t(p$e#!o;sz(z^+.z;su+ur!z"au;sxzxd?_{h)cx;:b;
s/\(\(.\).\)\(\(\)*\)\(\(.\).\)\(\(\)\6.*\2.*\)/\5\3\1\7/;
tb'|awk '{while((i+=2)<=length($1)-18)a=a substr($1,i,1);print a}'

Reply With Quote
  #8  
Old June 27th, 2008, 09:09 AM
Albretch Mueller
Guest
Dev Archives Newbie (0 - 499 posts)
 
Posts: n/a  
Time spent in forums:
Reputation Power:
trying to wrestle sort and/or awk (not succesfully anyway ;-))

Dave B wrote:

Albretch Mueller wrote:
>

find . -type f -printf '%T@ %A@ %C@ %M %n %u %g %s %d %h %f\n' |
LC_ALL=C sort -k 9,9n -k 10
>

if by "directory path" you mean from the 10th to the end of the line. If
you want to sort only on the directory path (10th field), then use
>

LC_ALL=C sort -k 9,9n -k 10,10
>

but beware that there might be spaces in the names, so the 10th field may
contain only part of the directory name.
>

~
I am still not getting it right somehow
~
sort/your script:
~
sort -t, -k 4,4n -k 5,5 <file_name>
~
doesn't sort the 4th column as numeric and the 5th as text
~
"drwxr-xr-x",16,"root",0,""
"drwx",3,"root",1,".thumbnails"
"drwx",2,"root",2,".thumbnails/normal"
"drwxr-xr-x",2,"root",1,".mcop"
"drwxr-xr-x",2,"root",1,"Desktop"
"drwxr-xr-x",7,"root",1,".kde"
"drwx",4,"root",2,".kde/cache-Knoppix"
"drwx",2,"root",3,".kde/cache-Knoppix/favicons"
"drwx",2,"root",3,".kde/cache-Knoppix/background"
"drwx",2,"root",2,".kde/tmp-Knoppix"
"drwx",2,"root",2,".kde/socket-Knoppix"
"drwxr-xr-x",11,"root",2,".kde/share"
"drwx",2,"root",3,".kde/share/servicetypes"
"drwxr-xr-x",2,"root",3,".kde/share/services"
"drwxr-xr-x",5,"root",3,".kde/share/mimelnk"
"drwxr-xr-x",2,"root",4,".kde/share/mimelnk/video"
"drwxr-xr-x",2,"root",4,".kde/share/mimelnk/audio"
"drwxr-xr-x",2,"root",4,".kde/share/mimelnk/application"
"drwxr-xr-x",3,"root",3,".kde/share/icons"
"drwxr-xr-x",2,"root",4,".kde/share/icons/favicons"
"drwxr-xr-x",5,"root",3,".kde/share/fonts"
"drwxr-xr-x",2,"root",4,".kde/share/fonts/override"
"drwxr-xr-x",4,"root",3,".kde/share/config"
"drwxr-xr-x",2,"root",4,".kde/share/config/session"
"drwxr-xr-x",2,"root",4,".kde/share/config/colors"
"drwxr-xr-x",4,"root",3,".kde/share/cache"
"drwxr-xr-x",15,"root",4,".kde/share/cache/http"
"drwxr-xr-x",2,"root",5,".kde/share/cache/http/t"
"drwxr-xr-x",2,"root",5,".kde/share/cache/http/s"
"drwxr-xr-x",2,"root",5,".kde/share/cache/http/p"
"drwxr-xr-x",2,"root",5,".kde/share/cache/http/a"
"drwxr-xr-x",2,"root",4,".kde/share/cache/favicons"
"drwxr-xr-x",18,"root",3,".kde/share/apps"
"drwx",2,"root",4,".kde/share/apps/konsole"
"drwxr-xr-x",2,"root",4,""
"drwxr-xr-x",3,"root",3,".kde/share/applnk"
"drwxr-xr-x",2,"root",4,".kde/share/applnk/.hidden"
"drwxr-xr-x",2,"root",2,".kde/Autostart"
"drwxr-xr-x",4,"root",1,".mozilla"
"drwxr-xr-x",3,"root",2,".mozilla/knoppix"
"drwxr-xr-x",2,"root",3,".mozilla/knoppix/ujixazk6.slt"
"drwxr-xr-x",4,"root",2,".mozilla/firefox"
"drwx",6,"root",3,""
"drwxr-xr-x",2,"root",4,""
"drwxr-xr-x",2,"root",3,""
"drwxr-xr-x",2,"root",1,".gnome_private"
"drwxr-xr-x",3,"root",1,".gnome"
"drwxr-xr-x",2,"root",2,".gnome/accels"
"drwxr-xr-x",2,"root",1,"tmp"
"drwxr-xr-x",2,"root",1,".xmms"
"drwxr-xr-x",2,"root",1,".xine"
"drwxr-xr-x",2,"root",1,".qt"
"drwxr-xr-x",3,"root",1,".local"
"drwxr-xr-x",4,"root",2,".local/share"
"drwx",4,"root",3,".local/share/Trash"
"drwx",2,"root",4,".local/share/Trash/files"
"drwx",2,"root",4,".local/share/Trash/info"
"drwxr-xr-x",2,"root",3,".local/share/applications"
"drwxr-xr-x",2,"root",1,".links"
"drwxr-xr-x",21,"root",1,".gimp-2.2"
"drwxr-xr-x",2,"root",2,".gimp-2.2/tool-options"
"drwxr-xr-x",2,"root",2,".gimp-2.2/tmp"
"drwxr-xr-x",2,"root",2,".gimp-2.2/curves"
"drwxr-xr-x",2,"root",2,".gimp-2.2/brushes"


Reply With Quote
  #9  
Old June 28th, 2008, 05:09 PM
Albretch Mueller
Guest
Dev Archives Newbie (0 - 499 posts)
 
Posts: n/a  
Time spent in forums:
Reputation Power:
trying to wrestle sort and/or awk (not succesfully anyway ;-))

Dave B wrote:

Note that if you have filenames . . .
~
K, what do you do in order to avoid all those kinds of nuances that can
happen with file path names, which conflict with other utilities?
~
I think the sorting part can be safely managed by somehow including a
temporary column with hexadecimal representation of the string, but of
course you can not feed the exec part of a find statement with that
~
Feeding "find" a directory path that contains spaces works fine if you do
it right on the command line:
~
sh-3.1# find "/home/root/New Folder () & %% ^/New Folder" -type f -exec
md5sum {} \;
/home/root/New Folder () & %% ^/New
Folder/Text File~
/home/root/New Folder () & %% ^/New
Folder/Text File
~
But if you (actually -I- couldn't do it anyway) try crafting that same
statement as a script
~
#!/bin/bash
START_DIR="/home/root/New Folder () & %% ^/New Folder"
START_DIR="\"/home/root/New Folder () & %% ^/New Folder\""
START_DIR="\'/home/root/New Folder () & %% ^/New Folder\'"

find ${START_DIR} -type f -exec md5sum {} \;
~
I expectedly got:
~
sh-3.1# sh ./script00.sh
find: invalid predicate `()'
~
What would you do to make sure that find does not stumble on such cases?
~
thanks
lbrtchx


Reply With Quote
  #10  
Old June 28th, 2008, 09:49 PM
Albretch Mueller
Guest
Dev Archives Newbie (0 - 499 posts)
 
Posts: n/a  
Time spent in forums:
Reputation Power:
trying to wrestle sort and/or awk (not succesfully anyway ;-))

it seems to be working just fine for this basic script

But when I used some formattig via awk and stuff it did not seem to like it

More to come
lbrtchx


Reply With Quote
  #11  
Old June 30th, 2008, 06:09 AM
pk
Guest
Dev Archives Newbie (0 - 499 posts)
 
Posts: n/a  
Time spent in forums:
Reputation Power:
trying to wrestle sort and/or awk (not succesfully anyway ;-))

Monday 30 June 2008 03:06, Albretch Mueller wrote:

Basically what is happening, as I see it, is that when file names contain
spaces in a statment containing some -exec and/or awk processing the
processing parts is not being fed with the actual name of the file

, let's try to make things easy. If you have N spaces in your names, then
just use find -printf and awk will see all the correct fields:

$ echo "field1 field2 field3" | awk '{for(i=1;i<=NF;i++) print $i}'
field1
field2
field3

If you D have spaces, then just use a different separator, something that
does not appear elsewhere in the input (eg, a comma), and tell awk what
that separator is:

$ echo "field with space,field2,field3 space" | \
awk -F, '{for(i=1;i<=NF;i++) print $i}'
field with space
field2
field3 space

How do you produce a comma separated list with find's printf? Just do

-printf '%T@,%A@,%C@,%M,%n,%u,%g,%s,%d,%h,%f\n' | awk -F,

(the \n at the end of the printf format string is important, and I see it's
missing in one place in your post).

--
All the commands are tested with bash and GNU tools, so they may use
nonstandard features. I try to mention when something is nonstandard (if
I'm aware of that), but I may miss something. Corrections are welcome.

Reply With Quote
  #12  
Old July 2nd, 2008, 11:30 PM
Albretch Mueller
Guest
Dev Archives Newbie (0 - 499 posts)
 
Posts: n/a  
Time spent in forums:
Reputation Power:
trying to wrestle sort and/or awk (not succesfully anyway ;-))

Kenny McCormack wrote:

I think it means that it got all huffy and threatened to hold its breath
until it got its way. *When that didn't work, it probably took its toys
and went home.
~
I don't get what the deal is about getting huffy with toys or so seriously
cobbling some script, but here is what I came up with which totallt suits
my needs in case someone else is looking for something similar:

In short pk oversimplified my intentions and showed me something that
worked but it was not what i looking for to do

#!/bin/bash

# __
_BRX_DIR="/ramdisk/home/root"
_BRX_DIR="/media/sda1"

# __
_DATE=`date +%Y%m%d%H%M%S`

# __
_FLS_DATA=${_DATE}".fs.data.txt";

# __
_FLS_SIGN=${_DATE}".fs.sign.txt";

# __
UT_DIRS=${_DATE}".dirs.txt";

# __
echo "Starting Directory Branch: "${_BRX_DIR}
echo "Files data: "${_FLS_DATA}
echo "Files signatures: "${_FLS_SIGN}
echo "Directories: "${UT_DIRS}

# ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ SNAPSHT F FILES WITHUT MD5SUM
# __ getting files formatted as csv
find "${_BRX_DIR}" -type f -printf '%T@,%A@,%C@,"%F","%M",%n,"%u","%g",%s
%d,"%h","%f","%P"\n' "${_FLS_DATA}"

# ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ MD5SUMs
find "${_BRX_DIR}" -type f -print0 | xargs -0 -n1 md5sum -b "${_FLS_SIGN}"

# ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ SNAPSHT F FILES' DIRECTRIES
find "${_BRX_DIR}" -type d -printf '%T@,%A@,%C@,"%M",%n,"%u","%g",%d,"%P"\n'
${UT_DIRS}.2sort.tmp

# __ sorting on depth (numeric) and then on directory path (alpha)
sort -t, -k 8,8n -k 9,9 ${UT_DIRS}.2sort.tmp ${UT_DIRS}

# __
rm ${UT_DIRS}.2sort.tmp



Reply With Quote
Reply

Viewing: Web Development Archives FAQs Unix/Linux > trying to wrestle sort and/or awk (not succesfully anyway ;-))


Thread Tools  Search this Thread 
Search this Thread:

Advanced Search
Display Modes  Rate This Thread 
Rate This Thread:


Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are Off
[IMG] code is On
HTML code is Off
View Your Warnings | New Posts | Latest Threads | Shoutbox
Forum Jump


Forums: » Register « |  User CP |  Games |  Calendar |  Members |  FAQs |  Sitemap |  Support | 
  
 





© 2003-2008 by Developer Shed. All rights reserved. DS Cluster 1 hosted by Hostway