[TriLUG] help converting a perl script to zshell

Aaron S. Joyner aaron at joyner.ws
Tue Apr 13 22:04:48 EDT 2004


Okay, well one obvious thing first.  The find command, as you have it, 
and as I quoted it earlier, is busted.  The grep command as you have it, 
will return all the bad files.  The only possible explanation that I can 
offer is a typo, or zsh is doing something funny with meta character 
interpretation.  Here's a direct copy-paste from my sample:

[asjoyner at bobjr good-test]$ ls
file1  file2  file3  file4  file5  file6  file7
[asjoyner at bobjr good-test]$ ls | grep '[]*?~=/&;!#$%^(){}<>[]'
[asjoyner at bobjr good-test]$ cd ../bad-test                   
[asjoyner at bobjr bad-test]$ ls -la
total 12
drwxr-xr-x    2 asjoyner asjoyner     4096 Apr 13 21:48 .
drwxrwxrwt   14 root     root         8192 Apr 13 21:46 ..
-rw-r--r--    1 asjoyner asjoyner        0 Apr 13 21:48 *
-rw-r--r--    1 asjoyner asjoyner        0 Apr 13 21:48 **
-rw-r--r--    1 asjoyner asjoyner        0 Apr 13 21:48 ***
-rw-r--r--    1 asjoyner asjoyner        0 Apr 13 21:48 Ba  D   F   I 
le.  zip
-rw-r--r--    1 asjoyner asjoyner        0 Apr 13 21:48 Ba & D \ @ F \ | 
I \ ! l e.  zip
-rw-r--r--    1 asjoyner asjoyner        0 Apr 13 21:48 
BAD*(__)---}{FILE.ZIP
-rw-r--r--    1 asjoyner asjoyner        0 Apr 13 21:48 BaD@!#Fi@#^%zip.??
-rw-r--r--    1 asjoyner asjoyner        0 Apr 13 21:48 BaD#Fi@^;zip
-rw-r--r--    1 asjoyner asjoyner        0 Apr 13 21:48 Ba&D#F@^;zip.q
-rw-r--r--    1 asjoyner asjoyner        0 Apr 13 21:48 BA\"rm -rf 
*"\DFILE.SH
-rw-r--r--    1 asjoyner asjoyner        0 Apr 13 21:48 cd ${HOME}
-rw-r--r--    1 asjoyner asjoyner        0 Apr 13 21:48 \(date\)
-rw-r--r--    1 asjoyner asjoyner        0 Apr 13 21:48 $HOSTNAME
-rw-r--r--    1 asjoyner asjoyner        0 Apr 13 21:48 ${HOSTNAME}
[asjoyner at bobjr bad-test]$ ls | grep '[]*?~=/&;!#$%^(){}<>[]'
*
**
***
Ba & D \ @ F \ | I \ ! l e.  zip
BAD*(__)---}{FILE.ZIP
BaD@!#Fi@#^%zip.??
BaD#Fi@^;zip
Ba&D#F@^;zip.q
BA\"rm -rf *"\DFILE.SH
cd ${HOME}
\(date\)
$HOSTNAME
${HOSTNAME}
[asjoyner at bobjr bad-test]$

Interested to see what the reason for the difference turns out to be...  
Unfortunately, as the wife is tugging at my collar to go to sleep, and 
I'm fighting a mean head cold, I don't have time to troubleshoot the 
find regex.  Apparently my quick-test with at first was incorrect.

Aaron J.


Smith, Brett wrote:

>The problem is that find and grep are returning everything in the directory.
>
>ls | grep '[]*?~=/&;!#$%^(){}<>[]'
>and
>find ~/tmp -regex '.*[]*?~=/&;!#$%^(){}<>[].*' -exec echo {} \;
>both return all the files in the directory. Even the non-metacharacter
>files.
>
>
>-----Original Message-----
>From: Aaron S. Joyner [mailto:aaron at joyner.ws]
>Sent: Tuesday, April 13, 2004 8:50 PM
>To: Triangle Linux Users Group discussion list
>Subject: Re: [TriLUG] help converting a perl script to zshell
>
>
>Regular expressions to the rescue!  :)  Here's a regex (using plain-old 
>grep) to match precisely the char set you describe:
>grep '[]*?~=/&;!#$%^(){}<>[]'
>
>Note that the only tricky part is matching [ and ] -- you have to place 
>] as the first character, and [ as the last character, in the string.  
>The concept that makes the regular expression above work is a "character 
>class", that is anything enclosed in []'s is a class of characters which 
>can be a match.  You essentially want to match if any of these 
>characters appears in the string, and that's a nice compact and 
>shell-expansion safe way or writing it.  Enclosing the whole thing in 
>single quotes is actually what prevents you from having to do all of the 
>shell escaping, you could have written your egrep example with about 
>half as many characters if you'd swapped the leading and ending double 
>quotes for single quotes, and dropped all the backslashes.  You might 
>also want to drop files with quotes in them, which is only slightly more 
>difficult - the easiest way is to just do it as a separate grep 
>something like this:
>grep '"'      or
>grep "'"
>(that's a double quote in single quotes, or a single quote in double 
>quotes, depending on which one you want to match)
>Or if you don't like nearly-illegible regex's:
>grep [\'\"]
>
>Actually, I just noticed your find command that you included -- you were 
>really close with that one!  :)  Another way to do the same thing would 
>be this:
>find /your/directory -regex '.*[]*?~=/&;!#$%^(){}<>[].*'
>Note the .* at the beginning, and the use of -regex not -iname.  Regex 
>will use a regular expression, and the enclosing .*'s allow it to match 
>against any single character (find's regex implies a leading ^ and 
>closing $ in traditional regex terms).
>
>Hope that helps!
>
>Aaron S. Joyner
>
>PS - none of the regex's I included have a space in them.  If you want 
>to throw out files with spaces in the name, just chunk a space in the 
>middle of one of the regex's and it'll do the trick.  :)
>
>
>Smith, Brett wrote:
>
>  
>
>>I will try your regex for zsh 
>>[^[:alpha:][:digit:]-/.]
>>the problem is the amount of metas I need to weed out.
>>These are the characters I need out. *?[]~=/&;!#$%^(){}<>
>>I can't get egrep or find to get all the files. White spaces are a problem.
>>    
>>
>
>  
>
>>I realize it is my lack of knowledge concerning regex
>>
>>I did not put all the perl code up..sorry
>>this was the original
>>#!/usr/bin/perl
>>$dir = shift(@ARGV);
>>$glob = ($dir) ? $dir . "/*" : "*";
>>#print "- $glob -\n";
>>while (<${glob}>) {
>>$files++;
>>$file = $_;
>>s/\w+//g;
>>s/[\.\-\/]//g;
>>print $dirty++ . " $file\n" if ($_ ne "");
>>}
>>printf "Found $files files, $dirty with a problem in the name\n";
>>
>>
>>
>>this is the zsh function I tried but it misses a few files.
>>
>>chk4BD(){
>>#WORDCHARS=${WORDCHARS//[._-]}
>>#echo $WORDCHARS
>>#find $BDDIR -iname '*[+{;"\\=?~()<>&*|$ *]' -exec echo {} \;
>>ls $BDDIR|egrep "|\$|\#|\*|\+|\:|\@|\&|\/|\~|\%|\=|\{|\}|\^|\[|\]|\,|\S"
>>NUMBD=`echo chk4BD|wc|awk '{print $2}'`
>>}
>>
>>here is a list of the files I need to weed through plus I throw some good
>>files in for testing. Because I am going to tar and move the bad and the
>>good. I just need to test for the bad.
>>*
>>**
>>***
>>Ba  D   F   I le.  zip
>>Ba & D \ @ F \ | I \ ! l e.  zip
>>BAD*(__)---}{FILE.ZIP
>>BaD@!#Fi@#^%zip.??
>>BaD#Fi@^;zip
>>Ba&D#F@^;zip.q
>>BA\"rm -rf *"\DFILE.SH
>>cd ${HOME}
>>\(date\)
>>$HOSTNAME
>>${HOSTNAME}
>>\`rm\ -rf\ *`
>>rm -rf *
>>\'rm\ -rf\ \*' \(exec date\)BaD#Fi at E;zip
>>
>>
>>-----Original Message-----
>>From: Aaron S. Joyner [mailto:aaron at joyner.ws]
>>Sent: Tuesday, April 13, 2004 7:55 PM
>>To: Triangle Linux Users Group discussion list
>>Subject: Re: [TriLUG] help converting a perl script to zshell
>>
>>
>>One question about your perl code -- what does $files; do and why do you 
>>have $dirty as part of your string?  Neither are doing anything, I would 
>>guess that they're left-overs from some previous iteration where you 
>>were testing or doing something else in particular?
>>
>>If your description is correct, and you just want to a list of files 
>>with meta characters in the name, that's actually quite easy to do by 
>>chaining together a few common *NIX utilities - consider this:
>>file /your/directory | grep '[*?%$@#^!()]'
>>That will give you a list of every file which contains a character in 
>>the big gobbeldeygook of special characters.  You may of course add and 
>>remove them at will, as long as you leave the []'s, and don't put a ^ as 
>>the first character after the [, or it will match precisely the opposite 
>>of what you expect.  If you want to accomplish the same task in pure 
>>zshell (say for inclusion on a tiny distro or something), then more 
>>thought will be required.  I don't do that much thought unless it's 
>>really required.  :)  Unless zsh has support for regex's or some pretty 
>>good pattern matching or substitution, it's going to be difficult, at 
>>best. 
>>
>>Okay I don't know much about zsh so I got curious.  A quick skim of "A 
>>User's Guide to ZSH" turned up that you could probably do what you want 
>>with a zsh pattern match like this:
>>[^[:alpha:][:digit:]-/.]
>>getting the list of files into a zsh variable, and the rest of the 
>>fluff, is left as an exercise to the reader (think: @dir = `ls`).
>>
>>Aaron J.
>>
>>Smith, Brett wrote:
>>
>> 
>>
>>    
>>
>>>Guys~
>>>I need some help with a zshell script. I have searched the web and just
>>>haven't found an answer.  I am trying to match metacharacters in filenames
>>>(on my ftp server) so I can mark them as bad and move them. I have
>>>everything else done in zsh. 
>>>I have the perl code to do it but I really need to convert it to zsh (or
>>>even bash). Here is the perl code.
>>>#!/usr/bin/perl
>>>$dir = shift(@ARGV);
>>>$glob = ($dir) ? $dir . "/*" : "*";
>>>while (<${glob}>) {
>>>$files;
>>>$file = $_;
>>>s/\w+//g;
>>>s/[\.\-\/]//g;
>>>print $dirty . " $file\n" if ($_ ne "");
>>>}
>>>If anyone knows the answer or could help point me in the right direction I
>>>would appreciate it. (zsh IRC channel on freenode was devoid of human
>>>interaction)
>>>Thanks,
>>>
>>>Brett Smith
>>>IS Team 
>>>Bloodhound, Inc.
>>>2520 Meridian Parkway, Suite 500
>>>Durham, N.C.  27713
>>>(919) 313-1619
>>>bsmith at bloodhoundinc.com
>>>
>>>Brett Smith
>>>IS Team 
>>>Bloodhound, Inc.
>>>2520 Meridian Parkway, Suite 500
>>>Durham, N.C.  27713
>>>(919) 313-1619
>>>bsmith at bloodhoundinc.com
>>>
>>>
>>>
>>>This email message is for the sole use of the intended recipients(s) and
>>>   
>>>
>>>      
>>>
>>may contain confidential and privileged information of Bloodhound Software,
>>Inc.. Any unauthorized review, use, disclosure is prohibited. If you are
>>    
>>
>not
>  
>
>>the intended recipient, please contact the sender by reply email and
>>    
>>
>destroy
>  
>
>>all copies of the original message.
>> 
>>
>>    
>>
>>>   
>>>
>>>      
>>>
>> 
>>
>>    
>>
>
>  
>




More information about the TriLUG mailing list