[TriLUG] help converting a perl script to zshell
Aaron S. Joyner
aaron at joyner.ws
Tue Apr 13 20:49:56 EDT 2004
Regular expressions to the rescue! :) Here's a regex (using plain-old
grep) to match precisely the char set you describe:
grep '[]*?~=/&;!#$%^(){}<>[]'
Note that the only tricky part is matching [ and ] -- you have to place
] as the first character, and [ as the last character, in the string.
The concept that makes the regular expression above work is a "character
class", that is anything enclosed in []'s is a class of characters which
can be a match. You essentially want to match if any of these
characters appears in the string, and that's a nice compact and
shell-expansion safe way or writing it. Enclosing the whole thing in
single quotes is actually what prevents you from having to do all of the
shell escaping, you could have written your egrep example with about
half as many characters if you'd swapped the leading and ending double
quotes for single quotes, and dropped all the backslashes. You might
also want to drop files with quotes in them, which is only slightly more
difficult - the easiest way is to just do it as a separate grep
something like this:
grep '"' or
grep "'"
(that's a double quote in single quotes, or a single quote in double
quotes, depending on which one you want to match)
Or if you don't like nearly-illegible regex's:
grep [\'\"]
Actually, I just noticed your find command that you included -- you were
really close with that one! :) Another way to do the same thing would
be this:
find /your/directory -regex '.*[]*?~=/&;!#$%^(){}<>[].*'
Note the .* at the beginning, and the use of -regex not -iname. Regex
will use a regular expression, and the enclosing .*'s allow it to match
against any single character (find's regex implies a leading ^ and
closing $ in traditional regex terms).
Hope that helps!
Aaron S. Joyner
PS - none of the regex's I included have a space in them. If you want
to throw out files with spaces in the name, just chunk a space in the
middle of one of the regex's and it'll do the trick. :)
Smith, Brett wrote:
>I will try your regex for zsh
>[^[:alpha:][:digit:]-/.]
>the problem is the amount of metas I need to weed out.
>These are the characters I need out. *?[]~=/&;!#$%^(){}<>
>I can't get egrep or find to get all the files. White spaces are a problem.
>I realize it is my lack of knowledge concerning regex
>
>I did not put all the perl code up..sorry
>this was the original
>#!/usr/bin/perl
>$dir = shift(@ARGV);
>$glob = ($dir) ? $dir . "/*" : "*";
>#print "- $glob -\n";
>while (<${glob}>) {
>$files++;
>$file = $_;
>s/\w+//g;
>s/[\.\-\/]//g;
>print $dirty++ . " $file\n" if ($_ ne "");
>}
>printf "Found $files files, $dirty with a problem in the name\n";
>
>
>
>this is the zsh function I tried but it misses a few files.
>
>chk4BD(){
>#WORDCHARS=${WORDCHARS//[._-]}
>#echo $WORDCHARS
>#find $BDDIR -iname '*[+{;"\\=?~()<>&*|$ *]' -exec echo {} \;
>ls $BDDIR|egrep "|\$|\#|\*|\+|\:|\@|\&|\/|\~|\%|\=|\{|\}|\^|\[|\]|\,|\S"
>NUMBD=`echo chk4BD|wc|awk '{print $2}'`
>}
>
>here is a list of the files I need to weed through plus I throw some good
>files in for testing. Because I am going to tar and move the bad and the
>good. I just need to test for the bad.
>*
>**
>***
>Ba D F I le. zip
>Ba & D \ @ F \ | I \ ! l e. zip
>BAD*(__)---}{FILE.ZIP
>BaD@!#Fi@#^%zip.??
>BaD#Fi@^;zip
>Ba&D#F@^;zip.q
>BA\"rm -rf *"\DFILE.SH
>cd ${HOME}
>\(date\)
>$HOSTNAME
>${HOSTNAME}
>\`rm\ -rf\ *`
>rm -rf *
>\'rm\ -rf\ \*' \(exec date\)BaD#Fi at E;zip
>
>
>-----Original Message-----
>From: Aaron S. Joyner [mailto:aaron at joyner.ws]
>Sent: Tuesday, April 13, 2004 7:55 PM
>To: Triangle Linux Users Group discussion list
>Subject: Re: [TriLUG] help converting a perl script to zshell
>
>
>One question about your perl code -- what does $files; do and why do you
>have $dirty as part of your string? Neither are doing anything, I would
>guess that they're left-overs from some previous iteration where you
>were testing or doing something else in particular?
>
>If your description is correct, and you just want to a list of files
>with meta characters in the name, that's actually quite easy to do by
>chaining together a few common *NIX utilities - consider this:
>file /your/directory | grep '[*?%$@#^!()]'
>That will give you a list of every file which contains a character in
>the big gobbeldeygook of special characters. You may of course add and
>remove them at will, as long as you leave the []'s, and don't put a ^ as
>the first character after the [, or it will match precisely the opposite
>of what you expect. If you want to accomplish the same task in pure
>zshell (say for inclusion on a tiny distro or something), then more
>thought will be required. I don't do that much thought unless it's
>really required. :) Unless zsh has support for regex's or some pretty
>good pattern matching or substitution, it's going to be difficult, at
>best.
>
>Okay I don't know much about zsh so I got curious. A quick skim of "A
>User's Guide to ZSH" turned up that you could probably do what you want
>with a zsh pattern match like this:
>[^[:alpha:][:digit:]-/.]
>getting the list of files into a zsh variable, and the rest of the
>fluff, is left as an exercise to the reader (think: @dir = `ls`).
>
>Aaron J.
>
>Smith, Brett wrote:
>
>
>
>>Guys~
>>I need some help with a zshell script. I have searched the web and just
>>haven't found an answer. I am trying to match metacharacters in filenames
>>(on my ftp server) so I can mark them as bad and move them. I have
>>everything else done in zsh.
>>I have the perl code to do it but I really need to convert it to zsh (or
>>even bash). Here is the perl code.
>>#!/usr/bin/perl
>>$dir = shift(@ARGV);
>>$glob = ($dir) ? $dir . "/*" : "*";
>>while (<${glob}>) {
>>$files;
>>$file = $_;
>>s/\w+//g;
>>s/[\.\-\/]//g;
>>print $dirty . " $file\n" if ($_ ne "");
>>}
>>If anyone knows the answer or could help point me in the right direction I
>>would appreciate it. (zsh IRC channel on freenode was devoid of human
>>interaction)
>>Thanks,
>>
>>Brett Smith
>>IS Team
>>Bloodhound, Inc.
>>2520 Meridian Parkway, Suite 500
>>Durham, N.C. 27713
>>(919) 313-1619
>>bsmith at bloodhoundinc.com
>>
>>Brett Smith
>>IS Team
>>Bloodhound, Inc.
>>2520 Meridian Parkway, Suite 500
>>Durham, N.C. 27713
>>(919) 313-1619
>>bsmith at bloodhoundinc.com
>>
>>
>>
>>This email message is for the sole use of the intended recipients(s) and
>>
>>
>may contain confidential and privileged information of Bloodhound Software,
>Inc.. Any unauthorized review, use, disclosure is prohibited. If you are not
>the intended recipient, please contact the sender by reply email and destroy
>all copies of the original message.
>
>
>>
>>
>>
>>
>
>
>
More information about the TriLUG
mailing list