[TriLUG] Regexp Help

Daniel Sterling sterling.daniel at gmail.com
Tue Aug 12 17:39:01 EDT 2014


assuming this is a single-use script, you can just use:

undef $/;
$f = <>;

instead of the open + while loop. <> automatically opens and reads
from either STDIN or the files listed on the command line

see http://perldoc.perl.org/functions/readline.html

-- Dan


On Tue, Aug 12, 2014 at 5:21 PM, William Sutton <william at trilug.org> wrote:
> Assuming you have a file named "bar", which you could easily change in the
> following Perl, it should do the trick for you.  It requires no special
> libraries and is thoroughly overdocumented so everyone can understand what I
> just did.  I've done this particular trick so many times, it's second
> nature.  Code follows:
>
> #---
>
> # read in the file
> open(my $fh, "bar");
> my $f;
> while (<$fh>)
> {
>     $f .= $_;
> }
> close($fh);
>
> # regex match the date/time stamp; we need this twice, so define it once
> my $dt_re = '([a-z]{3}\s+[a-z]{3}\s+\d+\s+\d+(?::\d{2}){2}\s+\d{4})';
>
> # dirty trick; replace the matched date/time stamp with a known nonprintable
> # character (\x01, or hex 01; anything unique would do), followed by the
> same
> # date/time stamp we found before; this marks the beginning of the record
> $f =~ s/$dt_re/\x01$1/gi;
>
> # now the nasty magic; see per-line notes, with numeric ordering so you can
> # follow what is happening
> # 9. and then print the result
> print
>
>   # 7. re-join the saved blocks on a newline (results in a single block of
> text)
>   join(
>     "\n",
>
>     # 3. process the blocks from step 2 by
>     map {
>
>         # 6. then re-join the matching block lines on a newline so that it
>         #    is one text block again
>         join(
>             "\n",
>
>             # 5. then grep for only lines with the date/time stamp (the
>             #    other place we use the ugly regex) or ORA- lines
>             grep { m/$dt_re|ORA-/i }
>
>               # 4. splitting them on a newline (now we have individual lines
>               #    instead of blocks)
>               split(/\n/, $_)
>             )
>       }
>
>       # 2. then, grep out only those blocks that contain ORA- lines; any
>       #    other text blocks fall out at this point
>       grep { m/ORA-/ }
>
>       # 1. split the modified file text on our arbitrary \x01; this creates
>       #    blocks of text that start with a date/time stamp
>       split(/\x01/, $f)
>   )
>
>   # 8. then tack on a newline at the end for formatting purposes
>   . "\n";
>
> #---
>
> William Sutton
>
>
> On Tue, 12 Aug 2014, Alan Sterger wrote:
>
>> Hello Listers,
>>
>> Analyzing an Oracle alert_log.  Familiar with sed, grep and regular
>> expressions but processing across multiple lines has always been my nemesis.
>> What I'm trying to accomplish is to printout blocks of lines to a smaller
>> file for analysis.
>>
>> Thanks for your help,
>>
>> - Alan S.
>>
>> Alert log fragment:
>> ******************************************************************
>> Tue Mar 25 19:48:43 2014
>>  Current log# 3 seq# 141753 mem# 1: F:\ORADATA\NCP15\REDO03B.LOG
>> Tue Mar 25 19:48:47 2014
>> ARC0: Completed archiving  log 2 thread 1 sequence 141752
>> Tue Mar 25 19:52:17 2014
>> Errors in file d:\oracle\admin\ncp15\udump\ncp15_j000_5824.trc:
>> ORA-12012: error on auto execute of job 216103
>> ORA-30926: unable to get a stable set of rows in the source tables
>> ORA-06512: at "ENTREF.BGNPOSTLOAD", line 156
>> ORA-06512: at line 2
>>
>> Tue Mar 25 19:56:51 2014
>> Thread 1 advanced to log sequence 141754
>> Tue Mar 25 19:56:51 2014
>> ARC0: Evaluating archive   log 3 thread 1 sequence 141753
>> Tue Mar 25 19:56:51 2014
>>  Current log# 1 seq# 141754 mem# 0: F:\ORADATA\NCP15\REDO01A.LOG
>>  Current log# 1 seq# 141754 mem# 1: F:\ORADATA\NCP15\REDO01B.LOG
>> ******************************************************************
>>
>> Preferred output:
>> Tue Mar 25 19:52:17 2014
>> ORA-12012: error on auto execute of job 216103
>> ORA-30926: unable to get a stable set of rows in the source tables
>> ORA-06512: at "ENTREF.BGNPOSTLOAD", line 156
>> ORA-06512: at line 2
>>
>>
>> Needed information:
>> DOW & timestamp line
>> all ORA- lines
>>
>> "Errors in file" line can be removed/skipped as well as the ORA-
>> terminating whitespace line.
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> --
>> This message was sent to: William <william at trilug.org>
>>
>> To unsubscribe, send a blank message to trilug-leave at trilug.org from that
>> address.
>> TriLUG mailing list : http://www.trilug.org/mailman/listinfo/trilug
>> Unsubscribe or edit options on the web  :
>> http://www.trilug.org/mailman/options/trilug/william%40trilug.org
>>
>> Welcome to TriLUG: http://trilug.org/welcome
>>
> --
> This message was sent to: Daniel S. Sterling <sterling.daniel at gmail.com>
> To unsubscribe, send a blank message to trilug-leave at trilug.org from that
> address.
> TriLUG mailing list : http://www.trilug.org/mailman/listinfo/trilug
> Unsubscribe or edit options on the web  :
> http://www.trilug.org/mailman/options/trilug/sterling.daniel%40gmail.com
> Welcome to TriLUG: http://trilug.org/welcome


More information about the TriLUG mailing list