[TriLUG] Regexp Help

Tue Aug 12 17:48:16 EDT 2014

True.  I could also have just used File::Slurp and been done with it :-)

BTW, for the non-Perl folks, this is a good example of TIMTOWTDI.  It is 
also a fine example of using split/join/map/grep to do in-line list 
creation, processing, and reduction.

The general idea is "I have a single thing (a log file) and I want another 
single thing (a cleaner log file)."  I could go through the trouble of 
explicitly creating arrays, populating them, iterating over them, and 
recombining them into scalars...

...or I could just munge it all at once and let Perl deal with the data 
manipulations.

William Sutton

On Tue, 12 Aug 2014, Daniel Sterling wrote:

> assuming this is a single-use script, you can just use:
>
> undef $/;
> $f = <>;
>
> instead of the open + while loop. <> automatically opens and reads
> from either STDIN or the files listed on the command line
>
> see http://perldoc.perl.org/functions/readline.html
>
> -- Dan
>
>
> On Tue, Aug 12, 2014 at 5:21 PM, William Sutton <william at trilug.org> wrote:
>> Assuming you have a file named "bar", which you could easily change in the
>> following Perl, it should do the trick for you.  It requires no special
>> libraries and is thoroughly overdocumented so everyone can understand what I
>> just did.  I've done this particular trick so many times, it's second
>> nature.  Code follows:
>>
>> #---
>>
>> # read in the file
>> open(my $fh, "bar");
>> my $f;
>> while (<$fh>)
>> {
>>     $f .= $_;
>> }
>> close($fh);
>>
>> # regex match the date/time stamp; we need this twice, so define it once
>> my $dt_re = '([a-z]{3}\s+[a-z]{3}\s+\d+\s+\d+(?::\d{2}){2}\s+\d{4})';
>>
>> # dirty trick; replace the matched date/time stamp with a known nonprintable
>> # character (\x01, or hex 01; anything unique would do), followed by the
>> same
>> # date/time stamp we found before; this marks the beginning of the record
>> $f =~ s/$dt_re/\x01$1/gi;
>>
>> # now the nasty magic; see per-line notes, with numeric ordering so you can
>> # follow what is happening
>> # 9. and then print the result
>> print
>>
>>   # 7. re-join the saved blocks on a newline (results in a single block of
>> text)
>>   join(
>>     "\n",
>>
>>     # 3. process the blocks from step 2 by
>>     map {
>>
>>         # 6. then re-join the matching block lines on a newline so that it
>>         #    is one text block again
>>         join(
>>             "\n",
>>
>>             # 5. then grep for only lines with the date/time stamp (the
>>             #    other place we use the ugly regex) or ORA- lines
>>             grep { m/$dt_re|ORA-/i }
>>
>>               # 4. splitting them on a newline (now we have individual lines
>>               #    instead of blocks)
>>               split(/\n/, $_)
>>             )
>>       }
>>
>>       # 2. then, grep out only those blocks that contain ORA- lines; any
>>       #    other text blocks fall out at this point
>>       grep { m/ORA-/ }
>>
>>       # 1. split the modified file text on our arbitrary \x01; this creates
>>       #    blocks of text that start with a date/time stamp
>>       split(/\x01/, $f)
>>   )
>>
>>   # 8. then tack on a newline at the end for formatting purposes
>>   . "\n";
>>
>> #---
>>
>> William Sutton
>>
>>
>> On Tue, 12 Aug 2014, Alan Sterger wrote:
>>
>>> Hello Listers,
>>>
>>> Analyzing an Oracle alert_log.  Familiar with sed, grep and regular
>>> expressions but processing across multiple lines has always been my nemesis.
>>> What I'm trying to accomplish is to printout blocks of lines to a smaller
>>> file for analysis.
>>>
>>> Thanks for your help,
>>>
>>> - Alan S.
>>>
>>> Alert log fragment:
>>> ******************************************************************
>>> Tue Mar 25 19:48:43 2014
>>>  Current log# 3 seq# 141753 mem# 1: F:\ORADATA\NCP15\REDO03B.LOG
>>> Tue Mar 25 19:48:47 2014
>>> ARC0: Completed archiving  log 2 thread 1 sequence 141752
>>> Tue Mar 25 19:52:17 2014
>>> Errors in file d:\oracle\admin\ncp15\udump\ncp15_j000_5824.trc:
>>> ORA-12012: error on auto execute of job 216103
>>> ORA-30926: unable to get a stable set of rows in the source tables
>>> ORA-06512: at "ENTREF.BGNPOSTLOAD", line 156
>>> ORA-06512: at line 2
>>>
>>> Tue Mar 25 19:56:51 2014
>>> Thread 1 advanced to log sequence 141754
>>> Tue Mar 25 19:56:51 2014
>>> ARC0: Evaluating archive   log 3 thread 1 sequence 141753
>>> Tue Mar 25 19:56:51 2014
>>>  Current log# 1 seq# 141754 mem# 0: F:\ORADATA\NCP15\REDO01A.LOG
>>>  Current log# 1 seq# 141754 mem# 1: F:\ORADATA\NCP15\REDO01B.LOG
>>> ******************************************************************
>>>
>>> Preferred output:
>>> Tue Mar 25 19:52:17 2014
>>> ORA-12012: error on auto execute of job 216103
>>> ORA-30926: unable to get a stable set of rows in the source tables
>>> ORA-06512: at "ENTREF.BGNPOSTLOAD", line 156
>>> ORA-06512: at line 2
>>>
>>>
>>> Needed information:
>>> DOW & timestamp line
>>> all ORA- lines
>>>
>>> "Errors in file" line can be removed/skipped as well as the ORA-
>>> terminating whitespace line.
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> --
>>> This message was sent to: William <william at trilug.org>
>>>
>>> To unsubscribe, send a blank message to trilug-leave at trilug.org from that
>>> address.
>>> TriLUG mailing list : http://www.trilug.org/mailman/listinfo/trilug
>>> Unsubscribe or edit options on the web  :
>>> http://www.trilug.org/mailman/options/trilug/william%40trilug.org
>>>
>>> Welcome to TriLUG: http://trilug.org/welcome
>>>
>> --
>> This message was sent to: Daniel S. Sterling <sterling.daniel at gmail.com>
>> To unsubscribe, send a blank message to trilug-leave at trilug.org from that
>> address.
>> TriLUG mailing list : http://www.trilug.org/mailman/listinfo/trilug
>> Unsubscribe or edit options on the web  :
>> http://www.trilug.org/mailman/options/trilug/sterling.daniel%40gmail.com
>> Welcome to TriLUG: http://trilug.org/welcome
> -- 
> This message was sent to: William <william at trilug.org>
> To unsubscribe, send a blank message to trilug-leave at trilug.org from that address.
> TriLUG mailing list : http://www.trilug.org/mailman/listinfo/trilug
> Unsubscribe or edit options on the web	: http://www.trilug.org/mailman/options/trilug/william%40trilug.org
> Welcome to TriLUG: http://trilug.org/welcome
>