[TriLUG] Regexp Help
William Sutton
william at trilug.org
Tue Aug 12 17:48:16 EDT 2014
True. I could also have just used File::Slurp and been done with it :-)
BTW, for the non-Perl folks, this is a good example of TIMTOWTDI. It is
also a fine example of using split/join/map/grep to do in-line list
creation, processing, and reduction.
The general idea is "I have a single thing (a log file) and I want another
single thing (a cleaner log file)." I could go through the trouble of
explicitly creating arrays, populating them, iterating over them, and
recombining them into scalars...
...or I could just munge it all at once and let Perl deal with the data
manipulations.
William Sutton
On Tue, 12 Aug 2014, Daniel Sterling wrote:
> assuming this is a single-use script, you can just use:
>
> undef $/;
> $f = <>;
>
> instead of the open + while loop. <> automatically opens and reads
> from either STDIN or the files listed on the command line
>
> see http://perldoc.perl.org/functions/readline.html
>
> -- Dan
>
>
> On Tue, Aug 12, 2014 at 5:21 PM, William Sutton <william at trilug.org> wrote:
>> Assuming you have a file named "bar", which you could easily change in the
>> following Perl, it should do the trick for you. It requires no special
>> libraries and is thoroughly overdocumented so everyone can understand what I
>> just did. I've done this particular trick so many times, it's second
>> nature. Code follows:
>>
>> #---
>>
>> # read in the file
>> open(my $fh, "bar");
>> my $f;
>> while (<$fh>)
>> {
>> $f .= $_;
>> }
>> close($fh);
>>
>> # regex match the date/time stamp; we need this twice, so define it once
>> my $dt_re = '([a-z]{3}\s+[a-z]{3}\s+\d+\s+\d+(?::\d{2}){2}\s+\d{4})';
>>
>> # dirty trick; replace the matched date/time stamp with a known nonprintable
>> # character (\x01, or hex 01; anything unique would do), followed by the
>> same
>> # date/time stamp we found before; this marks the beginning of the record
>> $f =~ s/$dt_re/\x01$1/gi;
>>
>> # now the nasty magic; see per-line notes, with numeric ordering so you can
>> # follow what is happening
>> # 9. and then print the result
>> print
>>
>> # 7. re-join the saved blocks on a newline (results in a single block of
>> text)
>> join(
>> "\n",
>>
>> # 3. process the blocks from step 2 by
>> map {
>>
>> # 6. then re-join the matching block lines on a newline so that it
>> # is one text block again
>> join(
>> "\n",
>>
>> # 5. then grep for only lines with the date/time stamp (the
>> # other place we use the ugly regex) or ORA- lines
>> grep { m/$dt_re|ORA-/i }
>>
>> # 4. splitting them on a newline (now we have individual lines
>> # instead of blocks)
>> split(/\n/, $_)
>> )
>> }
>>
>> # 2. then, grep out only those blocks that contain ORA- lines; any
>> # other text blocks fall out at this point
>> grep { m/ORA-/ }
>>
>> # 1. split the modified file text on our arbitrary \x01; this creates
>> # blocks of text that start with a date/time stamp
>> split(/\x01/, $f)
>> )
>>
>> # 8. then tack on a newline at the end for formatting purposes
>> . "\n";
>>
>> #---
>>
>> William Sutton
>>
>>
>> On Tue, 12 Aug 2014, Alan Sterger wrote:
>>
>>> Hello Listers,
>>>
>>> Analyzing an Oracle alert_log. Familiar with sed, grep and regular
>>> expressions but processing across multiple lines has always been my nemesis.
>>> What I'm trying to accomplish is to printout blocks of lines to a smaller
>>> file for analysis.
>>>
>>> Thanks for your help,
>>>
>>> - Alan S.
>>>
>>> Alert log fragment:
>>> ******************************************************************
>>> Tue Mar 25 19:48:43 2014
>>> Current log# 3 seq# 141753 mem# 1: F:\ORADATA\NCP15\REDO03B.LOG
>>> Tue Mar 25 19:48:47 2014
>>> ARC0: Completed archiving log 2 thread 1 sequence 141752
>>> Tue Mar 25 19:52:17 2014
>>> Errors in file d:\oracle\admin\ncp15\udump\ncp15_j000_5824.trc:
>>> ORA-12012: error on auto execute of job 216103
>>> ORA-30926: unable to get a stable set of rows in the source tables
>>> ORA-06512: at "ENTREF.BGNPOSTLOAD", line 156
>>> ORA-06512: at line 2
>>>
>>> Tue Mar 25 19:56:51 2014
>>> Thread 1 advanced to log sequence 141754
>>> Tue Mar 25 19:56:51 2014
>>> ARC0: Evaluating archive log 3 thread 1 sequence 141753
>>> Tue Mar 25 19:56:51 2014
>>> Current log# 1 seq# 141754 mem# 0: F:\ORADATA\NCP15\REDO01A.LOG
>>> Current log# 1 seq# 141754 mem# 1: F:\ORADATA\NCP15\REDO01B.LOG
>>> ******************************************************************
>>>
>>> Preferred output:
>>> Tue Mar 25 19:52:17 2014
>>> ORA-12012: error on auto execute of job 216103
>>> ORA-30926: unable to get a stable set of rows in the source tables
>>> ORA-06512: at "ENTREF.BGNPOSTLOAD", line 156
>>> ORA-06512: at line 2
>>>
>>>
>>> Needed information:
>>> DOW & timestamp line
>>> all ORA- lines
>>>
>>> "Errors in file" line can be removed/skipped as well as the ORA-
>>> terminating whitespace line.
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> --
>>> This message was sent to: William <william at trilug.org>
>>>
>>> To unsubscribe, send a blank message to trilug-leave at trilug.org from that
>>> address.
>>> TriLUG mailing list : http://www.trilug.org/mailman/listinfo/trilug
>>> Unsubscribe or edit options on the web :
>>> http://www.trilug.org/mailman/options/trilug/william%40trilug.org
>>>
>>> Welcome to TriLUG: http://trilug.org/welcome
>>>
>> --
>> This message was sent to: Daniel S. Sterling <sterling.daniel at gmail.com>
>> To unsubscribe, send a blank message to trilug-leave at trilug.org from that
>> address.
>> TriLUG mailing list : http://www.trilug.org/mailman/listinfo/trilug
>> Unsubscribe or edit options on the web :
>> http://www.trilug.org/mailman/options/trilug/sterling.daniel%40gmail.com
>> Welcome to TriLUG: http://trilug.org/welcome
> --
> This message was sent to: William <william at trilug.org>
> To unsubscribe, send a blank message to trilug-leave at trilug.org from that address.
> TriLUG mailing list : http://www.trilug.org/mailman/listinfo/trilug
> Unsubscribe or edit options on the web : http://www.trilug.org/mailman/options/trilug/william%40trilug.org
> Welcome to TriLUG: http://trilug.org/welcome
>
More information about the TriLUG
mailing list