[Dev] Performance Profiler
M. Mueller/bhu5nji
dev@trilug.org
Tue, 8 Jan 2002 16:22:05 -0500
> compile your prog with 'gcc -pg ...'. When you run the prog, a
> gmon.out file will be created, which can processed by gprof to list
> called routines, and time spent in routines.
>
> $ gcc -g -pg src.c -o exe
> $ ./exe
> $ gprof exe gmon.out
Cool. I tried it but it busted on a "select". I will play with this more in
the future.
I may have worked around the problem. I have a proprietary wrapper around
datagrams as they pass through the box. I added 6 new fields to the wrapper
as follows:
struct timeval timeStampA
struct timeval timeStampB
struct timeval timeStampC
struct timeval timeStampD
struct timeval timeStampE
struct timeval timeStampF
I loaded the fields with gettimeofday() after the recvfrom on entry and
before the sendto on exit from each of the 3 daemons n my app.
I found the D to E transfer in one direction was taking about 1/2 minute
consistently. After substituting code from a "fast" daemon into the "slow"
daemon and gnashing my teeth in frustration for a while I noticed that the
fast daemon had a blocking select (wait-forever). The "slow" daemon used a
polling select (no-wait). I changed the slow daemon select to wait for 10000
usecs. Voila. Transfer time overall goes down dramatically. I went through
the entire code body and changed all polling selects to wait for 10000 usecs
selects.
The change is dramatic. Before the change, using LOTs of syslogging, I got
cross 2-box delays of about 30 secs and with almost no syslogging I got .9
sec delays. With the new change I get delays of .2-.02 secs with LOTs of
syslogging. These numbers are much more reasonable. (Great sigh of relief.)
This result does not seem consistent with how I read Stevens explanation of
no-wait select in section 5.6 of Unix Network Programming Vol 1.
I am using a 2.2.14 kernel.
Mike