HP-UX gprof vs. GNU binutils' gprof (2024)

(Originally posted to: http://devresource.hp.com/forums/thread.jspa?threadID=2646&tstart=0&forumID=4)

The original HP-UX gprof seems to process gmon.out file of multithreaded application in a wrong way, and GNU binutils' gprof does it in a more correct manner.

Or, HP-UX's gprof seems to create gmon.out that can be wrong. GNU gprof gives different output.

Out product is a rather complicated multithreaded application that eats a lot of CPU. We are trying to profile the application of different platforms to find out what are major time-consuming parts of it, since the application behave slightly different on different platforms (because of the OS implementations, etc).

One of the test was done on HP-UX 11.11 machine, when our application was built as 32-bit application (it may be built as a 64-bit application).

The test ran some time (~100 seconds), and the gmon.out file was generated at the end. When the gprof was run for the tested binary and the gprof.out file, there were strange results. Here it's start and the end:

---
granularity: each sample hit covers 4 byte(s) for 0.01% of 111.62 seconds
%time c*msecs seconds calls msec/call name
19.1 21.27 21.27 _mcount
3.2 24.84 3.57 $$remU
3.0 28.23 3.38 $$divU
3.0 31.57 3.35 $$div2I
1.0 32.72 1.14 _mcleanup
0.9 33.77 1.05 $$dyncall_external
0.4 34.27 0.50 4378429 0.00 hrMalloc(unsigned long)
0.2 34.49 0.23 8756076 0.00 hrCheckMemory()
0.1 34.65 0.15 $$dyncall
0.1 34.73 0.08 532042 0.00 SmartPtr

::~SmartPtr()
0.1 34.80 0.07 $$mulU
0.0 34.85 0.05 $$divoI
0.0 34.89 0.04 $$rem2U
0.0 34.91 0.02 $$bit_adrs_store
0.0 34.92 0.01 1 10.00 printMallocCountSet()
0.0 34.92 0.00 2780939 0.00 hrFree(void *)
0.0 34.92 0.00 44581182 0.00 Mutex::getOwnership()
0.0 34.92 0.00 44580989 0.00 Mutex::release()
0.0 34.92 0.00 31745415 0.00 DimManager::getArrByMode() const
... [skipped ]
0.0 34.92 0.00 1597376 0.00 _strncpy
... [skipped ]
0.0 34.92 0.00 1 0.00 Cube::writeXml(Xml *,int) const
0.0 34.92 0.00 1 0.00 xmlLoadDLL(Xml *)
---

1. The total (last) "c*msecs" row was about 1/3 of the time (N, below) that was printed in the line (with "granularity").
2. The "top" calls are very strange: $$remU, $$divU, $$div2I, ...
3. Our functions are counted, but "self seconds" are "0.00", almost always.

The same output was generated with GNU gprof from GNU binutils 2.15 that was built on HP-UX 11.00 machine, and the result differs.
The result of GNU gprof seems to be more realistic, than the original gprof's.
Here is the GNU gprof output (from the same binary application and the same gmon.out file):

---
granularity: each sample hit covers 4 byte(s) for 0.01% of 89.22 seconds
Each sample counts as 0.01 seconds.
% cumulative self self total
time seconds seconds calls s/call s/call name
8.27 7.38 7.38 _recv_sys
7.74 14.29 6.91 _select_sys
4.33 18.15 3.86 _send_sys
3.83 21.57 3.42 regular_seq
3.74 24.91 3.34 pthread_mutex_lock
3.40 27.94 3.03 44580989 0.00 0.00 Mutex::release(void)
3.16 30.76 2.82 normal
3.10 33.53 2.77 44581182 0.00 0.00 Mutex::getOwnership(void)
2.73 35.97 2.44 pthread_mutex_unlock
1.98 37.74 1.77 __mutex_unlock_handoff_enabled
1.83 39.37 1.63 InnerLoop
1.71 40.90 1.53 532236 0.00 0.00 Worker::processStreamPoint(QueryStream *, AskedCubesInfo *, CubesBuffers *, PointToSend *, bool &)
1.52 42.26 1.36 __spin_lock
1.36 43.47 1.21 __spin_unlock
1.34 44.67 1.20 $$dyncall_external
1.18 45.72 1.05 12477282 0.00 0.00 indexToCoord__10DimManagerCFPUpi15CalculationMode
1.12 46.72 1.00 8525168 0.00 0.00 moveToSuccessor__13DirEntryArrayFPUlN21PUpb15CalculationModeUl
1.10 47.70 0.98 13586523 0.00 0.00 putIntoIndex__10DimManagerCFPUpUlT215CalculationMode
1.09 48.67 0.97 4378219 0.00 0.00 real_malloc
1.02 49.58 0.91 31745415 0.00 0.00 DimManager::getArrByMode( const(CalculationMode))
0.92 50.40 0.82 10008954 0.00 0.00 inLimits__11AbstrLimitsCFUlN21b
... [skipped ]
0.11 81.48 0.10 1597376 0.00 0.00 strncpy
... [skipped ]
0.00 89.22 0.00 1 0.00 0.00 Cube::writeXml( const(Xml *, int))
0.00 89.22 0.00 1 0.00 0.01 xmlLoadDLL(Xml *)
---

The final question is: what is wrong?
Should I install some patches? Should I compile/build differently?
Should I reconfigure something?

---

System and tools version:
what /usr/bin/ld
/usr/bin/ld:
$Revision: 92453-07 linker linker crt0.o B.11.37 040218 $
HP aC++ B3910B A.03.52 Classic Iostream Library
HP aC++ B3910B A.03.52 Language Support Library
ld_msgs.cat: $Revision: 1.85 $
92453-07 linker command s800.sgs ld PA64 B.11.43 REL 050124
what /opt/aCC/bin/aCC
/opt/aCC/bin/aCC:
HP aC++ B3910B A.03.13
HP aC++ B3910B X.03.11.10 Language Support Library
what /usr/bin/gprof
/usr/bin/gprof:
gprof.c $Date: 2003/03/03 00:46:07 $Revision: r11.11/2 PATCH_11.11 (PHCO_27848)
calls.c $Date: 2002/11/13 07:47:41 $Revision: r11.11/1 PATCH_11.11 (PHCO_27848)
printgprof.c $Date: 2002/11/13 07:48:25 $Revision: r11.11/1 PATCH_11.11 (PHCO_27848)
fixbounds.c $Date: 2003/03/03 00:46:07 $Revision: r11.11/2 PATCH_11.11 (PHCO_27848)
$Revision: @(#) all CUP11.11_BL2003_0310_3 PATCH_11.11 PHCO_27848
Mon Mar 10 21:58:20 PST 2003 $
/usr/local/bin/gprof --version
GNU gprof 2.15
Based on BSD gprof, copyright 1983 Regents of the University of California.
This program is free software. This program has absolutely no warranty.
what /usr/lib/libpthread.sl
/usr/lib/libpthread.sl:
Pthread Interfaces
$Revision: libpthread.1: @(#) depot-32pa CUP11.11_BL2002_0405_3 PATCH_11.11 PHCO_26466 Fri Apr 5 12:25:38 PST 2002 $
what /usr/lib/libc.sl
/usr/lib/libc.sl:
$ PATCH_11.11/PHCO_29955 Jan 28 2004 01:24:54 $
what /usr/lib/libp/libc.a
/usr/lib/libp/libc.a:
$ PATCH_11.11/PHCO_29955 Jan 28 2004 01:26:40 $

Patches tried: (that relate to any of the words: gprof crt0 gcrt0 _mcount
PHCO_27848 s700_800 11.11 gprof(1) patch
PHSS_30970 s700_800 11.11 ld(1) and linker tools cumulative patch
PHSS_28435 s700_800 11.11 linker startup code / SLLIC ELF support
(does not include gcrt0.o file)

Compile time options:
for release version:
aCC +z \
-DRWSTD_MULTI_THREAD -D_LARGEFILE64_SOURCE -D__STDC_EXT__ \
-D_POSIX2_SOURCE -D_POSIX_SOURCE -D_POSIX_C_SOURCE=199506L \
-D_HPUX_SOURCE -D_XOPEN_SOURCE -D_XOPEN_SOURCE_EXTENDED \
-D_XPG4 \
+O2 -D_FILE_OFFSET_BITS=32 +W495 -D_REENTRANT \
-I

-D \
-c -o
for profile version:
aCC +O2 \
-DRWSTD_MULTI_THREAD -D_LARGEFILE64_SOURCE -D__STDC_EXT__ \
-D_POSIX2_SOURCE -D_POSIX_SOURCE -D_POSIX_C_SOURCE=199506L \
-D_HPUX_SOURCE -D_XOPEN_SOURCE -D_XOPEN_SOURCE_EXTENDED \
-D_XPG4 \
-G -D_FILE_OFFSET_BITS=32 +W495 -D_REENTRANT \
-I -D \
-c -o
Note, that option "-G" can not go with "+z" (PIC).
May this be a problem ?

Build time options:
for release version:
aCC +O2 -o

-L \
-l -ldld -lpthread -lrt -Wl,-N
for profile version:
aCC -G -o -L \
-l -ldld -lpthread -lrt -Wl,-N
where are compiled using appropriate compile options, and created with "ar(1)" command with "ranlib" afterwards.

---

swlist -l file | egrep 'gprof|crt0|libp/|bin/aCC|bin/ld'

ACXX.ACXX: /opt/aCC/bin/aCC
# Auxiliary-Opt.LANG-STARTUP B.11.01.06 Family of Startup crt0.o files
Auxiliary-Opt.LANG-STARTUP: /opt/langtools/lib/crt0.o
Auxiliary-Opt.LANG-STARTUP: /opt/langtools/lib/gcrt0.o
Auxiliary-Opt.LANG-STARTUP: /opt/langtools/lib/icrt0.o
Auxiliary-Opt.LANG-STARTUP: /opt/langtools/lib/mcrt0.o
Auxiliary-Opt.LANG-STARTUP: /opt/langtools/lib/pa20_64/crt0.o
Auxiliary-Opt.LANG-STARTUP: /opt/langtools/lib/scrt0.o
BLINKLINK.BLINKLINK: /opt/blinklink/obsolete/bin/ld
HPPAK.HPPAK: /opt/langtools/hppak/ui/icons/gprof_icon.pm
OS-Core.C-KRN: /usr/ccs/bin/ld
OS-Core.C-KRN: /usr/ccs/lbin/ld32
OS-Core.C-KRN: /usr/ccs/lbin/ld64
OS-Core.C-MIN: /usr/ccs/lib/crt0.o
OS-Core.C-MIN-32ALIB: /usr/lib/libp/libpthread.a
OS-Core.C-MIN-64ALIB: /usr/ccs/lib/pa20_64/crt0.o
OS-Core.C-MIN-64ALIB: /usr/lib/pa20_64/libp/libpthread.a
OS-Core.CMDS-AUX: /usr/ccs/bin/ldd
OS-Core.CMDS-AUX: /usr/ccs/lbin/ldd32
OS-Core.CMDS-AUX: /usr/ccs/lbin/ldd64
OS-Core.CORE-64SLIB: /usr/lib/pa20_64/libgprof.1
OS-Core.CORE-64SLIB: /usr/lib/pa20_64/libgprof.a
OS-Core.CORE-64SLIB: /usr/lib/pa20_64/libgprof.sl
OS-Core.CORE-KRN: /usr/conf/sys/gprof.h
OS-Core.CORE-SHLIBS: /usr/lib/libgprof.1
OS-Core.CORE-SHLIBS: /usr/lib/libgprof32.sl
# PHCO_27848 1.0 gprof(1) patch.
PHCO_27848.PROG-AUX: /usr/ccs/bin/gprof
PHCO_29955.PROG-AUX: /usr/lib/libp/libc.a
PHCO_29955.PROG-AX-64ALIB: /usr/lib/pa20_64/libp/libc.a
PHSS_28435.LANG-STARTUP: /opt/langtools/lib/crt0.o
PHSS_28435.LANG-STARTUP: /opt/langtools/lib/icrt0.o
PHSS_28435.LANG-STARTUP: /opt/langtools/lib/pa20_64/crt0.o
PHSS_28435.LANG-STARTUP: /opt/langtools/lib/scrt0.o
PHSS_30970.C-INC: /usr/include/crt0.h
PHSS_30970.C-KRN: /usr/ccs/bin/ld
PHSS_30970.C-KRN: /usr/ccs/lbin/ld32
PHSS_30970.C-KRN: /usr/ccs/lbin/ld64
PHSS_30970.C-MIN: /usr/ccs/lib/crt0.o
PHSS_30970.C-MIN-64ALIB: /usr/ccs/lib/pa20_64/crt0.o
PHSS_30970.CMDS-AUX: /usr/ccs/bin/ldd
PHSS_30970.CMDS-AUX: /usr/ccs/lbin/ldd32
PHSS_30970.CMDS-AUX: /usr/ccs/lbin/ldd64
ProgSupport.C-INC: /usr/include/crt0.h
ProgSupport.C-INC: /usr/include/sys/gprof.h
ProgSupport.PAUX-ENG-A-MAN: /usr/share/man/man1.Z/gprof.1
ProgSupport.PAUX-ENG-A-MAN: /usr/share/man/man3.Z/crt0.3
ProgSupport.PAUX-ENG-A-MAN: /usr/share/man/man3.Z/crt0.o.3
ProgSupport.PAUX-ENG-A-MAN: /usr/share/man/man3.Z/gcrt0.o.3
ProgSupport.PAUX-ENG-A-MAN: /usr/share/man/man3.Z/mcrt0.o.3
ProgSupport.PAUX-JPN-E-MAN: /usr/share/man/ja_JP.eucJP/man1.Z/gprof.1
ProgSupport.PAUX-JPN-E-MAN: /usr/share/man/ja_JP.eucJP/man3.Z/crt0.3
ProgSupport.PAUX-JPN-E-MAN: /usr/share/man/ja_JP.eucJP/man3.Z/crt0.o.3
ProgSupport.PAUX-JPN-E-MAN: /usr/share/man/ja_JP.eucJP/man3.Z/gcrt0.o.3
ProgSupport.PAUX-JPN-E-MAN: /usr/share/man/ja_JP.eucJP/man3.Z/mcrt0.o.3
ProgSupport.PAUX-JPN-S-MAN: /usr/share/man/ja_JP.SJIS/man1.Z/gprof.1
ProgSupport.PAUX-JPN-S-MAN: /usr/share/man/ja_JP.SJIS/man3.Z/crt0.3
ProgSupport.PAUX-JPN-S-MAN: /usr/share/man/ja_JP.SJIS/man3.Z/crt0.o.3
ProgSupport.PAUX-JPN-S-MAN: /usr/share/man/ja_JP.SJIS/man3.Z/gcrt0.o.3
ProgSupport.PAUX-JPN-S-MAN: /usr/share/man/ja_JP.SJIS/man3.Z/mcrt0.o.3
ProgSupport.PROG-AUX: /usr/ccs/bin/gprof
ProgSupport.PROG-AUX: /usr/lib/gprof.callg
ProgSupport.PROG-AUX: /usr/lib/gprof.flat
ProgSupport.PROG-AUX: /usr/lib/libp/libc.a
ProgSupport.PROG-AX-64ALIB: /usr/lib/pa20_64/libp/libc.a
gcc.gcc-MAN: /usr/local/man/man1/gprof.1
gcc.gcc-RUN: /usr/local/bin/gprof
gcc.gcc-RUN: /usr/local/info/gprof.info
gcc.gcc-RUN: /usr/local/share/locale/da/LC_MESSAGES/gprof.mo
gcc.gcc-RUN: /usr/local/share/locale/de/LC_MESSAGES/gprof.mo
gcc.gcc-RUN: /usr/local/share/locale/es/LC_MESSAGES/gprof.mo
gcc.gcc-RUN: /usr/local/share/locale/fr/LC_MESSAGES/gprof.mo
gcc.gcc-RUN: /usr/local/share/locale/id/LC_MESSAGES/gprof.mo
gcc.gcc-RUN: /usr/local/share/locale/pt_BR/LC_MESSAGES/gprof.mo
gcc.gcc-RUN: /usr/local/share/locale/sv/LC_MESSAGES/gprof.mo
gcc.gcc-RUN: /usr/local/share/locale/tr/LC_MESSAGES/gprof.mo
Total=17:05.53 (CPU/TOTAL=1.2%, user=9.56 kernel=3.51)

--- END ---

HP-UX gprof vs. GNU binutils' gprof (2024)
Top Articles
Latest Posts
Article information

Author: Trent Wehner

Last Updated:

Views: 6046

Rating: 4.6 / 5 (56 voted)

Reviews: 95% of readers found this page helpful

Author information

Name: Trent Wehner

Birthday: 1993-03-14

Address: 872 Kevin Squares, New Codyville, AK 01785-0416

Phone: +18698800304764

Job: Senior Farming Developer

Hobby: Paintball, Calligraphy, Hunting, Flying disc, Lapidary, Rafting, Inline skating

Introduction: My name is Trent Wehner, I am a talented, brainy, zealous, light, funny, gleaming, attractive person who loves writing and wants to share my knowledge and understanding with you.