The way I ran the scripts and the respective output is followed:
$ time nfsdsanalysis -Z common archive/lindump_total.ds | ./stats_basic.php > stats_basic.txt
real 551m56.349s
user 605m38.263s
sys 28m44.936s
$ time nfsdsanalysis -Z common archive/lindump_total.ds | perl stats_basic.pl > stats_basic1.txt
real 712m16.792s
user 677m48.698s
sys 66m32.526s
The file lindump_total.ds is a 80Gb file. The output of nfsdsanalysis (what is piped to the script) is something like this:
# Extent, type='Trace::NFS::common'
packet_at source source_port dest dest_port is_udp is_request nfs_version transaction_id op_id operation rpc_status payload_length record_id
1253831523212739 3a163121 790 01c633c7 2049 TCP request V3 21ff6e38 3 lookup null 56 0
1253831523212743 3a163121 790 01c633c7 2049 TCP request V3 21ff6e38 3 lookup null 56 1
1253831523212746 3a163121 897 01c633c7 2049 TCP request V3 2eff9a5e 1 getattr null 36 2
1253831523212748 3a163121 897 01c633c7 2049 TCP request V3 2eff9a5e 1 getattr null 36 3
1253831523214877 2a2622c2 2049 1a264421 790 TCP response V3 2ffdae28 3 lookup 0 216 4
1253831523214886 2a2622c2 2049 1a264421 897 TCP response V3 2ffca15e 1 getattr 0 88 5
Some people asked me to run the scripts isolated, i.e., not in paralel like last time. I got optimized versions from several people, and I even got some versions in other languages like python and C.
Apparently, the Perl version was so slow due some serious performance bug with regards to list assignment. Thanks to Pedro Figueiredo for the tip. Just by installing 5.10.1 I got a 37% performance improvement. Even though the improvements were significative, Perl still performed in last.
Below you can see the results of the runs of the several optimized scripts in different languages. The results are ordered by run time, being the first one the fastest one and the last one the slowest one:
C Version (By Jose Celestino):
$ time nfsdsanalysis -Z common archive/lindump_total.ds | ./stats_basic > stats_basic4.txt
real 202m37.347s
user 265m46.817s
sys 9m39.888s
PHP Version (Optimized by Diogo Neves, and modified by me since there were several bugs):
$ time nfsdsanalysis -Z common archive/lindump_total.ds | ./stats_basic_optimized.php >stats_basic5.txt
real 270m48.511s
user 444m43.480s
sys 8m56.562s
Python Version (by Andre Cruz):
$ time nfsdsanalysis -Z common archive/lindump_total.ds | python stats_basic.py > stats_basic3.txt
real 322m55.569s
Perl Version (Original by Carlos Pires, Optimized version by Joao Pedro):
$ time nfsdsanalysis -Z common archive/lindump_total.ds | perl stats_basic_optimized.pl > stats_basic2.txt
real 419m11.267s
user 508m26.699s
sys 16m20.717s