Hi! Please consider following me on twitter: @hanekomu.

Perl benchmarks: Nested data structures

The conclusion first: Nested arrays and hashes in Perl are slow. They get slower the more levels you nest. For most applications this won't be a problem, but if you cache values in a hash, you might want to use as few levels of nesting as possible.

First, let's benchmark reading from and writing to nested array elements. We use different arrays for six different levels of nesting, and all element indices are constants.

use Benchmark qw(:all);

our (@array1, @array2, @array3, @array4, @array5, @array6);

cmpthese(timethese(10_000_000, {
    L1w => sub { $array1[1] = 1 },
    L2w => sub { $array2[1][2] = 1 },
    L3w => sub { $array3[1][2][3] = 1 },
    L4w => sub { $array4[1][2][3][4] = 1 },
    L5w => sub { $array5[1][2][3][4][5] = 1 },
    L6w => sub { $array6[1][2][3][4][5][6] = 1 },
    L1r => sub { $array1[1] },
    L2r => sub { $array2[1][2] },
    L3r => sub { $array3[1][2][3] },
    L4r => sub { $array4[1][2][3][4] },
    L5r => sub { $array5[1][2][3][4][5] },
    L6r => sub { $array6[1][2][3][4][5][6] },
}));

The results:

Benchmark: timing 10000000 iterations of L1r, L1w, L2r, L2w, L3r, L3w, L4r, L4w, L5r, L5w, L6r, L6w...
       L1r:  0 wallclock secs ( 0.20 usr + -0.00 sys =  0.20 CPU) @ 50000000.00/s (n=10000000)
            (warning: too few iterations for a reliable count)
       L1w:  1 wallclock secs ( 0.72 usr +  0.01 sys =  0.73 CPU) @ 13698630.14/s (n=10000000)
       L2r:  1 wallclock secs ( 1.19 usr +  0.01 sys =  1.20 CPU) @ 8333333.33/s (n=10000000)
       L2w:  1 wallclock secs ( 2.01 usr +  0.02 sys =  2.03 CPU) @ 4926108.37/s (n=10000000)
       L3r:  2 wallclock secs ( 1.82 usr +  0.02 sys =  1.84 CPU) @ 5434782.61/s (n=10000000)
       L3w:  3 wallclock secs ( 2.67 usr +  0.03 sys =  2.70 CPU) @ 3703703.70/s (n=10000000)
       L4r:  3 wallclock secs ( 2.63 usr +  0.01 sys =  2.64 CPU) @ 3787878.79/s (n=10000000)
       L4w:  4 wallclock secs ( 3.54 usr +  0.02 sys =  3.56 CPU) @ 2808988.76/s (n=10000000)
       L5r:  3 wallclock secs ( 3.04 usr +  0.00 sys =  3.04 CPU) @ 3289473.68/s (n=10000000)
       L5w:  4 wallclock secs ( 4.13 usr +  0.04 sys =  4.17 CPU) @ 2398081.53/s (n=10000000)
       L6r:  4 wallclock secs ( 3.76 usr +  0.01 sys =  3.77 CPU) @ 2652519.89/s (n=10000000)
       L6w:  5 wallclock secs ( 4.46 usr +  0.02 sys =  4.48 CPU) @ 2232142.86/s (n=10000000)
          Rate   L6w   L5w   L6r   L4w   L5r   L3w   L4r  L2w  L3r  L2r  L1w  L1r
L6w  2232143/s    --   -7%  -16%  -21%  -32%  -40%  -41% -55% -59% -73% -84% -96%
L5w  2398082/s    7%    --  -10%  -15%  -27%  -35%  -37% -51% -56% -71% -82% -95%
L6r  2652520/s   19%   11%    --   -6%  -19%  -28%  -30% -46% -51% -68% -81% -95%
L4w  2808989/s   26%   17%    6%    --  -15%  -24%  -26% -43% -48% -66% -79% -94%
L5r  3289474/s   47%   37%   24%   17%    --  -11%  -13% -33% -39% -61% -76% -93%
L3w  3703704/s   66%   54%   40%   32%   13%    --   -2% -25% -32% -56% -73% -93%
L4r  3787879/s   70%   58%   43%   35%   15%    2%    -- -23% -30% -55% -72% -92%
L2w  4926108/s  121%  105%   86%   75%   50%   33%   30%   --  -9% -41% -64% -90%
L3r  5434783/s  143%  127%  105%   93%   65%   47%   43%  10%   -- -35% -60% -89%
L2r  8333333/s  273%  248%  214%  197%  153%  125%  120%  69%  53%   -- -39% -83%
L1w 13698630/s  514%  471%  416%  388%  316%  270%  262% 178% 152%  64%   -- -73%
L1r 50000000/s 2140% 1985% 1785% 1680% 1420% 1250% 1220% 915% 820% 500% 265%   --

Next we do the same for hashes, again with constant hash keys:

use Benchmark qw(:all);

our (%hash1, %hash2, %hash3, %hash4, %hash5, %hash6);

cmpthese(timethese(10_000_000, {
    L1w => sub { $hash1{L1} = 1 },
    L2w => sub { $hash2{L1}{L2} = 1 },
    L3w => sub { $hash3{L1}{L2}{L3} = 1 },
    L4w => sub { $hash4{L1}{L2}{L3}{L4} = 1 },
    L5w => sub { $hash5{L1}{L2}{L3}{L4}{L5} = 1 },
    L6w => sub { $hash6{L1}{L2}{L3}{L4}{L5}{L6} = 1 },
    L1r => sub { $hash1{L1} },
    L2r => sub { $hash2{L1}{L2} },
    L3r => sub { $hash3{L1}{L2}{L3} },
    L4r => sub { $hash4{L1}{L2}{L3}{L4} },
    L5r => sub { $hash5{L1}{L2}{L3}{L4}{L5} },
    L6r => sub { $hash6{L1}{L2}{L3}{L4}{L5}{L6} },
}));

The results:

Benchmark: timing 10000000 iterations of L1r, L1w, L2r, L2w, L3r, L3w, L4r, L4w, L5r, L5w, L6r, L6w...
       L1r:  0 wallclock secs ( 0.63 usr + -0.00 sys =  0.63 CPU) @ 15873015.87/s (n=10000000)
       L1w:  0 wallclock secs ( 1.53 usr +  0.01 sys =  1.54 CPU) @ 6493506.49/s (n=10000000)
       L2r:  2 wallclock secs ( 1.58 usr + -0.00 sys =  1.58 CPU) @ 6329113.92/s (n=10000000)
       L2w:  2 wallclock secs ( 2.35 usr +  0.02 sys =  2.37 CPU) @ 4219409.28/s (n=10000000)
       L3r:  3 wallclock secs ( 2.58 usr +  0.01 sys =  2.59 CPU) @ 3861003.86/s (n=10000000)
       L3w:  4 wallclock secs ( 3.67 usr +  0.02 sys =  3.69 CPU) @ 2710027.10/s (n=10000000)
       L4r:  4 wallclock secs ( 3.07 usr +  0.02 sys =  3.09 CPU) @ 3236245.95/s (n=10000000)
       L4w:  5 wallclock secs ( 4.69 usr +  0.01 sys =  4.70 CPU) @ 2127659.57/s (n=10000000)
       L5r:  5 wallclock secs ( 4.67 usr +  0.02 sys =  4.69 CPU) @ 2132196.16/s (n=10000000)
       L5w:  5 wallclock secs ( 5.20 usr +  0.02 sys =  5.22 CPU) @ 1915708.81/s (n=10000000)
       L6r:  6 wallclock secs ( 5.77 usr +  0.01 sys =  5.78 CPU) @ 1730103.81/s (n=10000000)
       L6w:  7 wallclock secs ( 6.35 usr +  0.01 sys =  6.36 CPU) @ 1572327.04/s (n=10000000)
          Rate  L6w  L6r  L5w  L4w  L5r  L3w  L4r  L3r  L2w  L2r  L1w  L1r
L6w  1572327/s   --  -9% -18% -26% -26% -42% -51% -59% -63% -75% -76% -90%
L6r  1730104/s  10%   -- -10% -19% -19% -36% -47% -55% -59% -73% -73% -89%
L5w  1915709/s  22%  11%   -- -10% -10% -29% -41% -50% -55% -70% -70% -88%
L4w  2127660/s  35%  23%  11%   --  -0% -21% -34% -45% -50% -66% -67% -87%
L5r  2132196/s  36%  23%  11%   0%   -- -21% -34% -45% -49% -66% -67% -87%
L3w  2710027/s  72%  57%  41%  27%  27%   -- -16% -30% -36% -57% -58% -83%
L4r  3236246/s 106%  87%  69%  52%  52%  19%   -- -16% -23% -49% -50% -80%
L3r  3861004/s 146% 123% 102%  81%  81%  42%  19%   --  -8% -39% -41% -76%
L2w  4219409/s 168% 144% 120%  98%  98%  56%  30%   9%   -- -33% -35% -73%
L2r  6329114/s 303% 266% 230% 197% 197% 134%  96%  64%  50%   --  -3% -60%
L1w  6493506/s 313% 275% 239% 205% 205% 140% 101%  68%  54%   3%   -- -59%
L1r 15873016/s 910% 817% 729% 646% 644% 486% 390% 311% 276% 151% 144%   --

I hadn't expected the differences to be so dramatic...

Tags: .

Write a comment | Bookmark and Share

posted at: 16:38 | path: /benchmarks | permalink | 5 comments | 0 trackbacks