ℹ️ Select 'Choose Exercise', or randomize 'Next Random Exercise' in selected language.

Choose Exercise:
Timer 00:00
WPM --
Score --
Acc --
Correct chars --

WAT Loop Unrolling for Array Summation

WAT

Goal -- WPM

Ready
Exercise Algorithm Area
1;; Array Summation with Loop Unrolling (Factor of 4) in WAT
2;; Computes the sum of elements in an i32 array.
3
4(func $sum_array_unrolled (param $array_ptr i32) (param $array_len i32) (result i32)
5(local $sum i32) ;; Accumulator for the sum
6(local $i i32) ;; Loop counter
7(local $unrolled_len i32) ;; Length after unrolling
8(local $remainder i32) ;; Number of remaining elements
9
10;; Initialize sum and loop counter
11(set_local $sum (i32.const 0))
12(set_local $i (i32.const 0))
13
14;; Determine the length for the unrolled loop
15;; We unroll by a factor of 4.
16(set_local $unrolled_len (i32.mul (i32.div_s $array_len (i32.const 4)) (i32.const 4)))
17(set_local $remainder (i32.sub $array_len $unrolled_len))
18
19;; Unrolled loop (process 4 elements at a time)
20(loop $unrolled_loop
21(if (i32.ge $i $unrolled_len) (br $end_unrolled_loop))
22
23;; Load and add elements
24(set_local $sum (i32.add $sum (memory.load (i32.add $array_ptr (i32.mul $i (i32.const 4))))))
25(set_local $sum (i32.add $sum (memory.load (i32.add $array_ptr (i32.mul (i32.add $i (i32.const 1))) (i32.const 4)))))
26(set_local $sum (i32.add $sum (memory.load (i32.add $array_ptr (i32.mul (i32.add $i (i32.const 2))) (i32.const 4)))))
27(set_local $sum (i32.add $sum (memory.load (i32.add $array_ptr (i32.mul (i32.add $i (i32.const 3))) (i32.const 4)))))
28
29;; Increment counter by 4
30(set_local $i (i32.add $i (i32.const 4)))
31(br $unrolled_loop)
32)
33(label $end_unrolled_loop)
34
35;; Handle remaining elements (if any)
36(loop $remainder_loop
37(if (i32.ge $i $array_len) (br $end_remainder_loop))
38
39(set_local $sum (i32.add $sum (memory.load (i32.add $array_ptr (i32.mul $i (i32.const 4))))))
40
41;; Increment counter by 1
42(set_local $i (i32.add $i (i32.const 1)))
43(br $remainder_loop)
44)
45(label $end_remainder_loop)
46
47;; Return the total sum
48(return $sum)
49)
50
51;; Helper function to create an array for testing
52(func $create_test_array (param $size i32) (result i32)
53(local $array_ptr i32)
54(local $i i32)
55(local $value i32)
56
57;; Allocate memory for the array
58(set_local $array_ptr (memory.grow (i32.add (i32.div (local.get $size) (i32.const 65535)) (i32.const 1))))
59
60;; Fill array with values (e.g., 1, 2, 3, ...)
61(set_local $i (i32.const 0))
62(set_local $value (i32.const 1))
63(loop $fill_loop
64(if (i32.ge $i $size) (br $end_fill_loop))
65(memory.store (i32.add $array_ptr (i32.mul $i (i32.const 4))) $value)
66(set_local $i (i32.add $i (i32.const 1)))
67(set_local $value (i32.add $value (i32.const 1)))
68(br $fill_loop)
69)
70(label $end_fill_loop)
71
72(return $array_ptr)
73)
74
75;; Example usage
76(func $example_unroll_sum
77;; Create an array of size 10: [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
78(local $arr_ptr i32)
79(local $arr_len i32)
80(local $total_sum i32)
81
82(set_local $arr_len (i32.const 10))
83(set_local $arr_ptr (call $create_test_array $arr_len))
84
85;; Calculate sum using unrolled function
86(set_local $total_sum (call $sum_array_unrolled $arr_ptr $arr_len))
87
88;; Expected sum for [1..10] is 55
89
90;; Test with an array size not divisible by 4
91(set_local $arr_len (i32.const 7)) ;; [1, 2, 3, 4, 5, 6, 7]
92(set_local $arr_ptr (call $create_test_array $arr_len))
93(set_local $total_sum (call $sum_array_unrolled $arr_ptr $arr_len))
94;; Expected sum for [1..7] is 28
95
96;; Test with empty array
97(set_local $arr_len (i32.const 0))
98(set_local $arr_ptr (call $create_test_array $arr_len))
99(set_local $total_sum (call $sum_array_unrolled $arr_ptr $arr_len))
100;; Expected sum is 0
101
102(return (i32.const 0))
103)
Algorithm description viewbox

WAT Loop Unrolling for Array Summation

Algorithm description:

This WAT program demonstrates loop unrolling for array summation. Instead of processing one element per iteration, the loop is unrolled by a factor of 4, meaning it processes four elements in each iteration. This reduces loop overhead and can improve performance by exposing more parallelism to the processor. The code also handles any remaining elements that don't form a full group of four.

Algorithm explanation:

The `sum_array_unrolled` function calculates the sum of elements in an i32 array. It first determines how many full groups of 4 elements can be processed and how many remain. The main loop iterates `unrolled_len` times, but in each iteration, it loads and adds four elements at once. This reduces the number of loop control instructions (increments, comparisons, branches). After the unrolled loop, a separate loop handles the `remainder` elements. The time complexity remains O(N), but the constant factor is reduced due to fewer loop overheads, potentially leading to faster execution. Space complexity is O(1). Edge cases include empty arrays and arrays whose lengths are not multiples of the unrolling factor (4). The `create_test_array` helper is for demonstration.

Pseudocode:

function sum_array_unrolled(array, length):
  sum = 0
  i = 0
  unrolled_length = floor(length / 4) * 4
  remainder = length - unrolled_length

  // Unrolled loop (process 4 elements at a time)
  while i < unrolled_length:
    sum = sum + array[i]
    sum = sum + array[i+1]
    sum = sum + array[i+2]
    sum = sum + array[i+3]
    i = i + 4

  // Handle remaining elements
  while i < length:
    sum = sum + array[i]
    i = i + 1

  return sum