Levenstein distance

machineauthorcommitcommit dateplatformtimeresultcheckrel timebonus

Single-threaded version of an assignment from NPRG042 Programming in Parallel Environment.

Levenstein distance of words A[0..M-1], B[0..N-1] is defined as:

        D(0,0) = 0;
        D(0,j) = j;
        D(i,0) = i;
        D(i+1,j+1) = min( D(i,j+1) + 1, D(i,j) + (A[i]==B[j] ? 0 : 1), D(i+1,j) + 1);
        D = D(M,N);
    

The word elements A[i], B[j] as well as the distances D[i,j] are unsigned 32-bit numbers.

Interface

        namespace levensol {

            template< typename policy>
            class levenstein {
            public:
                levenstein(std::size_t a_size, std::size_t b_size);
        
                std::uint32_t compute(const std::uint32_t* a, const std::uint32_t* b);
            };
        
            struct policy_sse {
            };
        
            struct policy_avx {
            };
        
            struct policy_avx512 {
            };
        
        }
    

The class constructor accepts two numbers defining the sizes of the two input words. The constructor shall allocate all buffers used later by the compute function. Time consumed by the constructor is not measured.

The compute function accepts two pointers to the two input words, represented as arrays of std::uint32_t. The sizes of the words were specified as arguments to the constructor; i.e. all invocations of compute on the same object work with words of the same size. The function shall compute and return the levenstein distance of the two words.

The test wrapper displays the wall time divided by the M*N factor, i.e. the time per one matrix element, in nanoseconds.

Test parameters

a_size and b_size parameters specify the sizes of the two words; both are iterated through the set { 128, 2048, 32768 }, resulting in 9 test configurations on each platform. In Debug mode, only { 128, 2048 } is used.

repeats is an auto-adjusted parameter, used to increase running time by repeatedly calling compute. The range of the parameter is set so that the expected run time ranges from fractions of a second to seconds (stopped by the auto-adjustment mechanism after exceeding a second). For the largest configuration (32768*32768), there is usually only one repetition.