A very effective pattern-matching algorithm, developed by

A very effective pattern-matching algorithm, developed by Rabin and Karp [54], relies on the use of hashing to produce an algorithm with very good expected performance. Recall that the brute-force algorithm compares the pattern to each possible placement in the text, spending O(m) time, in the worst case, for each such comparison. The premise of the Rabin-Karp algorithm is to compute a hash
function, h(·), on the length-m pattern, and then to compute the hash function on all length-m substrings of the text. The pattern P occurs at substring, T[ j.. j + m−1], only if h(P) equals h(T[ j.. j+m−1]). If the hash values are equal, the authenticity of the match at that location must then be verified with the bruteforce approach, since there is a possibility that there was a coincidental collision
of hash values for distinct strings. But with a good hash function, there will be very few such false matches. The next challenge, however, is that computing a good hash function on a lengthm substring would presumably require O(m) time. If we did this for each of
O(n) possible locations, the algorithm would be no better than the brute-force approach. The trick is to rely on the use of a polynomial hash code, as originally introduced in Section 10.2.1, such as (x0am−1+x1am−2+···+xn−2a+xm−1) mod p for a substring (x0,x1, . . . ,xm−1), randomly chosen a, and large prime p. We can compute the hash value of each successive substring of he text in O(1) time each,by using the following formula h(T[ j+1.. j+m])= (a · h(T[ j.. j+m−1])−xjam+xj+m) mod p.Implement the Rabin-Karp algorithm and evaluate its efficiency.

1	2	3	4	5
6	7	8	9	10
11	12	13	14	15
16	17	18	19	20
21	22	23	24	25
26	27	28	29	30
31	32	33	34	35
36	37	38	39	40
41	42	43	44	45
46	47	48	49	50
51	52	53	54	55
56

The Tradition of Sharing

Help your friends and juniors by posting answers to the questions that you know. Also post questions that are not available.

Why should I post the question or an answer?

Question