Skip to content

add ons in string directory - Boyer_Moore_Search #933

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jul 2, 2019
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
88 changes: 88 additions & 0 deletions strings/Boyer_Moore_Search.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,88 @@
"""
The algorithm finds the pattern in given text using following rule.

The bad-character rule considers the mismatched character in Text.
The next occurrence of that character to the left in Pattern is found,

If the mismatched character occurs to the left in Pattern,
a shift is proposed that aligns text block and pattern.

If the mismatched character does not occur to the left in Pattern,
a shift is proposed that moves the entirety of Pattern past
the point of mismatch in the text.

If there no mismatch then the pattern matches with text block.

Time Complexity : O(n/m)
n=length of main string
m=length of pattern string
"""


class BoyerMooreSearch:


def __init__(self, text, pattern):
self.text, self.pattern = text, pattern
self.textLen, self.patLen = len(text), len(pattern)


def match_in_pattern(self, char):
""" finds the index of char in pattern in reverse order

Paremeters :
char (chr): character to be searched

Returns :
i (int): index of char from last in pattern
-1 (int): if char is not found in pattern
"""

for i in range(self.patLen-1, -1, -1):
if char == self.pattern[i]:
return i
return -1


def mismatch_in_text(self, currentPos):
""" finds the index of mis-matched character in text when compared with pattern from last

Paremeters :
currentPos (int): current index position of text

Returns :
i (int): index of mismatched char from last in text
-1 (int): if there is no mis-match between pattern and text block
"""

for i in range(self.patLen-1, -1, -1):
if self.pattern[i] != self.text[currentPos + i]:
return currentPos + i
return -1


def bad_character_heuristic(self):
# searches pattern in text and returns index positions
positions = []
for i in range(self.textLen - self.patLen + 1):
mismatch_index = self.mismatch_in_text(i)
if mismatch_index == -1:
positions.append(i)
else:
match_index = self.match_in_pattern(self.text[mismatch_index])
i = mismatch_index - match_index #shifting index
return positions


text = "ABAABA"
pattern = "AB"
bms = BoyerMooreSearch(text, pattern)
positions = bms.bad_character_heuristic()

if len(positions) == 0:
print("No match found")
else:
print("Pattern found in following positions: ")
print(positions)