Skip to content

Added Python Implementation of Suffix Arrays and LCP Arrays #12171

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 20 commits into
base: master
Choose a base branch
from
Open
Changes from 1 commit
Commits
Show all changes
20 commits
Select commit Hold shift + click to select a range
6132d40
Suffix Array and LCP implementation.py
putul03 Oct 19, 2024
06a7be7
Delete divide_and_conquer/Suffix Array and LCP implementation.py
putul03 Oct 19, 2024
1e8f767
Added Suffix Array and LCP implementation
putul03 Oct 19, 2024
0094577
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Oct 19, 2024
123e6f0
Suffix Array and LCP implementation.py
putul03 Oct 19, 2024
848a358
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Oct 19, 2024
d950f57
Delete divide_and_conquer/Suffix Array and LCP implementation.py
putul03 Oct 19, 2024
dae072c
Suffix Array and LCP Array Implementation
putul03 Oct 19, 2024
70c3869
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Oct 19, 2024
c7f137e
suffix_array_lcp.py
putul03 Oct 19, 2024
8038826
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Oct 19, 2024
8b0e74e
suffix_array_lcp.py
putul03 Oct 19, 2024
1b37c1c
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Oct 19, 2024
a4073ca
suffix_array_lcp.py
putul03 Oct 19, 2024
8dcffa3
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Oct 19, 2024
81c09d1
Longest Palindromic Subsequence
putul03 Oct 19, 2024
b01fbff
Delete dynamic_programming/longest_palindromic_subsequence.py
putul03 Oct 19, 2024
ada767d
Add files via upload
putul03 Oct 19, 2024
0018a8e
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Oct 19, 2024
6380f89
Delete data_structures/persistent_segment_tree.py
putul03 Oct 19, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
[pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
  • Loading branch information
pre-commit-ci[bot] committed Oct 19, 2024
commit 848a358d80945b27cfa76e50ff0847d01da57034
9 changes: 6 additions & 3 deletions divide_and_conquer/Suffix Array and LCP implementation.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
from typing import List

Check failure on line 1 in divide_and_conquer/Suffix Array and LCP implementation.py

View workflow job for this annotation

GitHub Actions / ruff

Ruff (N999)

divide_and_conquer/Suffix Array and LCP implementation.py:1:1: N999 Invalid module name: 'Suffix Array and LCP implementation'

Check failure on line 1 in divide_and_conquer/Suffix Array and LCP implementation.py

View workflow job for this annotation

GitHub Actions / ruff

Ruff (UP035)

divide_and_conquer/Suffix Array and LCP implementation.py:1:1: UP035 `typing.List` is deprecated, use `list` instead


class SuffixArray:
Expand All @@ -10,10 +10,10 @@
self.suffix_array = self.build_suffix_array()
self.lcp_array = self.build_lcp_array()

def build_suffix_array(self) -> List[int]:

Check failure on line 13 in divide_and_conquer/Suffix Array and LCP implementation.py

View workflow job for this annotation

GitHub Actions / ruff

Ruff (UP006)

divide_and_conquer/Suffix Array and LCP implementation.py:13:37: UP006 Use `list` instead of `List` for type annotation
"""
Builds the suffix array for the input string.
Returns the suffix array (a list of starting indices of suffixes in sorted order).

Check failure on line 16 in divide_and_conquer/Suffix Array and LCP implementation.py

View workflow job for this annotation

GitHub Actions / ruff

Ruff (E501)

divide_and_conquer/Suffix Array and LCP implementation.py:16:89: E501 Line too long (90 > 88)

Example:
>>> sa = SuffixArray("banana")
Expand All @@ -22,10 +22,12 @@
"""
n = len(self.text)
# Create a list of suffix indices sorted by the suffixes they point to
sorted_suffix_indices = sorted(range(n), key=lambda suffix_index: self.text[suffix_index:])
sorted_suffix_indices = sorted(
range(n), key=lambda suffix_index: self.text[suffix_index:]
)
return sorted_suffix_indices

def build_lcp_array(self) -> List[int]:

Check failure on line 30 in divide_and_conquer/Suffix Array and LCP implementation.py

View workflow job for this annotation

GitHub Actions / ruff

Ruff (UP006)

divide_and_conquer/Suffix Array and LCP implementation.py:30:34: UP006 Use `list` instead of `List` for type annotation
"""
Builds the LCP (Longest Common Prefix) array for the suffix array.
LCP[i] gives the length of the longest common prefix of the suffixes
Expand All @@ -41,7 +43,7 @@
rank = [0] * n
lcp = [0] * n

# Build the rank array where rank[i] gives the position of the suffix starting at index i

Check failure on line 46 in divide_and_conquer/Suffix Array and LCP implementation.py

View workflow job for this annotation

GitHub Actions / ruff

Ruff (E501)

divide_and_conquer/Suffix Array and LCP implementation.py:46:89: E501 Line too long (97 > 88)
for rank_index, suffix in enumerate(suffix_array):
rank[suffix] = rank_index

Expand All @@ -49,7 +51,9 @@
for i in range(n):
if rank[i] > 0:
j = suffix_array[rank[i] - 1] # Previous suffix in the sorted order
while (i + h < n) and (j + h < n) and self.text[i + h] == self.text[j + h]:
while (
(i + h < n) and (j + h < n) and self.text[i + h] == self.text[j + h]
):
h += 1
lcp[rank[i]] = h
if h > 0:
Expand Down Expand Up @@ -77,7 +81,6 @@
print(f"{suffix_index}: {self.text[suffix_index:]}")



# Example usage:
if __name__ == "__main__":
text = "banana"
Expand Down
Loading