-
Notifications
You must be signed in to change notification settings - Fork 0
Solution with comments (see "unknown_alphabet" function) #28
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,22 @@ | ||
| import unittest | ||
| from unknown_alphabet import * | ||
|
|
||
| class TestUnknownAlphabet(unittest.TestCase): | ||
|
|
||
| def test_0(self): | ||
| self.assertEqual(unknown_alphabet([]), []) | ||
|
|
||
| def test_1(self): | ||
| self.assertEqual(unknown_alphabet(["a"]), ["a"]) | ||
| self.assertEqual(unknown_alphabet(["1", "2", "3"]), ["1", "2", "3"]) | ||
| self.assertFalse(unknown_alphabet(["1", "2", "3", "2"])) | ||
|
|
||
| def test_2(self): | ||
| self.assertEqual(unknown_alphabet(["1a", "1b", "a1", "ab"]), ["1", "a", "b"]) | ||
|
|
||
| def test_3(self): | ||
| self.assertTrue(unknown_alphabet(["ART", "RAT", "CAT", "CAR"]) == ["A", "T", "R", "C"] or | ||
| unknown_alphabet(["ART", "RAT", "CAT", "CAR"]) == ["T", "A", "R", "C"]) | ||
|
|
||
| if __name__ == '__main__': | ||
| unittest.main() | ||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,49 @@ | ||
| from collections import defaultdict | ||
|
|
||
| def dfs(ordered_vertices, edges, color, v): | ||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This dfs is seriously scary. I think it needs at least some comments. What does it return and why? My current guess is topsort order and a color array if it exists or None if the graph has a cycle, but why return them if the function modifies them anyway? Also imho searching for cycles and topsorting in the same function causes unnecessary confusion. Maybe comments would help, but why not split this stuff into two functions anyway?
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Oh I think we should look for cycles during the topsort, because they both require searching on graph. I liked the coloring, which helps detecting if it is possible or not, but not sure how to use it for finding maximal satisfying subset. Also, comments would be great indeed.
Collaborator
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Ok, I'll add comments soon |
||
| if color[v] == 1: | ||
| return (ordered_vertices, None) | ||
| if color[v] == 2: | ||
| return (ordered_vertices, color) | ||
| color[v] = 1 | ||
| if v in edges.keys(): | ||
| for v_ in edges[v]: | ||
| ordered_vertices, color = dfs(ordered_vertices, edges, color, v_) | ||
| if v_ in edges.keys() and color == None: | ||
| return (ordered_vertices, None) | ||
| ordered_vertices.append(v) | ||
| color[v] = 2 | ||
| return (ordered_vertices, color) | ||
|
|
||
| def topological_sort(edges, symbols): | ||
| ordered_vertices = [] | ||
| N = len(symbols) | ||
| color = dict() | ||
| for v in symbols: | ||
| color[v] = 0 | ||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. stuff like this could look cleaner as
Collaborator
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes, I agree |
||
| for v in symbols: | ||
| ordered_vertices, color = dfs(ordered_vertices, edges, color, v) | ||
| if color == None: | ||
| return False # False if there is no correct alphabet possible | ||
| residual_symbols = list(symbols.difference(set(ordered_vertices))) | ||
| ordered_vertices.extend(residual_symbols) | ||
| return ordered_vertices[::-1] | ||
|
|
||
| def unknown_alphabet(dictionary): | ||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think the name of the function is not definitive. Something like "find_alphabet" might be better? |
||
| if len(dictionary) < 1: | ||
| return [] | ||
| edges = defaultdict(set) | ||
| symbols = set() | ||
| symbols.update(dictionary[0]) | ||
|
|
||
| for w1, w2 in zip(dictionary[:-1], dictionary[1:]): | ||
| for i in range(min(len(w1), len(w2))): | ||
| symbols.add(w1[i]) | ||
| if w1[i] != w2[i]: | ||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Once again, I do get the feeling that there's a lot going on in these functions. Possibly splitting up some of the logic could help?
Collaborator
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I added some comments here. Hope it will be more clear now. |
||
| edges[w1[i]].add(w2[i]) | ||
| symbols.add(w2[i]) | ||
| break | ||
| symbols.update(w2) | ||
|
|
||
| alphabet = topological_sort(edges, symbols) | ||
| return alphabet | ||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice tests, we should merge them with @crossopt 's unittests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I've already think about it