Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

search engine: first draft of code and also (slightly desynchronized) chapter text! #33

Merged
merged 67 commits into from
Feb 3, 2016
Merged
Changes from 1 commit
Commits
Show all changes
67 commits
Select commit Hold shift + click to select a range
e6daaaf
adding my name
Feb 24, 2014
1817e9a
some initial search engine rethoughts
invalid-email-address Feb 25, 2014
3fd499d
a little more search engine
invalid-email-address Feb 25, 2014
5cc4dd0
more writing about search engines
invalid-email-address Feb 25, 2014
b6a4ecf
implemented most of indexing
invalid-email-address Feb 25, 2014
a040a2a
most of the search engine is now working
invalid-email-address Feb 25, 2014
4e24010
made querying the search engine work
invalid-email-address Feb 26, 2014
0d7b145
updating search-engine text
invalid-email-address Feb 26, 2014
a621ddc
search engine quibble comment
kragen Feb 26, 2014
7a40518
documenting merge strategy and index structure
kragen Feb 26, 2014
1472b46
search engine: more discussion of merge policy
kragen Feb 26, 2014
8c85365
renamed indexdir vars to index_dir
kragen Feb 26, 2014
6d84782
made index size sim more forgiving
kragen Feb 26, 2014
3c483fe
total overhaul of search engine
kragen Feb 27, 2014
04c3a14
more updates to search-engine text
kragen Feb 27, 2014
5c2431d
added search-engine todo
kragen Feb 27, 2014
646324d
more search-engine notes
kragen Feb 27, 2014
39a162a
fixed my contributor row
kragen Feb 27, 2014
bba0290
simplified search engine
kragen Feb 27, 2014
0b97e99
fixed persistence to cope with spaces
kragen Feb 27, 2014
af29885
updated search engine todo
kragen Feb 27, 2014
d856016
failed attempt to port search-engine to Jython 2.5
kragen Feb 27, 2014
82ef17f
updated search-engine todo
kragen Feb 27, 2014
f156979
search-engine: more text
kragen Feb 27, 2014
1f39e58
search-engine: explained generating indices
kragen Feb 28, 2014
ec97985
search-engine: removed todo item
Feb 28, 2014
7364dfd
search-engine: measured performance
kragen Feb 28, 2014
1f4c5df
search-engine: some readability/correctness updates
kragen Feb 28, 2014
33d823a
search-engine: updated TODO
kragen Feb 28, 2014
c5669fd
Merge branch 'master' of github.com:kragen/500lines
kragen Feb 28, 2014
9e9ab14
search-engine text updates
Feb 28, 2014
d50802d
search-engine: exit on ^C
Feb 28, 2014
eb799c7
search-engine: updated TODO
Feb 28, 2014
ca0ea1c
search-engine: more text updates
Feb 28, 2014
443901c
search-engine: more cleanup and simplification
Feb 28, 2014
482c64d
search-engine: a little more on performance
Feb 28, 2014
8f58e51
search-engine: adding postings filters
Feb 28, 2014
82b5e60
search-engine: more text on performance
Feb 28, 2014
a7bab13
search-engine: adding crude ranking
Mar 1, 2014
f5e1aa5
search-engine: documenting very crude ranking
Mar 1, 2014
7e929d1
search-engine: added stopwords
Mar 1, 2014
10af720
search-engine: fixed quotes and apostrophes
kragen Mar 1, 2014
2f416cc
search-engine: added distinct tokenizers
kragen Mar 1, 2014
3463c6e
search-engine: recording file metadata
kragen Mar 1, 2014
101deb3
search-engine: a couple of tiny shortenings
kragen Mar 1, 2014
0b67acb
a couple more items for TODO
kragen Mar 1, 2014
0abf6ec
search-index: more updates to the text
kragen Mar 1, 2014
4c7abfb
search-engine: a couple of text fixes
kragen Mar 1, 2014
defcef7
search-engine: properly handling relative paths
kragen Mar 2, 2014
34637de
Merge branch 'master' of github.com:kragen/500lines
kragen Mar 2, 2014
fa49cd7
search-engine: incorporated most of ayust's suggestions
kragen Mar 3, 2014
458cec4
search-engine: integrated ayust's other comment
kragen Mar 4, 2014
3e2924d
search engine: adding litprog system
kragen Mar 30, 2014
4f5a6bc
search engine: explaining context of handaxeweb
kragen Mar 30, 2014
d6a936c
search engine: handaxewebifying README.md
kragen Mar 30, 2014
7c463f2
search engine: updated code inside chapter text
kragen Mar 30, 2014
6df4409
search engine: gitignoring temp files from handaxeweb
kragen Mar 30, 2014
2228111
search-engine: added Makefile
Jul 27, 2014
ecfd922
Merge github.com:aosabook/500lines
Jul 27, 2014
d43d1df
updating search-engine/TODO.md
kragen Aug 7, 2014
84f74dd
updating search-engine in response to feedback
kragen Aug 7, 2014
5da83b3
search-engine: added diagrams
kragen Aug 13, 2014
3aa120f
search-engine: refactored diagrams
kragen Aug 13, 2014
8bb480e
search-engine: refactored diagrams more
kragen Aug 13, 2014
20fa9e7
search-engine: add xmlpi on diagrams
kragen Aug 13, 2014
4f9ab85
search-engine: refactored and docstringed diagrams
kragen Aug 13, 2014
9ee37cc
search-engine: more fixes
kragen Aug 15, 2014
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
search-engine: updated TODO
  • Loading branch information
user committed Feb 28, 2014
commit eb799c7d6990a8842e462cff3558b5130fe99cdb
3 changes: 2 additions & 1 deletion search-engine/TODO.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,8 @@
* Find indices on the path to root and update them automatically,
perhaps niced in the background.
* Perhaps some kind of abstraction of different pieces: storage
engine, term extraction engine, etc. Term extraction is already
engine, term extraction engine, stemming, posting contents, etc.
Term extraction is already
pretty abstract, in the sense that everything else just treats its
output as a stream of (term, docid) tuples.
* Deal with fsync and incomplete segments to enable concurrency and
Expand Down