Skip to content

kush-jain/grep-python

Repository files navigation

Basic Grep Implementation

In this, implement grep like functionality

Inspiration

Inspired by codecrafters.

Dependencies

Tested with Python 3.12+

Features

Supports:

  • Basic character support, including \d, \w and wildcards
  • Character Classes - both positive and negative [abc] and [^abc]
  • Beginner and End Carats (^, $)
  • Quantifiers like *, +, ?
  • Parenthesis support so (a*b)+ ops are possible
  • Support for Alternation |

Design Approach and History

  • We started with very basic recursive algorithm
  • However, we soon realised that we needed to tokenise pattern since it make quantifier search pretty easy For example, with no tokenisation, it is hard to check for \\w+, [abc]* or 2? since each is different length But with tokens, you can check that if token[n+1] is quantifier, then token[n] would be thing where this applies
  • This approach scaled very well for lot of use cases
  • However, once we tried to handle parenthesis - we started running into issues. It quickly became apparant that our token array would quickly became nested structure
  • Looking and investigating into that, it turned out that we would be better served by converting Token Arrays wholesale into AST-like structure
  • At this stage, we converted Parser, AST and Matcher into their own separate modules

About

Implementation of Grep in Python

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •