Version 2.0 inital commit

asherAgs · Apr 4, 2018 · dd2ec2e · dd2ec2e
commit dd2ec2e
Show file tree

Hide file tree

Showing 57 changed files with 2,125 additions and 0 deletions.
diff --git a/LICENSE.txt b/LICENSE.txt
@@ -0,0 +1,9 @@
+MIT License
+
+Copyright (c) 2017 Asher Silvers
+
+Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
+
+The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
+
+THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
diff --git a/MANIFEST b/MANIFEST
@@ -0,0 +1,6 @@
+setup.cfg
+setup.py
+amzsear/__init__.py
+amzsear/api.py
+amzsear/cli.py
+amzsear/consts.py
diff --git a/README.md b/README.md
@@ -0,0 +1,154 @@
+# amzSear
+
+The unofficial Amazon Product CLI & API. Easily search the amazon product directory from the command line without the need for an Amazon API key.
+
+Wondering about about an amazon product listing? Find the amzSear!
+
+__Version 2 has been released!__ See [below](#whats-new) for more info.
+
+
+```
+$ amzsear 'Harry Potter Books'
+```
+
+
+```
+    Title                                               Prices             Rating
+0   Harry Potter Paperback Box Set (Books 1-7)          $21.20 - $52.99    *****
+1   Harry Potter and the Sorcerer's Stone               $0.00 - $10.99     *****
+2   Harry Potter And The Chamber Of Secrets             $0.00 - $10.99     *****
+3   Harry Potter And The Goblet Of Fire                 $0.00 - $12.99     *****
+4   Harry Potter and the Prisoner of Azkaban            $0.00 - $10.99     *****
+5   Harry Potter And The Order Of The Phoenix           $0.00 - $12.99     *****
+6   Harry Potter and the Deathly Hallows (Book 7)       $0.00 - $14.99     *****
+7   Harry Potter and the Half-Blood Prince (Book 6)     $0.00 - $12.99     *****
+8   [Sponsored]Hudson James & the Baker Street Legacy   $0.00 - $3.07      -----
+9   [Sponsored]Kids' Travel Guide - London: The fun wa  $8.37 - $10.90     *****
+10  Harry Potter and the Sorcerer's Stone: The Illustr  $9.23 - $39.99     ****
+11  Harry Potter Complete Book Series Special Edition   $64.88 - $81.95    *****
+12  Harry Potter and the Cursed Child, Parts One and T  $3.15 - $12.99     ****
+13  Harry Potter Books Set #1-7 in Collectible Trunk-L  $73.96 - $157.95   ****
+14  Harry Potter Complete Collection 7 Books Set Colle  $146.89 - $163.99  *****
+15  Harry Potter and the Chamber of Secrets: The Illus  $20.51 - $39.99    *****
+16  Harry Potter and the Prisoner of Azkaban: The Illu  $15.92 - $275.00   *****
+17  The Unofficial Harry Potter Spellbook: Wizard Trai  $0.00 - $13.95     ****
+18  [Sponsored]Widdershins – Part One: The Boy with Ab  $0.00 - $0.76      *****
+19  [Sponsored]Missions Accomplished: And some funny b  $0.00 - $3.96      *****
+```
+
+![Amazon Comparison Shot](amazon_screenshot.png)
+
+<a name="installation"></a>
+### Installation
+
+Can easily be be run on Python version 3 or greater with minimal additional dependencies.
+
+Install the dependencies and main package using pip.
+
+```
+$ pip install amzsear
+```
+
+For those wanting to upgrade to version 2, use the command:
+
+```
+$ pip install amzsear --upgrade
+```
+
+Note: The [Pandas](https://pandas.pydata.org/) package is not a required dependency for amzSear, however a few methods do use it (see [AmzSear.md](core/AmzSear.md#to_dataframe), [AmzBase.md](core/AmzBase.md#to_series)) if one wants to integrate with Pandas. If this is the case, pandas should be installed separately using:
+```
+$ pip install pandas
+```
+
+<a name="usage"></a>
+### Usage
+
+AmzSear can be used in two ways, from the command line and as a Python package.
+
+#### CLI
+The amzSear CLI allows Amazon search queries to be performed directly from the command line. In it's simplest form, the CLI only requires a query.
+
+```python
+$ amzsear 'Harry Potter Books'
+```
+
+However, additional options can be set to select the page number, item number, region or the output format. For example:
+
+```python
+$ amzsear 'Harry Potter' -p 2 -i 35 --output json
+```
+
+The above query will display the item at index 35 on page 2 as a JSON object. For more examples and for extended usage information see the [CLI Readme](cli/README.md).
+
+
+#### API
+
+```python
+from amzsear import AmzSear
+amz = AmzSear('Harry Potter')
+```
+
+In the latest version of amzSear dedicated `AmzSear` and `AmzProduct` classes have been created to allow easier extraction of Amazon product information in a Python program. For example:
+```python
+>>> from amzsear import AmzSear
+>>> amz = AmzSear('Harry Potter', page=2, region='CA')
+>>> 
+>>> last_item = amz.rget(-1) # retrieves the last item in the amzSear
+>>> print(last_item)
+title               "[Sponsored]Kids' Travel Guide - London: The fun way to discover Lo..."
+product_url         'https://www.amazon.com/gp/slredirect/picassoRedirect.html/ref=pa_s...'
+image_url           'https://images-na.ssl-images-amazon.com/images/I/61CatLnbhQL._AC_U...'
+rating              ratings_text          '4.6 out of 5 stars'
+                    ratings_count_text    '29'
+                    <Valid AmzRating object>
+prices              {'Perfect Paperback': '$8.37', '1': '$10.90'}
+extra_attributes    {}
+subtext             ['by Sarah-Jane Williams and FlyingKids']
+<Valid AmzProduct object>
+>>> 
+>>> print(last_item.get_prices()) # retrieves all price values as floats
+[8.37, 10.9]
+```
+
+For a complete explanation of the intricacies of the amzSear core API, see the [API docs](core/).
+
+
+
+<a name="whats-new"></a>
+### What's New in Version 2.0
+
+| Feature                                                        | v 1.0 | v 2.0 |
+|----------------------------------------------------------------|-------|-------|
+| Command line Amazon queries                                    | ✔     | ✔     |
+| Command line conversion to CSV or JSON                         |       | ✔     |
+| Support for US Amazon                                          | ✔     | ✔     |
+| Support across __all__ Amazon regions                          |       | ✔     |
+| Single page API queries                                        | ✔     | ✔     |
+| Multiple page API queries                                      |       | ✔     |
+| Dedicated AmzSear class & subclasses                           |       | ✔     |
+| Extraction of (title, url, prices & rating)                    | ✔     | ✔     |
+| Extraction of (image_url, rating's count, extra text, subtext) |       | ✔     |
+| Consistent extraction across Amazon sites                      |       | ✔     |
+| Support for API input from query or url or html directly       |       | ✔     |
+
+
+##### Summary
+* Support across all Amazon regions (Australia, India, Spain, UK, US, etc.)
+* Dedicated AmzSear class & subclasses
+* Better scraping & extraction to retrieve all data
+* Additional fields - including image_url, subtitle/subtext, rating's count
+* Simpler usability and clearer command line interface
+* Multiple command line export formats - CSV, JSON, etc.
+
+A more in depth understanding of the latest features of the CLI can be explored in the [CLI Readme](cli/README.md). A complete breakdown of the core API's extended features can be seen in the core [API docs](core/).
+
+### About
+
+##### Articles
+
+* [OSTechNix](https://www.ostechnix.com/search-amazon-products-command-line/)
+* [CrackWare](http://crackware.me/technology/search-amazon-products-from-command-line/)
+* [Linux-OS.net](http://linux-os.net/amzsear-busca-productos-en-amazon-desde-la-linea-de-comandos/)
+* [MasLinux](http://maslinux.es/buscar-productos-de-amazon-desde-la-linea-de-comandos/)
+
+This library was designed to facilitate the use of amazon search on the command line whilst also providing a utility to easily scrape basic product information from Amazon (for those without access to Amazon's Product API). The developer does, however, append an Amazon Affiliate Tag in order to track usage of this software and to monetize this and other publicly accessible projects. We are a participant in the Amazon Services LLC Associates Program, an affiliate advertising program designed to provide a means for us to earn fees by linking to Amazon.com and affiliated sites.
diff --git a/amazon_screenshot.png b/amazon_screenshot.png
diff --git a/amzsear/__init__.py b/amzsear/__init__.py
@@ -0,0 +1,4 @@
+try:
+    from amzsear.core.AmzSear import AmzSear
+except ImportError:
+    from .amzsear.core.AmzSear import AmzSear
diff --git a/amzsear/__pycache__/AmzProduct.cpython-36.pyc b/amzsear/__pycache__/AmzProduct.cpython-36.pyc
diff --git a/amzsear/__pycache__/__init__.cpython-36.pyc b/amzsear/__pycache__/__init__.cpython-36.pyc
diff --git a/amzsear/__pycache__/api.cpython-36.pyc b/amzsear/__pycache__/api.cpython-36.pyc
diff --git a/amzsear/__pycache__/consts.cpython-36.pyc b/amzsear/__pycache__/consts.cpython-36.pyc
diff --git a/amzsear/__pycache__/temp.cpython-36.pyc b/amzsear/__pycache__/temp.cpython-36.pyc
diff --git a/amzsear/cli/__init__.py b/amzsear/cli/__init__.py
diff --git a/amzsear/cli/__pycache__/Funcs.cpython-36.pyc b/amzsear/cli/__pycache__/Funcs.cpython-36.pyc
diff --git a/amzsear/cli/__pycache__/RangeOrList.cpython-36.pyc b/amzsear/cli/__pycache__/RangeOrList.cpython-36.pyc
diff --git a/amzsear/cli/__pycache__/__init__.cpython-36.pyc b/amzsear/cli/__pycache__/__init__.cpython-36.pyc
diff --git a/amzsear/cli/__pycache__/cli.cpython-36.pyc b/amzsear/cli/__pycache__/cli.cpython-36.pyc
diff --git a/amzsear/cli/cli.py b/amzsear/cli/cli.py
@@ -0,0 +1,122 @@
+try:
+    from amzsear import AmzSear
+    from amzsear.core.consts import DEFAULT_REGION, REGION_CODES
+except ImportError:
+    from .amzsear import AmzSear
+    from .amzsear.core.consts import DEFAULT_REGION, REGION_CODES
+
+import argparse
+import webbrowser
+import json
+import sys
+import csv
+
+"""
+"""
+def run(*passed_args):
+    parser = get_parser()
+    args = parser.parse_args(*passed_args) # the parser defaults to sys args if nothing passed
+    args = vars(args)
+
+    amz_args = {x:y for x,y in args.items() if x not in ['item','output','dont_open']}
+    out = AmzSear(**amz_args)
+
+    if args['item'] != None:
+        # single item selection
+        prod = out[args['item']]
+        out = AmzSear(products=[prod]) # error raised if not found
+        out._urls = [prod.product_url] 
+
+
+    # handle output types
+    if args['output'] == 'short':
+        print_short(out)
+    if args['output'] == 'verbose':
+        print_verbose(out)
+    elif args['output'] == 'csv':
+        print_csv(out)
+    elif args['output'] == 'json':
+        print_json(out)
+    # elif args['output'] == 'quite' --> no output
+
+
+    if args['dont_open'] != True:
+        for url in out._urls:
+            webbrowser.open(url)
+
+
+
+def get_parser():
+    parser = argparse.ArgumentParser(description='The unofficial Amazon search CLI')
+
+    parser.add_argument('query', type=str, help='The query string to be searched')
+    parser.add_argument('-p','--page', type=int,
+        help='The page number to be searched (defaults to 1)', default=1)
+    parser.add_argument('-i','--item', type=str,
+        help='The item index to be displayed (relative to the page)', default=None)
+    parser.add_argument('-r','--region', type=str, choices=REGION_CODES,
+        default=DEFAULT_REGION, help='The amazon country/region to be searched')
+
+    parser.add_argument('-d','--dont-open', action='store_true',
+        help='Stop the page from opening in the default browser')
+
+    parser.add_argument('-o','--output', type=str, choices=['short','verbose','quiet','csv','json'],
+        default='short', help='The output type to be displayed (defaults to short)')
+
+    return parser
+
+
+def print_csv(cls):
+    # flattens to list of dicts with index value
+    data = [{**v.to_dict(flatten=True),**({'_index' : k})} for k,v in cls.items()]
+
+    # print with all quotes
+    writer = csv.DictWriter(sys.stdout, data[0].keys(), quoting=csv.QUOTE_ALL) 
+    writer.writeheader()
+    writer.writerows(data)
+
+def print_json(cls):
+    print(json.dumps({k: v.to_dict() for k,v in cls.items()}))
+
+def print_verbose(cls):
+    print(cls)
+
+
+def print_short(cls):
+    fields = ['','Title','Prices','Rating']
+
+    rows = [{f:f for f in fields}]
+    for index, product in cls.items():
+        temp_dict = {}
+
+        temp_dict[''] = index
+
+        # get price in format '$nn.nn-$mm.mm'
+        price_tup = {product.prices[k]:product.get_prices(k) for k in product.prices}
+        if len(price_tup) > 0:
+            price_tup = (min(price_tup, key=lambda x: price_tup[x]), max(price_tup, key=lambda x: price_tup[x]))
+            if price_tup[0] == price_tup[-1]:
+                temp_dict['Prices'] = price_tup[0] # one price
+            else:
+                temp_dict['Prices'] = price_tup[0] + ' - ' + price_tup[-1] # range of prices
+        else:
+            temp_dict['Prices'] = '------------'
+        temp_dict['Title'] = product.get('title','----------')[:50] # limit title length
+
+        temp_dict['Rating'] = product.get('rating','-----')
+        if temp_dict['Rating'] != '-----':
+            temp_dict['Rating'] = temp_dict['Rating'].get_star_repr()
+
+        rows.append(temp_dict)
+
+    format_str = []
+    for field in fields:
+        #get longest in each field into format_str
+        format_str.append('{:%d}' % (max(len(x[field]) for x in rows)))
+    format_str = '  '.join(format_str)
+
+    for row in rows:
+        print(format_str.format(*[row.get(x,'') for x in fields])) # print in order
+
+if __name__ == '__main__':
+    run()