Skip to content

A Python module to fetch and parse results from different search engines.

License

Notifications You must be signed in to change notification settings

Franky12/py-web-search

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

44 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

py-web-search

Latest Version

A Python module to fetch and parse results from different search engines.

Warning: Do not make queries rapidly! The servers may block you.

Table of Contents

Search engines supported

Installation

Needs Python3. Install using pip:

    pip install py-web-search

Usage

Web search

    from pws import Google
    from pws import Bing

    print(Google.search('hello world', 5, 2))
    print(Bing.search('hello world', 5, 2))
    
    # Arguments:
    # search(query, num, start, sleep, recent)
    # query: Required. The keyword that will be searched.
    # num: Default 10. The number of results returned.
    # start: Default 0. The number of top results that are to be ignored.
    # sleep: Default True. If True, the program will wait for a second, when applicable, to avoid overwhelming the servers.
    # recent: Default None. The following values are allowed: 'h': hour, 'd': day, 'w': week, 'm': month and 'y': year.(Buggy)

Prints 5 results from the the third result onwards (ignores the first 2) in the following format.

    {
        'url': '...',
        'num': 5,
        'start': 2,
        'search_engine': 'google',
        'results':
        [
            {
                'link': '...',
                'link_text': '...',
                'link_info': '...',
                'related_queries': [...],
                'total_results': ...,
                'additional_links':
                {
                    linktext: link,
                    ...
                }
        	},
        	...
        ]
    }

News search

    from pws import Bing
    from pws import Google

    print(Bing.search_news('github', 10, 0, True, 'h'))
    print(Google.search_news('github', 10, 0, True, 'd'))
    
    # Arguments:
    # search_news(query, num, start, sleep, recent)
    # query: Required. The keyword that will be searched.
    # num: Default 10. The number of results returned.
    # start: Default 0. The number of top results that are to be ignored.
    # sleep: Default True. If True, the program will wait for a second, when applicable, to avoid overwhelming the servers.
    # recent: Default None. The following values are allowed: 'h': hour, 'd': day, 'w': week, 'm': month and 'y': year.(Buggy)

Prints 10 results from the the first result onwards (ignores the first 0) in the following format.

    {
        'url': '...',
        'num': 10,
        'start': 0,
        'search_engine': 'bing',
        'results':
        [
            {
                'link': '...',
                'link_text': '...',
                'link_info': '...',
                'source': '...',
                'time': '...',
                'additional_links':{}, # Always empty for Bing.
            },
            ...
        ]
    }

Todo

  • Other search engines
  • Images etc.

Contribution

Feel free to add any features that you think might be useful.

About

A Python module to fetch and parse results from different search engines.

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages

  • Python 100.0%