Skip to content
Timid Robot Zehta edited this page Aug 29, 2016 · 2 revisions
A handy tool for sophisticated, ad-hoc analysis of webserver logs.

logrep  [--mode MODE] [--include | --exclude CLASSES] [-H | -R]
        [--output FIELDS] [--filter FILTERS] [--last LAST_N]
        [--sort LIM:FIELDS:DIRECTION] [--config CFG_FILE] [--quiet]
        [LOG_FILE]


   -m MODE           There are three modes:
   --mode              - "grep" parses an entire log file (default).
                       - "tail" reads from the end of the file.
                       - "top"  shows running performance stats.

   -i, -e CLASSES    Include or exclude the given URL "classes". You can
   --include         configure logrep to classify URLs by a set of
   --exclude         regular expressions. See the installation docs and
                     /etc/wtop.cfg for how to configure your own classes.
                     --include and --exclude are mutually exclusive.

                     Examples:
                        --include "home,search,wiki"
                        --exclude "img,xml,js"

   -f FILTERS        -f filters act on named fields.
   --filter          There is support for strings & numbers, greater
                     than (>), less than (<), equals (=), not-equals
                     (!=), and regular expression match (~ and !~).

                     For example: Filter successful requests that were
                     over 10kB in size that do not have "example.com"
                     in the Referer field:

                        -f "status=200,bytes>10000,refdom!~example.com"

                     AVAILABLE FIELDS:
                        msec       millisecond response time
                        fbmsec     millisecond response time (first byte)
                        ip         The IP address of the client
                        lip        The IP address of the server
                        url        The path of the request, ex. "/home"
                        ref        "Referer" header
                        refdom     domain part of the "Referer" header
                        bytes      Bytes sent
                        ua         User-agent header
                        uas        First 30 characters of ua
                        class      URL class, configurable in wtop.cfg
                        status     HTTP status code, eg 200, 301, 404
                        proto      Protocol version, eg "HTTP/1.1"
                        method     HTTP method, eg "GET", "POST"
                        bot        Is a robot? 1 or 0. Only a guess.
                        botname    eg "Googlebot", "Nutch", "Slurp", etc
                        ts         Unix timestamp of the request
                        year
                        month
                        day
                        hour
                        minute
                        country    country name (see Geocoding, below)
                        cc         ISO-639 country code (see below)


   -H, -R            Shorthand for a useful but incomplete filter of
                     robot user-agents. Equivalent to --filter 'bot=0'
                     or --filter 'bot=1'


   -o FIELDS         Output only the given fields, tab-delimited. All
   --output          of the fields listed for --filter are available.

                     Example:
                     $ logrep -o 'cc,msec,url'
                        UK      34      /Madonna.jpg
                        CA      34      /Padma-Lakshmi.jpg
                        UK      34      /Shaun-Woo.jpg
                        US      184     /Ben-Stiller.jpg
                        ...

                     AGGREGATE FUNCTIONS:
                     In -m grep mode you can use aggregate functions
                     on numeric fields such as bytes and msec. Any
                     non-aggregate fields in the list will be used to
                     group records together.
                        avg(FIELD)  mean average
                        count(*)    record count
                        dev(FIELD)  deviation (square root of variance)
                        iqm(FIELD)  Interquartile Mean (see IQM below)
                        max(FIELD)  highest seen value
                        min(FIELD)  lowest seen value
                        miqm(FIELD) moving interquartile mean (see IQM below)
                        sum(FIELD)  summation of all values
                        var(FIELD)  population variance

                     Example (grouped by status):
                     $ logrep -o 'status,count(*),avg(msec)'
                        200 4196    242.58
                        302 5       79.75
                        404 1       9.00
                        304 798     15.76

   -s LIM:FIELDS:DIRECTION
   --sort            Use this option to sort & limit aggregate records.
                     LIMIT is the number of records to return, FIELDS
                     is a comma-delimited list of column positions
                     starting with 1, and DIRECTION is either
                     'descending' (default) or 'ascending'.

                     Example (total bytes sent, by hour & minute)
                     $ logrep -o 'hour,minute,sum(bytes)' -s'3600:1,2:a'
                        12  0   1895927
                        12  1   7418972
                        12  2   2103828
                        12  3   7419371
                        12  4   1680468
                        ...

                     Example (the 10 most popular URLs):
                     $ logrep -o 'url,count(*)' -s '10:2'
                        /home    23718
                        /wiki    8211
                        /about   2703
                        ...

   -l LAST_N         (grep mode) Only read the last N log lines.
   --last

   -c CFG_FILE       Feed logrep a custom config file. By default it
   --config          will search for a file to use in the following
                     order:

                        VirtualEnv + /etc/wtop.cfg
                        PYTHONUSERBASE + /etc/wtop.cfg
                        USER_BASE + /etc/wtop.cfg
                        Python Lib + /etc/wtop.cfg
                        /etc/wtop.cfg

                        Platform appropriate path separaters are used.

   -q, --quiet       Quiet mode. Does not print warnings to stderr.

   -d, --debug       Print debug messages to stderr.

   --line-buffered   Force output to be line buffered. By defaut, output is
                     buffered when standard output is not a tty.

   LOG_FILE          The path to a log file. By default logrep will
                     read from the file path specified in wtop.cfg
                     If you specify "-", logrep will read from STDIN.

 GEOCODING:
    logrep will use the MaxMind GeoIP library if it is installed. This
    will enable two extra fields for filtering and output: country
    (eg "United Kingdom"), and cc (ISO-639 country code, eg "UK"). These
    are a *guess* at the country the HTTP client is from.

 IQM:
    logrep will use the python-iqm module if it is installed. This will
    enable two extra aggregation fields: iqm, miqm

 KNOWN BUG:
    Some installations of Apache have HostnameLookups defaulted to On.
    This means that the %h field will contain the fully-qualified domain
    name of the client (xdsl456.foo.example.com) instead of the IP
    address (123.1.2.3). Geocoding will work but will require a DNS
    lookup to resolve the IP address. Using the "cc" or "country"
    field in this case will generate a *LOT* of DNS traffic and can
    hang the program. It is recommended to explicitly set
    HostnameLookups Off in your Apache configuration.


 EXAMPLES:

 "wtop" for all human traffic:
     $ logrep -m top -f 'bot=0' access.log

 Status code & response times for all Googlebot homepage hits:
     $ logrep -f 'botname=Googlebot' -i home -o status,msec

 Tail for pages about Angelina Jolie or Brad Pitt sent from example.com
     $ logrep -m tail -f 'url~jolie|pitt,ref~example.com' access.log

 Get maximum response size and average response time for requests
 grouped by URL class:
     $ logrep -o 'class,max(bytes),avg(msec)' access.log


0.7.9   2014 Oct 03     https://github.com/ClockworkNet/wtop
Clone this wiki locally