Queries

Built-in queries

hcprequestanalytics comes with a set of pre-defined queries:

$  hcprequestanalytics showqueries
available queries:
        500_highest_throughput          The 500 records with the highest throughput (Bytes/sec)
        500_httpcode_409                The 500 newest records with http code 409
        500_httpcode_413                The 500 newest records with http code 413
        500_httpcode_503                The 500 newest records with http code 503
        500_largest_req_httpcode_node   The records with the 500 largest requests by req, httpcode, node
        500_largest_size                The records with the 500 largest requests sorted by size
        500_worst_latency               The records with the 500 worst latencies
        clientip                        No. of records per client IP address
        clientip_httpcode               No. of records per http code per client IP address
        clientip_node                   No. of records per clientip per node
        clientip_request_httpcode       No. of records per http code per request per client IP address
        count                           No. of records, overall
        day                             No. of records per day
        day_hour                        No. of records per hour per day
        day_hour_req                    No. of records per request per hour per day
        day_req                         No. of records per request per day
        day_req_httpcode                No. of records per http code per request per day
        mapi_endp_req_http              MAPI request: endpoints, request, http code
        mapi_user_req_http              MAPI requests by user
        node                            No. of records per node
        node_req                        No. of records per request per node
        node_req_httpcode               No. of records per http code per request per node
        percentile_req                  No. of records per request analysis, including percentiles for size and latency
        percentile_throughput_128kb     No. of records per request, with percentiles on throughput (Bytes/sec) for objects >= 128KB
        req                             No. of records per request
        req_httpcode                    No. of records per http code per request
        req_httpcode_node               No. of records per node per http code per request
        ten_ns_proto_clientip_httpcode  No. of records per Tenant / Namespace / protocol / client IP address / http code
        ten_ns_proto_httpcode           No. of records per Tenant / Namespace / protocol / http code
        ten_ns_proto_percentile_req     No. of records per Tenant / Namespace / protocol, including percentiles for size and latency
        ten_ns_proto_user_httpcode      No. of records per Tenant / Namespace / protocol / user / http code
        ten_proto_httpcode              No. of records per Tenant / protocol / http code
        ten_user_ns_req_http            Tenants with all users accessing Namespaces, incl. request and httpcode

Tip

More queries might have been added with newer versions - always check with the command above!

Adding individual queries

If additional queries are wanted, hcprequestanalytics can be easily extended by creating a query file and adding it to the call:

$ cat addqueries
[add_count]
comment = count all records
query = SELECT count(*) FROM logrecs

[add_req_count]
comment = count records per request
query = SELECT request, count(*) FROM logrecs GROUP BY request
freeze pane : C3

[add_node_req_http]
comment = node-per-request-per-httpcode analysis
query = SELECT node, request, httpcode, count(*),
        min(size), avg(size), max(size),
        percentile(size, 10), percentile(size, 20),
        percentile(size, 30), percentile(size, 40),
        percentile(size, 50), percentile(size, 60),
        percentile(size, 70), percentile(size, 80),
        percentile(size, 90), percentile(size, 95),
        percentile(size, 99), percentile(size, 99.9),
        min(latency), avg(latency),
        max(latency),
        percentile(latency, 10), percentile(latency, 20),
        percentile(latency, 30), percentile(latency, 40),
        percentile(latency, 50), percentile(latency, 60),
        percentile(latency, 70), percentile(latency, 80),
        percentile(latency, 90), percentile(latency, 95),
        percentile(latency, 99), percentile(latency, 99.9)
        FROM logrecs GROUP BY node, request, httpcode
freeze pane : E3

You can check the available queries, including the additional ones:

$ hcprequestanalytics -d dbfile.db -a addqueries showqueries
available queries:
    500_highest_throughput          The 500 records with the highest throughput (Bytes/sec)
    500_largest_req_httpcode_node   The records with the 500 largest requests by req, httpcode, node
    500_largest_size                The records with the 500 largest requests sorted by size
    500_worst_latency               The records with the 500 worst latencies
    add_count                       count all records
    add_node_req_http               node-per-request-per-httpcode analysis
    add_req_count                   count records per request
    clientip                        No. of records per client IP address
    clientip_httpcode               No. of records per http code per client IP address
    clientip_request_httpcode       No. of records per http code per request per client IP address
    count                           No. of records, overall
    day                             No. of records per day
    day_hour                        No. of records per hour per day
    day_hour_req                    No. of records per request per hour per day
    day_req                         No. of records per request per day
    day_req_httpcode                No. of records per http code per request per day
    node                            No. of records per node
    node_req                        No. of records per request per node
    node_req_httpcode               No. of records per http code per request per node
    percentile_req                  No. of records per request analysis, including percentiles for size and latency
    percentile_throughput_128kb     No. of records per request, with percentiles on throughput (Bytes/sec) for objects >= 128KB
    req                             No. of records per request
    req_httpcode                    No. of records per http code per request
    req_httpcode_node               No. of records per node per http code per request
    ten_ns_proto_httpcode           No. of records per Tenant / Namespace / protocol / http code
    ten_ns_proto_percentile_req     No. of records per Tenant / Namespace / protocol, including percentiles for size and latency
    ten_ns_proto_user_httpcode      No. of records per Tenant / Namespace / protocol / user / http code
    ten_proto_httpcode              No. of records per Tenant / protocol / http code

Rules:

  • You need to stick to the format as shown above - not doing so will most likely result in a crash
  • the [term] is the name of the query, which you can use in the analyze call
  • the comment entry is what is shown in when calling showqueries
  • the query entry is where to put the query in
  • The QUERY has to follow the SQLite3 SELECT rules
  • You can use all the column names listed below, the aggregate functions offered by SQLite as well as the private functions listed below

Columns in the logrecs table

column type description
node TEXT the HCP nodes backend IP address
clientip TEXT the requesting clients IP address
user TEXT the user who did the request
timestamp FLOAT the point in time of the request (seconds since Epoch)
timestampstr TEXT the point in time of the request (string)
request TEXT the HTTP request
path TEXT the requested object
httpcode INT the HTTP return code
size INT the size of the transfers body
namespace TEXT the HCP Namespace accessed (usually, in the form of namespace.tenant[@protocol])
latency INT the internal latency needed to fullfil the request

Private SQL functions that can be used in queries

  • getNamespace(path, namespace)

    Extract the name of the Namespace (bucket, container) from the path and namespace database columns.

  • getTenant(namespace)

    Extract the name of the Tenant from the namespace database column.

  • getProtocol(namespace)

    Extract the access protocol used from the namespace database column. Returns either S3, Swift or native REST.

  • percentile(column, float)

    Aggregate function that calculates the percentage (given by float) of column from all selected records.

    Warning

    Due to it’s nature, percentile() collects a list of the columns’ value from each selected row. As this list is held in memory, it can consume a lot of it. A rough calculation would be:

      no. of percentile() occurrences in the query
    * no of rows selected
    * 24 bytes
    
  • tp(size, latency)

    Calculates the throughput (in bytes/second) from an objects size and the internal latency.