Commit 1835e5d5 authored by Paul Asmuth's avatar Paul Asmuth
Browse files

fnordmetric enterprise doc WIP

parent 0e86b3f9
Loading
Loading
Loading
Loading
+1 −1
Original line number Diff line number Diff line
GEM
  remote: http://rubygems.org/
  remote: https://rubygems.org/
  specs:
    backports (2.7.1)
    monkey-lib (0.5.4)
+12 −15
Original line number Diff line number Diff line
@@ -43,25 +43,22 @@ sitemap:

  fnordmetric_enterprise:
    -
      title: "Introduction"
      title: "Getting Started"
      url: "/enterprise_index"
    -
      title: "Deployment"
      url: "/enterprise_deployment"
    -
      title: "HTTP API"
      url: "/enterprise_http_api"
    -
      title: "TCP/UDP API"
      url: "/enterprise_tcp_udp_api"
      title: "API Reference"
      url: "/enterprise_api_reference"
    #-
    #  title: "HTTP API"
    #  url: "/enterprise_http_api"
    -
      title: "Configuration"
      url: "/enterprise_usage"
    -
      title: "Clients"
      url: "/enterprise_usage"
    -
      title: "Examples"
      url: "/enterprise_examples"
    #-
    #  title: "Clients"
    #  url: "/enterprise_usage"
    #-
    #  title: "Examples"
    #  url: "/enterprise_examples"

+169 −0
Original line number Diff line number Diff line
API Reference
-------------

There are three basic operations: `add\_sample`, `value\_at` and `values\_in`.
`add\_sample` adds a sample to a metric, `value\_at` retrieves the measured value at
a specified time and `values\_in` retrieves all aggregated measured values in a specified
time interval.

The metric type and interval are implicitly specified by the metric key; all keys have to
end with "$type-$interval".

For example: if you want a metric `response\_time` to record the average/mean of all sampled
values in an aggregation interval of 60 seconds, use the key `response\_time-mean-60`. For a
metric `total\_clicks` that sums up all measuements in one-hour intervals, you could use
`total_clicks.sum-3600`

These are all metric types that are currently supported:

<table>
  <tr>
    <th>sum</th>
    <td>
      Records the sum of all sampled values in an interval. <i>e.g. total_sales-sum-60</i>
    </td>
  </tr>
  <tr>
    <th>mean</th>
    <td>
      Records the mean / average of all sampled values in an interval. <i>e.g response_time-mean-60</i>
    </td>
  </tr>
</table>


### Protocol

FnordMetric Enteprise offers a simple US-ASCII text-based interface. It is a very simple
line-based serial request/response protocol. Requests may contain all ASCII characters
and must end with a newline (`\n`). Responses are also simple ASCII formatted numbers,
seperated with whitespaces and colons. Responses also always end with a newline character.

_Example: Increment the total\_requests-sum-60 by 5_

    >> SAMPLE total_requests-sum-60 5
    << OK

_Example: Retrieve the current value of total\_requests-sum-60_

    >> VALUEAT total_requests-sum-60 now
    << 23087


#### TCP / Websockets

When connecting with TCP, there is no application level handshake: you just open up the
connection and start sending commands. The protocol does not support multiplexing / pipelining:
You have to wait for one response line after every request you send.

Example with netcat:

    echo "SAMPLE total_requests-sum-60" | nc localhost 8922


When connecting with Websockets, you have to use `fnordmetric_enterprise` as the protocol string.

Example in JavaScript:

    var ws = new WebSocket("ws://localhost:8080/", "fnordmetric_enterprise")
    ...
    ws.send("SAMPLE total_requests-sum-60")


The generic error response format for both TCP and WebSocket connections is (the error message
will never contain a newline character).

    ERROR something went wrong\n


#### UDP:

When you are using UDP as the transport layer you won't receive any responses. Therefore the only
command that works over UDP is "sample". Note that UDP doesn't guarantee delivery.

You can put one or more commands into one UDP packet. Every command needs to end with a newline
character (`\n`).

Example UDP packet that increments three counters:

    SAMPLE my_counter1.sum-3600 123\n
    SAMPLE my_counter2.sum-3600 456\n
    SAMPLE my_counter3.sum-3600 789\n



### Time format

  here be dragons



### Commands

#### SAMPLE

Samples/adds a value to a metric. What exactly this does depends on the type of the metric.

*Format:*

    SAMPLE [metric] [value]

*Response:*

    "OK"

*Example:*

    >> SAMPLE my_application.response_time.avg-30 23
    << OK

<br />


#### VALUE_AT

Retrieves the measured value of a metric at a specific point in time. See above
for the time format.

*Format:*

    VALUE_AT [metric] [at]

*Response:*

    Numeric value as a string or "null"

*Example:*

     >> VALUE_AT my_application.response_time.avg-30 -3hours
     << 17.42

     >> VALUE_AT my_application.response_time.avg-30 now
     << null

<br />


#### VALUES_IN

Retrieves all measured values of a metric in a specific time interval. See above
for the time format.

*Format:*

    VALUE_AT [metric] [from] [until]

*Response:*

    white space seperated Timestamp:Value tuples or "null"

*Example:*

     >> VALUE_AT my_application.response_time.avg-30 -2hours now
     << 1360804571:4233.52 1360804581:4312.36 1360804591:6323.12

     >> VALUE_AT my_application.response_time.avg-30 -6hours -5hours
     << null


+23 −0
Original line number Diff line number Diff line

### In-memory vs. disk storage

FnordMetric Enterprise stores the values as 64bit double precision floats.

With an example flush timeout of 10 seconds, one metric uses 0.065 MB of
memory per day or 0.4 MB per week. The default ring buffer size is x,.

That means with only 4GB of ram, you could access the last month of data of
2500 counters/metrics with 10 second granularity all from the in-memory
ringbuffers (without ever hitting a HDD).

Requests that can not be served from memory require one sequential disk read.


### FnordMetric Enterprise vs. StatsD

+ allows for flush intervals as low as one second
+ rendered in the browser, interactive
+ much much more scalable
+ highly customizable with css
+ requires only a single deployment
+ i18n (proper timezones in graphs due to in browser rendering etc)
+28 −33
Original line number Diff line number Diff line
FnordMetric Enterprise
----------------------

FnordMetric Enterprise is a JVM-based timeseries database. It's a key-value store
(much like redis or memcached) where each key holds a "measurement". There are
different measurement methods like sum, mean, min/max, 90th percentile, etc. You
continuously add data to these keys/measurements which is aggregated and periodically
persisted.
FnordMetric Enterprise is a key-value store (much like redis or memcached) where each
key holds a "metric". There are different metric types like sum, mean, min/max, 90th
percentile, etcetra. You continuously add data to these keys/metrics which is aggregated
and periodically persisted.

FnordMetric Enterprise features disk persistence, a HTTP, TCP and UDP API, native
WebSockets support, CSV/JSON Export and a turnkey-ready HTML5 visualization solution
(FnordMetric UI). FnordMetric Enterprise can be used as a drop-in replacement for
StatsD+Graphite (it is a lot faster, see Benchmarks).
FnordMetric Enterprise features a TCP/UDP, HTTP and WebSocket API, CSV/JSON Export and a
turnkey-ready HTML5 visualization solution ([FnordMetric UI](/documentation/ui_index)).
It can be used as a drop-in replacement for StatsD+Graphite.


### Semantics

There are three basic operations: `add_sample`, `value_at` and `values_in` that
add a sample to an ongoing measurement, retrieve the measurement value at a
specified time, or retrieve all aggregated measurement values in a specified time
interval.
There are three basic operations: `add\_sample`, `value\_at` and `values\_in`.
`add\_sample` adds a sample to a metric, `value\_at` retrieves the measured value at
a specified time and `values\_in` retrieves all aggregated measured values in a specified
time interval.

The measurement method and flush_interval are implicitly specified by the key;
all keys have to be postfixed with "$method-$flush_timeout". For example if
you want a key "response_time" to operate in average/mean mode and flush every 60
seconds, use the key `response_time-mean-60`, for a key "total_clicks" that
aggregates/sums a value and flushes every hour, you could use "total_clicks.sum-3600"
The metric type and interval are implicitly specified by the metric key; all keys have to
end with "$type-$interval".

For example: if you want a metric `response\_time` to record the average/mean of all sampled
values in an aggregation interval of 60 seconds, use the key `response\_time-mean-60`. For a
metric `total\_clicks` that sums up all measuements in one-hour intervals, you could use
`total_clicks.sum-3600`

### In-memory vs. disk storage
You can find a list of all metric types and the [full API Reference here](/documentation/fnordmetric_api_reference/)

FnordMetric Enterprise stores the values as 64bit double precision floats.

With an example flush timeout of 10 seconds, one metric uses 0.065 MB of
memory per day or 0.4 MB per week. The default ring buffer size is x,.
### Installation

That means with only 4GB of ram, you could access the last month of data of
2500 counters/measurements with 10 second granularity all from the in-memory
ringbuffers (without ever hitting a HDD).
Installing FnordMetric Enterprise is straightforward. Download the latest release
[here](/documentation/downloads) and run the jarfile with this command:

Requests that can not be served from memory require one sequential disk read.
    java -jar FnordMetric-Enterprise-v1.2.7.jar --tcp 8922 --udp 8922 --websocket 8080

This will start FnordMetric, listen on UDP and TCP port 8922 and start a WebSocket
server in port 8080.

### FnordMetric Enterprise vs. StatsD

+ allows for flush intervals as low as one second
+ rendered in the browser, interactive
+ much much more scalable
+ highly customizable with css
+ requires only a single deployment
+ i18n (proper timezones in graphs due to in browser rendering etc)
### Getting Started

...
Loading