Thursday, April 23, 2015

Logging FreeNAS performance data into Graphite

Update 12/23/2015 - I now have an updated post which supercedes this post.

Update 12/2/2015 - This information is dated, and there's a really good way to handle FreeNAS logging to Graphite with FreeNAS 9.3 that I need to document. I'll update this post with a link once I get that post done. In the meantime, this is just a placeholder.

FreeNAS is a great NAS appliance that I have been known to install just about anywhere I can. One of the things that makes it so cool is the native support for RRD graphs which you can view in the "Reporting" tab. What would be cooler, though, is if it could log its data to the seriously awesome metrics collection software Graphite. I've been working on getting Graphite installed at work, and have always wanted metrics collection at my house (because: nerd) and so once I got a Graphite server up and running, one of the first things I did was modify my FreeNAS system to point to the Graphite server.

Here are the steps I followed on FreeNAS 9.2.1.8.  I'm overdue for FreeNAS 9.3 and once I've done the upgrade, I'll update these instructions as necessary.


  1. Install a Graphite server.  Four little words which sound so easy, but mask the thrill and heartbreak that can come with trying to accomplish this task. I tried several guides and was a little daunted when I saw most of them mentioning how much of a pain in the ass installing Graphite can be, but then I managed to find this nice, simple guide that used the EPEL packages on a CentOS box. Following those instructions, I managed to get two CentOS 6 boxes up and running in pretty short order, and then with some slight modifications, I set up a CentOS 7 graphite server at home.
  2. Make sure port 2003 on your Graphite box is visible to your FreeNAS box. This usually involves opening some firewall rules.
  3. SSH into your FreeNAS box. (If you don't know what this means, you probably never got to this step, as "Install a Graphite server" would have entirely broken your will to live.) You will also need to log into your FreeNAS box as either root, or a user that has sudo permission.
  4. Edit the collectd config file:
    1. sudo vi /etc/local/collectd.conf
    2. At the top of the file, change "Hostname" from "localhost" to your hostname for the NAS. Otherwise, your NAS will report to the Graphite host as "localhost", and that's less than useful.

      Hostname "nastard"
      ...
    3. There is a block of lines all starting with "LoadPlugin". At the bottom of this block, add "LoadPlugin write_graphite":

      ...
      LoadPlugin processes
      LoadPlugin rrdtool
      LoadPlugin swap
      LoadPlugin uptime
      LoadPlugin syslog
      LoadPlugin write_graphite

      ...
    4. At the bottom of the file, add the following block, substituting the hostname for your graphite hostname:

      ...

       
          Host "graphite.example.net"
          Port "2003"
          Protocol "tcp"
          LogSendErrors true
          Prefix "servers."
          Postfix ""
          StoreRates true
          AlwaysAppendDS false
          EscapeCharacter "_"
       
  5. Change to the directory "/var/db/collectd/rrd" - this is where FreeNAS logs its RRD metrics that is visible in the GUI. If we just restart collectd, it's going to start logging under the hostname (since we changed that above) and that'll break the UI's RRD graphs. While we'll still have the data in graphite, we can have our graphite cake and RRD eat it too by doing the following steps.
  6. Shut down collectd:

    sudo service collectd stop
  7. Move the "localhost" directory (under /var/db/collectd/rrd) to whatever you set the hostname to in the collectd.conf above:

    sudo mv localhost nastard
  8. Symlink the directory back to "localhost":

    sudo ln -s nastard localhost
  9. Restart collectd:

    sudo service collectd start
  10. That's it! At this point, you can reload the FreeNAS GUI and see that you still have your RRD data, but more importantly, if you go to your Graphite GUI, you'll see that you should now be getting metrics.

Protips:

  • Collectd writes data every 10 seconds by default. If you write all your collectd data with the "servers." prefix as I've shown above, you can make sure your whisper files are configured for this interval with the following block in your /etc/carbon/storage-schemas.conf:

    [collectd]
    pattern = ^servers\.
    retentions = 10s:90d,1m:1y

    This will retain your full 10s metrics for a month (30d), 1 minute interval metrics for a year. That config results in each whisper file being 15MB, and with my NAS config (with 6 running jails) I have 220 whisper files for a total disk space of 1.2G. Considering disk space is pretty cheap, you could easily adjust these numbers up to retain more data for longer.  You should also read up on the graphite aggregator which controls how the data is parsed down when it's saved at lesser intervals.

    Thanks to Ben K for pointing out that more than one or two aggregations will greatly increase the amount of disk access. Initially I had a four stage aggregation, but that would require a crapload of access happening with each write. Since Graphite is very IO intensive to begin with, that's not a good idea.