Thursday, December 24, 2015

Getting started with data logging on the Raspberry Pi

I've got a friend who just got a Raspberry Pi and wants to try doing some projects. One of the first things that he wants to do is track temperature and humidity in his house, which is a really good place to start because it's not TOO difficult. Great place to get your feet wet playing with "physical computing".

So, he got the Pi, got an OS on it, got it booted up, and then said "I have no idea what to do with it." So I thought a little, and realized, if the goal is temp logging, there's a really, really easy place to start.

I sent him this:

Open a terminal and copy this into a file called "log-cpu-temp": (do you know vi? if not, use nano)

#!/bin/bash

eval $(date +'now=%s date=%y%m%d')

echo "cpu.temp,$(< /sys/class/thermal/thermal_zone0/temp),$now" > \
$HOME/cputemp-$date.csv

Then, make it executable:

chmod +x log-cpu-temp

Test it by running it:

./log-cpu-temp

It should create a file in your home directory cputemp-151223.csv containing the current CPU temperature in millideg C - something along the line of "35780" which means 35.780 deg C.

Once it's running, then add it to your crontab:

crontab -e

and add to the bottom:

* * * * * /home/pi/log-cpu-temp

Once you save and exit, it'll log the CPU temperature to the CSV file every minute.

Welcome to data logging!

Wednesday, December 23, 2015

Update on using Graphite with FreeNAS

A while back, I posted on using Graphite with FreeNAS. Well, there have been some changes with the recent versions, and this makes integrating Graphite with FreeNAS even easier, so it's time for an update. This applies to FreeNAS-9.3-STABLE.

FreeNAS collects metrics on itself using collectd. This is a nice program which does nothing but gather metrics, and gather them well. FreeNAS gathers basic metrics on itself - cpu, disk performance, disk space, network interfaces, memory, processes, swap, uptime, and ZFS stats, and logs it to RRD databases which can be accessed via the Reporting tab. However, as nice as that is, I much prefer the Graphite TSDB (time-series database) for storing and displaying metrics.

Previously, I was editing the collectd.conf directly, but since the collectd.conf is dynamically generated, and I'd have to add the same block of code each time that happened, I decided to move my additions to the collectd.conf into files stored on my zpool. I then just use the include directive added to the end of the native collectd.conf to call out those files. So, at this point, all I add to the native collectd.conf is this line:

Include "/mnt/sto/config/collectd/*.conf"

This makes my edits really easy, and allows me to create a script to check for it and fix it if necessary - more on that later.

In the /mnt/sto/config/collectd/ directory, I have several files - graphite.conf, hostname.conf, ntpd.conf, and ping.conf.

The graphite.conf loads and defines the write_graphite plugin:

LoadPlugin write_graphite
<Plugin "write_graphite">
<Node "graphite">
Host "graphite.example.net"
Port "2003"
Protocol "tcp"
LogSendErrors true
Prefix "servers."
Postfix ""
StoreRates true
AlwaysAppendDS false
EscapeCharacter "_"
</Node>
</Plugin>

It's worth mentioning that some of the other TSDBs out there accept Graphite's native plain-text format, so this could be used with them just as well. Or, if you had another collectd host, you could use collectd's "network" plugin to send to those.

The hostname.conf redefines the hostname. The native collectd.conf uses "localhost", and that does no good when logging to a graphite server which is receiving metrics from many hosts, so I force it to the hostname of my FreeNAS system:

Hostname "nas"

In order for this to not break the Reporting tab in FreeNAS (not that I use that anymore with the metrics in Graphite) I first need to move the local RRD databases to my zpool by chcking "Reporting Database" under the "System Dataset" in the "System tab:

I then go to the RRD directory, move "localhost" to "nas", and then symlink nas to localhost:

lrwxr-xr-x 1 root wheel 3 May 19 2015 localhost -> nas
drwxr-xr-x 83 root wheel 83 Dec 20 10:23 nas

This way, redefining the hostname in collectd causes the RRD data to be written to the "nas" directory, but when the GUI looks for the "localhost" directory, it still finds what it's looking for and displays the metrics properly.

The ntpd.conf enables ntpd logging, which I use to monitor the time offsets on my FreeNAS box on my Icinga2 monitoring host:

LoadPlugin ntpd
<Plugin "ntpd">
Host "localhost"
Port 123
ReverseLookups false
</Plugin>

Finally, ping.conf calls the Exec plugin to echo a value of "1" all the time:

LoadPlugin "exec"
<Plugin "ntpd">
Exec "nobody:nobody" "/bin/echo" "PUTVAL nas/collectd/ping N:1"
</Plugin "ntpd">

I use this on my Icinga2 server to check the health of the collectd data, and have a dependency on this check for all the other Graphite-based checks. This way, if collectd breaks, I get alerted on collectd being broken - the actual problem. This prevents a flurry of alerts on all the things I'm checking from Graphite, which makes deciphering the actual problem more difficult.

So, I define the Graphite writer, I change the hostname so the metrics show up on the Graphite host with the proper servers.nas.* path, and I add two more groups of metrics to the default configuration. These configuration files are stored on my zpool, so even if my FreeNAS boot drive craps out (which actually happened last week) and I have to reload the OS from scratch, I don't lose these files.

Since I'm only adding one line to the bottom of the collectd.conf file, it becomes very easy to check for my additions, and if necessary, add them. I have a short script which I run via cron: (the "Tasks" tab in the FreeNAS GUI)

#!/bin/bash

# Set the file path and the line I want to add
conf=/etc/local/collectd.conf
inc='Include "/mnt/sto/config/collectd/*.conf"'

# Fail if I'm not running as root
if (( EUID ))
then
echo "ERROR: Must be run as root. Exiting." >&2
exit 1
fi

# Check to see if the line is in the config file
if grep -q Include $conf
then
: All good, exit quietly.
else
: Missing the include line! Add it!
echo "$inc" >> $conf
service collectd restart
logger -p user.warn -t "collectd" \
"Added Include line to collectd.conf and restarted."
echo "Added include to collectd.conf" | \
mail -s "Collectd fixed on NAS" mymyselfandi@example.com
fi

If I reboot my FreeNAS system, the collectd.conf gets reverted. Not a huge problem if I can wait no more than 30 minutes for my cron job to run, but in 9.3, I can do even better. I can call the script at boot time as a postinit script from the Init/Shutdown Scripts section of "Tasks":

This way, when I boot the system, it runs the check script, which sees the missing Include line, adds it automatically, and restarts collectd so it resumes logging to my Graphite server.

This setup has proven to be wonderfully reliable, and unless/until native Graphite support is added to FreeNAS, should keep on working.

Wednesday, November 18, 2015

How to use tip tinner

So a buddy who knows things about how to solder told me I had to get tip tinner. Perfect, I got R&R Lotion tip tinner. (fun side note - when you get a notification on your phone that your "R&R Lotion..." shipped, it might not be what first comes to mind.) So I got it, it comes with no instructions. Looks pretty simple, but hey, I don't know what I'm doing, and I tend to not just guess, especially when I just shelled out good money for a nice adjustable soldering station. So, I search, and I found this, the best and simplest guide to using tip tinner that I found. It really is pretty easy.

Tuesday, October 20, 2015

Logging output from a DHT22 temp/humidity sensor to Graphite via collectd on the Raspberry Pi

Update: 11/29/2015 - Since I initially wrote this, I'm deciding that this should be in the "just because you can, doesn't mean you should" category. I am finding it much easier to query the sensor and then submit the metrics to Graphite directly using the plaintext protocol. You can do it with collectd, but it just introduces far too many complications to really be worth it.

This is a bit of a fringe case, but if I don't write down what I just learned, I'll totally forget it. I got a DHT22 temperature and humidity sensor, and unlike the 1-wire DS18B20 temperature sensor, there isn't a convenient kernel module so I can't just read from the /sys filesystem. Thankfully, Adafruit has a nice guide to using the DHT22 with the Raspberry Pi, and they've got a GitHub repository with the code needed to query the sensor.

Of course, due to my love of Graphite, I need to immediately get my DHT22 not only working, but logging to Graphite, because METRICS. (funny how that word used to annoy me) I could simply modify the Adafruit code to output in Graphite plaintext format, but since I use collectd for gathering my host-based metrics anyway, let's have it do the work and submit everything to graphite together.

I could modify the python script to output in the collectd format, and call that with the Exec plugin, but since the python code needs to be run as root, I decided to keep it pretty minimal, and write a shell wrapper around it, because I know shell. ^_^

The biggest problem I ran into was getting my data to log, even though the script was working, and I discovered this limitation: with the exec data format, the identifier - the path of the metric being collected - has to have a very specific format: hostname/exec-instance/type-instance. "hostname" is pretty obvious, and is defined by COLLECTD_HOSTNAME. (as documented in the Exec plugin docs) exec-instance just has to be a unique instance name, and with this being the only exec plugin I'm running, uniqueness is easy. The last entry, type-instance has to have "type" being a valid type as defined in /usr/share/collectd/types.db, and "instance" again is any unique name. Once I changed my metric path identifier to match this standard, my stuff started logging.

Here's my modified Adafruit python script to gather the data from the DHT22:
https://github.com/ChrisHeerschap/lakehouse/blob/master/dht.py

Here's the shell script wrapper called by collectd:
https://github.com/ChrisHeerschap/lakehouse/blob/master/dht-collectd

And, here's what it looks like when the data gets into Graphite, referencing the DHT22 against a DS18B20 1-Wire sensor:

Monday, August 31, 2015

Creating an access jail "jump box" on FreeNAS

If you wish to have external access to your network through SSH, it's a very good idea to have a very limited purpose "jump box" with the only external access, with that then tightly limited as to whom can log into it and what they can do when they get there. Here is what I've developed using a jail on a FreeNAS system.

I've stolen some ideas from DrKK's Definitive Guide to Installing OwnCloud in FreeNAS (or FreeBSD)

Start with the latest version of FreeNAS. I'll leave it up to you to figure that part out.
Create a standard jail, choose Advanced mode, make sure the IP is valid, and uncheck "VIMAGE"
Log into the jail via "jls" and "jexec"
jls
sudo jexec access csh
Remove all installed packages that aren't the pkg command:
pkg info | awk '$1 !~ /^pkg-/ {print $1}' | xargs pkg remove -y
Update installed files using the pkg command:
pkg update
pkg upgrade -ypkg will likely update itself.
Install bash and openssh-portable via the pkg command:
pkg install -y bash openssh-portable
Move the old /etc/ssh directory to a safe place and create a symlink to /usr/local/etc
mv /etc/ssh /etc/oldssh
ln -s /usr/local/etc/ssh /etc/sshNOTE: this step is purely for convenience and is not necessary but may avoid confusion since the native ssh files won't be used.
Make sure your /usr/local/etc/sshd_config contains at least the following:
Port 22
AllowGroups user
AddressFamily inet
PermitRootLogin no
PasswordAuthentication no
PermitEmptyPasswords no
PermitUserEnvironment yes
Enable the openssh sshd and start it:
echo openssh_enable=YES >> /etc/rc.conf
service openssh start
Verify that openssh is listening on port 22:
sockstat -l4 | grep 22
Create the users' restricted bin directory:
mkdir -m 555 /home
mkdir -m 0711 /home/bin
chown root:wheel /home/bin
This creates the directory owned by root and without read permission for the users.
You can create symlinks in here for commands that the users will be allowed to run in their restricted shell. I prefer to take this a step farther - since it's only a jump box, its only purpose is to ssh in, and ssh on to another system. I further restrict this by creating a shell script wrapper around the ssh command which restricts the hosts that the user can login to from the jump box.

If you have half a clue, you'll wonder how this prevents them from ssh'ing to another host when they get to one that they are allowed access to, and the answer is, if they have the permissions on that host - it doesn't. So it's not a fantastic level of security, but I wanted to see if I could do it. You'll also notice that you need to create a file /home/bin/sshauth.cfg which has the format of "username ALL" or "username host1 host2 ..." which dictates access.
Symlink in the "logger" command to the /home/bin directory:
ln -s /usr/bin/logger /home/bin
Create the user group "user" (as called out in the sshd_config above) so the users can log in:
pw groupadd user
Create the users with each home directory under /home, with the shell /usr/local/bin/rbash, no password based authentication, and the group created in the previous step.
adduser
Change to the user's home directory and remove all the dot files
cd /home/user
rm .??*
Create the following .bash_profile in the user's home directory:
export PATH=/home/bin
FROM=${SSH_CLIENT%% *}
logger -p user.warn -t USER_LOGIN "User $LOGNAME logged in from $FROM"
export HISTFILE=/dev/null
[[ $TERM == xterm* ]] && echo -ne "\033]0;JAIL-$HOSTNAME\007"
PS1="\!-$HOSTNAME\$ "
The file permissions should be set, but confirm:
chmod 644 .bash_profile
chown root:wheel .bash_profile
Create the ssh directory and give it to the user:
mkdir -m 700 .ssh
chown user:user .ssh
Install the user's authorized_keys file in the ssh directory, and make sure the permissions are right:
chown user:user .ssh/authorized_keys
chmod 600 .ssh/authorized_keys
Your user should be able to login at this point, and do nothing beyond what you've given them access to in the /home/bin directory.

Saturday, August 22, 2015

Logging weather data with Raspberry Pi and collectd

I have a Raspberry Pi that I'm using to track temperature inside a house that isn't occupied year-round. I have a DS18B20 1-wire temperature sensor sticking out of the case of the Pi, but I also have it connecting to an Ambient Weather WS-1400-IP weather station, tracking the inside and outside temperatures. Initially, I was using curl with some bash tomfoolery to get the values, and alert if necessary. (inside temperature too cold, risk of pipes freezing, etc) This works, and works well, but recently I've been doing a bunch of work with Graphite and Collectd as part of a monitoring system at work. While Graphite is really, really awesome (I have several blog posts I want to do on it) I've been more and more impressed with the flexibility of Collectd. I've even got a Graphite server running at home with Collectd running on all my systems (except laptops, at the moment) just because it's cool and I'm a metrics nerd.

Well, I took another look at the Raspberry Pi that I have in this remote location, and decided that I wanted to set up a simple collectd to start logging some of the metrics on this system. Since I don't have a Graphite server there, I decided I'd just write the metrics to CSV files, so I'd have them, and if I needed, I could import those into my home Graphite server.

I started out with just some basic metrics, cpu, df, disk, load, memory, ntpd, ping, and swap. Got it up and running, looked good, started logging to the CSV files in my home directory. Cool. I then used the collectd APCUPSD plugin to grab metrics from the UPS that's connected to the system. Easy peasy.

Then I remembered coming across this awesome post a while back where the writer uses the collectd cURL plugin to read both CPU temps and connected 1-Wire sensors. I thought that was just cooler than the Poconos in the middle of February, so I had to go ahead and add the following config:

<Plugin curl>
  <Page "CpuTemp">
URL "file:///sys/class/thermal/thermal_zone0/temp"
  <Match>
Regex "([0-9]*)"
DSType "GaugeLast"
Type "temperature"
Instance "CPUTemp"
  </Match>
  </Page>

  <Page "1WireTemp">
URL "file:///sys/bus/w1/devices/28-0004330af2ff/w1_slave"
  <Match>
Regex "(-+[0-9][0-9][0-9]+)"
DSType "GaugeLast"
Type "temperature"
Instance "Room"
  </Match>
  </Page>
</Plugin>

I adjusted the file path to the 1Wire sensor, as the writer mentions "1wire-filesystem", which I don't use, so I just used the normal path to the 1-wire sensor. A couple small adjustments, and blammo, I'm logging CPU temperature and room temperature from the 1-wire sensor. Okay, granted - the 1-wire temperature is being written in millidegrees C, so 23 degrees C logs as "23000". At some point I'll discover how to change the data before it gets logged, but if I'm accessing this data in Graphite, I'll be able to use the wonderful Graphite functions (specifically, offset and scale) to convert it into whatever I want. It doesn't matter how I log it, just that I do.

If you're curious, the Graphite target for this conversion from millidegrees C to degrees F would be:

target=offset(scale(path.to.1wire.metric,0.0018),32)

That's 9/5, divided by 1000, and offset by 32 degrees. All calculated live on the Graphite server. Slick.

Then I realize -- this is the cURL plugin. The WS-1400-IP weather station has a base unit which makes the live weather data available on a web page, http://{weather station IP}/livedata.htm - why couldn't I just use the cURL plugin and hit that page?

Well, no reason aside from not being familiar with the cURL plugin - yet. Using the example above as a template, I started off with a simple config (added to the block above) to grab the indoor and outdoor temperatures:

  <Page "WeatherStation">
URL "http://{weather station IP}/livedata.htm"
  <Match>
Regex "inTemp.*value=\"(-+[0-9]+\.*[0-9]*)"
DSType "GaugeLast"
Type "temperature"
Instance "insideTemp"
  </Match>
  <Match>
Regex "outTemp.*value=\"(-+[0-9]+\.*[0-9]*)"
DSType "GaugeLast"
Type "temperature"
Instance "outsideTemp"
  </Match>

</Page>

Restarted Collectd, and what do you know - it's logging those temperatures! Well, that was easy!

Let me just explain how this works. I define a page "WeatherStation" which has the URL. Then, I define any number of matches against that one page (so I only have to load it once for all the metrics I'm collecting) to grab metrics. In the case of the inside temp, the output line in the HTML looks like this:

<td bgcolor="#EDEFEF"><input name="inTemp" disabled="disabled" type="text" class="item_2" style="WIDTH: 80px" value="72.3" maxlength="5" /></td>

So, I write the regex to match against that line, starting with inTemp, then matching anything up to value=". Because the regex is surrounded by double-quotes, I had to escape the double quote in the regex with a backslash. I then put parenthesis around the actual metric that I want to extract. One thing to note is that I'm looking for zero or one "-" indicators (it gets cold up there) followed by one or more number, optionally followed by a literal "." (the decimal point) and more numbers. This allows my regex to match positive or negative integer or decimal values, which is important as the weather station reports temps to 0.1 degree, and humidity is logged as an integer. (percentage)

Starting with this template, it's only a moment of work to add the remaining metrics - inside and outside humidity, absolute and relative pressure, wind direction, speed and gust speed, solar radiation, UV value and UV index, as well as rainfall accumulations.

If you read the description on the weather station, you'll see that it can report all of these metrics to Weather Underground, and I do have it configured to do just that. That's all well and good, but sending all that data to them, I don't have access to it, and I can't graph it exactly the way that I want, but now that I'm logging this data, and I can get it into Graphite - I can.

SVC mini-script Testing

In a former life, I was a storage administrator for Thomas Jefferson University Hospital in Center City, Philadelphia. Aside from the wide array of lunch options we had to choose from, another thing I really enjoyed was working with IBM's SVC, or SAN Volume Controller. As a sysadmin, there are few software packages that you truly enjoy working with and you think are really, really good, but the SVC was one of those. One of the reasons was because the command-line interface was just a restricted Bash shell - something I knew very, very well. This allowed me to do pretty amazing things, as the shell is a wonderfully capable programming language. Back then, I had a Wiki that I used as a notepad, and I had several pages on the cool things you could do with SVC on the command line. Here is one of those pages, cut-and-pasted from my internal webserver. Be warned, though - none of this has been updated since about 2007!

Thanks to the awesome Aussie Storage Blog for the inspiration to resurrect these old pages.

If you're not real familiar with bash and one-liner bash scripts, you might not feel comfortable running them on your SVC, considering the damage you can do to your SAN. A better idea might be to take a linux system that's running the bash shell, and test your miniscripts there.

Easy way

With any general user login, you won't have a restricted shell, but as long as you stick to builtin commands, it'll be effectively the same. The big problem you might have is not having access to the "svcinfo" or "svctask" commands. Here's a simple way to get around that... create simple shell scripts called "svcinfo" and "svctask".

Since svcinfo doesn't do anything except output information, you can make a very simple script:

#!/bin/bash

cat << EOF
0:DS4800:online:6:65:8445.0GB:256:2156.2GB
1:DS6800:online:1:20:772.5GB:256:68.5GB
2:DS4500:online:3:3:1628.2GB:256:928.2GB
3:DS4500-SATA:online:1:1:930.5GB:256:430.5GB
EOF

This will give the same output as the "svcinfo lsmdiskgrp -delim : -nohdr" command in SVC... since that's how I generated the output.

To get a little fancier, I could have it take different commands:

#!/bin/bash

mdiskgrp="0:DS4800:online:6:65:8445.0GB:256:2156.2GB
1:DS6800:online:1:20:772.5GB:256:68.5GB
2:DS4500:online:3:3:1628.2GB:256:928.2GB
3:DS4500-SATA:online:1:1:930.5GB:256:430.5GB"

# Names have been changed to protect the guilty
lsvdisk="0:Miata-data:1:io_grp1:online:0:DS4800:40.0GB:striped:::::60050768018F8148F800000000000094
1:Integra-db:0:io_grp0:online:0:DS4800:40.0GB:striped:::::60050768018F8148F800000000000095
2:WRX-data:1:io_grp1:online:0:DS4800:40.0GB:striped:::::60050768018F8148F800000000000096
3:WRX-data2:0:io_grp0:online:0:DS4800:15.0GB:striped:::::60050768018F8148F800000000000097
4:WRX-Archive:1:io_grp1:online:3:DS4500-SATA:500.0GB:striped:::::60050768018F8148F800000000000098"


case $1 in
    lsmdiskgrp)
        echo "$mdiskgrp"
        ;;
    lsvdisk)
        echo "$lsvdisk"
        ;;
esac

So, now if I run "svcinfo lsmdisk" I'll get five lines of output like I would get from "svcinfo lsvdisk -delim : -nohdr" on a very small SVC implementation. If I instead run "svcinfo lsmdiskgrp", I'll see the output I would have seen above.

We could get even fancier and make the script handle the options, like changing "-delim" and handling "-nohdr", but that's starting to get a bit too complicated for just testing.

Testing the script

Now that I have an actual "svcinfo" command, I can run miniscripts on my linux box:

svcinfo lsmdiskgrp -delim : -nohdr | while IFS=: read id name stat nummd numvd size extsize free
do
    [[ $name == *DS4500* ]] && echo "Mdiskgrp $name has $free free of $size"
done

I specified the "-delim : -nohdr" on my command line, so I could cut & Paste my tested miniscript to the actual SVC. The script above ignores anything past $1.

This will give us the following output:

Mdiskgrp DS4500 has 928.2GB free of 1628.2GB
Mdiskgrp DS4500-SATA has 430.5GB free of 930.5GB

What about svctask?

Well, with svctask, you're actually *doing* something, which you won't to in a test environment. One possible approach would be to make an svctask command which just tells you what would be run:

#!/bin/bash

echo "RUN: svctask $*"

Now, whenever our test miniscript calls the svctask command, we'll just see the command line that it would run:

svcinfo lsvdisk -delim : -nohdr | while IFS=: read id name iogid iogrp mdgid mdgrp rest
do
     [[ $mdgrp = DS4800 ]] && svctask chvdisk -name "DS4800-$name" "$name"
done

Note that I only grabbed variables I needed, leaving everything else (the rest of the data) in the variable "rest". This example would assign the fields from size to vdisk UID into "rest".

This script would find all vdisks from the mdiskgrp "DS4800" and rename them, prefixing their old name with "DS4800-". Using our "svctask" test script, we would get output like this:

RUN: svctask chvdisk -name "DS4800-Miata-data" "Miata-data"
RUN: svctask chvdisk -name "DS4800-Integra-db" "Integra-db"
RUN: svctask chvdisk -name "DS4800-WRX-data" "WRX-data"
RUN: svctask chvdisk -name "DS4800-WRX-data2" "WRX-data2"

That way we can check to make sure the commands that are output are what we would expect.

However, you might want to ask yourself how lucky you feel before you start scripting svctask commands, especially ones that can do damage. Something like this example, "chvdisk" renames, is pretty tame, so the possibility of doing major damage is low, but just use your brain and test first. Also, before you run the command on the SVC itself, make sure everything is right by prefixing your "svctask" command with "echo"... then you get the same output as you have above and nothing breaks.

SVC mini-script storage

In a former life, I was a storage administrator for Thomas Jefferson University Hospital in Center City, Philadelphia. Aside from the wide array of lunch options we had to choose from, another thing I really enjoyed was working with IBM's SVC, or SAN Volume Controller. As a sysadmin, there are few software packages that you truly enjoy working with and you think are really, really good, but the SVC was one of those. One of the reasons was because the command-line interface was just a restricted Bash shell - something I knew very, very well. This allowed me to do pretty amazing things, as the shell is a wonderfully capable programming language. Back then, I had a Wiki that I used as a notepad, and I had several pages on the cool things you could do with SVC on the command line. Here is one of those pages, cut-and-pasted from my internal webserver. Be warned, though - none of this has been updated since about 2007!

Thanks to the awesome Aussie Storage Blog for the inspiration to resurrect these old pages.

all of the miniscripts are copied via straight cut and paste so you'll have to scroll horizontally. Upside is you can cut and paste from here and into SVC.

very limited "grep" function

function grep { typeset my; while read my; do [[ $my == *$1* ]] && echo $my; done; }

Very rudimentary, just searches stdin for a simple match anywhere on the line, prints the line if it's a match.

Fancy monitor for MDiskGrp Migration with size and progress bar:

function progress { bar="########################################"; echo "Waiting: $(svcinfo lsmigrate | while read x y; do [[ $x = progress ]] && p=$y; [[ $x = migrate_source_vdisk_index ]] && vdisk=$y; if [[ $x = migrate_source_vdisk_copy_id ]]; then if [[ $p = 0 ]]; then echo -n "$vdisk "; else eval $(svcinfo lsvdisk $vdisk | while read a b; do [[ $a = name ]] && echo vdn=$b; [[ $a = real_capacity ]] && b=${b//.00/} && echo sz=$b; done ); printf "Vdisk %16s %3d%% of %7s [%-${#bar}s]\n" "$vdn" "${p}" "${sz}" "${bar:0:$((${#bar}*p/100))}" >&2; fi; fi; done)"; }

Show info about the vdisks assigned to a host

function lshostvdisk { [[ -z $1 ]] && return; FMT="%3s %-16s %6s %10s %7s %-7s\n"; echo "===== VDisks assigned to host $1 ====="; printf "$FMT" "ID" " VDisk" "Size" "MDiskGrp " "IOGrp " " UID"; svcinfo lshostvdiskmap -nohdr $1 | while read a a a vdid vd a; do svcinfo lsvdisk $vd | while read x y; do case $x in IO_group_name) iogrp=$y;; mdisk_grp_name) mdg=$y;; capacity) sz=${y/.??};; vdisk_UID) uid=${y:28};; grainsize) printf "$FMT" "$vdid" "$vd" "$sz" "$mdg" "$iogrp" "...$uid";; esac; done; done; }

Takes a single argument, returns a formatted list of vdisks assigned to the hosts, showing vdisk ID, name, size, mdiskgrp, iogrp, and the last four characters of the UID.

Match an arg to a host WWPN

function hostwwpn { [[ -z $1 ]] && echo Missing argument. && return; svcinfo lshost -nohdr | while read a h a a; do f=$(svcinfo lshost $h | grep $1); [[ -n $f ]] && echo $h $f; done; }

List dead hosts

function lsdeadhosts { svcinfo lshost -nohdr | while read a h a a; do z=$(svcinfo lshost $h | while read x y; do [[ $x = node_logged_in_count ]] && echo -n "$y"; [[ $x = state ]] && printf "/%-8s " "$y"; done ); [[ $z = *0* || $z == *inactive* ]] && printf "%-16s %-s\n" "$h" "$z"; done; }

This function searches through the host objects and looks for ones that have a zero in the "node_logged_in_count". It then displays the hosts as well as the node_logged_in_count numbers, and anything with all zeros is not currently connected to the SAN.

List live hosts

function lslivehosts { svcinfo lshost -nohdr | while read a h a a; do z=$(svcinfo lshost $h | while read x y; do [[ $x = node_logged_in_count ]] && echo -n "$y "; done ); [[ $z == *[1-9]* ]] && printf "%-16s %-s\n" "$h" "$z"; done; }

The converse of the above, useful if you have to quiesce the SAN and need to see who is still active.

basic "free" function to show mdiskgrp usage

function free() { FORMAT="%-12s %9s/%9s %5s\n"; printf "$FORMAT" "MDiskGrp" "Free" "Capacity" "Pct "; svcinfo lsmdiskgrp -delim " " -nohdr | while read a b c d e f g h i j k l m n o p; do pct="$((${i%.*}*1000/${f%.*}))"; printf "$FORMAT" "$b" "$h" "$f" "${pct%?}.${pct:((-1))}%"; done; }

Looks at the output of svcinfo lsmdiskgrp, displays free space, total capacity, and percentage. Fakes the integer math, if the units (GB/MB) aren't the same for free space and capacity, it'll break the math.

Migrate all extents from one mdisk to another:

sourcemdisk=10; targetmdisk=1; svcinfo lsmdiskextent -nohdr $sourcemdisk | while read vdisk extents copy; do echo "Starting migration of $extents extents of vdisk $vdisk to mdisk $targetmdisk"; svctask migrateexts -source $sourcemdisk -target $targetmdisk -exts $extents -vdisk $vdisk -threads 1; done

If there are a large number of vdisks with extents on this mdisk, the command will start to fail. Wait until the migration is complete, and re-run this command as necessary.
The "-nohdr" option on the lsmdiskextent prevents the headers from being printed which would confuse things. (but not in a bad way, it would just error out with this error:
CMMVC5716E Non-numeric data was entered for a numeric field ([number_of_extents]). Enter a numeric value.

Show status of extents migration (above):

function progress { echo "Waiting: $(svcinfo lsmigrate | while read x y; do [[ $x = progress ]] && p=$y; [[ $x = migrate_vdisk_index ]] && vdisk=$y; if [[ $x = number_extents ]]; then if [[ $p = 0 ]]; then echo -n "$vdisk "; else printf "Vdisk %3d %3d%% of %4d extents\n" "$vdisk" "${p}" "$y" >&2; fi; fi; done)"; }

Show the space used by vdisks for a series of hosts

hosts="Integra Miata Impreza Element Fit Saab"; eval echo $(( $(for x in $hosts; do svcinfo lshostvdiskmap -nohdr -delim : $x | while IFS=: read a b c d e f; do svcinfo lsvdisk $e | while read z y; do [[ $z = real* ]] && echo -n "${y%.??GB}+"; done; done; done; echo "0" ) ))

The hosts are defined in the space-delimited list in variable "hosts"; they must match the SVC host object name exactly, or can be the host object IDs.

Note: makes the assumption that all of the vdisks are listed in GB. Will not work with vdisks that show size in MB or TB. could be adjusted to work with this situation, though.

Show useful information about an mdisk

function mdinfo { typeset md=$1; if [[ -z "$md" ]] || ! lsmdisk="$(svcinfo lsmdisk $md)"; then echo "Error - bad mdisk..."; return 1; fi; eval $(echo "$lsmdisk" | while read a b; do [[ $a == id ]] && echo "mdid=$b"; [[ $a == name ]] && echo "mdname=$b"; [[ $a == capacity ]] && echo "mdcap=$b"; [[ $a == UID ]] && echo "uid=${b:28:4}"; done ); mdiskusedexts=$(($(svcinfo lsmdiskextent -nohdr $md | while read a b; do echo -n "$b+"; done)0)); mdiskfreeexts=$(svcinfo lsfreeextents $md | while read a b; do [[ $a == number_of_extents ]] && echo $b; done); mdisksize=$((mdiskusedexts+mdiskfreeexts)); printf "%2s %-16s %5d/%5dexts (%-8s) %3d%%used %5d free - ID %4s\n" "$md" "$mdname" "$mdiskusedexts" "$mdisksize" "$mdcap" "$((mdiskusedexts*100/mdisksize))" "$mdiskfreeexts" "$uid"; }

Example output:

DS4800-Array-9     698/ 4359exts (1089.9GB) 16%used  3661 free

List the quorum MDisks

svcinfo lsmdisk -nohdr | while read id name rest; do svcinfo lsmdisk $id | while read key value; do if [ "$key" == "quorum_index" ]; then if [ "$value" != "" ]; then echo "MDisk $id ($name) is quorum disk $value"; fi; fi; done; done

List the VDisks which are not mapped to a host

function lsfreevdisk { svcinfo lsvdisk -nohdr | while read id name rest;do if [[ -z $(svcinfo lsvdiskhostmap -nohdr $id) ]] ; then echo "VDisk '$name' is not mapped to a host"; fi; done; }

Show SCSI ID and last four digits of LUN ID for DS4800 disk:

svcinfo lsmdisk | while read id mdisk stat manage mdg mdiskgrp size scsi controller diskid; do [[ $controller != DS4800 ]] && continue; echo "$id $mdisk ${scsi:14:2} ${diskid:28:4}"; done

Shows mdisk ID, mdisk name, SCSI ID (two digits), and LUN ID (last four digits)

Generate a CSV of mDisk/vDisk extent distribution

 echo vDisk,mDisk,Controller,mDisk Group,Extents;

 vdiskIds=(`svcinfo lsvdisk -nohdr | while read id rest; do echo -n "$id "; done`)
 vdiskNames=(`svcinfo lsvdisk -nohdr | while read id name rest; do echo -n "$name "; done`)
 vdiskNameMap=()
 for (( i = 0 ; i < ${#vdiskNames]} ; i++ ))
 do
 vdiskNameMap[${vdiskIds[$i]}]=${vdiskNames[$i]}
 done

 svcinfo lsmdisk -nohdr | while read mdiskId mDiskName status mode mdgId mdgName capacity LUN controllerName UniqueID;
 do
 svcinfo lsmdiskextent -nohdr $mdiskId | while read vdiskId extents;
    do
     echo ${vdiskNameMap[$vdiskId]},$mDiskName,$controllerName,$mdgName,$extents;
    done
 done

Redirect the output of your SSH command (with which you submitted the above) to a CSV file.
You can then open the CSV with Excel and do some Pivot Table magic to get pretty graphs for your management


Storage of deprecated functions
Show status of MDiskGrp Migration:

function progress { echo "Waiting: $(svcinfo lsmigrate | while read x y; do [[ $x = progress ]] && p=$y; [[ $x = migrate_source_vdisk_index ]] && vdisk=$y; if [[ $x = migrate_source_vdisk_copy_id ]]; then if [[ $p = 0 ]]; then echo -n "$vdisk "; else printf "Vdisk %3d %3d%%\n" "$vdisk" "${p}" >&2; fi; fi; done)"; }

Handy SVC mini-scripts

In a former life, I was a storage administrator for Thomas Jefferson University Hospital in Center City, Philadelphia. Aside from the wide array of lunch options we had to choose from, another thing I really enjoyed was working with IBM's SVC, or SAN Volume Controller. As a sysadmin, there are few software packages that you truly enjoy working with and you think are really, really good, but the SVC was one of those. One of the reasons was because the command-line interface was just a restricted Bash shell - something I knew very, very well. This allowed me to do pretty amazing things, as the shell is a wonderfully capable programming language. Back then, I had a Wiki that I used as a notepad, and I had several pages on the cool things you could do with SVC on the command line. Here is one of those pages, cut-and-pasted from my internal webserver. Be warned, though - none of this has been updated since about 2007!

Thanks to the awesome Aussie Storage Blog for the inspiration to resurrect these old pages.

SVC's command line interface (CLI) is a restricted bash shell. You might not be able to cd to other directories or run commands like "grep" or "awk", (see below) but you can still do some really useful stuff using the shell builtins available to you.

Basic info

The "for var in blah blah blah blah; do; done" and "while read; do; done" loops are very powerful tools available to the shell script programmer, and are the backbone of virtually every miniscript I write.
SVC's bash is restricted, but otherwise the same as the bash you would find on any linux box. Prior to SVC 4.2, the bash version is 2.05. At 4.2, the bash is updated to 3.1. If you've got a linux system available to play with, you can try out these miniscripts there before you try them on your production SVC. If you want to test these scripts on a different linux system, you might want to see the SVC miniscript testing page.
Before you actually run commands, ESPECIALLY on an SVC where you can cause some serious damage to your SAN, prefix the command with an "echo" so you can see the command it would run without actually running the command!

So, before you run "svctask rmvdisk $x" from inside a loop; first run "echo svctask rmvdisk $x" and verify that it's going to do what you think it's going to do. Much better to see that you messed something up on the screen than let your SVC try to remove every vdisk you've got assigned. Clients will not be happy nor will they be too understanding when you tell them you were using some ultra-fancy "bash miniscripts" to make working on the SVC more efficient.

Possible gotchas

Beyond the obvious gotchas of really screwing up the configuration of your SVC if you get your miniscripts wrong, there are a couple little gotchas to be aware of.

* Variables set inside loops are not available outside of loops.

found=0
svcinfo lsvdisk -nohdr | while read line
do
    [[ $line == *DS4800* ]] && (( ++found ))
done
echo "Found $found vdisks on the DS4800"

This will always report 0 vdisks, even if it does find vdisks, because the loop executes in a subshell, and thus any modification to the variable won't be available to the parent shell. A workaround is to use command substitution and catch the STDOUT of the loop:

found=$(svcinfo lsvdisk -nohdr | while read x; do [[ $x == *DS4800* ]] && echo -n X; done)
echo "Found ${#found} vdisks on the DS4800"

For each disk I find, I echo an "X". At the end of the loop, I've got stored in the variable "found" something which looks like "XXXXX" for five vdisks. Echoing "${#found}" gives me the length of the variable, and thus, the number of vdisks. Painful workaround? Yep.

This is a bash thing -- don't blame SVC. Korn shell (ksh) works just fine either way, but we're not running on the ksh, are we?

Multi-line entry

To make things more readable, you can enter your mini scripts on multiple lines:

svcinfo lsmdisk -delim : | while read line
do
    echo "$line"
done

(this miniscript doesn't really accomplish anything... just echoes what it reads, as if we ran "svcinfo lsmdisk | cat")

Bash will convert your entry to a one-liner (separating lines with ";" as appropriate) when it enters it into the history file, so the above will look like this in the history file:

svcinfo lsmdisk -delim: | while read line; do echo "$line"; done

Because of this, when I'm writing mini-scripts, I just edit them in the one-line format. Your option.

Shortening command lines

if/then with only one command can be simplified:

if [[ $x = "yes" ]]
then
    echo "yes"
fi

can be simplified to:

[[ $x = "yes" ]] && echo "yes"

if/then/else with only one command can be simplified:

if [[ $x == "yes" ]]
then
    echo "yes"
else
    echo "no"
fi

can be shortened to:

[[ $x == "yes" ]] && echo "yes" || echo "no"

Multiple commands can be run as well:

if [[ $x == "yes" ]]
then
    echo "yes"
    runcommand
fi

can be shortened to:

[[ $x == "yes" ]] && echo "yes" && runcommand

caveat: The shortened version will not work exactly the same as the long version. In the short version, "runcommand" is only run if the first command, echo "yes" is successful. Odds are good that an echo isn't going to fail, but if "runcommand" was first and it failed, the next command doesn't run. In the "long" if/then format, both commands will be run, even if the first fails.
Basically, unless you really understand how the "&&" and "||" dividers work, stick to single commands in your shortened if/then commands. Anyhow, the if/then syntax for multiple commands (on one line) is still pretty easy:

if test; then command1; command2; fi

No "ls"? No problem!

SVC's restricted bash shell gives you very, VERY few commands. You can still get by without some of them using the bash builtins. Here's how we can simulate the "ls" command:

echo /dumps/*

This lists the contents of the /dumps/ directory, but in a pretty ugly format... all the files are listed separated by a space, and wrapped around lines. A little hard to read. Here's how we can have something more like "ls -F" format: ("/" appended to the name of directories)

for file in /dumps/*
do
    [[ -d $file ]] && echo "$file/" || echo "$file"
done

I don't know of any bash builtins that could give us the equivalent output of "ls -l", though. No way to get file size, permissions, or ownership. You can check if your own permissions as related to the file (can I read, write, execute) but can't see who owns it. That said, if the file is there, as admin you will have permissions to access it

No "grep"? No worries!

Wouldn't it be nice to "svcinfo lsvdisk | grep MyDisk"? Here's how: (basic form)

svcinfo lsvdisk -nohdr -delim : | while read line
do
    [[ $line = *match* ]] && echo $line
done

The bash used in SVC is kinda old (version 2.05 on SVC v4.1) which is too bad. If it were a more modern version (v3.x) there are many REALLY powerful pattern matching capabilities available as builtins... including regular expressions!
WUHU! SVC 4.2 has upgraded bash to version 3.x! This opens up some REAL possibilities for powerful pattern matching. See the bash doc for more info.

No "awk"? No concerns!

Since most SVC output is field specific, you can use modify the "grep" code above bash to split up the stuff that you're reading:

svcinfo lsmdiskgrp -nohdr -delim : | while IFS=: read id name status nummdisk numvdisk size extsize free
do
    [[ $name == *4500* ]] &&  echo "MDiskGroup $name (id $id) has $free available of $size"
done

Note: Using the ":" delimiter and setting IFS allows us to handle blank fields. F'rinstance, "lsvdisk" often has blanks for FC_id, FC_name, RC_id, and RC_name. (unless you've got Flash Copy on that vdisk) but the vdisk UID might be important to you. If you don't set IFS and leave "delim" unset, there will just be extra blank space, and the vdisk UID will be set to the "FC_id". Setting the IFS to the delimiter makes it set the blank fields to blank. IF you don't have to deal with blank fields (like say lsmdiskgrp) then you can leave the delim unset and just use a read: (you should still turn off the header so you don't process that!)

svcinfo lsmdiskgrp -nohdr | while read id name status nummdisk numvdisk size extsize free
do
   ...

No "sleep"? Now that's a problem.

If you wanted to run a command repeatedly, separated by a certain time delay, I haven't figured out how to do that yet. The "sleep" command, oddly enough, isn't a bash builtin. (even though it seems like it could be, very easily... and not to harp on ksh93, but it's a builtin there as well, and takes fractional delays, to go with ksh93's capability for floating-point math... oh, and it leaves variables set in loops available to the original shell...)

One option I considered was checking the first field of /proc/uptime in a loop, and exiting the loop when the value equals (( initialvalue + sleeptime )), but the problem there is that bash will check the contents of the file as often as it can, many times a second, creating a possibly significant load on the system, which isn't a good idea.

Interactive workaround for no "sleep" command:

Instead of having the system delay for a certain period of time, you can have the system wait for your input, using the "read" command to read a line of input. You can then CTRL-C or put in a specific exit string:

while read keyb
do
    svc command goes here
    [[ $keyb == x ]] && break
done

This will execute the command every time you hit ENTER, and if you type a single "x", it will return you to a command prompt. Don't use "exit" instead of "break" or you'll be logged out when you type "x".

Building associative arrays

One of the biggest pains in the butt with bash is the variable scoping. If you want to build an associative array of, say, mDisk Name mapping to its state, you can't just set up a while loop with the output of lsmdisk piped into it. The reason is that the pipe spawns a subshell and the array that you're merrily building belongs to the subshell and so ceases to exist once you exit the loop

I worked out the following method.

Build an array of your keys, using command substitution
Build an array of your values, using command substitution
Iterate over your values and generate your associative array

vdiskIds=(`svcinfo lsvdisk -nohdr | while read id rest; do echo -n "$id "; done`)
vdiskNames=(`svcinfo lsvdisk -nohdr | while read id name rest; do echo -n "$name "; done`)
vdiskNameMap=()
for (( i = 0 ; i < ${#vdiskNames[

} ; i++ ))do

 vdiskNameMap[${vdiskIds[$i]}]=${vdiskNames[$i]}

done @]

Because no subshell was spawned for the last loop, vdiskNameMap is avaiable for later. I use this in my extent summary script

Functions in bash

If you wanted to use the "grep" code above and actually call it "grep", you could do that by defining a function:

function grep { while read line; do [[ $line == *$1* ]] && echo "$line"; done }

Since you can't edit your .bash_profile or .bashrc, every time you log into the SVC, you'd have to re-enter the function. This could be done via cut & paste, or if you wanted to get fancy, an expect script which logged in, defined your functions, and then gave you interactive control.

Since we don't have the capability to set aliases in restricted shell, you can use this if you want to get rid of the annoying need to prefix everything with "svcinfo" and "svctask":

function lsmdiskgrp { svcinfo lsmdiskgrp $*; }

"lsmdiskgrp" now works just like "svcinfo lsmdiskgrp", taking options (like "-nohdr") and arguments (like the name or id or an mdiskgrp) without having to type "svcinfo" first.

bash can be finicky when you define a function and put commands in { } brackets. Usually bash isn't like perl in needing each line terminated with ";", but if you have a simple command like we have here, end the command with a ";" before the closing "}". Watch your whitespace, too.

A more powerful miniscript

An example. If you're running extents migrations, the progress information given by "svcinfo lsmigrate" is a little hard to read:

IBM_2145:TJUH_SVC:admin>svcinfo lsmigrate
migrate_type MDisk_Extents_Migration
progress 33
migrate_vdisk_index 13
migrate_source_mdisk_index 10
migrate_target_mdisk_index 1
number_extents 160
max_thread_count 1
migrate_type MDisk_Extents_Migration
progress 90
migrate_vdisk_index 12
migrate_source_mdisk_index 10
migrate_target_mdisk_index 1
number_extents 60
max_thread_count 1

...and so forth. That's pretty ugly. Wouldn't it be nice if we could see output in a format like:

Vdisk  13  52% of  160 extents
Vdisk   1  10% of  819 extents
Vdisk  10  37% of   64 extents
Vdisk   7  22% of   96 extents
Vdisk   4   0% of  194 extents

Much nicer! It's actually pretty easy. Here's the command line:

svcinfo lsmigrate | while read x y; do
    [[ $x = progress ]] && p=$y
    [[ $x = migrate_vdisk_index ]] && vdisk=$y
    [[ $x = number_extents ]] && printf "Vdisk %3d %3d%% of %4d extents\n" "$vdisk" "$p" "$y"
done

Normally I'll just create the whole command line with one line and semicolons separating commands, since it's easier to return to that line and edit.

In this instance, 0% means it hasn't started the migration... it can only run four threads at once, so a vdisk at 0% is waiting. Let's fancy it up some:

svcinfo lsmigrate | while read x y; do \
    [[ $x = progress ]] && p=$y \
    [[ $x = migrate_vdisk_index ]] && vdisk=$y \
    if [[ $x = number_extents ]]; then \
        if [[ $p = 0 ]]
        then
             echo "Vdisk $vdisk is waiting..."
        else 
             printf "Vdisk %3d %3d%% of %4d extents\n" "$vdisk" "${p}" "$y"
        fi; fi
done

That produces output like this:

Vdisk   1  35% of  819 extents
Vdisk   4  85% of  194 extents
Vdisk   5  82% of  160 extents
Vdisk   0   6% of   16 extents
Vdisk 14 is waiting...
Vdisk 17 is waiting...
Vdisk 32 is waiting...

...and so forth!

As you can see, the command lines can get a little complicated, but you can do some REALLY powerful stuff.

More advanced stuff

A progress bar could be simulated like so:
->

bar="########################################"       # That's 40 pound signs
barlen=${#bar}                                       # Sets "barlen" to 40 - the length of "bar"
space="                                        "     # 40 spaces
blocks=$((p*barlen/100))                             # How many blocks for that percentage.
echo ">${bar:0:$blocks}${space:0:$((barlen-blocks))<"

For a value of p=70; output would look like this:
->>############################ <
pmwiki isn't showing all the spaces properly here... there would be 28 pound signs followed by 12 spaces, keeping the whole width between the > < characters at 40 characters.

Bash links

http://www.gnu.org/software/bash/
GNU's page on bash.
http://www.gnu.org/software/bash/manual/bashref.html
GNU's bash reference manual

Wednesday, August 12, 2015

Adjustments to the Mac OS X Dock

I don't use the dock in OS X. I don't find it necessary - I can open apps with CMD-space, (using spotlight to search) and I can switch between apps with cmd-tab. So, I auto-hide it, which gives me much more usable space on the screen, but this winds up being an annoyance when I have to click something down near the bottom of the screen, as the dock may pop up in the way of what I'm trying to click. So, after some searching, here's some command line adjustments I made to the dock:

To increase the delay before the dock hides:

defaults write com.apple.dock autohide-delay 10 && killall Dock

This sets the autohide delay to 10 seconds. Initially, I ran into a problem where this seemed to work but only momentarily. After running this, moving to the bottom of the screen resulted in the dock not popping up, but if I move away and back, the Dock pops back up immediately. I've set the delay as high as 1000000 and it acts this way. Further searching revealed that other links had "dock" with a capital D, which makes it not work properly. Make sure you use lower case for "dock".

Making the dock size as small as possible also helps as it covers less screen if it does pop up. This can be done from System Preferences... Dock.

To make the dock only be active applications:

defaults write com.apple.dock static-only -bool TRUE && killall Dock

(from: http://www.makeuseof.com/tag/customise-mac-os-x-dock-hidden-terminal-commands/)

This reduces the size further and makes it less intrusive.