Its that time of the year again. Happy SysAdmin Day everyone.
If today is dragging, might want to refresh your memory of the great OddTodd... always a pick-me-up.
> Read More... | Digg This!

Its that time of the year again. Happy SysAdmin Day everyone.
If today is dragging, might want to refresh your memory of the great OddTodd... always a pick-me-up.
I've missed FAST 2010 yet again.... but, good news! The complete FAST 2010 Proceedings (PDF) are available for free. USENIX members can also view the presentation videos online.
An interesting discussion has been taking place on the OpenSolaris SysAdmin Community list, and I sense it will lead us toward some important changes in Solaris. Essentially it all comes down to the lack of spit and polish. What has always been something we perhaps ignored or downplayed has become far more starkly contrasted by truly easy to use yet complex things such as ZFS or SMF.
The clearest examples are technologies that currently are essentially useless without custom scripting. Such examples include LDAP, Extended Accounting, and BSM Auditing.
LDAP is one that's really concerned me. Almost any Solaris environment would benefit greatly from an LDAP/Kerberos implementation, for ease of management and increased security... but frankly, just dropping in a directory server and authenticating to it isn't so straight forward. Populating and maintaining the DIT is complex, commonly requiring custom scripts and possibly a 3rd party LDAP Browser. While the aging idsconfig script is suppose to jumpstart your experience, its not perfect and is tailored to Sun DSEE. In the community we commonly see people scratching their heads wondering if other directory servers, such as OpenLDAP even work with Solaris and how to get started.
Microsoft hit a home run with ActiveDirectory, and it pains me in the same way that NetApp kicked Sun's ass at building NFS servers. Sun is a systems company and the leading provider of directory/identity management products, but if you want to use them in conjunction with Solaris you've got a lot of custom work to do. As far as Kerberos, most of the use continues to be in academic environments, which means that the best means to secure NFS in a corporate environment just isn't used.
Sun is very good at engineering the big things, but I've noticed that when it comes to connecting all the dots they tend to turn toward the path of acquisition. A need arises for a management app or something, they find a decent software company doing it, aquire them, and then slowly let the thing rot. I mean, how many people still use Sun Management Center or N1 Provisioning Server? (Or ever did for that matter.)
A lot of focus has gone into the GNU-ification of Solaris and improving the desktop experience with Indiana... I mean OpenSolaris... but at some point we've got to get back around to focusing on what Solaris does best, being the enterprise class server operating system we know and love.
This is especially important in the face of Cloud Computing. The cloud needs solid server operating systems, and Solaris leads the pack. If we've proved one thing with Solaris 10, its that making Solaris more like Linux doesn't have nearly the impact we hoped it would, but making the complex very simplistic and straightforward (ZFS, DTrace, SMF, FMA, ...) is dramatic.
Monitoring, Management, and Infrastructure is what we need. Easy, quick, and powerful. We have the technology underneath, we just need to bring it all together.
What say you?
The Storage Networking Industry Association's (SNIA) Storage Developer Conference (SDC) is not, as the fancy name suggests, not a place for storage hobbyist or the light hearted. Attendees are leaders in our industry, highly informed and knowledgeable. If they are interested in it, we all will be soon. If you follow the storage press at all, the two big things on their mind won't surprise you:
From performance talks, to corruption analysis talks, to ZFS talks, to NFSv4 talks, every session included a slide for or was asked a question about both of these. Frankly, there were very few answers. Sun's "hybrid storage architecture" for ZFS (for those in the know, this is L2ARC and ZIL offload, which are put on special SSD's). Most of the talks only noted "SSD will change everything... its too early to tell how." Given that the concern of the show is largely on primary storage, not secondary backup, de-dup was constantly come up but rarely had a place.
If de-duplication is a new term for you, here's the quick and dirty pitch. Imagine having to architect backups for 300 helpdesk PC's, all are running a standardized Windows XP, office stack, plus helpdesk support and naturally other user applications. Lets say the average PC has 80GB of data on its local drive. So thats 300 * 80GB to back up, perhaps nightly. A nightmare. Historically, to reduce the backup load by either putting user home directories on a centralized file server and just not backup PC's, only the file server, or you'd exclude paths such as C:/Windows (or whatever the hell they call it now). De-duplication typically uses hashing algorithms either on the client or on the backup server to reduce storing duplicate data blocks. So that means you only backup one copy of Windows XP, and then 299 references to it. If someone sends out a PDF of the company handbook thats 5MB, and there are 300 local copies of it, thats 1.5GB of the same file, but with de-duplication we store only a single 5MB file plus references to it.
From the example you can see that customers backing up Oracle databases or customized purpose build servers might not be in dire need of this technology (although they are interested too), but if your backing up server farms or desktop systems this is something you can't wait another second to get your hands on; especially if your backing up to tape!
I should note, de-dup is becoming more than just a backup technology. Storage admins see applications for file servers and other applications. I'm certain that in 5 years de-duplication methodology will be used in ways I'd laugh at today.
As for SSD. Its coming. I remember 10 years ago in a lab where we had a "Solid State Disk", which in the pre-flash era meant a box with bank upon bank of RAM and a big battery. Today SSD is cheap and getting cheaper. But how will they be used?
Today we have the concepts of "tiered storage". This means different things based on who you talk to. In some cases such as Pillar Data this is done by partitioning drive cylinders so that tier 1 data is on the outer (faster) tracks and tear 2, 3, 4 on the inner (slower) tracks. In other cases this means putting important fast access data on smaller 15K or 10K RPM FC or SAS disks as "tier 1", and bulk data on larger "nearline" 7,200 RPM SATA disks. For customers using HSM (Hierarchical Storage Management) you can even automate the data migration back and forth across tiers, all the way out to tape drives which was untill recently cheaper per gig than disk.
So many storage administrators and architects seem to see SSD pushing into tier1 and pushing 15K spinning media down the stack. Instead of Fast, Slow, Tape, you get Super-Fast, Fast, Slow and potentially just dump tape.
I know I'm a zealot, but Sun really is leading the charge here. The Hybrid Storage Pool architecture is really brilliant because it views SSD not as faster disks, but rather as slow (relatively of course) non-volatile memory. Traditionally you have an in-memory filesystem cache (ZFS's is called "ARC"), data flows through the cache and eventually is ejected to make room for fresher data meaning that if you call that data again you go out to disk. ZFS's L2ARC (Level 2 ARC) extends your in memory disk cache using SSD, so if you go back for data you don't have to go all the way out to disks. On busy file servers this is a massive win! A 64GB SSD is a really small disk, but as a secondary disk cache its massive! Plus, there is no management involved on the administrators part, no data policy or data classification to work out, the filesystem handles it for you.
Sun's other component to the ZFS Hybrid Storage Architecture is ZIL Offload. Most data access is asynchronous can be nicely cached and writes flushed to disk when its convenient. However, some applications such as databases or NFS do synchronous (O_DSYNC) IO, this flag requires that the filesystem immediately flush the data to stable storage. On a busy file server this is a performance killer. ZFS ZIL (ZFS Intent Log) is where these synchronous writes go; by putting those writes on super-fast SSD you get several orders of magnitude performance improvement without relying on things like RAID Controller Write Back Caches.
Since we're talking about SSD, let me point out that not all SSD's are created the same. There are two main types of SSD on the market right now: MLC and SLC. Here's the 60 second explanation:
If you see a Sun presentation on Hybrid Storage, you'll see them refer to these as "Read Biased" (MLC, slower but higher capacity) and "Write Biased" (SLC, faster but less capacity). By using the appropriate technology in the appropriate role they significantly reduce cost for an SSD deployment. If you look at everyone else out there just viewing SSD is "fast disk", the decision between SLC and MLC is really just a matter of cost; if you can afford SLC great, if not MLC, or perhaps even sub-teiring SLC to MLC SSD.
So thats de-dup and SSD. If you haven't heard of these, you will. Familiarize yourself with the basics now, you'll be better prepared for the future.
On a closing note. I talked to several people about SMART data. I'm shocked by how many people tell me to ignore SMART data as untrustworthy and unreliable. I was hoping someone at the show would disagree... I was disappointed. Most other experts agree, vendors don't trust SMART data and in some cases outright "fudge" the data or at the least disregard conclusions based on the data. On person remarked that most drives sent to Seagate due to a SMART suggested failure are simply scrubbed, cleared, and re-shipped. So, the belief that SMART data is something to be seriously monitored by admins continues. If you have it, nifty, but if not, oh well. As for me... I love telemetry, so SMART still has a warm spot in my heart, wrinkles and all.
The very first episode of SA Pro is here!
In the podcast we'll use one of two formats, classic 1-on-1 interview style and a round-table discussion format. This episode is the latter.
Together with Joe Moore of Siemens and Mark Imbriaco of 37signals we discuss the following questions:
Whats really new and unique is that Joe, Mark, and I don't know each other. They both responded to a request for participants on the OpenSolaris SA's list and matched the qualifications I was aiming for, thats the extent of it. This is interesting because even though the three of us are in very different circumstances, have different histories, and are geographically separated, we're not very dissimilar. It amazes me how much unity there is among a group with so few governing institutions.
The podcast is 1hr 6 mins and definately worth a listen. Feedback is appreciated, but this was the first one, so be kind. (Yes I know my audio was too low.)
A huge thank you goes out to Joe and Mark for participating!
If 10 years ago someone said "One day your wife will carry an extra hard drive in her purse", I'd have rolled my eyes. On a recent trip to pick up a hard drive (to replace the piece of crap that died in my MacBook Pro; so far every Apple laptop we've owned has had an OEM drive die) I saw, to my amazement, this:
CaseLogic, the folks that made those CD cases we all used to have in our cars, is now making neoprene sleeves for 2.5" hard drive enclosures. This is telling to me... CaseLogic decided that there was enough of a market to start peddling these. This says something about modern storage, says something about the expected reliability and mobility of spinning storage, and says something about the capacity of the ever more affordable flash storage in USB keys and such. And, the strange thing is, I just had to buy one.
But wait there's more! The wall of 3.5" enclosures had been pushed aside by a giant selection of 2.5" enclosures, most of them powered by the USB line alone, no need for an exteral DC plug. And in the corner of the rack was this interesting toy:
This is a Thermaltake BlacX HDD Docking Station, it accomidates 2.5" and 3.5" SATA drives.... like a damned Nintendo cartridge! And, the really funny thing is you'll find yourself blowing dust off the SATA paddle before inserting... oh the memories.
Most geeks, like myself, probly have a growing stack of SATA drives that aren't terribly old but have fallen by the way side as storage capacities have sky rocketed and prices plummeted in the last 3 years. Sure, there are lots of snazy USB/Firewire/eSATA enclosures out there, but generally the drives aren't worth it... but no longer is this a problem! Your old hard drives are now a very easy to use removable media for all your backup or temporary storage needs, no adapters or sleds required, just dust it off and slide it into the dock.
These two things, combined with the fact that your grandma's new Dell is probly going to have a 1TB drive, something that didn't seem possible in a 3.5" form factor just a couple of years ago, and some hope that aerial density will provide 2.5" with capacities well beyond 300GB in the future, as well as the coming wave of SSD solutions.... storage is looking to be at the peak of a wave thats going to crash out a lot of interesting things in the next couple of years.
Of course, what concerns me is that while bus speeds increase and capacities grow, throughput in real world situations is still low. 30MB/s is still considered pretty good in real-world usage because those poor little heads can only move so fast. Tiered storage combined with RAID is interesting considering the increases in arial density because the outer cylinders contain so much data, but with COW filesystems growing such as ZFS the data is increasingly spread around the platters if left unchecked which leads to slower transfer rates outside of the benchmarks. Bigger buffers can help, but in random workloads prefetch doesn't help as the drive doesn't know what sector to prefetch.
It wasn't log ago that I was begging a storage vendor to keep sending me 72GB drive because the rebuild times for a failed 167GB drive scared me. Gigabit speed networks increase the utilization of storage over the network, but again, those drive heads can only move so fast. I'm really interested to see what comes in the next couple years to try and catch up the random throughput of drives with the capacities. Will SSD be the solution or can spinning media vendors pull a rabbit out of their hats? Unless they do, my hunch is that in 10 years enterprise systems will be shipping with SAS SSD drives and relegate spinning media to secondary storage.
Any way you look at it, some kool stuff is coming; storage geeks stay vigilant!
A couple things here and there have kept me from continuing my series of posts regarding systems management solutions. One of the monitoring solutions I've planned to write about it Up.Time. While I haven't had the time to write it, I was thrill to check my favorite site, SunHelp.org, and see that Super Admin Bill Bradford wrote an excellent review himself: Software Review: up.time 4 Enterprise Monitoring.
In my professional opinion, Up.Time is the best, most comprehensive, and most polished out of the box solution available at any price. Yes, its proprietary closed source commercial software... but, whether your using Zenoss, HP OpenView, Hyperic, or another other solution out there, your going to only get a small subset of monitoring capability without spending some time extending it yourself or digging around for modules written by someone else. Most, such as NetNMS or Zenoss, are limited by the OIDs exposed by SNMP and then extended by creating custom scripts that SSH into boxes every n seconds. Others such as Zabbix and Hyperic provide a client side agent that gathers up fairly generic information such as disk usage, CPU and memory usage, and maybe an odd and end on top. But Up.Time gathers a massive range of metrics, stores them all, and provides useful graphing and reporting capabilities, including report automation, to make it all very useful.
I've solved more than a few problems because of the realization that the historical data I needed to analyze a problem was already right under my nose because Up.Time had been gathering it and I didn't even realize it. A great example is IO response time! I spent quite a bit of time ripping apart iostat.c to learn how to extend Zabbix, Hyperic, or other solutions to record a_svct... then I realized that all that data was already being gathered with Up.Time right out of the box. Not only does it gather single return metrics, it also stores useful multi-string return data such as the top CPU consuming processes during a given period. Just knowing that the CPU was saturated on Monday of last week isn't enough! What was actually using that CPU? Up.Time can tell you, no modification required.
With all my searching to date, there is only one "install and forget" solution on the market, and thats Up.Time. If you want to solve your monitoring problems with money its hands down the solution you need to use. I'm not saying its perfect, there are a couple things here and there I'd like to change, but I'm hard pressed to find anything as powerful as it.
Read Bill Bradford's excellent review for a better look at Up.Time.
Chances are you've heard of the Intelligent Platform Management Interface, or IPMI. And chances are very good that you view it as little more than a way to remotely reboot servers. But IPMI is oh so much more than that... wonders await you, should you just take the time to explore a little. So lets start with the basics and work outwards.
Almost any modern server is going to have a Baseboard Management Controller, or BMC for short, on the mainboard. On whitebox motherborads such as Tyan or ASUS this is normally an addon option but any purpose built server from Sun, HP, IBM, SuperMicro, etc, is going to have one on the board out of the box. The BMC acts as a hub for all the various sensor data on the board(s). In years past you might have heard of I2C and SMBus buses and sensors accessible via "lm-sensors", the BMC is the hub for these various sensor buses. The BMC, therefore, has access to all the various sensors on a give system and therefore is rich with useful data. The most common way in which to retrieve that data is via IPMI.
IPMI can be accessed in several ways, these methods are refered to as "channels", as in communications channels. The two most common are via the LAN ("lan") or if your OS has an BMC driver via the local device (commonly "/dev/bmc"). Please note that in many places the acronyms MC and BMC are used interchangeably. Now, to exploit those channels the most common method is to use the Open Source "ipmitool". This tool is found included with most OS's (including Solaris, in /usr/sfw/bin) or can be downloaded on the IPMItool SourceForge page. Other projects exist, including OpenIPMI and GNU's FreeIPMI. All these implementations offer a rich API for writing custom applications and CLI tools for interaction. As I said, IPMItool is by far the most common, so I'll discuss it here.
First to clear up a common misconception... lets take a Dell PowerEdge server. Many are convinced that you need a Dell Remote Access Card (or Controller depending on who you ask), better known as a DRAC, in order to use IPMI. You do not! Service Processors (SP), such as Sun's ILOM and ELOM (on X86, we're ignoring SPARC here) or Dell's DRAC, are not BMC's, rather they are mini-computers on a card, typically running Linux, powered by "trickle" or "standby" power such that they are running even if the mainboard is not running. These cards simply act as a conduit to access the BMC and other functions of the system. The web interfaces on the SP's, for instance, commonly are just passing IPMI commands back to the BMC. Thus, if you click "Power On" in the SP web interface your really just sending an IPMI "power on" command to the BMC. The point is, you do not need an SP to use IPMI with a system! The caviate is, depending on the architecture of the system, you may require an SP to talk to the BMC if the system is not running. For instance, on a Dell PowerEdge you can talk IPMI to the BMC without a DRAC by "Sharing" the first gigabit port, meaning that you really don't need a DRAC at all unless you want the ability to, for instance, get SNMP data which is really just an SNMP agent on the DRAC pulling data from the BMC and returning it as OIDs and branded up as "Dell OpenManage". To keep going with the Dell example, if you SSH onto a DRAC and use the "connect com2" command to do serial redirection, your actually doing a local IPMI Serial-over-LAN session, your just doing it inside the chassis.
Okey, so, IPMI is everywhere. So what can we do with it? Like I said, most people are familiar with this:
$ ipmitool power status Chassis Power is on $ ipmitool chassis status System Power : on Power Overload : false Power Interlock : inactive Main Power Fault : false Power Control Fault : false Power Restore Policy : always-off Last Power Event : command Chassis Intrusion : inactive Front-Panel Lockout : inactive Drive Fault : false Cooling/Fan Fault : false Front Panel Control : none $ ipmitool chassis power cycle ...
In the above examples I'm using the local "bmc" communications channel. The command "power" is actually a shortcut for "chassis power", so "ipmtool power cycle" and "ipmitool chassis power cycle" are the same thing. When the "-I (channel)" is not specified, local "bmc" channel is used. Here is a LAN example. (Those above was from a Sun X4150, below are Dell 2950):
$ ipmitool -I lanplus -H 10.0.50.60 -U root -f /ipmi.pass power status Chassis Power is on $ ipmitool -I lanplus -H 10.0.50.60 -U root -f /ipmi.pass chassis status System Power : on Power Overload : false Power Interlock : inactive Main Power Fault : false Power Control Fault : false Power Restore Policy : always-off Last Power Event : Chassis Intrusion : inactive Front-Panel Lockout : inactive Drive Fault : false Cooling/Fan Fault : false Sleep Button Disable : not allowed Diag Button Disable : allowed Reset Button Disable : not allowed Power Button Disable : allowed Sleep Button Disabled: false Diag Button Disabled : true Reset Button Disabled: false Power Button Disabled: true
The syntax above is fairly straight forward. I'm using the "lanplus" channel (the "lan" channel is for IPMI 1.5 commands, whereas "lanplus" is for IPMI 2.0 RMCP+), -H specifies the IP address of the IPMI interface, -U is the IPMI user (typically "root"). In recent releases of ipmitool the -P "password" option has been replaced with "-f /file", the file contains the password in plaintext, this ensures that the IPMI password isn't viewable via a process listing which is seen via "ps -ef" or the SNMP process tables. The default password on Dell PowerEdge servers is "calvin", on Sun Fire servers its "changeme", in both cases the user is "root".
So... what else is there to see? There are two really interesting things to look at...
The first is the Sensor Data Repository (SDR). Here you will find thresholds and values for all the available sensors. Here is an example on a Sun X4100 M2:
$ ipmitool sdr elist sys.id | 00h | ok | 23.0 | State Asserted sys.intsw | 01h | ok | 23.0 | sys.psfail | 02h | ok | 23.0 | Predictive Failure Deasserted sys.tempfail | 03h | ok | 23.0 | Predictive Failure Deasserted sys.fanfail | 04h | ok | 23.0 | Predictive Failure Deasserted mb.t_amb | 05h | ok | 7.0 | 34 degrees C mb.v_bat | 06h | ok | 7.0 | 2.88 Volts mb.v_+3v3stby | 07h | ok | 7.0 | 3.18 Volts mb.v_+3v3 | 08h | ok | 7.0 | 3.34 Volts mb.v_+5v | 09h | ok | 7.0 | 5.02 Volts mb.v_+12v | 0Ah | ok | 7.0 | 12.10 Volts mb.v_-12v | 0Bh | ok | 7.0 | -12.35 Volts mb.v_+2v5core | 0Ch | ok | 7.0 | 2.54 Volts mb.v_+1v5core | 0Dh | ok | 7.0 | 1.53 Volts mb.v_+1v2core | 0Eh | ok | 7.0 | 1.22 Volts fp.t_amb | 14h | ok | 12.0 | 24 degrees C pdb.t_amb | 1Bh | ok | 19.0 | 23 degrees C io.t_amb | 22h | ok | 15.0 | 22 degrees C bp.power | 0Fh | ok | 13.1 | State Deasserted bp.locate | 10h | ok | 13.2 | State Deasserted bp.locate.btn | 11h | ok | 13.2 | State Deasserted bp.alert | 12h | ok | 13.3 | State Deasserted fp.prsnt | 13h | ok | 12.0 | Device Present fp.usbfail | 15h | ok | 12.0 | Predictive Failure Deasserted fp.power | 16h | ok | 12.1 | State Asserted fp.locate | 17h | ok | 12.2 | State Deasserted fp.locate.btn | 18h | ok | 12.2 | State Deasserted fp.alert | 19h | ok | 12.3 | State Deasserted fp.ledbd.prsnt | 1Ah | ok | 12.0 | Device Present ps0.prsnt | 1Ch | ok | 10.0 | Device Present ps0.vinok | 1Eh | ok | 10.0 | State Asserted ps0.pwrok | 1Dh | ok | 10.0 | State Asserted ps1.prsnt | 1Fh | ok | 10.1 | Device Absent ps1.vinok | 21h | ns | 10.1 | Disabled ps1.pwrok | 20h | ns | 10.1 | Disabled io.id0.prsnt | 23h | ok | 15.0 | Device Present io.id1.prsnt | 24h | ok | 15.0 | Device Absent io.hdd0.fail | 25h | ok | 4.0 | Predictive Failure Deasserted io.hdd1.fail | 26h | ok | 4.1 | Predictive Failure Deasserted io.hdd2.fail | 27h | ok | 4.2 | Predictive Failure Deasserted io.hdd3.fail | 28h | ok | 4.3 | Predictive Failure Deasserted p0.t_core | 29h | ok | 3.0 | 24 degrees C p0.v_vdd | 2Ah | ok | 3.0 | 1.38 Volts p0.v_vddio | 2Bh | ok | 3.0 | 1.85 Volts p0.v_vtt | 2Ch | ok | 3.0 | 0.91 Volts p0.fail | 2Dh | ok | 3.0 | Predictive Failure Deasserted p0.d0.fail | 2Eh | ok | 32.0 | Predictive Failure Deasserted p0.d1.fail | 2Fh | ok | 32.1 | Predictive Failure Deasserted p0.d2.fail | 30h | ok | 32.2 | Predictive Failure Deasserted p0.d3.fail | 31h | ok | 32.3 | Predictive Failure Deasserted p1.t_core | 32h | ok | 3.1 | 21 degrees C p1.v_vdd | 33h | ok | 3.1 | 1.38 Volts p1.v_vddio | 34h | ok | 3.1 | 1.85 Volts p1.v_vtt | 35h | ok | 3.1 | 0.91 Volts p1.fail | 36h | ok | 3.1 | Predictive Failure Deasserted p1.d0.fail | 37h | ok | 32.4 | Predictive Failure Deasserted p1.d1.fail | 38h | ok | 32.5 | Predictive Failure Deasserted p1.d2.fail | 39h | ok | 32.6 | Predictive Failure Deasserted p1.d3.fail | 3Ah | ok | 32.7 | Predictive Failure Deasserted ft0.fm0.fail | 3Bh | ok | 29.0 | Predictive Failure Deasserted ft0.fm1.fail | 3Ch | ok | 29.1 | Predictive Failure Deasserted ft0.fm2.fail | 3Dh | ok | 29.2 | Predictive Failure Deasserted ft1.fm0.fail | 3Eh | ok | 29.3 | Predictive Failure Deasserted ft1.fm1.fail | 3Fh | ok | 29.4 | Predictive Failure Deasserted ft1.fm2.fail | 40h | ok | 29.5 | Predictive Failure Deasserted ft0.fm0.f0.speed | 41h | ok | 29.0 | 7900 RPM ft0.fm2.f0.speed | 43h | ok | 29.2 | 7200 RPM ft0.fm1.f0.speed | 42h | ok | 29.1 | 7400 RPM ft1.fm0.f0.speed | 44h | ok | 29.3 | 9200 RPM ft1.fm1.f0.speed | 45h | ok | 29.4 | 9100 RPM ft1.fm2.f0.speed | 46h | ok | 29.5 | 8400 RPM ft0.fm0.f1.speed | 47h | ok | 29.0 | 7800 RPM ft0.fm1.f1.speed | 48h | ok | 29.1 | 7400 RPM ft0.fm2.f1.speed | 49h | ok | 29.2 | 7100 RPM ft1.fm0.f1.speed | 4Ah | ok | 29.3 | 9100 RPM ft1.fm1.f1.speed | 4Bh | ok | 29.4 | 9000 RPM ft1.fm2.f1.speed | 4Ch | ok | 29.5 | 8400 RPM
You can see that some of these are boolean failure warnings, such as "io.hdd0.fail". By using the "elist" option the status is de-referenced, so we can see that its set as "Predictive Failure Deasserted" (with out "elist" this reports as 0x01). The fans, however output the speed and the temp sensors output the current reading.
While a full dump of the sensor repository is neat to look at, you'll want to cherry pick values for practical purposes such as monitoring. For instance, lets get just the motherboard ambient temperature reading using "sdr"s sister command "sensor":
$ ipmitool sensor reading "mb.t_amb" mb.t_amb | 34
If we want to feed this value to our monitoring application, such as Zabbix, Nagios, Cacti, and friends, we just parse that to display only the value, and we're good to go:
$ ipmitool sensor reading "mb.t_amb" | awk '{print $3}'
34
We can apply the same method to any thing else in the SDR, allowing us to create pretty graphs and useful alerts based on voltages, fan speed, temperatures, or failure warnings. If you want greater clarity into a given sensor item, use "sensor get", example:
$ ipmitool sensor get 'mb.t_amb' Locating sensor record... -- BMC req.fn : 0x4 BMC req.lun : 0x0 BMC req.cmd : 0x2d BMC req.datalength : 0x1 BMC req.data : 0x5 -- -- BMC req.fn : 0x4 BMC req.lun : 0x0 BMC req.cmd : 0x27 BMC req.datalength : 0x1 BMC req.data : 0x5 -- Sensor ID : mb.t_amb (0x5) Entity ID : 7.0 Sensor Type (Analog) : Temperature Sensor Reading : 34 (+/- 0) degrees C Status : ok Lower Non-Recoverable : na Lower Critical : na Lower Non-Critical : na Upper Non-Critical : 70.000 Upper Critical : 75.000 Upper Non-Recoverable : 80.000 -- BMC req.fn : 0x4 BMC req.lun : 0x0 BMC req.cmd : 0x2b BMC req.datalength : 0x1 BMC req.data : 0x5 -- -- BMC req.fn : 0x4 BMC req.lun : 0x0 BMC req.cmd : 0x29 BMC req.datalength : 0x1 BMC req.data : 0x5 -- Assertions Enabled : ucr+ unr+ Deassertions Enabled : ucr+ unr+
This output helps clarify more explicitly the various thresholds, this information is also useful to you monitoring or reporting solution. Spend some time on your platform playing with the "sdr" and "sensor" commands, hours of fun.
The second important feature is the System Event Log (SEL), it is exactly what you think it is:
$ ipmitool sel elist 100 | 08/21/2007 | 13:25:45 | Voltage mb.v_+1v2core | Lower Non-critical going low | Reading 0 < Threshold 1 Volts 200 | 08/21/2007 | 13:25:45 | Voltage p0.v_vdd | Lower Non-critical going low | Reading 0 < Threshold 1.00 Volts 300 | 08/21/2007 | 13:25:46 | Power Supply ps0.pwrok | State Asserted 400 | 08/21/2007 | 13:25:46 | Processor p0.fail | Predictive Failure Asserted 500 | 08/21/2007 | 13:25:48 | Power Supply ps1.pwrok | State Asserted 600 | 08/21/2007 | 13:25:50 | Voltage mb.v_+1v2core | Lower Non-critical going high | Reading 1.22 > Threshold 1 Volts 700 | 08/21/2007 | 13:25:50 | Voltage p0.v_vdd | Lower Non-critical going high | Reading 1.38 > Threshold 1.00 Volts 800 | 08/21/2007 | 13:25:52 | System Firmware Progress | Motherboard initialization | Asserted 900 | 08/21/2007 | 13:25:52 | System Firmware Progress | Video initialization | Asserted a00 | 08/21/2007 | 13:25:58 | System Firmware Progress | USB resource configuration | Asserted b00 | 08/21/2007 | 13:26:09 | System Firmware Progress | Option ROM initialization | Asserted c00 | 08/21/2007 | 13:26:53 | System Firmware Progress | User-initiated system setup | Asserted d00 | 08/21/2007 | 13:27:11 | System Firmware Progress | Motherboard initialization | Asserted e00 | 08/21/2007 | 13:27:11 | System Firmware Progress | Video initialization | Asserted f00 | 08/21/2007 | 13:27:17 | System Firmware Progress | USB resource configuration | Asserted 1000 | 08/21/2007 | 13:27:25 | Power Supply ps0.pwrok | State Deasserted 1100 | 08/21/2007 | 13:27:27 | Power Supply ps1.pwrok | State Deasserted 1200 | Pre-Init Time-stamp | Power Supply ps1.vinok | State Asserted 1300 | Pre-Init Time-stamp | Entity Presence ps1.prsnt | Device Present 1400 | Pre-Init Time-stamp | Power Supply ps0.pwrok | State Deasserted 1500 | Pre-Init Time-stamp | Power Supply ps0.vinok | State Deasserted 1600 | Pre-Init Time-stamp | Physical Security sys.intsw | General Chassis intrusion | Asserted 1700 | Pre-Init Time-stamp | Entity Presence ps0.prsnt | Device Present 1800 | Pre-Init Time-stamp | Power Supply ps1.pwrok | State Asserted 1900 | 11/14/2007 | 21:34:32 | System Firmware Progress | Motherboard initialization | Asserted 1a00 | 11/14/2007 | 21:34:32 | System Firmware Progress | Video initialization | Asserted 1b00 | 11/14/2007 | 21:34:38 | System Firmware Progress | USB resource configuration | Asserted 1c00 | 11/14/2007 | 21:35:09 | System Firmware Progress | Option ROM initialization | Asserted 1d00 | 11/14/2007 | 21:35:48 | System Firmware Progress | Motherboard initialization | Asserted 1e00 | 11/14/2007 | 21:35:48 | System Firmware Progress | Video initialization | Asserted 1f00 | 11/14/2007 | 21:35:54 | System Firmware Progress | USB resource configuration | Asserted 2000 | 11/14/2007 | 21:36:25 | System Firmware Progress | Option ROM initialization | Asserted 2100 | 11/14/2007 | 21:44:26 | System Firmware Progress | Motherboard initialization | Asserted 2200 | 11/14/2007 | 21:44:26 | System Firmware Progress | Video initialization | Asserted 2300 | 11/14/2007 | 21:44:32 | System Firmware Progress | USB resource configuration | Asserted 2400 | 11/14/2007 | 21:45:03 | System Firmware Progress | Option ROM initialization | Asserted 2500 | 11/14/2007 | 21:45:50 | System Firmware Progress | System boot initiated | Asserted 2600 | 11/14/2007 | 21:59:17 | Power Supply ps1.pwrok | State Deasserted 2700 | 11/14/2007 | 21:59:22 | Power Supply ps1.pwrok | State Asserted 2800 | 11/14/2007 | 21:59:35 | System Firmware Progress | Motherboard initialization | Asserted 2900 | 11/14/2007 | 21:59:35 | System Firmware Progress | Video initialization | Asserted 2a00 | 11/14/2007 | 21:59:41 | System Firmware Progress | USB resource configuration | Asserted 2b00 | 11/14/2007 | 22:00:12 | System Firmware Progress | Option ROM initialization | Asserted 2c00 | 11/14/2007 | 22:00:56 | System Firmware Progress | System boot initiated | Asserted 2d00 | 11/14/2007 | 22:14:01 | Power Supply ps1.pwrok | State Deasserted 2e00 | 11/14/2007 | 22:14:06 | Power Supply ps1.pwrok | State Asserted 2f00 | 11/14/2007 | 22:14:20 | System Firmware Progress | Motherboard initialization | Asserted 3000 | 11/14/2007 | 22:14:20 | System Firmware Progress | Video initialization | Asserted 3100 | 11/14/2007 | 22:14:26 | System Firmware Progress | USB resource configuration | Asserted 3200 | 11/14/2007 | 22:14:57 | System Firmware Progress | Option ROM initialization | Asserted 3300 | 11/14/2007 | 22:15:42 | System Firmware Progress | System boot initiated | Asserted 3400 | 11/14/2007 | 22:23:46 | System Firmware Progress | Motherboard initialization | Asserted 3500 | 11/14/2007 | 22:23:46 | System Firmware Progress | Video initialization | Asserted 3600 | 11/14/2007 | 22:23:52 | System Firmware Progress | USB resource configuration | Asserted 3700 | 11/14/2007 | 22:24:03 | System Firmware Progress | Option ROM initialization | Asserted 3800 | 11/14/2007 | 22:24:46 | System Firmware Progress | System boot initiated | Asserted 3900 | 11/14/2007 | 23:36:17 | Power Supply ps1.pwrok | State Deasserted 3a00 | Pre-Init Time-stamp | Power Supply ps0.pwrok | State Deasserted 3b00 | Pre-Init Time-stamp | Power Supply ps0.vinok | State Asserted 3c00 | Pre-Init Time-stamp | Entity Presence ps0.prsnt | Device Present 3d00 | Pre-Init Time-stamp | Power Supply ps0.pwrok | State Asserted 3e00 | 11/15/2007 | 01:36:22 | System Firmware Progress | Motherboard initialization | Asserted 3f00 | 11/15/2007 | 01:36:22 | System Firmware Progress | Video initialization | Asserted 4000 | 11/15/2007 | 01:36:28 | System Firmware Progress | USB resource configuration | Asserted 4100 | 11/15/2007 | 01:36:59 | System Firmware Progress | Option ROM initialization | Asserted 4200 | 11/15/2007 | 01:37:44 | System Firmware Progress | System boot initiated | Asserted
So here we see this event history of our system. Both Dell and Sun SP's and firmware use this event log to send warnings and such, for instance if you want to clear a mysterious warning light on a Dell's chassis just clear the SEL.
The SEL's best friend is Platform Event Filtering (PEF). Here we can create rules which dictate alerting policy. When a given event occurs that matches a PEF rule, an alert in the form of an SNMP trap is sent, which is called a "Platform Event Trap" (PET). The default event rules list is short on Sun X4100:
$ ipmitool pef list 1 | active, pre-configured | 0x11 | Any | Any | Warning | OEM | OEM | Alert,OEM-defined | 2 2 | active, pre-configured | 0x11 | Any | Any | Critical | OEM | OEM | Alert,OEM-defined | 3 3 | active, pre-configured | 0x11 | Any | Any | Non-recoverable | OEM | OEM | Alert,OEM-defined | 4 4 | active, pre-configured | 0x11 | Any | Any | Information | OEM | Any | Alert,OEM-defined | 1
On the Dell's its a bit more fine grained:
$ ipmitool -I lanplus -H 10.0.50.60 -U root -f /ipmi.pass pef list 1 | active | 0x11 | Fan | Any | Critical | Threshold | (0x01/0x0004),But where do these traps go? Thats defined by the PEF policy which is commonly configured via your SP, in the case of Sun systems this would be using the ELOM/ILOM interface, in the case of Dell you can do this in the BIOS.
$ ipmitool -I lanplus -H 10.0.50.60 -U root -f /ipmi.pass pef policy 1 | 1 | Match-always | 1 | 802.3 LAN | PET | public | 0 | 0 | 0.0.0.0 | 00:00:00:00:00:00 2 | 1 | Match-always | 1 | 802.3 LAN | PET | public | 0 | 0 | 0.0.0.0 | 00:00:00:00:00:00 3 | 1 | Match-always | 1 | 802.3 LAN | PET | public | 0 | 0 | 0.0.0.0 | 00:00:00:00:00:00 4 | 1 | Match-always | 1 | 802.3 LAN | PET | public | 0 | 0 | 0.0.0.0 | 00:00:00:00:00:00While these two, SDR and SEL, are extremely useful there is one more IPMI feature that you may not even be aware of... Serial over LAN, or SoL for short. It does what it sounds like, console serial redirection via IPMI over the LAN! This means that in systems that once required a console server that was physically connected to each servers serial port can now be access simply using "ipmitool". This feature was introduced in IPMI v1.5 and almost all modern generation servers support it and, as noted earlier, some SP's console redirect (such as DRAC 'connect com2') is in fact IPMI SoL in disguise.
$ ipmitool -I lanplus -H 10.0.50.60 -U root -f /ipmi.pass sol activate [SOL Session operational. Use ~? for help] # <-- this is a console promptThe controls are those of SSH, use tilde-dot (~.) to disconnect. IPMI SoL isn't fool proof however, I've run into several instances where 'ipmitool' would segfault and dump my connection for no repeatable reason, and I've seen this both "remotely" from another system and on a DRAC, so don't throw away all your console server just yet, but there are plenty of cases where it can sure come in handy.
So, let me recap since this is a lot to digest if your new to it:
- IPMI is your friend.
- Modern server motherboards possess a Baseboard Management Controller (BMC) which is the heart of your box accessed via IPMI.
- SP's are not BMC's, they just provide lights out access to it.
- IPMI is useful for more than checking power status and power cycling.
- The IPMI Sensor Data Repository (SDR) is accessable with the IPMItool "sdr" and "sensor" commands and provide access to all system sensors.
- IPMItool can be used locally and remotely, didn't buy an SP or forgot to configure LAN settings in the BIOS? Just install IPMItool locally and see if you can say hello!
- Sensor data can easily be output and formatted to be input into your monitoring solution whether it be Uptime, Nagios, Zabbix or Cacti.
- The System Event Log (SEL) can provide meaningful insite into previous events that the OS may not have been aware of
- Platform Event Filtering (PEF) can be used to alert on specified events from the SEL, sending Platform Event Traps (PET) in SNMP Trap format, providing a means for asynchornously alerting on error conditions.
- IPMI provides remote console capability, IPMI Serial-over-LAN (SoL), which can provide a low-cost/no-cost remote console access method where no other solution may be applicable. Can't access a system? Try SoL before you power cycle!
- Whats in that box? IPMItool can also output a FRU list (ipmitoo fru) to assist in your auditing needs.
- ... and much more.
I hope this gives you a new appreciation for just what IPMI can do for you and how you might be able to exploit it. In many cases its there, right now, waiting to be used on your servers, just because you didn't assign an IP address doesn't mean you can't use it, so please install IPMItool and give it a shot. If you had to buy servers with out a DRAC or SP, don't dispare, you're not missing out as much as you think.
> Read More... | Digg This!
With a new site comes new opportunity. When Joyent recently added a new facility I decided to look at new monitoring solutions. We previous were standardized on Sun X4100, Sun T1000's, and Sun X4500's with a network using Force10 and F5. This new site would replace the X4100's with Sun X4150's and Dell 2950's. Having largely avoided Dell in the past I took my time to get to know the system well and knowing that IPMI is far more utilized in that arena than with most Sun systems I was given the oppertunity to get a whole new appreciation for IPMI 2.0. Armed with this new knowledge I wanted to take our monitoring much further than we had in the past to monitoring individual BMC sensors via IPMI. This meant looking at monitoring solutions in a much deeper way.
Zabbix quickly rose to the top of my list. In the initial phase I grabbed Zabbix and Zenoss and planned for a face off. Zabbix compiles nicely and easily on Solaris/X86 and within about 30 minutes I had a server up and I created an agent tarball complete with installation script so that deploying an agent was as easy as a wget, untar, and "install.sh". Zenoss, however, gave me problems, dealing with Python isn't my strong point and several dependancies were required, and then I had problem after problem getting things to build properly... after about an hour I decided that as pretty as Zenoss looks in those screenshots it was out of the running for the time being.
Zabbix isn't pretty... I'll be honest. If you want a "tactical view" to put on an overhead projector in your NOC, Zabbix won't give you that Hollywood movie feel. Zabbix does make up for that short coming in raw power. Let me explain...
Zabbix is agent based, you can choose to avoid this but you'll loose most of what makes Zabbix so great. Is implemented in C and easily portable, so you don't need to worry about having Java or Python on all the monitored hosts like many of its competitors. It adds to this the standard assortment of SNMP, custom script, and external checks (icmp, ftp, etc) you expect. Where it adds something really interesting is its WEB (why they put this in all caps I don't know) monitoring capabilities which allow you to do more than just fetch a page but actually supply "steps" such as logging into a site and navigating around, to which response times are stored allowing you to alert someone if the login process takes more than 5 seconds or something. Very handy indeed.
Zabbix agents and other checks associate data with "keys", these keys are then bound to "items" that describe the data and define how often to update the given data. These items are then associated with various alert conditions called "triggers". For instance, the agent by default returns the number of users connected with the key "system.users.num", and the stock template associates the "Number of users connected" item with that key which is polled every 60 seconds. So now we can create any number of different alert conditions based on the number of users logged in by creating triggers. One default trigger is "Too may users connected on server {HOSTNAME}" (where HOSTNAME is replaced ultimately with the appropriate host), which uses the following expression/condition:
{Template_Joyent:system.users.num.last(0)}>50
In this case, the trigger is associated with my "Template_Joyent". The function "last(0)" (meaning the last value you polled) is applied to the key "system.users.num", and if the value is greater than 50 the condition is true. In Zabbix all conditions should evaluate to false when things are fine. Each trigger has a Severity associated with it, in this case "Average". So the result here is that any user configured to get alerts for Average or higher severity will get a notification when more than 50 users are logged in. Zabbix provides a rich set of functions by which to create your triggers, such as average change over time, min and max, absolute difference, etc. This means that I could, for instance, create a trigger that alerted me if since the last polling interval more than 10 users logged in or if the average number of users over the course of an hour exceeded 50.
Where this becomes really powerful is when you choose to extend your agents. In each agents configuration file you can supply a "UserParameter" directive which runs some command and returns it with a given key. Here are some simple examples:
# SMF UserParameter=smf.online,svcs -a | grep online | wc -l UserParameter=smf.offline,svcs -a | grep offline | wc -l UserParameter=smf.maintance,svcs -a | grep maint | wc -l # X4150 IPMI (BMC Direct) UserParameter=ipmi.amb,/usr/sbin/ipmitool sensor reading "Ambient Temp0" | cut -f2 -d|
UserParameter takes two arguments which are comma delimated: the key name and the command to run. In the above, I can return the number of "online" SMF Services by grepping out of "svcs -a" and then returning that as "smf.online". Restart the agent, go back to the server and add a new Item for this key and start creating triggers. Now I can be alerted if, for instance, the number of online services decreases by some number in a given time. The IPMI example there provides a workaround for environments where you may not have access to the management port of your server and instead want to return IPMI data directly from the OS using 'ipmitool'.
These examples above are simplistic, you can get more advanced by allowing the server to pass arguments. For example, if you want to monitor disk ops, you probably don't want to add a separate key for each disk, thus you could specify the disk name back on the server which is passed to your agent as an argument.
These abilities remove a lot of the cruft and put the power into your hands. If you have a highly customized environment Zabbix is a great choice.
However, in Zabbix 1.4 there is a lot more that is needed. Currently features like escalations are planned but not expected till Zabbix 1.6. Without features like limiting repeat pages your forced to get your triggers properly defined rather than masking away false-positives in your alerting policy, such as the way Nagios handles "flapping".
The "Overview" (all in one view of the world) page is something that takes some getting used to. Rather than a pretty page in black with "0 Services Down, 0 Servers Down" Nagios style page you instead get a list of defined triggers and a column for each monitored host which is color coded. If the box is green, life is good. If the box is red, life isn't. Several nifty things like flashing green if the trigger is fine now but wasn't less than 15 minutes ago are handy but sometimes annoying especially during testing.
But, and this is a big but, Zabbix does give you something exceedingly useful... all the monitored keys can be graphed. Click on the color block of any monitored trigger and you can view its value history as a graph. This means that you can take add-on graphing applications like Cacti or MRTG and roll the functionality directly into Zabbix. Want to know what load average has looked like for the last week? No problem. And, add to this the ability to custom create graphs which can combine multiple keys into a single view.
If you see a screenshot with pretty graphs all over it on the Zabbix front page, thats a "Screen". A screen is a page custom layed out with several custom graphs. So you might want a graph that has the memory usage and load average of every system on your network, you can put those together in a custom graph and then place that on your "Screen". If you take some time and create some nifty graphs like this you may find ourself looking more at your screen than at the Overview page.
Isn't as pretty as Zenoss or as streamlined as Nagios, but it really is a SysAdmins tool. It doesn't hide functions away from you or gloss over details to make you feel nurtured, its all out there gritty and raw for you to use an manipulate. There is a nature learning curve in wrapping your head around the concepts of "items" and "triggers" and how you can combine them in really powerful ways, but before long you'll be frustrated by the limits of other solutions. Having a full featured and easily extensible agent really is my favorite aspect and frees you from the concerns that come with having to pass around SSH Keys required to make external scripts work with other solutions.
That said, if you decide to implement Zabbix expect to spend some time crafting triggers to suite your needs. If you want to get beyond the basics you'll need get your hands dirty, which is easy to do after playing with it a bit, but it you want something you can just deploy and forget consider a commercial tool like Uptime.
I'm only grazing the basics here. If you want to learn more check out Zabbix.com.
I've recently been delving deeper into the area of systems management. At Joyent we have a variety of unique problems that I've not faced in the past. For instance, I have a largely non-heterogeneous environment, we have strict controls on what we use and always strive to be as consistent as possible. We also have more machines than I've ever managed before, while I've worked in many very large environments I typically worked on big iron systems with organizations that divided administrative duties across multiple teams. This all collides into a perfect test bed for some really deep systems management practices.
So what is Systems Management? I think we can divide data center duties into 4 main groups:
In between these layers lives Systems Management.
To make an analogy... 'root' may be omni-potent, having "God Power", but unlike God, a SysAdmin is not omni-present. You may be the sole administrator of a 50 systems installation, but do you really know the temps that the CPU's are running at? Do you know the load average of every system? Do you know if a disk has failed? Sure, you can find out right now, but what were those data points 2 days ago? A week ago?
The core focus of systems management, to me, is making an attempt to be omni-present. God has the ability to know all things at one time... we don't, so the critical elements are:
This second point is vital and one of the reasons that so many people struggle with DTrace... knowing what question to ask tends to be self evident with a little thought, and determining how to ask that question has become easier with tools like DTrace or SNMP, but WTF do those numbers really mean? A classic sysadmin interview question is to drop some printed vmstat output in front of a candidate and ask for an analysis on the spot.
Let me give you an example. At Joyent we've been learning a lot about modern storage, despite the advances in CPU and Memory technology, disk technology hasn't really come that far, we are largely in the same boat that we were 10 years ago. But how do you know what your storage solution just isn't measuring up? How do you know when your in trouble? Look at the output of 'iostat'. Most people incorrectly interpret the '%b' column as "blocked", saying, if its consistantly 100 %b then your in trouble. But thats not true... %b is "Busy", meaning, during a given time interval how much of that time was IO actively occuring? Thus, 100% simply means that over that time interval you were doing IO to a given device for the entire period.... but that might be at 1MB/s or 200MB/s or may vary. Its similar to saying that if the CPU's are 100% use there is a problem, which isn't strictly true if the load average remains lower than the number of cores. A fundamental flaw in most peoples interpretation of data is that 100% utilization automatically means something is broken.. sure, thats almost always the case, but fundamentally its still wrong. So, in the case of IO, the better number to look at is the asvc_t column which reports the "Active Service Time", or, the average time it took for an IO to be handled by the device from the time that it was sent to the device and then returned (time spent waiting to be sent to the device is the wsvc_t, wait time spent queued in the driver).
So how do I know if IO levels have dropped off to unacceptable levels? If asvc_t consistantly exceeds 50ms, like is probly unpleasant, and if it exceeds 100ms life is just unbearable. Thats the number that I'm really interested in because it most accurately describes a condition to which I want to be apprised... and this is where monitoring comes in.
Monitoring makes us omni-present. It allows us to be constantly aware of change in our environment to which we should be aware and to record data points over a period of time to allow for trend analysis, capacity planning, and fault isolation down the road. Thus, we can divide monitoring applications into 2 categories:
From these have emerged several hybrids that attempt to combine the two, such as Zabbix and Zenoss. Its not uncommon to see environments use multiple applications however, I commonly see environments with Ganglia, Nagios, and Cacti for instance. And, of course, there are specialty varieties, such as Snort or pmacct, that operate on specific types of data.
And so, when we move beyond network or systems administration we have to take a more holistic approach. Beyond just paging someone when a system crashes, but rather, asking questions and thinking deeply about what there is to know, how do you obtain that data, feed it to something and then build some intelligence around it to only bug you when you really should be bugged.
Modern servers provide us with a rich set of tools and capabilities to exploit. IPMI, for example, is not just for remotely rebooting systems! SMBus and sensors are not just for putting some nifty graphic in the corner of your desktop! Even on commodity systems boards now commonly include a Baseboard Management Controller (BMC) which forms the basis of most Lights Out Management (LOM) solutions, aggregating data together, collecting and making sensor data available via IPMI. What a list of componants in your system? What to know the temps and fan speeds in your box? What to see system events that may not have been seen or felt by your OS but occured non-the-less? IPMI talking to your BMC can help, even if you don't have a Service Processor (SP) such as ILOM or DRAC.
In the next couple weeks I'm going to try to set aside some time to dig into some of these topics in more depth. I don't think most administrators are truly aware of all the resources available to them, because unless you go looking for something or have a need or just stumble across it, how are you supposed to know? We're all busy enough as it is without going looking for new things to learn. So thats where I can try to help out. Hopefully we'll all learn some new tricks to add to our arsenal of magic.
Talk Like A Pirate Day is coming on Wed, the 19th. Be prepared to either: A) Have fun annoying your coworkers, or B) Bring your boxing gloves to beat the sh*t out of your annoying coworkers.
I'm a weird fellow, always have been. I chop this up to personality. But, I've got some strange issues that I've spent a long time trying to figure out, such as:
An example of all these together is my Oracle book. I wrote the entire book, examples and all, in 4 days... the bulk of it in 2. It then took over a year to edit and it was an unbearably painful task, several times I wanted to send the check I was paid for it back to USENIX and call the whole thing off. Suddenly something I wanted to do became something I had to do, I can't bear to even read my own writing, and I spent more time thinking about how much I hated editing the stupid book than actually doing it.
Now, before I continue let me say that I have always personally felt that ADD was a pile of crap.. its just the way some people are, they think differently, have different needs, different styles, etc. Most of that sentiment comes from a childhood of watching kids go from energetic and excited kids into Ritalin zombies. I saw plenty of kids who's parents preferred their kid suicidal rather than give them some love and attention and just drugged the kids into a soulless kid with suddenly improved table manners. Give the kid a shovel and point 'em toward a nice deep place to dig up some adventure, let 'em chop some wood and build a fort, stop with the zombie drugs.
But, that said, there isn't harm in at least have classifications for people and behavior. Better to know thyself and have a term to boot than be in the dark.. but that doesn't mean the Rx pad has to obligatorily appear.
Tam started looking into ADD and ADHD just to better educate herself should similar symptoms ever come up with our kids. In discussing it I begrudgingly noted that what little I knew about the symptoms pissed me off because they matched me pretty well. Frankly, I think they fit a lot of people pretty well. The world is a place that we're allowing to get faster and faster all the time without really much solid reason, but we keep stretching ourselves thinner and thinner. I tend to chop this up to American culture of harder, faster, more.
But, I do have issues, I know this. I just assume they are normal. Thus, my problem isn't some hooty-tooty disorder, I'm just lazy and have screwed up priorities, etc. I noted before in my blog entry about Sysadmin Mentoring, that I've spent a lot of time reflecting on my own behaviors, beliefs, and ideas. I've spent time reading a pile of great books that all fit into the dreaded "personal growth" category, most of which sent me back to reading more definative and foundational texts, namely Thoreau and Emerson. But, alas, I've sort of hit that 'self help gulch' in which you walk away with lots of questions, some answers, a couple new tips and tricks, but ultimately end up close to where you started, although perhaps a little wiser for the journey. Many of my issues listed above aren't answered by any of the books regarding management, leadership, integrity and character that I've read. Perhaps the most useful was David Cloud's Integrity, which emphasizes a firm handle on reality, that which is rather than that which may have, could have, would have been... this addressed the wandering mind issue to some extent and has been helpful to some degree.
Then the thought occurred to me that the symptoms of ADD not only fit me well, but many sysadmins. Lack of focus, desire to do large numbers of tasks at once and hopping constantly between them all the while unable to get any large tasks complete. That sounds like a lot of sysadmins out there. Is it possible that this job is a magnet to those who fit the ADD profile because it suits the symptoms so well? Can you sweep these symptoms away as "information overload" or is there something deeper?
And so, this is why I post this... I'm curious to hear what other sysadmins think. We know that some of the most gifted coders in the world have "disorders" which give them superior mental, mathematical, and focus abilities. Is it possible that sysadmins are, potentially, the converse? Are our abilities to juggle large numbers of tasks simultaneously, ability to not get too bogged down in any one thing, and ability to deal with disorder come from "disorders" themselves, may of which our ego driven minds stamp out as "bullshit excuses for losers" rather than accept that it might be something with an unfriendly stamp like ADD?
I'm still just researching, I'm not diagnosed, and frankly have no interest in ever being. If I were going to look into anything it'd probably be sleep apnea (I fit the profile and would explain my sucky memory), but I'm too lazy to get tested for that either... too much to do. But I will say that in my research thus far one book caught my attention: You Mean I'm Not Lazy, Stupid or Crazy?!: The Classic Self-Help Book for Adults with Attention Deficit Disorder.
But, seriously, how many of you, my fellow sysadmins, feel the same? Anyone actually gone down this road before? And, if there is any validity to it, wtf do you do with this new found knowledge?
It snuck up on me this year... I guess thats a good sign that we're all working hard. Happy SA Day! May your coffee cups stay full, donuts fresh, and systems stable.
This is a very important year for me. I'm going through an incredible amount of personal growth, personally and professionally. I've hit that point in my life where I'm out of culturally imposed goals... the midlife lull I'll call it. When your a kid you have clear cut goals... namely, move the hell out. When you do, you have more clear cultural goals: get an education, good job, nice car, nice home (whatever "home" means to you), wife, kids, build a career, build a chain of accomplishments, etc. The midlife lull is where you find yourself once you've reached all those goals. You have a good job, nice car, wife and kids, etc, and are now into that "just keep doing it till you retire" phase of life. I'm asking new questions strikingly similar to those when your young... "Am I doing what I want to?", "Where do I really want to go next?", "What is the purpose of my life (aside from or in addition to any religious beliefs)?" The big review.
All of these questions have jolted me free of my youthful belief that I'm alone in the universe and no one else can help. I'm reading books on every topic of personal development and growth looking for both answers and perhaps most importantly the right questions to ask of myself. I'm glad to report that its paying massive dividends so far, and I still have a long way to go.
Along with all this, my coming onto Joyent as a Director has given me a new career direction that I hadn't previous planned on. Initially just adding a yet larger title to my resume was a draw (along with the ability to solve just about every problem known to man using iSCSI). As time went on and I started building out a team I found that what I really enjoyed, even more than the work, was working with other sysadmins in a managerial role. Thats something I never would have imagined.
We've all worked for managers that didn't understand our roles, talents, abilities, or jobs. I've grown very tired of quarterly reviews that would follow the template "Ben is excellent technically, but has trouble being ontime to meetings, interacting with co-workers, ...." The focus always being on everything but the job. I'm not trying to shrug off my non-technical skills, indeed I've worked very hard to improve in those areas so as not overshadow my other abilities, but the point is that the parts of our jobs that we really work hard at commonly get smoothed over because managers simply don't know how to evaluate performance in technical areas. I crave constructive feedback. I'm self-analytical and want to constantly improve myself and my performance and that requires input. Targeted, dispassionate criticism is just as useful as praise and accolades; both are essential in equal quantities.
This extends into something even more important to me... career growth. We, as a profession, don't tend to stay anywhere long (being defined as more than 4 years). Promotion almost always occurs due to changing companies. Very few companies that I've worked with promote from within. A lot of companies don't even think about promotion, I think, they simply heap more responsibility on employees. The mail admin wants to be more involved with storage and so starts working with the storage team here and there and maybe takes over more control of the storage systems related to his area, but rarely do admins get shifted between operational groups, given a new office, title, etc. Those that do are very large corps, but it still doesn't happen enough in those organizations, its too easy to take someone from the outside than train someone up, even if that just means personally challenging them.
Most admins are, therefore, islands unto themselves, reinventing the wheel all the time. We are our own career advisers, training specialists, and head hunters. Why? Because we're also very ego driven people. We pride ourselves on going it alone and doing it ourselves. For me this has come and gone in waves... I crave a job where I'm alone in the data center able to set my pace, set my priorities, do it my way. After a time I get very lonely and crave a team. I then transition to a team environment and enjoy sharing ideas, learning from peers, and being collaborative... but this to gets old after awhile due to frustrations with certain members of the team, ignorant management, or a host of other issues that come up in a big team environment. I've gone back and forth enough times that I got to see the patterns emerge in myself. I'm extremely thankful to Taos and my time there that moved me between gigs frequently enough that I was able to learn these things about myself, it really was a gift, most people will work for 10 or more years before having that size dataset from which to analyze themselves and their performance.
I really enjoy thinking about members on my team, who they are, what they are strong in, where they have potential that is potentially unknown even to themselves, and working with them to draw out that potential in a way that will benefit both the organization and individual long term. Being a technical guide or at least sounding board. Being able to task them outside of their comfort zone to challenge them. Being able to acknowledge their work, commitment, and provide the constant feedback which improves moral and personal satisfaction. Its vitally important, I think, that sysadmins know that they are doing a good job, are appreciated, and just know where they stand. Leaving "the office" on Friday knowing that you've done the best job that you can, whether it was a good week or bad, is really important. Stress builds up when you have that nagging voice in your head saying "You're not doing enough!", "They are gonna fire you any day now!", "You're a fraud, you've got them fooled now, but next week you'll hit a problem you can solve and you're done!" Being able to just answer those fears on a weekly basis from your management can go a long, long way to making your life a lot less stressful and more enjoyable both on and off the job.
Now... I've watched organizations like SAGE and now LOPSA for years. I like that feeling of belonging to an organization. But I constantly ask myself why I should get behind these organizations. I'll be perfectly honest, in my mind at least, they are bureaucratic discount clubs. You save a couple bucks on books from this place or that, they organize conferences that I can't afford to attend, and .... um... provides meaningless mailing lists? Whats the point? They "advance the profession of systems administration". Um, how? By doing conferences? LISA and USENIX are great conferences, I'm told, but I've got a family that I don't want to leave for a week, a job that demands constant attention, and paying for the conference, travel, hotels, etc, is only something thats possible by working for a company that is willing to flip the bill. So when a conference isn't running, what value do I get for my membership fees? For this reason I basically join up when I spruce up my resume simply to have it on there. I'll admit my own hypocritical view that I look down on any job applicant that isn't a member of one of these. At least IEEE and ACM give you access to some great resources (the ACM library is awesome).
Now... we pair these things up. I've grown more and more convinced that a useful thing to have would be a really solid mentoring program at some admin organization. A "Career Counselor In The Sky" of sorts. A formal program of peers which could provide long term support for admins in both technical and non-technical aspects of the job. Mentors that could answer day-to-day questions, but also ask the "Where do you want to go with your career?" questions, and help create personalized paths for admins to take regardless of where they work. Applicants would be paired up with a mentor, and mentors would themselves have mentors. If a solid program was in place that really helped people I think it could actually achieve the goals that these organizations espouse.
Such a program would have to be very structured. Not just an informal "You email this guy if you have questions" hookup service. Rather, every participant should have a long term file to allow a stateful transition over time and chart progress. Weekly discussions by phone would provide the mentoring people need and dispassionate feedback that fuels growth. Periodic full skills reviews could allow participants to see their own growth over time and hone in on areas that they want to grow, rather than just floundering day-by-day based on immediate needs. This needs to be more than a "Sendmail is b0rked, any idea why?" Rather this is a higher level "What would you do in my situation?" ongoing conversation that provides that external viewpoint that helps individuals overcome roadblocks due to perspective. It provides that "You wanted to grow into Network Administration, but you keep growing your database skills, are you looking for opportunities to re-focus your energy? Are you growing into the wrong field?" check.
SA Mentoring isn't a new concept. Long before I'd even considered such a thing discussions where flying. As recently as 2006, it was a topic at LOPSA, and apparently a very passionate one. In that note, Some thoughts on "mentoring", its stated that its been tried and failed, apparently several times, in the past. What those attempts were and how they failed I don't know. Its certainly not a simple task, your going to have to go through all the same hurtles that a new business would go through. You've got to bootstrap and organization, implement procedures, bring in participants, connect everyone up, track their progress, and ensure that things don't break down after the initial "Seems like a good idea!" buzz wears off. There needs to be incentives and rewards for participation, and they need to be recognized in the field. You have to build a track record and respect in the industry and become a stable facility that managers and peers can send people to who might benefit from. You've got to really publicize it, so that the people who need the most help, those that feel alone and isolated, find and have the opportunity to embrace it. Its no small task to be sure, and not something that can be done half heartedly.
Is such a thing even possible? I think so. It would require a lot of commitment and a lot of time, but I think its something that, if well executed, could become an invaluable resource that would really take massive strides to really bringing the field together. Most consulting firms, such as Taos, already have systems like this in place for their consultants, it's not a unique or novel venture, it just isn't done on such a massive scale.
Will it happen? I don't know. Ego is a big problem and a tough one to over come. Mentoring only works when there is real trust and building that trust takes time and requires participants to drop their guard to open themselves up. Commitment comes from results, otherwise its just another time sync, so you have to be improving peoples lives in a measurable way, not just giving them one more person to answer to. Can people open up? A better question is, because of the ego thing, would such a program have far more volunteers to be mentors rather than be mentored? Because of that I think you have to ensure that mentors are themselves mentored. And would potential participants be willing to take that first step? All that gets easier over time as you build a reputation around the program, but initially its really gonna be hard.
If something like this happens I'll be thrilled... but until then, I'll just continue to try to develop myself to be the best possible technical manager I can be and do everything I can for my team.
I was asked to respond to this post: System Administration; an insider’s perspective. Its sort of general, but tries to capture some ideas on where a well rounded sysadmin should have some proficiency.
Lots of lists have appeared over time detailing what someone things a good sysadmin should know or be experienced with.. but honestly, I've found that it really depends on what environment your talking about. In large corporate environment with lots of admins its all about specialization. In small and medium sized companies you want a "jack of all trades" who can do punchdowns for the new PBX and implement HA databases. In startup environments its all about drive and raw talent, not so much experience.
I've been lucky in my career to have worked at very large sites (Sun, Cadence, Fujitsu, MCI Systemhouse), small and medium sites (Homestead, Clarify), and now a "startup" (Joyent). The demands of each are very different.
Of the 3 different types of sites, I can definitely tell you I don't like small and mediums. They tend to be more conservative than they should because of cash constraints and they often want to play like the big dogs but without spending big dog bucks. Startups are far more agile and many of them that I've worked with at Joyent have a refreshing "whatever it takes, just make it happen" approach.
Many admins don't like big mega-corps... but I've always liked them. You get to work with a big crew and specialization is a key to success. All those big fat perks are kinda nifty too. Best of all, its nice to have some redundancy so that you can take a vacation (or a weekend) without the pager. Besides, I have this weird fascination with tile bathrooms... can't say why, but its like my little "You've made it buddy!" benchmark. Lenolium bathrooms are a big "Run as fast as you can!" indicator.
My advice for admins is all about attitude and approach...
There is a lot of ego in this job. There needs to be to survive in a job where no matter how much you know, no matter how much you've learned, you're always confronted with the one question or problem you just don't know the damned answer to and you can't let that show. And, frankly, we hold the keys to a company. If a sysadmin goes psycho we can put a business out of business... push this button and kiss your company goodbye. Its actually shocking that doesn't happen more often. Its the famous OfficeSpace "My, My Stapler...I can set fire to the building."
It's a tough gig. Long hours, huge expectations, a pat on the back if your lucky when things are good, a gun to the head if you make a mistake. It's not easy. It takes a very special type of character to do what we do, and I think anyone brave enough to do it and to stick with it is commendable.
And so the best advice I can give is...
When you talk with an admin during an interview or just over a Guinness you can tell a lot about them, and it has nothing to do with how many times they've configured Postfix or used Sun Cluster. It's about their character and the passions for technology.