Matthew McMillan: 2012

Tuesday, July 31, 2012

File transfer through Windows Remote Desktop client

I can't believe I have never known about this feature of the Windows Remote Desktop client. I have been using Windows since version 3.1 and I use the Remote Desktop client (RDP) almost everyday at work. I have done an informal poll of several people I work with and none of them knew about this either. The feature I am talking about is mapping local drives to a remote machine using only the Remote Desktop client. I only ran across this myself while doing some reading about managing Amazon EC2 instances. I'm not sure why this feature is buried so deep in the options. Anyway here is how to use it:

Launch Remote Desktop and click on the 'Local Resources' tab.

I have used the Local Resources tab many times to change printer and audio settings. That 'More...' button hides some neat features. Click on 'More...'

After clicking 'More...' you will have a list of all the drives available on your local PC. Check the box next to the drives you want to have mapped to the remote system.

After you have made the Remote Desktop connection to the remote machine this is how the drive appears on the remote system. There is an 'Other' section in the list of drives. Now you can to copy files back and forth to the remote system directly through the Remote Desktop client connection!

Wednesday, July 11, 2012

NetApp deduplication for VMware

Being a skeptic I usually don't believe something until I see some hard evidence. When I am dealing with claims made by a company trying to sell me something I don't believe anything until I see results with my own two eyes. At work we recently began upgrading from a NetApp FAS270C to a FAS2240-2. We needed to upgrade to a faster filer anyway so deduplication wasn't the only selling point but it definitely was of interest to me. We are planning on migrating our VMware virtual machine disks from local storage on our various ESXi servers to centralized storage on the NetApp over NFS mounts. NetApp deduplication has been around for a while now and NetApp recommends enabling dedup for all VMware volumes. My NetApp sales rep also told me that tons of his other customers were using NFS mounts along with dedup and seeing disk space savings of 20-50%. Based on all of that info and finally having the budget to purchase a new filer I decided it was time to try dedup out in my environment.

I began testing a few weeks ago by creating a few test VM's on the NFS mounted volume and after that went well I moved on to migrating a few existing non-critical VM's to the NFS mount. The performance over NFS was quite good and after letting things run for about a week I did not see anything obviously wonky with the virtual machines so I decided to enable deduplication or "Storage Efficiency" as NetApp calls it. One thing to note is that the deduplication only works for data added after it has been enabled. So if you have an existing volume that is already filled with data you won't see much benefit unless you tell the NetApp to scan all the data on volume.

HOW TO

So let's start with the command to manage dedup on a NetApp. The command is named 'sis'. Running sis with no options will give you the list of available options

netapp2240-1> sis
The following commands are available; for more information
type "sis help "
config              off                 revert_to           status
help                on                  start               stop

The sis status command will show you if dedup is enabled.

netapp2240-1> sis status          
Path                           State      Status     Progress
/vol/testvol                   Disabled   Idle       Idle for 02:12:30
/vol/vol_prod_data             Enabled    Active     70 GB Scanned

The sis on /vol/volname command will enable dedup on a volume.

netapp2240-1> sis on /vol/testvol
SIS for "/vol/testvol" is enabled.
Already existing data could be processed by running "sis start -s /vol/testvol".

Notice that helpful message about processing already existing data? The default schedule once dedup is enabled is to run the process one a day at midnight. You can kick off the process manually with the sis start /vol/volname command. The start command has a '-s' option which will cause the dedup scan to process all of the existing data looking for duplication.

netapp2240-1> sis start -s /vol/testvol
The file system will be scanned to process existing data in /vol/testvol.
This operation may initialize related existing metafiles.
Are you sure you want to proceed (y/n)? y
The SIS operation for "/vol/testvol" is started.
netapp2240-1> Wed Jul 11 14:10:06 CDT [aus-netapp2240-1:wafl.scan.start:info]: Starting SIS volume scan on volume testvol.

You can use the sis status command to monitor the progress of the deduplication process.

netapp2240-1> sis status
Path                           State      Status     Progress
/vol/testvol                   Enabled    Active     4 KB (100%) Done

RESULTS

For my volume that is storing VMware virtual machine disks I am seeing an unbelievable 59% savings of disk space. It's pretty crazy. I keep adding virtual machine disks to the volume and the used space hardly grows at all. So far all of the virtual machines I have put on this volume are Linux. I expect once I start adding some Windows VM's the savings will go down somewhat.

To highlight the importance of using the '-s' option to process all existing data I have this example from a volume that is used as a file share for user data. We enabled dedup and after several nightly dedup runs we were disappointed to see almost no savings.

Dedup enabled but without initially using the '-s' option.

I knew something wasn't right. I had a hunch that the users had more than 122MB of duplicate files out of 450GB of data. In doing research for this blog post I discovered the '-s' option. We kicked off a manual dedup process with the -s and check out the results.

After reprocessing with '-s'.

We freed up 225GB of disk space with one simple command (and the command wasn't rm * ;-).

I recommend enabling deduplication on any file share volumes or VMware volumes. You will probably see more savings with the VMware volumes because multiple copies of operating systems will have lots of duplicate files. So far I have seen between 15-30% savings for file share volumes and up to 59% savings for VMware volumes.

Friday, July 6, 2012

pyMCU and DS18B20 temperature sensors

As part of my Garage Monitor project I am using a Maxim DS18B20 digital temperature sensor with a pyMCU microcontroller. I found lots of pages and sample code for using the DS18B20 with Arduino boards and various other microcontrollers but nothing for the pyMCU. The pyMCU uses a Python library published by the board manufacturer to control it. The DS18B20 uses the 1-Wire communication bus and the pyMCU Python library includes functions for 1-Wire communication called owWrite and owRead. The tricky part is figuring out the command sequence to coax a temperature reading out of the sensor. The communication with the sensor involves writing several hex values to it and then reading a hex temperature value. That hex value is then converted to a decimal value.

It took me the better part of three evenings to figure out the sequence of writes and reads. The other thing that was a bit difficult was figuring when to send reset pulses.

The pyMCU 1-Wire function uses this format:

owWrite(pinNum, mode, writeData)

The mode value determines if a reset pulse is sent before or after the data. Here is how I wired it:

And is the code that I wrote:

#!/usr/bin/python
#
# Written by Matthew McMillan
# matthew.mcmillan at gmail dot com
#
# This code reads a temperature value from
# a DS18B20 sensor using a pyMCU board. It only
# reads from a single sensor.
#
import pymcu           # Import the pymcu module

# Function to read value from DS18B20 temperature sensor
# Need to pass digital pin number and flag for F or C units
#
# 0x33  READ ROM. Read single device address.
# 0xCC  SKIP ROM. Address all devices on the bus
# 0x44  CONVERT T. Initiate temperature conversion
# 0xBE  READ SCRATCHPAD. Initiate read temp stored in the scratchpad.

def readtemp(pin,ForC):
    mb.owWrite(pin,1,0x33)
    ReadVal = mb.owRead(pin,0,8)
    mb.owWrite(pin,1,0xCC)
    mb.owWrite(pin,0,0x44)
    mb.owWrite(pin,1,0xCC)
    mb.owWrite(pin,0,0xBE)
    ReadVal = mb.owRead(pin,0,12)
    HexVal1 = hex(ReadVal[1])
    HexVal2 = hex(ReadVal[0])
    HexVal1 = HexVal1[2:4]
    HexVal2 = HexVal2[2:4]
    HexVal = '0x'+HexVal1+HexVal2
    DecVal = int(HexVal, 0)
    TempC = (DecVal * 0.0625)
    if ForC:
        TempF = ((TempC*9)/5)+32
        TempFR = round(TempF, 1)
        return TempFR
    else:
        TempCR = round(TempC, 1)
        return TempCR

################
# Main program
################

# Initialize mb (My Board) with mcuModule Class Object
mb = pymcu.mcuModule() 

#  Need to pass digital pin number and flag for ForC
tempout = readtemp(7,1)

print 'Temp: ' + str(tempout)

Please leave a comment if you found this helpful. I found this page on hackaday.com very helpful in figuring this out.

Sunday, July 1, 2012

Raspberry Pi Heat Sinks

A friend of mine noticed that his Raspberry Pi got quite warm while playing 1080p video. I am planning on using my Raspberry Pi in the garage which gets pretty dang hot here in Texas so we were both interested in adding additional cooling to the Pi. He found a post talking about making custom heat sinks out of old CPU heat sinks. One of the commenters on that post suggested using Zalman VGA Ram Heatsinks. Fry's Electronics has the silver colored version of these heat sinks for only $4.99. They come in a pack of eight and you only need three for each Raspberry Pi.

The Raspberry Pi CPU (or SoC to be completely accurate) is 12mm x 12mm and so are the Zalman VGA Ram Heatsinks. So the CPU is easy enough to add a heat sink. You can just peel off the backing of the thermal tape and stick it on the CPU. There are two other components that would benefit from cooling as well. One is the USB/Ethernet controller and the other is the voltage regulator. For these two chips I had to cut down the heat sinks so they would fit without touching other components on the board. I used my Dremel tool with a cutoff wheel to trim them down to size and then a sanding drum to clean up the edges. The thermal tape that was on the heat sinks got pretty messed up during the cutting but the pack of eight came with two extra pieces of thermal tape so I trimmed down those extra pieces. You could also use thermal compound to stick them on if you have some of that.

Location of the various chips that may need cooling.

I just eye balled the closest fins that would fit the chip and marked it
with an ultra fine point Sharpie marker.

After cutting the heat sinks with my Dremel tool.

Heat sinks installed on my Raspberry Pi.

Saturday, March 17, 2012

Measuring light levels and temperature with a pyMCU board

My pyMCU board running a light sensor and a Maxim DS18B20 digital temperature sensor.

Detailed logging for chrooted sftp users

At work we have been migrating some of our customers from ftp to sftp. This gives us and the customer better security but one drawback with my initial sftp setup was that we didn't have detailed logs like most ftp servers produce. All we were getting in the logs were records of logins and disconnects. We didn't have any information on what a client was doing once they were connected. Things like file uploads, file downloads, etc. I had some time this morning to take a look at this. I started with doing some google searches for 'sftp logging'.

I found a lot of blog posts saying that all you had to do was change this line in sshd_config:

ForceCommand internal-sftp

to:

ForceCommand internal-sftp -l VERBOSE

I tried this but didn't get any additional logging. What I finally figured out is that the logging setup for chrooted sftp is a bit more involved. I ran across this blog which spells out what needs to be done quite clearly. The meat of the problem is that the chrooted sftp process can't open /dev/log because it is not within the chrooted filesystem. An additional layer of complexity is that my sftp home directories exist on an NFS mount. Here are the steps from bigmite.com's blog that I used for my CentOS system.

1. Modify /etc/ssh/sshd_config

Edit /etc/ssh/sshd_config and add -l VERBOSE -f LOCAL6 to the internal-sftp line.

Match group sftpuser
ChrootDirectory /sftp/%u
X11Forwarding no
AllowTcpForwarding no
ForceCommand internal-sftp -l VERBOSE -f LOCAL6

2. Modify the syslog configuration

If the users sftp directory is not on the root filesystem syslog will need to use an additonal logging socket within the users filesystem. For example /sftp is the seperate sftp filesystem (like my setup with the sftp home directories on an NFS mount). For syslog on Redhat/CentOS edit /etc/sysconfig/syslog so that the line:

SYSLOGD_OPTIONS="-m 0"

reads:

SYSLOGD_OPTIONS="-m 0 -a /sftp/sftp.log.socket"

To log the sftp information to a separate file the syslog daemon needs to be told to log messages for LOCAL6 to /var/log/sftp.log. Add the following to /etc/syslog.conf:

#For SFTP logging
local6.* /var/log/sftp.log

Restart syslog with the command service syslog restart. When syslog starts up it will create the sftp.log.socket file.

3. Create links to the log socket
Now you will need to create a link in each users chrooted home directory so the chrooted sftp process can write to the log. This will also need to be done everytime you create a new user.

mkdir /sftp/testuser1/dev
chmod 755 /sftp/testuser1/dev
ln /sftp/sftp.log.socket /sftp/testuser1/dev/log

And that's it! Now sftp will log everything an sftp user does while connected to your server. Here is a sample of what the logs look like:

Mar 16 15:36:45 sftpsrvname internal-sftp[2449]: session opened for local user sftpusername from [192.168.1.10]
Mar 16 15:36:45 sftpsrvname internal-sftp[2449]: received client version 3

Mar 16 15:36:45 sftpsrvname internal-sftp[2449]: realpath "."
Mar 16 15:37:13 sftpsrvname internal-sftp[2449]: lstat name "/"
Mar 16 15:37:13 sftpsrvname internal-sftp[2449]: lstat name "/"
Mar 16 15:37:13 sftpsrvname internal-sftp[2449]: opendir "/"
Mar 16 15:37:13 sftpsrvname internal-sftp[2449]: closedir "/"
Mar 16 15:37:21 sftpsrvname internal-sftp[2449]: realpath "/backup"
Mar 16 15:37:21 sftpsrvname internal-sftp[2449]: stat name "/backup"
Mar 16 15:37:33 sftpsrvname internal-sftp[2449]: lstat name "/backup"
Mar 16 15:37:33 sftpsrvname internal-sftp[2449]: lstat name "/backup/"
Mar 16 15:37:33 sftpsrvname internal-sftp[2449]: opendir "/backup/"
Mar 16 15:37:33 sftpsrvname internal-sftp[2449]: closedir "/backup/"
Mar 16 15:37:37 sftpsrvname internal-sftp[2449]: open "/backup/testfile" flags WRITE,CREATE,TRUNCATE mode 0664
Mar 16 15:37:37 sftpsrvname internal-sftp[2449]: close "/backup/testfile" bytes read 0 written 288
Mar 16 15:41:45 sftpsrvname internal-sftp[2449]: lstat name "/backup"
Mar 16 15:41:45 sftpsrvname internal-sftp[2449]: lstat name "/backup/"
Mar 16 15:41:45 sftpsrvname internal-sftp[2449]: opendir "/backup/"
Mar 16 15:41:45 sftpsrvname internal-sftp[2449]: closedir "/backup/"
Mar 16 15:42:16 sftpsrvname internal-sftp[2449]: lstat name "/backup/testfile"
Mar 16 15:42:16 sftpsrvname internal-sftp[2449]: remove name "/backup/testfile"
Mar 16 15:42:24 sftpsrvname internal-sftp[2449]: session closed for local user sftpusername from [192.168.1.10]

Wednesday, March 14, 2012

Using an LCD Display with my pyMCU board

I connected a 2x16 LCD display to my pyMCU

Monday, March 12, 2012

First test of my pyMCU board

Sunday, March 4, 2012

Troubleshooting a NetApp Snapmirror problem

Last night I started getting Nagios alerts that one of my Snapmirror replication sessions was falling way behind it's normal schedule. The source filer is a critical part of every web application we run and this failing Snapmirror session is critical to our disaster recovery plan.

I started by checking 'Filer At-A-Glance' on the source and destination filers and my MRTG charts. The CPU utilization was very high on the source filer and it was throwing syslog errors every minute saying "pipeline_3:notice]: snapmirror: Network communication error" and "SnapMirror source transfer from srcname to dstname: transfer failed." The CPU was at a 45% - 50% when it normally runs in the 3% - 8% range. The destination filer was throwing syslog errors about failed transfers and destination snapshots are in use and can't be deleted. For example "[snapmirror.dst.snapDelErr:error]: Snapshot dst-netapp(543534345435)_volname.12313 in destination volume volname is in use, cannot delete."

The high CPU was really worrying because come Monday morning I was picturing the additional load of web site traffic might cause the source NetApp to fall over dead. I let it run over night thinking the CPU load might be caused by a batch job, sort of hoping it would clear up by the morning and the Snapmirror would recover once the load dropped. Unfortunately I woke up to the exact same situation.

This morning I started with some triage. First task was get the CPU load down so we could at least function tomorrow. Since this all seemed related to Snapmirror I started by turning off Snapmirror all together on both filers. The command line to do this is 'snapmirror off'. Voilà, the CPU load on the source filer immediately dropped down to normal levels. Now worst case we could at function come Monday morning even if that meant not having the DR site up to date. I also disabled the Snapmirror schedule for this particular job which normally runs every fifteen minutes. With the schedule for the troublesome job disabled I turned Snapmirror back on (snapmirror on).

Next I wanted to test Snapmirror on a separate volume to see if Snapmirror was generally broken or if it was specific to that particular volume. To do this I created a new small (1GB) volume on both the source and destination filers. I copied a few hundred megs of data into it and then setup snapmirror for the volume. I hit Initialize and it worked just fine. Hmmm... ok so Snapmirror works. It must be a problem with that particular volume.

From my CPU chart I had a pretty good idea of what time things went wrong. I was starting to consider the possibility that some strange combination of events took place at around 7pm which somehow interrupted/damaged the Snapmirror sync that took place at that time. Looking at the snapshots on both the source and destination filers I saw that I still had a couple hourly and one nightly snapshot in common between both filers before the CPU spike. So my next plan was to delete snapshots on the destination filer until I was at a spot prior to the CPU spike. I had to break the snapmirror relationship to make the destination filer read/write (snapmirror break volname). Then I deleted the snapshots that took place after and just prior to the CPU spike. Now replication on the destination filer was set back in time prior to the problem. I was desperately trying to avoid starting over with a full sync of 125GB of data! Fingers crossed I ran resync (on the destination filer!) to reestablish the snapmirror relationship:

dst-netapp> snapmirror resync dst-netapp:volname

The resync base snapshot will be: hourly.1

These older snapshots have already been deleted from the source and will be deleted from the destination:

hourly.4

hourly.5

nightly.1

Are you sure you want to resync the volume? y

Sun Mar 4 11:43:26 CST [snapmirror.dst.resync.info:notice]: SnapMirror resync of volname to src-netapp:volname is using hourly.1 as the base snapshot.

Volume volname will be briefly unavailable before coming back online.

Sun Mar 4 11:43:28 CST [wafl.snaprestore.revert:notice]: Reverting volume vol1a to a previous snapshot.

Sun Mar 4 11:43:28 CST [wafl.vol.guarantee.replica:info]: Space for replica volume 'volname' is not guaranteed.

Revert to resync base snapshot was successful.

Sun Mar 4 11:43:28 CST [snapmirror.dst.resync.success:notice]: SnapMirror resync of volname to src-netapp:volname successful.

Transfer started.

Monitor progress with 'snapmirror status' or the snapmirror log.

Phew!! Success! CPU is at 3% and snapmirror is working again.

The error messages that the source filer was throwing were less than helpful. Network communication error? WTH? The problem had nothing to do with networking. Must be a general error message when a snapmirror transfer fails. The destination filer syslog messages gave me the best clue to the problem. One word of caution: Make sure you run the resync command on the correct filer! In this situation if I ran the resync command on the source filer it would have reverted it back to the previous afternoon losing all the data in between.

Matthew McMillan

Labels