Tuesday, May 20, 2008

Vmware Server and Xen

In the past, I've played with VMware (as a "workstation" "server", not the bare-metal one we have access to now) but was never quite happy with it. Some of the problems I had might have been that I ran it on windows Vista, not Linux. However, from a VM point of view, the VM itself should be more or less identical.

Recently I tried out Xen on the same hardware, but using NetBSD/amd64 as the "host" OS.

Hardware

The machine is Gateway GM5446E with a dual core Intel Core 2 Duo with 3 GB of ram. The machine has three SATA hard drives, connected via an Intel AHCI controller running in native AHCI mode.

Bare Hardware Baseline

I booted a standard NetBSD-current/amd64 kernel and ran some speed tests, which gives a baseline for dom0 and guest OS disk I/O speed tests. See below.

VMware

In my VMware install, I used the machine running "Windows Vista Media Center" -- it was what came pre-installed on the machine.

Host OS: Windows Vista Media Center

Guest OSs tried:

  • NetBSD-4.0/i386
  • NetBSD-4.0/amd64
  • NetBSD-current/i386
  • NetBSD-current/amd64
  • Linux Ubuntu (server, then-current version)
  • Linux Debian (then-current version)

Linux booted in 64-bit mode, but would crap out at some later point with similar issues that NetBSD had.

One machine, named "nfsd", was dedicated to serving out home directories and source trees of NetBSD. The other host OSs mounted /home or /netbsd-src from nfsd.

Each machine had a "local" disk to store object files from pkgsrc, src, and other OS-related builds.

General VMware Problems

I could not get netbsd/amd64 (current or 4.0 release) to "self host" -- build /usr/src and kernels -- reliably. They would either silently lock up without reason, or they would crash with an odd CPU exception.

Timekeeping was whacky. Without running the VMware-supplied (closed source) tools on a client, time was off, and apparently in uncorrectable ways. Running ntp made things seriously whacky as the time would drift wildly as ntp tried to correct a guest.

The VMware closed-source tools are only available on a small, limited number of OS types, and then specific versions of many of them. They do supply a .so that is, in theory, linkable on many versions of Linux, but the install procedure warns loudly of warnings pertaining to compatibility.

VMware running on anything but Intel chips with synchronized cycle counters (which most OSs use for high-res timekeeping these days) was a disaster.

VMware Strengths

If the problems above are solved, VMware is a true virtual machine architecture that will run any OS without modification. VMware could run windows guests, right along with unmodified NetBSD, FreeBSD, Linux, and Solaris/x86 guests.

Xen

I tried Xen 3.1.3.

Xen is a different architecture than vmware in that it prefers to use "paravirtualization" rather than a full virtual machine. It has a host machine (called "domain 0" or "dom0") which attaches to the physical hardware and acts as a conduit between the xen hypervisor and hardware.

The boot process is that the xen kernel is booted first, which then boots the dom0 host. Multiple domains can be created, serving different hardware, but in practice this is rarely done.

Each host OS has a config file, and is stated with "xm create /path/to/file.conf". This boots the guest OS and connects a serial console, which can be used with "xm console <name>".

Since the "dom0" is a fully functional OS in its own rights, I have it serve NFS to the guest OSs.

I created the following guest OSs:

  • NetBSD-current/i386
  • NetBSD-current/amd64
  • Windows XP Pro (32-bit)
  • Windows Server 2003 (32-bit)

Yes, I managed to install Windows XP Pro and Server 2003. They run in a "vnc" console, and for all practical purposes looks like windows. This is using Xen's "hvm" -- which is a full hardware emulated virtual machine, and allows running unmodified guest OSs. People have Vista running in a virual machine under Xen, but I do not have "real" Vista install media or licenses, just the ones that came with and is tied to my hardware.

Xen also supports both realtime and offline "migration." As I have only one machine of the same type, I have not yet read up on how this works. The basics: A realtime copy is made of the guest's ram, device state, and other data. It is transmitted to the new destination, and synced up until a very small switchover time can be used to swap where that guest is running. Xen claims 100 ms switchover time is possible, but there are restrictions: The disks are NOT migrated, so must reside on a shared volume. The physical network each dom0 is on is also shared, in order to avoid disruption of TCP connections. I also believe fairly identical CPU and dom0 operating systems should be used.

Offline migration involves shutting the guest down, copying the disks over, and restarting it on a new dom0. This will, of course, interrupt service.

Xen Problems

It is difficult to configure for the fist time. The documentation is... lacking. It is also only as solid as the host OS is, but vmware has the same issue in "server" or "workstation" incarnations.

Xen is also very, very "linux" specific in documentation and examples. Most of these can be translated -- I certainly did so easily enough -- but this is being corrected in their documentation as more OSs are able to boot as dom0.

Xen Strengths

Timekeeping in Xen, since it is paravirtualized, is almost perfect. Small drifts will occur without running ntp, but all guests (and the host) can run ntp and obtain sanity.

It also appears that all guests and the dom0 "drift" identically, so this is probably related to hardware timekeeping issues. The measured drift of an uncorrected NetBSD guest was 4 seconds in two weeks. ntp correction kept the others in perfect real-world sync.

It is as free as you want it to be. Support and commercial versions exist, but the free stuff works amazingly well.

Performance

On Xen, all tests were performed with the domu's running but idle, and no hvm guests running (windows is just too unpredictable.) On VMware, only one VM was active at once, and the Vista host was as idle as it could be made.

I measured three main things here:

  1. Boot speed: How fast a kernel gets from loading to the first /etc/rc message.
  2. Disk speed: read/write speed.
  3. CPU Performance.

Boot Speed

The dom0 boots as fast as any other kernel boots; it must probe the hardware, wait for hardware to change state, etc. No measured difference between a standard NetBSD-current/amd64 kernel and the dom0 kernel.

The domUs (guests) boot so fast it is nearly impossible to measure. This is because the devices they have access to are known -- all are on a virtual bus, and are directly enumerable, so there is no need to probe for devices, wait for them to change state, or time out when not present. As best I can measure, just under 2 seconds is a fair estimate.

The hvm (windows) guest seems to be about as fast as windows is. I did not analyze this one much.

On VMware, the host OSs boot at about the same speed as a "real" machine boots unless a custom kernel is built with just "known present" devices. Even then, boot times are 15-20 seconds.

Disk Speed

In all host/dom0 tests, "iozone" version 3.263 was used, with a 1 GB file on the same disk. Each test was performed only once; for real comparison data we'd want to run it more than once, but this is just a first-pass test.

For native NetBSD/amd64, I had to increase the file size to 4 GB to avoid the cache, as the machine has 3 GB of ram.

  • wd0 is a 500 GB SATA 3.0Gb/sec disk.
  • wd1 is present but unused.
  • wd2 is a 320 GB SATA 1.5Gb/sec disk.

All are on different channels of an Intel AHCI controller running in native SATA mode.

For the Xen tests, all disks were mounted as files on the dom0 host. From dom0's point of view, the file is mounted on a "vnd" virtual disk, and that virtual disk is exported to the host.

For the VMware test, all disks were mounted as files in the Windows filesystem.

OS Disk Block Size Read Write
netbsd-current/amd64 native wd0 8192 60762 59949
16384 60545 60141
wd2 8192 78342 76342
16384 78311 75252
netbsd-current/amd64 dom0 wd0 8192 60641 60109
16384 60459 61919
wd2 8192 80258 79102
16384 80187 80295
netbsd-current/amd64 domu wd0 8192 51205 24004
16384 51714 27971
wd2 8192 77990 23997
16384 77282 22496
netbsd-current/i386 domu wd0 8192 41730 25012
16384 42008 24543
wd2 8192 66401 26048
16384 66201 28910
netbsd-current/i386 vmware wd0 8192 25014 13912
16384 25417 13771
wd2 8192 38831 16100
16384 38994 16332

I also repeated one test with a raw, physical partition mounted in the netbsd-current/amd64 domU, which bypasses the "double filesystem" issue:

netbsd-current/amd64 domu wd2 8192 79915 76992
Physical mount wd2 16384 79744 77102
CPU Performance

Each CPU speed test was run with: Dhrystone Benchmark, Version 2.1 (Language: C) Program compiled without 'register' attribute.

I used an iteration count of 1,000,000,000 for each test.

Operating System Dhrystones per second
netbsd-current/amd64 native 11,013,216
netbsd-current/amd64 dom0 10,365,917
netbsd-current/amd64 domu 11,130,899
netbsd-current/i386 domu 4,935,347
netbsd-current/i386 vmware 5,012,123

Just for grins, I ran the following tests, one dhrystone on one guest and another on a different one. Since each guest is uniprocessor in my configuration, I did not run two benchmarks on the same host.

Operating Systems Speed 1 Speed 2
Running both domu/i386 and domu/amd64 4916421.0 11135857.0
Running both dom0/amd64 and domu/amd64 10298661.0 11135857.0
Running both dom0/amd64 and domu/i386 10373444.0 4921260.0

Conclusions

Xen is production ready.

When the host OS can be modified, much higher performance numbers are obtained vs. the low-end VMware server I ran.

While it might be extremely tempting to build one guest that does one very specific function, this probably does not scale: memory is pre-allocated and dedicated to a guest, and while some swapping is allowed, it will slow the guest at seemingly random times; disk can be overcommitted, but the OS sees failure to allocate a block as a hardware failure; the more hosts, the more maintenance costs are present: maintaining packages on each guest, upgrading, etc.

VMware "hmx" or whatever the name of the run-on-bare-metal product should be tested.

I'd love to install Xen on a huge machine with lots of ram and many, many CPUs as a test. Would someone like to ship me a 4 CPU quad core with 64 GB?

Wednesday, April 23, 2008

Your cable company owns you

Well, ok, perhaps not entirely... yet.

This is actually a rant on something cable modems allow your cable internet provider to do to you.

They restrict access to your own hardware.

Why would they do this? Paranoia. A while back, there was a security hole in a network monitoring tool called Simple Network Management Protocol, or SNMP. This security issue allowed people to crash other people's modems, break into their own and change upload/download speeds, and other nasty things.

All of these have been fixed. However, people are still breaking into their modems to "uncap" them -- change speed settings. They just don't use SNMP to do it anymore. They've become more advanced and use things like internal serial ports or JTAG ports.

So, why do cable companies still restrict access to SNMP, and worse, to some of your modem's diagnostic features? I suspect it is because they don't want to have to answer questions about why they suck. They hide the real details of what your modem is doing from you.

Why is this a big deal?

For one, I own the hardware, but my cable company configures it against my wishes. I can understand rate limiting -- I pay for the fastest service already -- but I cannot understand restricting diagnostic tools.

For two, I have spent, in the last 6 months, perhaps 40 hours debugging a cable internet issue with techs from Cox Communications. After many, many rounds of techs who report "all signal levels are good" I finally got a real live network engineer on the line, who, in 5 minutes, could look at all the statistics on my modem. And solve problems.

Monday, March 24, 2008

Checking Credit Card Numbers in Ruby

This is not meant to be an exhaustive list of all possible numbers, nor the only or best method to verify that they pass the "checksum" test, but here's what I came up with.

I wrote this mostly to link a Ruby version of the code to Wikipedia's article on Luhn checksum validation, since nearly every other language in use was listed, but Ruby was sadly missing.

#!/usr/bin/env ruby

#
# Copyright (c) 2008 Michael Graff.  All rights reserved.
#
# Redistribution and use in source and binary forms, with or
# without modification, are permitted provided that the following
# conditions are met:
# 1. Redistributions of source code must retain the above copyright
#    notice, this list of conditions and the following disclaimer.
# 2. Redistributions in binary form must reproduce the above
#    copyright notice, this list of conditions and the following
#    disclaimer in the documentation and/or other materials provided
#    with the distribution.
# 3. The name of Michael Graff may not be used to endorse or promote
#    products derived from this software without specific prior
#    written permission.
#
# THIS SOFTWARE IS PROVIDED BY Michael Graff ``AS IS'' AND ANY
# EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO,
# THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A
# PARTICULAR PURPOSE ARE DISCLAIMED.  IN NO EVENT SHALL Micahel Graff
# BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL,
# EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED
# TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
# DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND
# ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
# OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
# OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY
# OF SUCH DAMAGE.
#

class Luhn
  public
  def self.check_luhn(s)
    s.gsub!(/[^0-9]/, "")
    ss = s.reverse.split(//)

    alternate = false
    total = 0
    ss.each do |c|
      if alternate
        total += double_it(c.to_i)
      else
        total += c.to_i
      end
      alternate = !alternate
    end
    (total % 10) == 0
  end

  private
  def self.double_it(i)
    i = i * 2
    if i > 9
      i = i % 10 + 1
    end
    i
  end

end

if $0 == __FILE__
  def test_valid(s)
    result = Luhn::check_luhn(s)
    if result
      puts "VALID: #{s}"
    else
      puts "INVALID: #{s} (should be valid)"
    end
  end

  test_valid('5105 1051 0510 5100') # Mastercard
  test_valid('5555 5555 5555 4444') # Mastercard

  test_valid('4222 2222 2222 2')    # Visa
  test_valid('4111 1111 1111 1111') # Visa
  test_valid('4012 8888 8888 1881') # Visa

  test_valid('3782 8224 6310 005')  # American Express
  test_valid('3714 4963 5398 431')  # American Express
  test_valid('3787 3449 3671 000')  # American Express Corporate
  test_valid('3782 8224 6310 005')  # Amex
  test_valid('3400 0000 0000 009')  # Amex
  test_valid('3700 0000 0000 002')  # Amex

  test_valid('38520000023237')      # Diners Club (14 digits)
  test_valid('30569309025904')      # Diners Club (14 digits)

  test_valid('6011111111111117')    # Discover (16 digits)
  test_valid('6011 0000 0000 0004') # Discover
  test_valid('6011 0000 0000 0012') # Discover
  test_valid('6011000990139424')    # Discover (16 digits)
  test_valid('6011601160116611')    # Discover (16 digits)

  test_valid('3530111333300000')    # JCB (16 digits)
  test_valid('3566002020360505')    # JCB (16 digits)

  test_valid('5431111111111111')    # Mastercard (16 digits)
end

Wednesday, March 19, 2008

Javascript application framework 'extjs' and privacy

Out of the box, extjs version 2.0.2 leaks privacy information.

If you fail to change the value of Ext.BLANK_IMAGE_URL to something local, it will default to http://extjs.com/s.gif. At first this might not seem bad, but remember that every time this image is fetched the referring URL is sent to the extjs.com web server.

At worse, this is a minor information link. Depending on what you might place in your URL line, this could be a major issue.

I have posted a comment on the extjs forums, but so far the developers don't see the problem. They say it is well documented in their FAQ, and that it is documented in the API docs.

I would prefer they opt for a warning message saying "You did not set ..." rather than leaking information by default. I'll probably have to post a CERT on this one.

Wednesday, October 31, 2007

Fun With Apache and Virtual Hosts

Specifically, name based virtual hosts.

I recently tried to add IPv6 support to my web server. I used to have it, I remember having it, so this should not be all that hard.

After an hour of hacking, I ended up finding two gotchas:

  • Make certain, I mean certain, that all virtual hosts for name-based servers have a unique ServerName line.
  • Make certain, and I mean certain, to save your original configuration files.
Yea, I know, I should have known better. But this is a simple thing to change, right?

A very useful tool is apachectl -S, which lists all virtual hosts. Even better is to run that output through sort.

Sunday, October 28, 2007

Mongrel, Apache, and Rails

When I first started running Rails applications on my web server, I chose to use FastCGI. Specifically, the mod_fcgid module, which had some features I wanted. It also has the unfortunate by-product of corrupting Apache's memory. Bad news.

I've since removed FastCGI entirely and moved to a proxy to mongrel_cluster setup. And I've started deploying with Capistrano.

Capistrano

I have a certain amount of concern with moving to a deployment system I knew very little about. Just like a new backup system, I feel like I'm handing the keys to my data over to something not written by me. And, while it is fairly simple to set up, Capistrano is somewhat complicated internally.

I already push out my operating system upgrades in an automated way. I compile NetBSD on one machine here at home, and push the binaries out to all my machines. This means about 7 machines rsync from the build box with one command. This can be scary, but I've been doing it for 5 years now, and it just works. How can a web site be scary compared to kernels and system binaries?

The answer is, it's not. If something breaks it is fairly easy to manually reconfigure if I need to. So, I've relaxed a bit. My concerns are still there, and I'm keeping a careful watch on how Capistrano runs each time I deploy. I have yet to do a *real* deployment after all! So far, I've not done a single migration, and have not had to roll back. And I'm pushing to a single machine, which runs the database as well as the site.

I suspect that, as I become comfortable with this new method to update my web sites, I'll start thinking of it as rsync++. It really is that simple.

mongrel_cluster

Mongrel is a vary amazing little widget. Sure, it's slower than Apache, but that's ok. Mongrel is still far, far faster than restarting Rails for each web hit, and far more reliable than mod_fcgid.

In my configuration, I run each site on ports 10000, 10010, 10020, etc. with up to 3 servers per. This means application #1 is on 10000 through 10002, with room to grow should I need to run more. If I find myself running more than 10 servers for a site it needs a new machine anyway, or more machines. And if that happens, I hope I'll have a budget.

Apache load balancing

This is a new feature in Apache 2.1, and apparently is very reliable with Apache 2.2. This is currently my favorite way to run a web site.

My configuration, which happens to be for this site:

<proxy balancer://blog>
  BalancerMember http://localhost:10010
  BalancerMember http://localhost:10011
  BalancerMember http://localhost:10012
</proxy>
<VirtualHost blog.flame.org:80>
  DocumentRoot /www/blog/flame-blog/current/public
  <directory "/www/blog/flame-blog/current/public">
    Options FollowSymLinks
    AllowOverride None
    Order allow,deny
    Allow from all
  </directory>

  ProxyRequests off
  <proxy *>
    order deny,allow
    allow from all
  </proxy>

  RewriteEngine on

  # Check for maintenance file. Let apache load it if it exists
  RewriteCond %{DOCUMENT_ROOT}/system/maintenance.html -f
  RewriteRule . /system/maintenance.html [L]

  # Rewrite index to check for static
  RewriteRule ^/$ balancer://blog%{REQUEST_URI} [L,P,QSA]

  # Let apache serve static files (send everything via mod_proxy that
  # is *no* static file (!-f)
  RewriteCond %{DOCUMENT_ROOT}%{REQUEST_FILENAME} !-f
  RewriteRule .* balancer://blog%{REQUEST_URI} [L,P,QSA]
</VirtualHost>

It is important, at least on my host, to use localhost in the balancer destinations. This is due to mongrel suddenly running on IPv6 loopback (::1) rather than the usual IPv4 loopback (127.0.0.1). I don't know why this happened, but the localhost trick makes Apache try both addresses, and whichever works it will use.

This configuration makes Apache serve static content, and sends all other requests off to one of the Mongrel processes.

Saturday, October 27, 2007

Ursae-Lyons

Last weekend my wife and I attended a lovely little event in the Barony of Bjornsborg in the Kingdom of Ansteorra. We had a wonderful time. There was music, a bardic circle, and lots of singing (most of it good!) coming from Cynric's tavern.

The best things to happen there, in my opinion, was that Baron Cynric of Bedwyn was awarded with the Kingdom's highest persona award, Lions of Ansteorra, Devenders of the Dream, and his lady wife Baroness Seraphina Maslowska was awarded with a Pelican! Congratulations to both of you!