Machine virtualization on the Linux desktop
In the past I’ve used virtualization mostly in server environments: Xen as a sysadmin, and VMware and Virtuozzo as a user. They have worked well enough. When there’ve been problems they’ve mostly been traceable to network configuration trouble.
Lately I’ve been playing with virtualization on the desktop, specifically on Ubuntu desktops, using Xen, kvm, and VirtualBox. Here are a few notes.
Xen: Requires hardware virtualization support for full virtualization, and paravirtualization is of course only for certain types of guests. It feels a little heavier on resource usage, but I haven’t tried to move beyond lame anecdote to confirm that.
kvm: Rumored to have been not ready for prime time, but when used from libvirt with virt-manager, has been very nice for me. It requires hardware virtualization support. One major problem in kvm on Ubuntu 8.04 is with the CD/DVD driver when using RHEL/CentOS guests. To work around that, I used the net install and it worked fine.
VirtualBox: This was for me the simplest of all for desktop stuff. I’ve used both the OSE (Open Source Edition) in Ubuntu and Sun’s cost-free but proprietary package on Windows Vista. The current release of VirtualBox only …
environment hosting
Know your tools under the hood
Git supports many workflows; one common model that we use here at End Point is having a shared central bare repository that all developers clone from. When changes are made, the developer pushes the commit to the central repository, and other developers see the relevant changes on subsequent pulls.
We ran into an issue today where after a commit/push cycle, suddenly pulls from the shared repository were broken for downstream developers. It turns out that one of the commits had been created by root and pushed to the shared repository. This worked fine to push, as root had read-write privileges to the filesystem, however it meant that the loose objects which the commit created were in turn owned by root as well; fs permissions on the loose objects and the updated refs/heads/branch prevented the read of the appropriate files, and hence broke the pull behavior downstream.
Trying to debug this purely on the reported messages from the tool itself would have resulted in more downtime at a critical time in the client’s release cycle.
There are a couple of morals here:
- Don’t do anything as root that doesn’t need root privileges. :-)
- Understanding how git works at a low level enabled a …
openafs git
Fun with 72GB disks: Filesystem performance testing
If you haven’t heard, the Linux Plumbers Conference is happening September 17-19, 2008 in Portland, OR. It’s a gathering designed to attract Linux developers—kernel hackers, tool developers and problem solvers.
I knew a couple people from the Portland PostgreSQL User Group (PDXPUG) interested in pitching an idea for a talk on filesystem performance. We wanted to examine performance conventional wisdom and put it to the test on some sweet new hardware, recently donated for performance testing Postgres.
Our talk was accepted, so the three of us have been furiously gathering data, and drawing interesting conclusions, ever since. We’ll be sharing 6 different assumptions about filesystem performance, tested on five different filesystems, under five types of loads generated by fio, a benchmarking tool designed by kernel hacker Jens Axboe to test I/O.
Look forward to seeing you there!
conference performance
Small changes can lead to significant improvements
Case in point: We’ve been investigating various system management tools for both internal use and possibly for some of our clients. One of these, Puppet from Reductive Labs has a lot of features that I like a lot and good references (Google uses it to maintain hundreds of Mac OS X laptop workstations).
I was asked to see if I could identify any performance bottlenecks and perhaps fix them. With the aid of dtrace (on my own Mac OS X workstation) and the Ruby dtrace library it was easy to spot that a lot of time was being eaten up in the “checksumming” routines.
As with all system management tools, security is really important and part of that security is making sure the files you are looking at and using are exactly the files you think they are. Thus as part of surveying a system for modified files, they are each checksummed using an MD5 hash.
To speed things up, at a small reduction in security, the Puppet checksumming routines have a “lite” option which only feeds the first 512 bytes of a file into the MD5 algorithm instead of the entire file, which can be quite large.
As with most security packages these days, the way you implement an MD5 hash is to get a “digest” object, …
security
Stepping into version control
It’s no secret that we here at End Point love and encourage the use of version control systems to generally make life easier both on ourselves and our clients. While a full-fledged development environment is ideal for maintaining/developing new client code, not everyone has the time to be able to implement these.
A situation we’ve sometimes found is clients editing/updating production data directly. This can be through a variety of means: direct server access, scp/sftp, or web-based editing tools which save directly to the file system.
I recently implemented a script for a client who uses a web-based tool for managing their content in order to provide transparent version control. While they are still making changes to their site directly, we now have the ability to roll back any changes on a file-by-file basis as they are created, modified, or deleted.
I wanted something that was: (1) fast, (2) useful, and (3) stayed out of the user’s way. I turned naturally to Git.
In the user’s account, I executed git init
to create a new Git repository in their home directory. I then git add
ed the relevant parts that we definitely wanted under version control. This included all of the …
git
Standardized image locations for external linkage
Here’s an interesting thought: https://boingboing.net/2008/09/01/publishers-should-al.html
Nutshell summary: publishers should put cover images of books into a standard, predictable location (like http://www.acmebooks.com/covers/{ISBN}.jpg).
This could be extended for almost any e-commerce site where the product image might be useful for reviews, links, etc.
At very least, with Interchange action maps, a site could capture external references to such image requests for further study. (E.g., internally you might reference a product image as [image src=“images/products/current{SKU}”], but externally as “/products/{SKU}.jpg”; the actionmap wouldn’t be used for the site, but only for other sites linking to your images.)
interchange
Authorize.Net Transaction IDs to increase in size
A sign of their success, Authorize.net is going to break through Transaction ID numbers greater than 2,147,483,647 (or 2^31), which happens to exceed the maximum size of a signed MySQL int() column and the default Postgres “integer”.
It probably makes sense to ensure that your transaction ID columns are large enough proactively—this would not be a fun bug to run into ex-post-facto.
database postgres payments ecommerce
Major rumblings in the browser world
Wow. There’s a lot going on in the browser world again all of a sudden.
I recently came across a new open source browser, Midori, still in alpha status. It’s based on Apple’s WebKit (used in Safari) and is very fast. Surprisingly fast. Of course, it’s not done, and it shows. It crashes, many features aren’t yet implemented, etc. But it’s promising and worth keeping an eye on. It’s nice to have another KHTML/WebKit-based browser on free operating systems, too.
Now today news has come out about Google’s foray into the browser area, with a browser also based on WebKit called Chrome. It’ll be open source, include a new fast JavaScript engine, and feature compartmentalized JavaScript for each page, so memory and processor usage will be easy to monitor per application, and individual pages can be killed without bringing the whole browser down. Code’s supposed to become available tomorrow.
A new generation of JavaScript engine for Mozilla is now in testing, called TraceMonkey. It has a just-in-time (JIT) compiler, and looks like it makes many complex JavaScript sites very fast. It sounds like this will appear formally in Firefox 3.1. Information on how to test it now is at John Resig’s …
browsers