Going to Debconf11


See you in Banja Luka!

Christmas Lights (redux)

Last year, if you'll remember, I did a half-assed job of putting together a musically coordinated christmas light rig, and promised to Do Better this year. Lucky for me I was vague because under-promising made over-delivering a lot easier. :)

What I did manage to do was tackle last years technical debut, and get the code cleaned up. I've named it Lumen and put it up on Github.

Lumen has two modes, record and playback. When launched in record mode, you use the a, s, d, f, j, k, l, and ; keys to "play" the 8 outputs to the music. You can think of it as a sort of reverse Guitar Hero. It's not as easy as it sounds (not for me anyway). It takes practice, at which point you're so sick of the song that if you never hear it again it'll be too soon.

Playback mode can be used to either drive the lights, or to run a simulation to check your work. Here is what it looks like in simulation:

And here is what it looks like For Real:

Christmas Lights

We put up lights each year for the holidays, and while I don't mind having the house decorated, I do not like having to put them up. Despite this, I feel mounting pressure each year to Do Better, which by default means more lights and decorations, which in turn mean even more work.

The year before last I had the idea that if I worked smarter I might avoid working harder, and that one of those musically synchronized setups would be pretty sweet. Problem is I came to this conclusion in October of 2008, and that didn't leave enough time to properly procrastinate before throwing something together at the last minute, so I was forced to postpone. This past year though I was able to spend a solid 11 months procrastinating, which still left a couple short weeks to hurriedly throw something together.

So long story short, I did it, I put together a controller that sets christmas lights to music. And, providing that you weren't privy to all of the nasty hacks and ugly short-cuts, it was actually kind of neat of to watch. Obviously though, I'm not entirely happy with the results, so I'm considering this year a practice run, and hope that with this as a basis to build upon, next year it will be pretty sweet. So treat the rest of this post as more of a rough brain-dump than a recipe or step-by-step, and hopefully it will prove interesting for comparison purposes next year.


Normally this is where I'd expound on some of my research, the options I investigated, costs, ease of use, etc. That's not going to happen because with all of the procrastination this project required, I simply didn't have the time. Instead I went straight for a pre-assembled parallel port relay board, and I took a page out of this guys book and mounted it in a plastic tool box.

I used some cheap extension cords, fixed to the ends of the toolbox with cable connectors, and wired it all together inside. The relay board needs a power supply, so there is an outlet inside for that.

A cord that exits the rear of the toolbox gets plugged into the mains to supply power to the whole thing, but since we're combining electricity and the great outdoors, a GFI is a must.

I scored this parallel cable in an old pile of hardware. It worked great once I got the zip drive that was attached to it off and in the trash.

Finally, I made use of an old PIII notebook.

Christmas light controll

I know, it's not much to look at. Sue me.


There are 8 pins on a parallel port that are (were) used to send character data to printers, and it's these 8 pins that are used for outputs. That means controlling the outputs is as simple as writing to that byte. The pyparallel library makes this even easier, so for example, I was able to use something like the following, ran from a cronjob to start and stop the lights each day.

python -c 'import parallel; parallel.Parallel().setData(0)'

I wired everything up to the normally closed contacts of the relay board so the lights would fail-safe. In other words, you have to turn the relay on, in order to turn the corresponding lights off. The setData(0) above switches on all of the lights by turning the relays off, killing the lights is as easy as changing that to setData(255).

Initially I had the idea that I'd whip up something to analyze an audio track; that the light show would essentially be a visualization of the waveform. That, as it turns out isn't the panacea that it would seem. Sure, the lights will flash in a way that seems vaguely in response to the music, but the results are just not as coordinated, or ordered, as the samples you see on the Internet.

So I then moved on to the idea of creating a time-series of output states that could be "played" along with the audio, but I was naive to believe that I could hand-craft this data file, so before all was said and done, I'd also written a PyGame application for keying in the outputs as the music played, and visualizing it during playback.

Finally, only after getting everything working I found out that the way "professionals" do this is actually pretty similar to what I came up with, only using MIDI, so I will definitely be looking into that before next year.

Going to FOSDEM

Due to a scheduling conflict, Jonathan won't able to present on Cassandra in the NoSQL devroom at this years FOSDEM, so I'll be going in his stead.

I've always wanted to go to a FOSDEM, and getting to see Brussels will be a real treat as well. I can't wait!

I'm going to FOSDEM, the Free and Open Source Software Developers' European Meeting

NP: Black & White, In Flames

NoSQL: What's in a name?

Depending on the circles you travel in, you might be aware of the whole NoSQL "movement". If not, I'm not going try and explain it at this time (explaining it is sort of the problem), but you can get the general idea from wikipedia.

I've spent the last couple of days at nosqleast and one of the hot topics here is the name "nosql". Understandably, there are a lot of people who worry that the name is Bad, that it sends an inappropriate or inaccurate message. While I make no claims to the idea, I do have to accept some blame for what it is now being called. How's that? Johan Oskarsson was organizing the first meetup and asked the question "What's a good name?" on IRC; it was one of 3 or 4 suggestions that I spouted off in the span of like 45 seconds, without thinking.

My regret however isn't about what the name says, it's about what it doesn't. When Johan originally had the idea for the first meetup, he seemed to be thinking Big Data and linearly scalable distributed systems, but the name is so vague that it opened the door to talk submissions for literally anything that stored data, and wasn't an RDBMS.

I don't have a problem with projects like Neo4J, Redis, CouchDB, MongoDB, etc, but the whole point of seeking alternatives is that you need to solve a problem that relational databases are a bad fit for. MongoDB and Voldemort for example set out to solve two very different problems and lumping them together under a single moniker isn't very meaningful. This is why people are continually interpreting nosql to be anti-RDBMS, it's the only rational conclusion when the only thing some of these projects share in common is that they are not relational databases.

The cat is out of the bag though, and the "movement" has enough momentum that I don't think it's going anywhere. And, I'm not really advocating that, it's had the effect of bringing a lot of attention to some very interesting projects, and that's a Good Thing. Maybe Emil Eifrem has the right idea by encouraging people to overload the term with Not Only SQL.

Upcoming travel

I have several trips lined up for the next few weeks:

There is also a NoSQL meetup on November 2 as a part of ApacheCon; I've offered to present on Cassandra there. I'm also thinking of giving a session at BarcampApache, and I'm scheduled to sit on a "SQL vs. NoSQL" panel at OpenSQL, though I'll probably submit a session idea or two there as well.

There are a lot of Cassandra people in the Bay Area, it'd be great if we could setup a hack-a-thon/bug squashing party/meetup/whatever during ApacheCon. Ping me or post something to the list if you are interested! :)

NP: Calling Dr Love, Kiss

Ooops, I did it again

I rewrote my blog software again (actually, it was done months ago but I just now got around to deploying it). The last one used Turbogears, but the 1.x branch is getting long in teeth, and 2.0 came a little too late. Besides, Django is the new hotness these days.

Somehow the rewrite resulted in about half as much code, which is always cool, and I finally got to make use of mod_wsgi, (it is everything that I had ever dreamed it would be, and more :)).

All of the old permalinks should still be valid, and with any luck I managed to avoid DoS'ing everyones feed reader.

Transitioning My GPG Key

A few months ago a group of researchers announced a fairly serious attack that shattered everyone's faith in SHA-1. It has frightening implications for anyone who relies on cryptographic signatures, and while consensus is that there is little danger in the near-term, most people agree that now is the time to start a move to something stronger.

So, I've begun my transition, (document here), and submitted my new key for the Debconf9 signing party later this month. I intentionally left out any mention of a time-line in the transition doc, and I'm in no big hurry. I'll retire the old key once I have enough signatures, or once there is evidence of a real threat, whichever comes first.

Thrift Packaging

My latest project at work is Cassandra, a distributed, eventually consistent, column oriented data store. It's somewhere between Dynamo (Cassandra's original author worked on Dynamo), and Google's BigTable. It was developed as an internal application at Facebook, later open sourced, and is now an Apache incubator project.

The external interface to Cassandra is thrift-based. Thrift is a framework for creating network services, services that communicate using a compact binary data format. It's similar to Google's Protocol Buffers, but with more of a focus on RPC, and greater language coverage, (much greater actually). The bottom line, any application that uses Cassandra for structured data storage is going to need Thrift. So, I filed an ITP (Intent To Package) and have started work on packaging it for Debian.

Thrift is an interesting project to package as it has an architecture specific application (C++), 6 architecture specific and 5 architecture independent libraries, and covers 12 different languages. That's right, 12.

I'm still somewhat undecided on a game plan; the options I've considered so far are:

  1. Convince upstream to split their source tree and distribute all of these libraries separately, allowing them to be packaged by people with the skills and/or motivation for each.
  2. Split the source myself and package the bits that are most important to me.
  3. One source package based on the official upstream release, with binary packages for each of the components that I need/am comfortable maintaining. Folks interested in the parts not packaged could step up to the plate and contribute their time.
  4. Best-effort packaging of most/all of the libraries upstream ships with the proviso that for any I'm not comfortable seeing in a release, and for which no one has stepped forward for, they would be removed prior to Squeeze.

I've already taken a stab at #1 and it didn't seem promising. #2 is an option I still consider on the table but I'm a little concerned that it could lead to a mess. #3 and #4 really boil down to the same thing, collaborating with others to package as much as possible while maintaining the standards everyone expects from Debian. I guess I'm currently leaning toward some variation of #3 or #4, probably through the use of collab-maint or a dedicated Alioth project.

For the time being, my efforts can be tracked in Git here, so drop me a line if you're interested in joining the fun!

NP: Sand and Mercury, The Gathering

Brain Rewiring

A co-worker of mine uses one of the stranger keyboards I've seen, a Kinesis Advantage.

Kinesis Advantage Keyboard

He picked it up his after a bout with tendinitis and was sold on it. He was kind enough to let me borrow his spare for about a week so I could try it out. It's been an interesting week. :)

The Advantage differs from conventional keyboards in a number of ways, the ones I think most relevant are:

  • The separation of the left and right sides of the keyboard, done to keep you from pivoting your hands side-to-side at the wrist as you type. A lot of keyboards address this by creating a break in the middle and angling the two sides outward (everyone has seen the MS Naturals), but not having to turn your arms inward feels more comfortable/natural to me.
  • Keys that are arranged into a concave surface as opposed to a flat one. This might seem strange, but the curvature lines up well with the arc your finger tips travel in, and positions the keys within closer reach of one another.
  • The keys are also arranged on a vertical axis to one another, as opposed to being staggered. So for example the C key is directly below D, not below and to the right. Moving your fingers from their keys on the home row to the corresponding keys above and below is a much more natural movement.
  • Key layout is different as well. You're expected to do quite a bit more with your thumbs. The Backspace, Delete, Home, End, and Control and Alt keys are positioned within reach of your left thumb, your right works Space, Enter, Page Up and Down, in addition to another Control key, and a Windows key (which I remap to Alt). This really makes sense if you think about, why waste two perfectly good fingers on the same key, when you could put them to use and eliminate all of that reaching.
  • The keys have outstanding tactile feedback, in addition to an audible feedback (something between a faint click and a beep emitted by a speaker somewhere inside). I find this feedback helpful in maintaining a light touch on the keys since I often catch myself banging keys pretty hard on normal keyboards.

I'm not going to lie though, it does take some getting used to. The biggest problem I had was Space vs. Backspace, which are the right-most thumb key, and left-most thumb key respectively. Prior to all of this I heavily favored my left thumb for striking the space bar, and muscle memory is a bitch when it causes you to Backspace when you meant Space.

Other points of frustration were the tilde/back-tick key (located bottom-left instead of top-left), and the left and right bracket/brace keys (located bottom-right). These keys are used a lot in a shell or when coding, which probably made the pain even more pronounced for someone like me.

I managed to force myself to use nothing else for several days, at which point I felt I was doing quite well. I still had the occasional problem here and there, but it seemed like I was well on my way to normalcy. Then I tried using the built-in keyboard on my laptop. Wow. Epic fail. It took a few more days and plenty of patience before I was able to move back and forth (and truth be told it's still a little awkward).

So was it worth it? Yeah, I think so. I've had RSI troubles of my own and a week of typing on this keyboard has felt pretty good. I've ordered one of my own to use at work, and I'll probably grab a second one for home.

Lenny Released On Time

Lenny released yesterday. This is great news, and congratulations all around to everyone that worked their asses off making it happen.

By my calculations this comes 677 days after the initial release of Etch, (or 22 months and change). I've said before, Debian releases When Ready and that (to the best of my observations), consensus seems to be that somewhere between 18 and 24 months is the sweet spot. Not only does this make for the second "on-time" release in a row, but there was an Etch-And-A-Half sporting new kernels and video drivers in the mean time.

With any luck the various "Debian is too old/can't release" memes will finally die.

Publishing divergence from upstream

On Monday I attended Martin Krafft's talk, Packaging with version control systems. Martin has started a project, coordinated via http://vcs-pkg.org, to explore work patterns for packaging and cross-distro collaboration using distributed version control systems. This is a topic that I've spent a fair amount of time on so it was interesting to see Martin's packaging work flow, and hear him discuss its evolution.

Today I attended a Bof organized by Luciano Bello. Luciano is the developer that discovered the recent OpenSSL vulnerability. The point of the BoF was to discuss ways of preventing this sort of thing from happening in the future. The vulnerability in question was introduced in a Debian-specific patch, so a good bit of the discussion centered around code review and the need to make Debian's upstream divergences more transparent.

There were quite a few in attendance that felt that the best way to publish divergences is by using a patch series, (something that recently received first class support by way of the new dpkg v3 format). I used to fall into this camp, but a blog post from Joey Hess got me to reevaluate my work flow, and I now feel pretty strongly that using a patch series is not the answer.

I switched to Git a while back and have adopted a work flow similar to Martin's with an upstream branch, a Debian packaging branch, and topic branches for each customization or bug fix. Obtaining the divergence from upstream is a simple matter of diff'ing the topic branches against the upstream branch and the entire change history is preserved. Using a patch system alone seems woefully inadequate when compared to any of the modern VCS, and generating a patch series from branches in a VCS feels like, as Martin likes to say, Yak Shaving.

I plan to subscribe to the vcs-pkg.org mailing list and follow the discussions taking place there. It should be interesting.

TXOSS 2008

The Texas Open Source Symposium in San Angelo is a wrap. This was a small one day event in San Angelo organized by Jeremy Fluhmann (who did an excellent job by the way). I rather enjoyed it, and providing that it becomes an annual event (which I understand is the idea), I will certainly try and make it back next year.

I gave a talk on Mercurial during the 11:00am slot. I'm not convinced that I did a very good job, but most of the people seated in the audience seemed to be paying attention and several people asked questions both during and after, so maybe it was alright.

The talks I attended were of excellent quality, I especially enjoyed both of the talks given by Patrick Michaud on Parrot and Perl6 (even if they did pretty much convince me that my shift in focus to Python was a Good Idea :).

Accomodations in San Angelo

I'm giving a talk on Saturday at the Texas Open Source Symposium entitled An Introduction to Mercurial. I'll be driving there but hadn't considered making a room reservation until today. How hard could it be to book a hotel room in San Angelo Texas, right? Sheesh.

I started out with a Google maps search that included the zip code of the venue, and began working through the options ordered by proximity and user submitted reviews. So first up were places like Comfort Suites and Holiday Inn, about a mile from the venue with reviews like, "I've never had customer service so good!". It ended with obscure motor inns on the opposite side of town with reviews like, "WHATEVER YOU DO, NEVER STAY IN THIS DISGUSTING PLACE". I called 18 different places and they were all booked.

Apparently there is some kind of triathlon or something going on this weekend.

Fortunately for me, Tarus Balog is also speaking. Fortunately for me he is also a smooth talker because he was able to get his room upgraded to one with two beds, (after finding out that he had made his reservations for the wrong month and got them to give him a room anyway, I might add).

Update: It's been suggested that this is the reason all of the hotels are at capacity.

Oh Joy

Radeon R5xx 3D programming guide released

NP: Running Out Of Pain, 12 Stones

Die Disk, Die

There's been a lot of "Ubuntu kills laptop hard drives" buzz going around lately. The implication is that over aggressive power management is causing excessive load/unload cycles, exceeding a reasonable duty cycle, and drastically shortening the life of your drive. I run Debian unstable on my laptop but I looked into it anyway and sure enough it's something which is effecting me as well.

As Matthew Garrett points out, it doesn't have anything to do with Ubuntu, Debian, or Linux in general, the culprit is aggressive power management settings in the drive firmware, or settings applied by the BIOS.

If this is happening to you, it's possible that it can be rectified with a firmware upgrade or by updating settings in the BIOS. The solution I chose was to allow laptop-mode-tools to control power management, applying maximum power savings when on battery (hdparm -B 1 /dev/$device), and disabling it when on AC power (hdparm -B 254 /dev/$device).

If you have an otherwise default install of laptop-mode-tools on Debian you can accomplish this by setting CONTROL_HD_POWERMGMT=1 in /etc/laptop-mode/laptop-mode.conf and issuing an /etc/init.d/laptop-mode restart.

Duplicity backport for Etch

I've been backing up all of my important machines to Amazon S3 using Duplicity for sometime now. It's worked out really well but required just enough hackery to prevent me from providing straight forward instructions for others.

I'm all about sharing the love so I submitted a new S3 backend to upstream using the excellent [boto] (http://code.google.com/p/boto/) library from Mitch Garnaat, and I packaged boto for Debian. The new backend made it into the 0.4.3 release of Duplicity, which in turn migrated to testing a couple of weeks ago. Now there is an Etch backport of 0.4.3 on backports.org.

In addition to installing duplicity from backports.org you will also need to pin python-boto from Lenny.


Free Software Driver for Radeon R5xx/R6xx

Wow. Less than two weeks ago AMD announced that they would be opening the specs for their graphics cards, a few days later they followed through, and yesterday a driver for R5xx/R6xx cards was released. How's that for fast?

AMD to open up graphics specs

This is excellent news. I look forward to the day when I can be oblivious of my graphics adapter.

New Blog

I've been using pyblosxom for years. I chose it because it was dead simple and I wasn't interested in a long-term commitment with a bloated PHP app, (or any PHP app for that matter). Unfortunately it has always been just a bit too simple.

I went shopping for new blog software, but sadly things aren't much better these days than they were when I originally set up my blog. There are a few more choices than there used to be, but the list of options is still pretty short if you aren't willing to use PHP (again, I'm not).

Ultimately I succumbed to NIH and wrote my own. It has a simple CRUD interface for inputing markdown which is stored to a database and converted on-the-fly to html. Writing one from scratch was probably the worst thing I could have done, but hey, it's like a right of passage these days, isn't it?

All of the original postings have been migrated over and any bookmarked permalinks should still be valid (shoot me an email if you find any that aren't). The feed links have changed, but I put a redirect in place for the old rss 0.9 feed which points to the new rss 2.0 one, (hopefully that's transparent). There are also atom 0.3 and 1.0 feeds as well.

If I created a mess out of your feed reader, I apologize.

When Technology and Stupidity Meet

Recently I sent an, (on-topic), email to a public mailing list that exists for the discussion of network management. Shortly after I received the following, (edited to protect the stupid).

We have enabled a Spam (Unsolicited e-mails) filtering software package in
an attempt to reduce unsolicited e-mails.

The following e-mail was blocked because our filter detected an improper
word or phrase. See the Rule section below for the word or phrase that
stopped the transmission of the message.

Rule = Spam;Bulk Email;Bulk Email Product/Service;Products/Services__3;Penis
Enlargement;Penis Enlargement Terms

My install of Spamassassin, (default configuration on Debian unstable), gave this same mail a score of -4.9, (definitely NOT spam), so I sifted through it manually. Out of 151 lines and 986 words, (minus headers of course), I could only find two words that by any stretch of the imagination could contribute to the rejection, the words "performance" and "response", (remember, this is a mailing list pertaining to network management).

Not only does this win the prize for the most crack-headed bit of content filtering that I've seen to date, but on top of that, they take it once step further and email the product of their stupidity to the person they *think* has sent the UCE. Considering how prevalent it is for real spammers to forge the sender address, the only people likely to even see it are either the victims of a false positive, or the victim of a JoeJob. Dumb-asses.

In yur DVD, stealin yer DRM

09 F9 11 02 9D 74 E3 5B D8 41 56 C5 63 56 88 C0

Etch Released On Time

Etch released yesterday. Awesome news. Now I rant.

There is no one-size-fits-all release cycle. Some people wait on pins and needles for the next release of their favorite distro, and six months is almost more than they can bear. Others milk their vendor's support for a full six years and are upset when it is EOL'd and they are forced to upgrade. Releasing every six months and supporting each release for six years is a costly endeavor because it means floating up to twelve releases at once. Someone once said that you can make some of the people happy some of the time, but not all of the people all of the time. They must have been talking about distribution release cycles.

It's very difficult (some would say impossible) to obtain consensus among volunteers in a project the size of Debian. I don't speak for the Debian Project as a whole, but I do however perceive this as an area where consensus has been reached, and the time-line is: When It Is Ready. "Ready" as it is used here relates to the completion of certain goals outlined at the beginning of the cycle combined with the absence of release critical bugs. "When" is targeted at 18 to 24 months after the previous release, but only if it is "Ready".

Debian does not have a fixed release schedule. If that bothers you then you are free to jump in and try to change things (it is a community-based project). If it bothers you than perhaps you should consider a distribution that does have a fixed release cycle. You could also just try getting over it.

Etch released when it was Ready (22 months and 2 days after Sarge). Etch released on time.

Bad Form

After the draft version of GPLv3 came out, I intently followed all of the various discussions for about a month. It got pretty boring, pretty fast. Boring because it seems obvious to me that it isn't going to be earth shattering.

Interpretations of the text range from "riddled with unacceptable restrictions" and "<< 3 incompatible", to "not significantly different from past versions". Whatever. I'm confident that when v3 goes gold and the dust settles, we'll all still be here. Life will go on.

This guy over at Forbes disagrees. Of course, it's a little difficult for me to take him seriously with all of the vitriol aimed at RMS. See if you can count the number of personal insults.

Debconf6: Pictures

I'm back home now, but before settling into my normal routine I thought I'd take the time to get some pictures put up.