Datastax's yearly Cassandra Summit has grown into a two day event this year, June 11-12 in San Francisco. If you are a user of Cassandra (or are considering it), then you probably want to attend this conference (use code SFSummit25 for 25% off).
I'll be there to present on virtual nodes; Find me if you want to chat about databases, network management, beer, or the Can't Hug Every Cat phenomenon.
NP: The Malkin Jewel, The Mars Volta
I'm fortunate enough to be speaking at Berlin Buzzwords again this year. As usual, chats over beer or currywurst (or both) are always welcome. Hope to see you there!
As usual, if you think we'll cross paths and want to meet for beer, coffee, or currywurst, let me know!
This year, the organizers are arranging for a number of hackathons and workshops to precede and follow the main conference. One of those will be a Cassandra event hosted by Acunu and Datastax (date to be announced).
If you're in the Berlin area (or can be), and are interested in search, data analysis and NoSQL (and especially if you're interested in Cassandra), I'd recommend you plan to attend.
If you've been around for more than a few years, you've probably bore witness to how susceptible the tech industry is to hype. Some new-shiny comes along, people lose their minds, and seemingly overnight The Next Big Thing has spread like wildfire. Like it or not you find yourself bombarded by blog posts, tweets, articles, and water cooler chat from wild-eyed co-workers. Clearly, Ted Dziuba knows what I'm talking about.
Ironically though, what Ted is missing is the corollary, the equally annoying contrarian who takes it upon himself to set the world right by refuting The Next Big Thing, usually with straw-men and a lot of hand-waving. Seriously dude, don't feed the trolls.
Because I can be a contrarian too, let's have a closer look at some of Ted's points.
The idea is that object relational databases like MySQL and PostgreSQL have lapsed their useful lifetimes, and that document-based or schemaless databases are the wave of the future. Never mind of course that MySQL was the perfect solution to everything a few years ago when Ruby on Rails was flashing in the pan. Never mind that real businesses track all of their data in SQL databases that scale just fine. (For Silicon Valley readers, Walmart is a real business, Twitter is not.)
No, the idea is not that relational databases have lapsed their useful lifetimes, and at least in the case you later cite (Cassandra), the data-model is not considered a feature when compared to relational databases. And Twitter has indicated that they aren't using Cassandra as a replacement for everything, they are still using MySQL, so maybe there's still hope that they can be a Real Business someday.
Also, just because a "real business" like Walmart can cope using a relational database doesn't disqualify any business that can't from being "real", that's just dumb. Even where it is technically possible, there are cases when the economics of running the business preclude the costs of using say Oracle.
So you've magically changed your backend from MySQL to Cassandra. Stuff will just work now, right? Well, no. Did you know that Cassandra requires a restart when you change the column family definition? Yeah, the MySQL developers actually had to think out how ALTER TABLE works, but according to Cassandra, that's a hard problem that has very little business value. Right.
I had considered trying to explain here how the differing use-cases and data-model made this less of a problem than Ted perceived it to be, but it's probably easier to just point out that it's basically fixed (see CASSANDRA-44).
I'm not just singling out Cassandra - by replacing MySQL or Postgres with a different, new data store, you have traded a well-enumerated list of limitations and warts for a newer, poorly understood list of limitations and warts, and that is a huge business risk.
I can't speak for other NoSQL projects, but I can assure you that if you have a work-load that can be reasonably accommodated by MySQL or Postgres, then that is what we will recommend you use. For those that can't, they're just going to have to live with a newer, less understood list of limitations and warts, because otherwise there is no business.
The sooner your company admits [that you are not Google], the sooner you can get down to some real work. Developing the app for Google-sized scale is a waste of your time, plus, there is no way you will get it right. Absolutely none. It's not that you're not smart enough, it's that you do not have the experience to know what problems you will see at scale.
The takeaways here I believe are, don't prematurely optimize, Google alone has the scale to justify distributed systems, and you are dumb. One out of three, not bad.
NoSQL will never die, but it will eventually get marginalized, like how Rails was marginalized by NoSQL. In the meantime, DBAs should not be worried, because any company that has the resources to hire a DBA is likely has decision makers who understand business reality.
Any company that has decision makers who understand reality will know to use the right tool for the right job.
As I mentioned earlier, I was fortunate enough to be able to attend FOSDEM this year. The sheer scale of FOSDEM is amazing, with literally thousands of people in attendance, dozens of projects represented, and hundreds(?) of talks. It's doubly impressive when you consider that it is entirely volunteer driven and 100% sponsored (it's no cost to attend).
The NoSQL track organized by Steven Noels on Sunday turned out quite well too I thought, and it seemed to generate a lot of interest (the room was continually filled to capacity and the doors barred). There were talks from some of the usual players (MongoDB, HBase, and of course Cassandra), along with some less heard of projects (GT.M). Mine was the last talk of the morning and seemed to be pretty well received. I got a lot of great questions both during and after the session, and ended up talking shop with several attendees until the next session was starting.
Finally, here is the video of my talk, or you can view it here with the slides.