First of all, this post won’t be for people who think developer’s job is to design, write code and test it. It’s far beyond that. One of the important responsibilities is to ship your code to production. How to do that safely?
There are certain classes of exciting problems which are surfaced only in a massively distributed systems. This post will be about one of them. It’s rare, it’s real and if it happens, it will take your system down. The root cause, however, is easy to overlook.
It’s surprising how the volume of data is changing around the world, in the Internet. Who would have thought 10 years ago, that in future a physical experiment will generate 25 petabytes (26 214 400 GB) of data, yearly? Yes, I’m looking at you, LHC. Times are changing, companies are changing. Everyone is designing for scale a tad different. And that’s good, it’s important to design for the right scale.
Let’s assume you are considering using Cassandra for logs storage or in general, for time series storage. You are well prepared - asked google extensively. Yet, there is a trap waiting to kill your cluster in few weeks after lunch.
As they say: there are two kinds of people in the World -
those who pick up the ice cube that falls on the floor, and those who kick it under the fridgethose who back up their files and those who haven’t experienced losing all their files yet.
Which category do you fall in?
I decided to set up a backup system with ResilioSync - the heir apparent of the BitTorrent Sync software. Well, that wasn’t good idea and I don’t recommend anyone using this software.