Why we switched from Papertrail to SplunkStorm

Splunk> StormA few months ago, during the great search for loggly alternatives, I ran across two major solutions, Papertrail and SplunkStorm. While at first we used Papertrail since it was very familiar to us as developers (in fact it's really just "tail" with an optional UI), it turned out that this was incredibly useless when we wanted to get real details on the logs more then just a simple "what's going on right now". We upgraded our plan, shoved about 50GB/month into the system, and soon it became apparent that Papertrail could not reliably handle the massive amount of logging data we had. What's worse, there's no built-in graphing support.

We realized that SplunkStorm, despite being a lot more "enterprise-esque", offered us the one thing we really needed form logging: traceability. There's a big difference between accepting log messages, and understanding log messages. For example, we include some timing metrics in some of our log messages, to help us see what areas need to be focused on for performance tuning. We were able to parse this data into meaningful fields, and then do something like this query:

index_time > 10 | timechart count

This gave us how many of our events took more then 10 seconds, and plotted them out in a very nifty little time chart.

What's even better, SplunkStorm is actually cheaper then Papertrail, and it's incredibly fast. With Papertrail, you can tell they're not indexing events quite properly. Specifically if you do a search for something that happened days ago, it might take hours to get any sort of response (in our case we have upwards of 50GB of log files, and it appears to simply be grepping through them in reverse chronological order). When you need to see a pattern, that's simply not acceptable.



The folks at Papertrail certainly did do a good job with tailing log files, they do that way better then Splunk Does, and they even have a CLI client to do this. However, that one feature does not make up for the fact that it fails very badly with any sort of searching, which really is the point of log aggregation isn't it?

If you want more from your statistics, if you can't wait to dive into details directly within your logging platform and really find out what your logs mean, not just what they say, then take a look at Splunk. It's powerful enough for the advanced developer (can you say "regex" everywhere!), yet simple enough for our non-technical users to also understand. Being able to produce sexy graphics on-demand of statistics you didn't even know you had? Yeah, that's Splunk.

There are two very major features that Splunk is working on which is the reason we switched over now; a real-time streaming API for searching (think, CLI client for tailing splunk logs), and more importantly for most, alerts. While Papertrail does do both of these things now, it's not very good with Alerts, and it's lack of fast searching in the history of your logs do not make up for it's current features.

What do you think? Do you use Splunk? What are some of the cool things you've done with your logging systems?

logos/graphics are trademarked and copyrighted by Splunk and used with their consent

2