For how many data sources? The whole reason everyone goes to Splunk is that it scales, and scales incredibly well.
Large enterprises can generate hundreds of terabytes to petabytes every day. Splunk has all sorts of issues, but to pretend as if you can replace them in any large shop with a 1200 line python script and SQLite is just being disingenuous. This acquisition falls right into Cisco's sweet spot, they aren't chasing shops that can dump all their security and infrastructure logging into a SQLite database and not have it tip over in an hour.
It's around 6 data sources on ~25 machines, but it could be easily scaled to way more than that with a bit of work. And I mean less work than it takes to do even trivially simple things using the horrible Splunk API. There are many thousands of small companies using Splunk and getting totally ripped off for a very mediocre product with a rapacious and annoyingly aggressive salesforce.
You'd be surprised how many companies with infra that small have CTOs get consultant buzzword pilled into buying every SaaS under the sun nonetheless...
How many servers does Stack overflow run on? It’s not a good measure of data volume or criticality.
I think “expensive” here is basically relative to revenue/margin. Where margins are high, spending on Splunk (etc.) isn’t meaningful. Where margins are thin, it hurts.
Basically, the arguments here seem to reflect the markets and business model folks are working under. Some pay, some can’t and some won’t - all valid.
I havent developed it yet. But my Splunk killer solutions actually scales so big we can use it to walk to the center of the universe. And its only 1 line of Rust and a bash script that runs when ever the Unix clock has 420 in the number string.
I think we're talking about very different levels of scale. Enterprises are generally feeding tens to hundreds of thousands of datapoints into Splunk depending on their size between servers, networking gear, endpoint devices, etc.
Wait what this is such an important detail. Log aggregators like Splunk start being something to consider when you get to about 25 THOUSAND machines, not 25 machines. I hope that for you, humility will come with experience.
Splunk isn't perfect. Managing it is more work than it should be for example. But I've got hundreds of systems I'm pulling logs from and that's not counting infra and applications as well. And my deployment isn't even a large one by their standards. Your use case just isn't the scale where splunk makes sense.
Splunk does not scale to large data sources. It fucks out at a few TB and then you have to spend hours on the phone trying to work out which combination of licenses and sales reps you need to get going again.
By which time you can just suck the damn log file and grep it on the box.
But, and this is not meant as criticism or insult as I have no idea how Splunk works, it is just based on other comments; do you know what license your company has with them? It appears that if you are paying them millions, it scales fine, otherwise, it does not?
Well usually you have to overpurchase up front and they sell you a 3 year lock in to make it affordable capital cost. Then when you eek over it temporarily, the sales guy calls you up within 10 nanoseconds to bill you for more.
I was getting 2-4 calls a week.
It was so fucking annoying and expensive ($1.2M spend each cycle) we shitcanned the entire platform.
First thing they hear of this is when our ingress rate drops to zero and they phone us up to ask what is happening. Then we don't go to the numerous catch up and renewal meetings and calls. Then we stop answering the phone.
Had a similar experience with them, they are truly the worst. We wasted a bunch of time trying to figure out how the ingestion volume could be so high and then realized that 99% of it was from the ridiculous default settings of their universal collector agent which was dumping detailed system stats every few seconds-- all to drive up usage so they can harass you about spending more money on their awful product. I did the renewal call with them just to basically tell them how outrageous their company is.
Yeah, because that is what I meant. A lot of services are useable without paying through the nose, this one apparently not, but thanks for the excellent input.
I'm certainly not a Splunk expert and I'm CERTAINLY have no insight into the nature of our financial arrangement with them, but yeah it's expensive.
I think there's not much of a useful "flat rate" tier; you pay based on usage. People can accidentally spin up a ton of EC2 instances and get a huge surprise AWS bill, too. And yeah our logging needs are high and monotonically increasing but they're also relatively predictable at our scale.
It ALSO turns out though that Splunk is really really good at their job and matching their expertise would require tons of engineering effort and it's not like the disk space alone is THAT cheap if you want it to be searchable.
I've worked at companies with objectively large amounts of data. Splunk scaled to meet their workloads. At no enterprise doing this is someone able to just isolate a single log file and grep through it at scale.
Well, according to what people write in this thread, a distributed grep or some other way to organize a decent central logging system might be a necessary part of the core competency. Because if they buy splunk instead, they might go bankrupt.
You don’t have to be splunk to make money out of distributed grep but it turns out to not be that easy… as proven by the fact that there are quite a few competitors
Uhhhh you splunk scales no matter the size. for just pure ingest. Now if you got duped into the SVC model I can see what you mean. But for pure Gigs/Day ingest if you know what youre doing it can scale infinitely.
Large enterprises can generate hundreds of terabytes to petabytes every day. Splunk has all sorts of issues, but to pretend as if you can replace them in any large shop with a 1200 line python script and SQLite is just being disingenuous. This acquisition falls right into Cisco's sweet spot, they aren't chasing shops that can dump all their security and infrastructure logging into a SQLite database and not have it tip over in an hour.