ChaosSearch, in Boston, is a log analytics solution aims to provide enterprises with data lakes that turn cloud object storage into analytics engines. ChaosSearch features a stateless architecture that separates storage from compute, and data is stored in Amazon S3. It is accessible via SSO, but data is also protected with role-based access control, encrypted while in motion, and the service provides audit logging. It stores and analyzes data within S3 cloud storage directly so that there is no…
Logstash is a must in an ELK stack, which I am sure is going to be the #1 case. At any point when you have several sources, Logstash can be the common point to aggregate, and categorize those data. Then send this new data to its destination. Very handy. It is free and open source. It may not be appropriate to analyze data-sets dependent on each other but from a different data source. Reason being Logstash works on data at hand, and not wait for other data to arrive. It would be unwise for Logstashh to handle complicated, long-running transformations because this is injected and ejected. The faster you do it, the safer.
Memory: Logstash is a HOG, if you are deploying it on commodity (i.e. cheap and old) hardware: You will need at least 2GB, just for Logstash. So don't expect to run your entire ELK stack on one AMD Athlon machine.
Overlap: Logstash fills in an area of the ELK stack that makes the most sense: as a log file transformer / shipper. However, if you start breaking that stack, with the addition of other components- you start seeing where features of Logstash may be implemented or solved in the additional components much easier (or better, or to a higher degree of resolution)
More Overlap: Since my team employs Syslog-ng extensively- Logstash can sometimes get in the way (and this may be a problem for DevOps stacks overall): You can configure Syslog to record certain information from a source, filter that data, and even export that data in a particular format. Logstash will pick that data up, and then parse it. However, if you don't keep your Syslog-ng configuration files, and your Logstash configuration files in sync, your results will not be what you expected, and this will translate into (sometimes) hours/days of work, hunting down a line item in a configuration file.
As I said earlier, for a production-grade OpenStack Telco cloud, Logstash brings high value in flexibility, compliance, and troubleshooting efficiency. However, this brings a higher infra & ops cost on resources, but that is not a problem in big datacenters because there is no resource crunch in terms of servers or CPU/RAM
MongoDB and Azure SQL Database are just that: Databases, and they allow you to pipe data into a database, which means that alot of the log filtering becomes a simple exercise of querying information from a DBMS. However, LogStash was chosen for it's ease of integration into our choice of using ELK Elasticsearch is an obvious inclusion: Using Logstash with it's native DevOps stack its really rational
It is very difficult to give any figures on ROI, as it depends on many factors, and in a Telcocloud environment, it is much complex to find out; however, I would give some points below on ROI
ROI based on flexibility is very high, as it reduces the time to find RCA
ROI based on integration is very high because it supports multi-vendor environments, avoiding vendor lock-in & works across multi-cloud setups
ROI on resource consumption is less because Logstash in 2-3 times more resource-intensive as compared to its lightweight alternatives resulting in latency