Creator of stockideashq.com here.
I've been working in software development for a while, and Datadog has been a tool I used to monitor my software infrastructure for the majority of the time. Roughly speaking, datadog is a software you install on to your machines for you to visualize things like your vm CPU, memory or HTTP traffic and detect potential issues so your software can run smoothly.
I used a few components there such as the Dashboard, Monitor, APM, Traces, Logs, Synthetics, RUM. The work flow for my website has usually been that you have to setup some alerts on your monitors or APM, and once it crossed some thresholds it would send you an alert. And then you can do take a look at the dashboards and logs and see what's happening with your services.
Another use case for me is Synthetics. Synthetics is basically just a health check for your service that you would expect to get a success or error responses. But, in addition to doing a health check I've also use it to hit endpoints that requires cache warming from many locations in the world, so that when visits from around the world visit my website they would be able to access information that's being pushed to and cached at the edges. And this means better UX for visitors.
Traces also provide useful insights into the performance and behavior of services. By analyzing distributed traces, you can identify and understand bottlenecks, latency issues, and errors in your systems and see how requests traverse different services and talk to each other.
I currently use datadog on the free plan since I do not have a lot of machines I need to monitor. But imagine if you're a company that has a lot of (say, N) machines handling your business traffic, and each datadog software costs P per month to run on your machine, and then you would be paying N * P per month to datadog. As a growing company, you would also experience proportional growth in your infrastructure, and this is how datadog makes money for the baseline.
And a typical reason why this business model can scale really well is: let's say I am Facebook and I created a new service called Threads (this is an updated example, of course). And once I do some capacity planning and then I realized I might need N machines to handle those extra traffic. But since there is no guarantee how your access pattern to other services would change, there's no reason to consolidate and cleanup existing services. This means if your company is growing you would be paying more and more to companies like datadog and make sure your infrastructure is ok. Also, the cost to install a datadog agent on a machine is not high, imagine you have 1000 vms that you need to pay 33 USD a month each, you would only be paying 33,000 USD a month. That's quite cheap if you think about how expensive it is to hire a dev, and there's absolutely no reason to create your own observability solutions in-house to do something you are not good at.
Also, datadog claims that they want to be your single pane of glass so that teams can collaborate on shared data using the same view. And from my past experience it's very easy to speak "datadog" as a infrastructure as code language, and with more devs able to speak this language it would be even harder for APM competitors to compete with them in the future.
This one should have an ok future at least.