Developers are building modern analytics applications with Apache Druid to deliver interactive data experiences for customer-facing insights.
By David Wang, Vice President, Product Marketing, Imply (https://imply.io/)
Every year industry pundits predict data and analytics becoming more valuable the following year. But this doesn’t take a crystal ball to predict. There’s actually something much more interesting happening that’s going to change everything in the analytics world: the rise of a new hero, the software developer.
If the past is any indication of the future, then what we are seeing is a major transformation unfolding across every industry: a changing of the guard, so to speak, of the ones who are creating value from data.
Today, the industry at large equates analytics with data warehousing and business intelligence. It’s a traditional approach of BI experts querying historical data “once in a while” for the executive dashboards and reports that have been around for decades.
But for bleeding-edge companies like Netflix, Target, and Salesforce, their use of analytics is much more progressive – and much more impactful and real-time. Companies like these see the true game-changer for data in the hands of their software developers.
Their developers are building modern analytics applications and doing it with Apache Druid to deliver interactive data experiences for investigative, operational, and customer-facing insights.
But what’s causing the emergence of these apps and what’s it mean for developers?
Let’s break down the Top 5 reasons:
Increasingly, analytics are needed to understand a situation or investigate a problem. This requires the freedom to slice and dice and interact with data live with sub-second query response at any scale. It’s a dynamic user experience that can be best created via a developer-built application.
No one wants to sit around waiting for a query to process. And while many databases will claim the checkbox for interactivity and speed, they’ll come with lots of scale constraints. They’ll rely on tricks like roll ups, aggregations, or recent data only to make queries appear faster, but that just restricts the insights you can actually get. So the operative word here is “scale”.
The days of relying on a few BI analysts to write SQL queries are seemingly in the rear-view. Data-driven companies today want to give everyone – from product managers to ops teams to data scientists – free access to explore. And, multi-tenancy takes user count even further. But concurrency doesn’t just come from the numbers of users. Developers are being asked to build analytics apps with dozens of visualizations with each firing off several concurrent SQL queries.
Now I’ll admit – it’ll be hard to find a modern database today that doesn’t claim high concurrency. You obviously wouldn’t want to force fit Postgres (or even Elastic) in uncomfortable positions. But what about scale-out cloud data warehouses? Doesn’t elasticity = scale = high concurrency? Of course, but elasticity without insane compute efficiency (like with Apache Druid) is going to be a really expensive app.
Businesses of all kinds are rapidly adopting event-streaming platforms like Apache Kafka. Our friends at Confluent, the creators of Kafka, have built a data mesh that puts data ‘in motion’. With data swirling around constantly, what better use of it than to analyze it for continuous, real-time insights?
Companies like Netflix are doing this and their developers are creating a huge competitive advantage by bringing together Apache Kafka and Druid to build an analytics app that enables a high quality, always-on, user experience.
With an eye on real-time analytics, several things have to be taken into account. Is analyzing streams alone enough – or does the use case need to compare streams against historical data? For Intercontinental Exchange, it’s the full spectrum from present to past that gives them the right security visibility. Does ingestion scalability matter – do you need to process millions of events per second? What about latency or data quality?
Analytics of the past were about making better decisions for the business. While still very relevant – and a huge opp to create more value – we are increasingly seeing companies build analytics apps to deliver insights to their customers.
Companies like Twitter, Cisco ThousandEyes, and Citrix are doing this and driving material revenue. They’re giving their customers visibility and insights – and that in turn creates big business for them.
But it can be a pretty hairy outcome to use any database to build a customer-facing analytics app. There’s way more on the line than internal use cases when you think about SLAs and the customer experience. It’s in these apps where microseconds of latency makes a difference, downtime is costly, and concurrency and $$ goes through the roof. Thankfully there’s a database for that!
At this point in tech, I think we all see that every company is becoming a software company. But with everyone having easy access to the cloud, simply building cloud software and services isn’t enough to sustain an advantage. That’s why companies like Salesforce and AirBnB build analytics apps to optimize how they build their products.
Developers there – and at the best software companies – are building analytics apps to help them create the best product experiences. Whether it’s next-gen observability, user behavior insights, live A/B testing, or even recommendation engines, an analytics app is at work.
There you have it. Our prediction for this year. We see the world of analytics expanding rapidly to modern analytics apps – with developers becoming the new analytics heroes in organizations.
Here’s to 2022!
Patti Jo Rosenthal chats about her role as Manager of K-12 STEM Education Programs at ASME where she drives nationally scaled STEM education initiatives, building pathways that foster equitable access to engineering education assets and fosters curiosity vital to “thinking like an engineer.”