Rockset, the serverless search and analytics company that enables SQL on NoSQL data, today announced the capability to analyze raw events from Apache Kafka in real time. Kafka, backed by Confluent, is one of the most popular distributed streaming platforms and capable of handling trillions of events a day. Rockset takes an entirely new approach to ingesting, analyzing and serving data so that developers and business stakeholders can run powerful SQL analytics, including joins, on raw event data from Kafka. With this release, Rockset is also announcing a partnership with Confluent, with Rockset’s Kafka Connect Plugin listed as a Verified Gold Connector in Confluent Hub.
Increasingly, businesses are capturing real-time data to drive intelligent actions on the fly. However, traditional databases are not built to handle semi-structured data, making it difficult to operationalize event data like this in real time. In an effort to solve this issue and unlock analytics, considerable data engineering effort goes into building complex data pipelines that schematize and load NoSQL data from Kafka event streams into SQL-based systems. These pipelines are difficult to build, expensive to maintain and hours behind in terms of insights into events – making “real-time” operational analytics on event data next to impossible.
Rockset complements Kafka’s KSQL stream processing capabilities by serving as the “sink” that ingests the processed stream. With Rockset, new event data from Kafka is automatically represented as a dynamic SQL table and available for querying in seconds. Rockset uses Converged Indexing™ and a Distributed SQL Processing Engine under the hood to enable customers to filter, aggregate and join across different datasets from different sources in milliseconds, without upfront schema definitions.
“When you embrace modern real-time technologies like Kafka, you discover that NoSQL databases do not support the type of powerful analytics you need, and that’s when you turn to SQL databases. But it will take you hours to extract-transform-load these events into a traditional SQL database and that is just not fast enough for real-time use cases,” said Venkat Venkataramani, co-founder and CEO of Rockset. “Our goal is to give Kafka users the speed and simplicity they need for deriving maximum value from their event streams in seconds.”
With this release, Rockset supports the ability to:
• Visualize event data in leading real-time SQL dashboards with JDBC support, including Tableau, Apache Superset, Redash and Grafana.
• Create developer APIs for building microservices and applications for the Internet of Things (IoT), e-commerce, operational monitoring and more.
• Join Kafka event streams with business data in Amazon DynamoDB, Amazon Kinesis, Amazon S3, Google Cloud Storage and more.
Customers embrace real-time event analytics with Kafka and Rockset.
“We need to carefully monitor our growth in real time,” said Amboj Goyal, principal engineer at Fynd. “Is a certain product suddenly selling more? Is there a fraudulent transaction? We easily generate 20-30 million events per day, all captured in Kafka streams. Our applications query the data every few seconds. By sending our raw event data directly from Kafka to Rockset, we save a lot of time and energy. We track over 40 metrics in real time and constantly take immediate actions.”