TechnologyMay 15, 2024

Empowering On-Premises Deployments with Generative AI: Introducing Vector Search for a Self-Managed Modern Architecture

Introducing DataStax Hyper-Converged Data Platform and DataStax Enterprise 6.9 for cloud-native operations and generative AI capabilities on modern data center infrastructure.
Empowering On-Premises Deployments with Generative AI: Introducing Vector Search for a Self-Managed Modern Architecture

For a detailed overview of HCDP and DSE 6.9, contact us

We’re thrilled to announce Hyper-Converged Data Platform (HCDP), a platform with modular software services for hyperconverged infrastructure data needs. This includes Hyper-Converged Database (HCD),  Hyper-Converged Streaming (HCS), OpenSearch and Mission Control. These releases herald significant advances in cloud-native operations and generative AI capabilities on modern data center infrastructure and powerful new AI and edge systems. HCDP is a self-managed, enterprise-ready data platform for GenAI applications.

HCD represents the future of self-managed multi-model databases, built on Apache Cassandra® 4.0—the leading NoSQL database known for its unlimited scalability and exceptionally low latencies. Cassandra 4.0 represents the most stable and extensively tested major release to date, with improved scaling, significant performance and security upgrades, and reduced costs. 

HCD addresses the needs of new and modern data workloads, and it replaces DataStax Enterprise (DSE) workloads, benefiting from its cloud-native architecture and the latest Cassandra features.

It is designed for enterprises modernizing their data centers and adopting hyper-converged infrastructure (HCI) to support new AI and advanced workloads. By leveraging HCD, these enterprises can maximize the benefits of data center modernization and optimize total cost of ownership (TCO). 

For existing DSE customers, we’re introducing DSE 6.9, which offers a simple upgrade path to integrate powerful GenAI and vector database capabilities into their workloads.

With Kubernetes-powered DataStax Mission Control, both HCD and DSE 6.9 deliver cloud-like operational ease across on-premises environments, bare-metal data centers, virtual machines, and public/private clouds.

“We're looking forward to DSE 6.9 for its advancements in real-time digital banking services. Currently, we use DSE for critical tasks like data offloading and real-time data services. Its full-text search enhances customer engagement without adding to core system loads,” said Marcin Dobosz, director of technology, Neontri. “With features like the vector add-on and DataStax Mission Control, DSE 6.9 aligns with our goal of modernizing technology stacks for financial institutions, enhancing productivity, search relevance, and operational efficiency to better meet clients' needs."

Hyper-Converged Database (HCD)

DataStax HCD caters to enterprises investing in data center modernization, HCI, and advanced GenAI data workloads. Its benefits include:

  • Built on proven open source technology - HCD is built on Cassandra 4.x, which is renowned for its reliability and scalability in handling AI workloads. Trusted by industry leaders like Netflix and Uber, Cassandra ensures robust performance and scalability with HCD.

  • Cloud-native data operations and observability - HCD offers a cloud-native architecture  with elastic scalability. Mission Control is designed to support modern DevOps practices. It enables seamless deployment and management of data infrastructure while providing comprehensive observability capabilities.

  • Enhanced developer productivity - With HCD, developers benefit from rapid provisioning and intuitive data APIs, streamlining the development process. Additionally, HCD provides a comprehensive GenAI stack, facilitating retrieval-augmented generation (RAG) for enterprise applications.

  • Designed for hybrid cloud - HCD shares the same platform as DataStax Astra Serverless and has a completely cloud-native architecture, making it ideal for workloads distributed across public clouds and self-managed infrastructure. This is a step forward toward true hybrid-cloud workloads in the future.

  • Hybrid vector search - HCD offers a vector search add-on, powered by the open-source JVector vector search engine, bringing SAI-powered vector search alongside OpenSearch-based full-text search capabilities

HCD is tailored for enterprise operators and architects spearheading data center modernization initiatives. It excels in facilitating:

  • Data center modernization - HCD enables enterprises that have adopted HCI to optimize infrastructure and hardware costs, driving data center modernization efforts.

  • TCO initiatives - HCD aids in reducing TCO through modern DevOps practices and enhanced developer productivity, ensuring efficient resource use and operational excellence.

  • GenAI - HCD enhances GenAI capabilities with powerful vector search functionalities, leveraging the open-source JVector project.

Hyper-Converged Streaming (HCS)

DataStax HCS provides the latest in data streaming and event processing, built and designed to take advantage of cutting-edge hyperconverged infrastructure. Features include:

  • Separation of storage and compute - Built on Apache Pulsar, HCS natively provides the ability to separate event storage and event processing for high performance data distribution.

  • Native multi-protocol support - Native API support for Apache Kafka and JMS workloads provides the ability to seamlessly integrate existing messaging and streaming applications, resulting in better performance, consolidated operations, and infrastructure modernization for new and existing development.

  • Built for global scale - Designed for highly scalable global data distribution, HCS provides the foundation for having the right data in the right place at the right time without having to manage third-party data replication.

HCS is built and designed to provide data communications for a modern infrastructure. With native support of inline data processing and embedding, HCS is built for:

  • GenAI - HCS provides the ability to bring vector data to the edge, as close to the application as possible, allowing for faster response times and enabling event data for better contextual generative AI experiences.

  • Application communication modernization - Designed and built to leverage the latest in HCI, HCS provides that ability to consolidate workloads from legacy applications, while also providing a modern approach to data streaming that supports inline data processing and embedding for traditional and AI workloads.

  • Hybrid cloud integration - When it comes to operating across multiple cloud environments or between on-premise and cloud, HCS provides a high performance, scalable data communications layer that simplifies data integration across all environments.

Hybrid Search with JVector and OpenSearch

The introduction of vector search in HCD empowers developers to seamlessly harness proprietary data stored in Cassandra/DSE databases for their large language model, AI assistant, and real-time GenAI projects without compromising data security. With HCD's innovative JVector technology, users get a 10x enhancement in vector search performance, which surpasses the capabilities of traditional Lucene-based search. This represents a significant leap forward in search relevancy and accuracy.

In alignment with our steadfast commitment to open source technologies, we are also pleased to introduce seamless integration with OpenSearch, a robust platform known for its versatile and cost-effective search solutions for enterprises. OpenSearch offers a comprehensive suite of features tailored for enterprise search, encompassing full-text search capabilities, advanced analytics, comprehensive monitoring tools, and robust security functionalities.

By harnessing the combined power of HCD and OpenSearch, users can expect lightning-fast search results coupled with unparalleled precision and relevance, empowering organizations to extract maximum value from their data infrastructure investments.

This integration marks a significant milestone in the evolution of hybrid search solutions, representing a convergence of cutting-edge technologies to redefine the search experience for businesses. 

DSE 6.9

DSE 6.9 offers enterprises a quick upgrade to GenAI workloads, with features that support enterprise requirements for: 

  • Developer productivity - DSE 6.9 gives developers Java 11 support, rapid provisioning,  and an intuitive Data API for modern data modeling and advanced workloads, 

  • Search relevance - Improvements in SAI (storage-attached indexes) with analyzers to improve search relevance and support for ‘OR’ queries.

  • Cloud-native operations - Mission Control gives operations a cloud-native, cloud-like, single interface for observability and operational management powered by Kubernetes. 

  • Hybrid vector search - DSE 6.9 offers a vector search add-on, powered by the open-source JVector vector search engine, bringing SAI powered vector search alongside Solr-based full-text search capabilities. 

We’re excited about the possibilities that HCDP and DSE 6.9 bring to developers and enterprises alike, enabling advanced AI use cases, real-time data processing, and seamless cloud-native operations.

Download the HCD 1.0 preview now, or contact us to learn more.

One-Stop Data API for Production GenAI

Astra DB gives developers a complete data API and out-of-the-box integrations that make it easier to build production RAG apps with high relevancy and low latency.