Welcome to “Vector Vanguard: Tracking the Pulse of Vector Tech 09/2024” – a source for the latest developments in vector databases, vector indexes, RAG (Retrieval-Augmented Generation), similarity search, and related technologies that caught my attention in the last month.

Featured Vector Tech Topic: Hybrid search with PostgreSQL and pgvector

Jonathan Katz writes about Hybrid search as delivering precise and relevant search results is crucial for enhancing user experience. Traditional keyword-based searches often fall short in understanding the context or semantic meaning behind queries. This is where hybrid search comes into play—combining the strengths of both lexical (keyword-based) and semantic (vector-based) search methods to provide more accurate and comprehensive results.

Hybrid search leverages vector embeddings to capture the semantic relationships between data points. By integrating vector similarity search with traditional full-text search, applications can understand not just the literal terms but also the intent and context of user queries. This approach improves recall and precision, making it particularly useful in applications like recommendation systems, conversational AI, and content discovery platforms.

Enter pgvector, an open-source extension for PostgreSQL that brings vector similarity search directly into your database. With pgvector, you can:

  • Store vector embeddings efficiently within PostgreSQL tables.
  • Perform similarity searches using distance metrics like Euclidean, cosine similarity, or inner product.
  • Combine vector searches with SQL queries, enabling seamless integration with existing data and operations.

By using pgvector, developers can eliminate the need for separate vector databases or external services, simplifying the architecture and improving performance. It allows for the execution of complex queries that consider both the presence of specific keywords and the semantic similarity between data entries.

My take

Filtering in vector search is crucial to enhance precision and reduce resource consumption. Future vector databases should integrate filtering methods that work efficiently across large, complex datasets. Two primary approaches exist: pre-filtering, which narrows down results before vector search, and post-filtering, which applies filters after the search. Both methods have pros and cons, but future systems need to balance speed with accuracy. Techniques like payload indexing and hybrid pre/post-filtering may become key to ensuring scalability, performance, and precision in large-scale vector databases.

Additional Vector Tech 09/2024 resources

Summary of selected articles that caught my attention.

Local AI – what it is and why we need it

Local AI, or on-device AI, enables artificial intelligence to operate directly on devices rather than relying on cloud servers. This shift offers numerous benefits, including enhanced privacy by keeping data local, increased accessibility by working offline, and improved sustainability by reducing the energy demands of large cloud data centers. Local AI is particularly useful for real-time applications and resource-constrained devices. With advancements like smaller language models and vector databases, local AI has the potential to transform sectors ranging from healthcare to retail by providing secure, energy-efficient, and always-available AI solutions. For such secenarios around semantic search, a vector database can run directly on the edge device with the possibility to sync data into the cloud.

Source: Local AI – what it is and why we need it by Objectbox.io, 11-Sep-2024

Charting a path to the data- and AI-driven enterprise of 2030

The article from McKinsey explores the future of data and AI-driven enterprises by 2030, emphasizing the growing importance of data ubiquity, AI integration, and strategic data leadership. It highlights key challenges such as managing structured and unstructured data, unlocking competitive advantage through proprietary data, and evolving data architectures like data mesh.

Organizations will require robust governance, scalable data pathways, and new talent strategies to leverage AI, while addressing security and compliance risks. Success depends on strong leadership, cross-functional collaboration, and a culture of continuous data innovation. A clear view of new skills is required as mentioned in the article “Data engineers, for example, will need to develop a new range of skills, such as database performance tuning, data design, DataOps (which combines DevOps, data engineering, and data science), and vector database development. New roles might include prompt engineers, AI ethics stewards, and unstructured-data specialists.”

Source: Charting a path to the data- and AI-driven enterprise of 2030 by McKinsey Digital, 05-Sep-2024

Understanding Indexing Efficiency for Approximate Nearest Neighbor Search in High-dimensional Vector Databases

The thesis titled “Understanding Indexing Efficiency for Approximate Nearest Neighbor Search in High-dimensional Vector Databases” by Yuting Qin provides a detailed and technical exploration of vector databases, focusing on the efficiency of Approximate Nearest Neighbor (ANN) search in high-dimensional data using graph-based indexing. The novel contribution of this work is the development of a machine learning-based framework for predicting the search performance of different graph structures.

Machine learning is applied to model the trade-offs between graph quality, search speed, and accuracy. A regression model is trained on features extracted from various graph indices (e.g., node degree, edge length, clustering coefficient) to predict the search performance of a given graph structure. This model helps guide the selection of graph-building heuristics that optimize both accuracy and speed.

The experimental evaluation is performed on two benchmark datasets, SIFT1M and Deep1M, which are commonly used in ANN benchmarking. The experiments show that:

  • HNSW graphs consistently outperform other methods in terms of search speed and accuracy, though their construction is more complex and memory-intensive.
  • Graph degree has a significant impact on search performance: increasing the degree improves accuracy but also increases the computational cost of each query.
  • Edge length plays a crucial role in balancing search efficiency and accuracy. Graphs with shorter edges perform better in terms of both accuracy and speed.

The regression model trained on graph features provides insights into how specific graph properties influence search performance. For example, increasing the average node degree improves accuracy but also increases the computational cost, while reducing edge lengths consistently improves both accuracy and speed.

Source: Understanding Indexing Efficiency for Approximate Nearest Neighbor Search in High-dimensional Vector Databases thesis by Qin, Yuting Qin

Looking Ahead: Vector tech conferences or events

A selection of conferences or events containing vector tech sessions:

For more articles around Vector Tech see on my blog.