High-level overview of some Columnar Stores

Columnar Stores exist since a long time. There are well-known players like Vertica, Sybase IQ or Exasol. Recent trends like Cassandra and HBase or announcements from SAP, IBM, Microsoft and Oracle caused some hype for (in-memory) columnar stores.

The term “columnar store” is ambiguous: columnar features for non-relational and relational databases should be differentiated.

Wide column Stores

Google BigTable, Cassandra and HBase belong to a category that should be named wide column stores. These databases belong to the NoSQL movement but they require some kind of data modeling. The data model is not relational. Databases in this category group columns/attributes into column families. Attributes that are used together should be grouped into the same column family. Data within one column family is stored row by row though.

Relational column Stores

Vendors from relational databases added columnar features to their databases. Such features (if stand-alone database or feature within existing relational database) should be labeled relational columnar stores. The approaches differ substantially, but all of the approaches have analytical queries as focus:

Reading data is much more frequent than writing data
Reading (including grouping or sorting) selective columns is much more frequent than reading many or even all columns in one query

High-level overview of some vendors:

Vendor / Product	Characteristics
SAP HANA	Row store and column storage are distinguished during table creation. You have to choose row or column table syntax. There are tow engines (row and columnar) that operate on the compressed data. CREATE ROW TABLE R1 (A INT, …); CREATE COLUMN TABLE C1 (A INT, …);
IBM DB2	Similiar to SAP HANA but with different syntax. The feature is available since version 10.5 (“DB2 BLU”). Columnar data is compressed in-memory and data is operated by a new columnar engine (data in row tables is operated by the existing engine). CREATE TABLE R1 (A INT, …) ORGANIZE BY ROW; CREATE TABLE C1 (A INT, …) ORGANIZE BY COLUMN;
Microsoft SQL Server	Microsoft offers the possibility to create columnar indexes in SQL Server 2012. The index can contain one or more columns. Updates and deletes on tables with columnar indexes are not allowed (This restriction does not apply to Microsoft Parallel Data Warehouse PDW. SQL Server 2014 is expected to come without the restriction). CREATE NONCLUSTERED COLUMNSTORE INDEX i1 on ;
Oracle Database	Not available yet. The feature was announced during OOW 2013 for Oracle version 12.1.0.2. Expected functionality: a memory area for columnar data must be configured. Tables are still stored in row based format on disc. Tables that have to go into the columnar memory area must be enumerated.

Wide column Stores

Relational column Stores

Leave a reply Cancel reply

Recent Posts

Archives

Categories

High-level overview of some Columnar Stores

Wide column Stores

Relational column Stores

Related Posts

Schemaless revolution and data migration burden

DataBeat week 01/2015

NoSQL, NewSQL, cloud-native databases

SQL – Only the Strong Survive

Leave a reply Cancel reply

Recent Posts

Archives

Categories