Cassandra, Apache Cassandra & It’s Database Db

Our HadoopTpoint App is now available in google play store,please rate and comment it in play store : W3Schools

Apache Cassandra is one type of Nosql database.Cassandra Database was open source ed by Facebook in July 2008.Cassandra database original version written by an ex employees of Amazon and one from Microsoft . Cassandra mainly influenced by Amazon Dynamo db and google big table. Apache Cassandra is highly scalable and high performance distributed database designed to handle large amount of data across many servers.The main features of cassandra database is fault-tolerant,No single point of failure,robust support and consistent and Cassandra db is a column-oriented database.Let us first understand what is Nosql (column-oriented) database .

What is NoSQL Database ?

NoSQL refers “non SQL”, “non relational” or “not only SQL” .NoSQL database concept is little bit different from Relational Database management systems.Generally databases are dived into row oriented database and column-oriented database . Row oriented database are Oracle database,Microsoft SQL server,MySQL,IBM DB2,IBM informix,Tera data etc.column-oriented database are Cassandra,HBase,MongoDB,CouchDB,ToroDB etc.Later NoSQL also dived into mainly 4 types

  1. Key-Value Store – It has a Big Hash Table of keys & values {Example- Riak, Amazon S3 (Dynamo)}
  2. Document-based Store- It stores documents made up of tagged elements. {Example- CouchDB}
  3. Column-based Store- Each storage block contains data from only one column, {Example- HBase, Cassandra}
  4. Graph-based-A network database that uses edges and nodes to represent and store data. {Example- Neo4J}

Our Cassandra Database is Column based store database

  • NoSQL databases are schema Free databases
  • Horizontal scaling (we can increase or decease the nodes horizontally )
  • Support easy replication
  • Easy API for other Languages
  • Handle huge amount of data

Nosql databases are comes to picture to slove the problems in RDBMS. Let us see the main difference between NoSQL and RDBMS

NoSQL RDBMS
NoSQL is column oriented database. RDBMS is row oriented database.
NoSQL follows CAP Theorem . RDBMS follows BASE Theorem.
It is not supports Powerful Query Language like Joins,Sub quires etc. It supports powerful Query Language.
It don’t have a fix schema It have a Fixed schema and structured data
It doesn’t supports OLTP (Transactions ) It supports OLTP
Follows ACID (Atomicity, Consistency, Isolation, and Durability). It is only “eventually consistent”.
Nosql vs RDBMS

Nosql vs RDBMS

Most popular Nosql Databases are

  • Cassandra
  • MongoDB
  • Hbase

Cassandra, Apache Cassandra & It’s Database Db

What is Apache Cassandra Database ?

Apache Cassandra Database is an open source,It Distributed data across the all nodes,it handle large amount of data and provide high availability with no single point of failure. The highlight points of Apache Cassandra Database is

  • It is column oriented database
  • It is scalable, fault-tolerant, and consistent and supports replication factor.
  • No single point of failure,fault tolerance
  • Cassandra follows Ring architecture no Master slave architecture is followed by Apache cassandra
  • Cassandra db is inspired by Apache Dynamo and google big table
  •  Cassandra database original version written by an ex employees of Amazon and one from Microsoft.
  • Cassandra is officially created by Facebook in 2008.
  • Cassandra implements a Dynamo-style replication model with no single point of failure, but adds a more powerful “column family” data model.
  • Cassandra supports different languages like Python,C#,.NET,C++,Java,Ruby

Which Companies are using Cassandra In Real Time projects ?

Many companies have successfully deployed and benefited from Apache Cassandra including some large companies

  • Apple
  • Twitter
  • Facebook
  • Rackspace
  • Cisco
  • ebay
  • Netflix
  • Comcast
  • Instagram
  •  Spotify
  • Uber

and many more. The larger production environments have PB’s of data in clusters of over 75,000 nodes.

Cassandra, Apache Cassandra & It's Database Db

Cassandra, Apache Cassandra & It’s Database Db

Features of Cassandra

Decentralized 

Every node in the cluster has the same role. There is no single point of failure. Data is distributed across the cluster (so each node contains different data), but there is no master as every node can service any request.

Replication

Cassandra supports replication factor concept and also support multi level Data centers concepts .

Scalability

Read and write throughput both increase linearly as new machines are added, with no downtime or interruption to applications.You can add any number of nodes horizontally.

Fault-tolerant

Data is automatically replicated to multiple nodes for fault-tolerance. Replication across multiple data centers is supported. Failed nodes can be replaced with no downtime.

Tunable consistency

In cassandra there is a concept Consistency,it is very useful for Writes and Reads in cassandra.by default cassandra consistency level is one,based on our requirement we can change cassandra read consistency and write consistency levels from one to two,three or quorum.

MapReduce support

Cassandra supports Hive and pig also,Cassandra has hadoop integration, with mapreduce support.

Query Language

Cassandra introduced the Cassandra Query Language (CQL). CQL is a simple interface for accessing Cassandra,it almost looks like structured query language.In CQL we called databases as keyspace and tables as columnfamily .

Fast Writes

Cassandra Database is very useful for fast writes and can store hundreds of terabytes of data, without sacrificing the read efficiency.

Supports Different Languages

Cassandra supports different languages like Python,C#,.NET,C++,Java,Ruby

History of Cassandra

  • Cassandra was developed at Facebook for inbox search.
  • It was open-sourced by Facebook in July 2008.
  • Cassandra was accepted into Apache Incubator in March 2009.
  • It was made an Apache top-level project since February 2010.

Cassandra Versions Releases 

0.6 version is first version of Cassandra and the latest version of cassandra is 3.9 here is the list of all Cassandra versions with release dates .

Version Original release date Latest version Release date Status
0.6 2010-04-12 0.6.13 2011-04-18 No longer supported
0.7 2011-01-10 0.7.10 2011-10-31 No longer supported
0.8 2011-06-03 0.8.10 2012-02-13 No longer supported
1.0 2011-10-18 1.0.12 2012-10-04 No longer supported
1.1 2012-04-24 1.1.12 2013-05-27 No longer supported
1.2 2013-01-02 1.2.19 2014-09-18 No longer supported
2.0 2013-09-03 2.0.17 2015-09-21 No longer supported
2.1 2014-09-16 2.1.16 2016-10-10 Still supported
2.2 2015-07-20 2.2.8 2016-09-28 Most stable release
3.0 2015-11-09 3.0.9 2016-09-16 Stable release
3.9 2015-12-08 3.9 2016-09-29 Latest release

 Cassandra For Developers 

Always on Architecture — A true masterless architecture (unlike other master/slave RDBMS and NoSQL databases) delivers continuous availability for your applications.

Natively Distributed — The gold standard in multi-data center and cloud replication supplies real write/read anywhere capabilities, allowing you to easily put data where it’s needed anywhere in the world.

Fast Linear-Scale Performance — Enables millisecond response times with linear scalability (double your throughput with two nodes, quadruple it with four, and so on) to deliver response time speeds your customers have come to expect.

Language Drivers – Cassandra supports different languages like Python,C#,.NET,C++,Java,Ruby

Flexible Data Model  — Cassandra Data model is very flexible we can add new entity or attributes to columnfamily at any time no restrictions.

CQL – Cassandra supports Cassandra Query Language and it’s look like SQL it is very useful for developers to create keyspace and columnfamily in Cassandra Database.

Apache Cassandra For Administrators

Always Online Architecture — Cassandra follows Ring architecture it doesn’t have any master slave architecture.

Native Multi-Data Center Replication – Cross data center (in multiple geographies) and multi-cloud availability zone support for writes/reads.

Transparent Fault Detection and Recovery – Nodes that fail can easily be restored or replaced.

Tunable Data Consistency – Support for strong or eventual data consistency across a widely distributed cluster.

OpsCenter Monitoring/Management Tool — OpsCenter is tool it useful for ives system operators the flexibility to monitor and manage even the most complex workloads with ease from any web browser.

Commodity Hardware  – Cassandra supports Commodity hardware no need to buy large amount of systems and RAM.

Mitigate Risks of Downtime — Apache Cassandra follows Ring architecture,if one node is failed there is no problem for server or data.Another node will take care,so  no downtime problems .

Improved Customer Experience — Apache Cassandra’s high availability and superior performance  gives businesses, and their mission-critical applications,  the ability to provide customers with a superior user experience.

DataStax Enterprise — Datastax Enterprise supports Apache Cassandra deploying, operating, and maintaining your production deployment.

This is total overview of Apache Cassandra Database for more information please follow our blog

Speak Your Mind

*