Cassandra Introduction - Apache Cassandra

What is Cassandra Introduction?

Apache Cassandra is an extremely accessible, high-performance distributed database intended to handle big quantities of data through many commodity servers, providing high accessibility with no single point of letdown. It is a type of NoSQL database. Let us first understand what a NoSQL database does.


A NoSQL database (sometimes called as Not Only SQL) is a database that delivers a mechanism to store and retrieve data other than the tabular relations used in relational databases. These databases are schema-free, support easy replication, have simple API, eventually consistent, and can handle huge amounts of data.

The primary objective of a NoSQL database is to have

  • simplicity of design,
  • horizontal scaling, and
  • finer control over availability.

NoSql databases use altered data structures related to relational databases. It creates some operations faster in NoSQL. The suitability of a given NoSQL database depends on the problem it must solve.

NoSQL vs. Relational Database

The following table lists the points that differentiate a relational database from a NoSQL database.

Relational Database

NoSql Database

Supports powerful query language.

Supports very simple query language.

It has a fixed schema.

No fixed schema.

Follows ACID (Atomicity, Consistency, Isolation, and Durability).

It is only “eventually consistent”.

Supports transactions.

Does not support transactions.

In addition to Cassandra, we have the subsequent NoSQL databases that are fairly popular:

  • Apache HBase - HBase is an open source, non-relational, distributed database modeled after Google’s BigTable and is written in Java. It is developed as a part of Apache Hadoop project and runs on top of HDFS, providing BigTable-like capabilities for Hadoop.
  • MongoDB - MongoDB is a cross-platform document-oriented database system that avoids using the traditional table-based relational database structure in favor of JSON-like documents with dynamic schemas making the integration of data in certain types of applications easier and faster.

What is Apache Cassandra?

Apache Cassandra is an open source, distributed and decentralized/distributed storage system (database), for handling very big quantities of structured data spread out across the world. It delivers extremely obtainable facility with no single point of letdown.
Listed below are some of the notable points of Apache Cassandra:

  • It is scalable, fault-tolerant, and consistent.
  • It is a column-oriented database.
  • Its distribution design is based on Amazon’s Dynamo and its data model on Google’s Bigtable.
  • Generated at Facebook, it varies suddenly from relational database management systems.
  • Cassandra implements a Dynamo-style replication model with no single point of failure, but adds a more powerful “column family” data model.
  • Cassandra is being used by some of the biggest companies such as Facebook, Twitter, Cisco, Rackspace, ebay, Twitter, Netflix, and more.

Features of Cassandra

Cassandra has turn out to be so widespread because of its outstanding technical features. Given below are some of the features of Cassandra:

  • Elastic scalability - Cassandra is highly scalable; it allows to add more hardware to accommodate more customers and more data as per requirement.
  • Always on architecture - Cassandra has no single point of failure and it is continuously available for business-critical applications that cannot afford a failure.
  • Fast linear-scale performance - Cassandra is linearly scalable, i.e., it increases your throughput as you increase the number of nodes in the cluster. Therefore it maintains a quick response time.
  • Flexible data storage - Cassandra accommodates all possible data formats including: structured, semi-structured, and unstructured. It can dynamically accommodate changes to your data structures according to your need.
  • Easy data distribution - Cassandra provides the flexibility to distribute data where you need by replicating data across multiple data centers.
  • Transaction support - Cassandra supports properties like Atomicity, Consistency, Isolation, and Durability (ACID).
  • Fast writes - Cassandra was designed to run on cheap commodity hardware. It performs blazingly fast writes and can store hundreds of terabytes of data, without sacrificing the read efficiency.

History of Cassandra

  • Cassandra was developed at Facebook for inbox search.
  • It was open-sourced by Facebook in July 2008.
  • Cassandra was accepted into Apache Incubator in March 2009.
  • It was made an Apache top-level project since February 2010.

All rights reserved © 2020 Wisdom IT Services India Pvt. Ltd Protection Status

Apache Cassandra Topics