MongoDB Sharding - MongoDB

What is Sharding and how it works?

Sharding in MongoDB is the process of storing data records across multiple machines. It is an approach to meet the demands of data growth. As the data size increases, a single machine might not be sufficient for storing the data nor provides an acceptable read and write throughput. Sharding can solve the problem with horizontal scaling. With sharding, you can add more machines for supporting data growth and the demands of read and write operations.

Why Sharding?

  • In replication, all writes will go to master node
  • Latency sensitive queries will still go to master
  • Single replica set will have a limitation of 12 nodes
  • Memory can't be large enough when active dataset is big
  • Local disk is not big enough
  • Vertical scaling is very expensive

Sharding in MongoDB

Below diagram shows the sharding in MongoDB using sharded cluster.


In the above diagram, there are 3 main components

  • Shards − Shards are used for storing data and they provide high availability and data consistency. In production environment, each shard will be a separate replica set.
  • Config Servers − Config servers will store the cluster's metadata and this data contains a mapping of the cluster's data set to the shards. Query router will use this metadata to target operations for specific shards. In production environment, sharded clusters will have exactly 3 config servers.
  • Query Routers − Query routers are mainly mongo instances which interface with client applications and will direct operations to the appropriate shard. Query router will process and target the operations to shards and will then return results to the clients. A sharded cluster consists of more than one query router for dividing the client request load and a client will send requests to one query router. Basically, a sharded cluster can have many query routers.

All rights reserved © 2020 Wisdom IT Services India Pvt. Ltd Protection Status

MongoDB Topics