Sharding
Sharding is responsible for storing data records across
multiple machines. As per demand to manage large amounts of data, a single machine
is not sufficient to store data nor provide an acceptable medium to read and
write data. To resolve this problem MongoDB provides a process called Sharding. Sharding solves the problem
with horizontal scaling. Through sharding, you may add more machines to support
data growth and the demands of reading and write operations.
Why Sharding?
- In replication, all writes go to the master node
- Latency sensitive queries still go to master
- Single replica set has a limitation of 12 nodes
- Memory can't be large enough when the active dataset is big
- Local Disk is not big enough
- Vertical scaling is too expensive
Sharding in MongoDB
Below given diagram shows the sharding in MongoDB using
sharded cluster.
Shards: It is
used to store data and provide high availability and data consistency, shard
is a separate replica in the production environment.
Config Servers:
Config servers store the cluster's metadata. This data contains a mapping of
the cluster's data set to the shards. The query router uses this metadata to
target operations to specific shards. In a production environment sharded
clusters have exactly 3 config servers.
Query Routers:
Query Routers are basically mongos instances, interface with client
applications and direct operations to the appropriate shard. The query router
processes and targets operations to shards and then returns results to the
clients. A sharded cluster can contain more than one query router to divide the
client request load. A client sends requests to one query router. Generally, a
sharded cluster has many query routers.
No comments:
Post a Comment
Please do not enter any spam link in the comment box.