Skip to main content

$sample operator in MongoDB

If we want select document randomly in fixed number of document then we have to use $sample. $sample operator randomly selects specific number of data from input.


{ $sample: { size: <positive integer> } }

  1. If size is greater than or equal to 5% of data (total documents), $sample scan the collection and sort the document.
  2. If size is less than 5% of data (total documents) then $sample use the pseudo random cursor on the collection if using WiredTiger Storage Engine.
  3. $sample use the _id to select random data if using MMAPv1 Storage Engine.
  4. $sample can select same document more than once in its result set.
Example: Suppose we have a collection named Books and wants to select 3 document randomly.

{ "_id" : 1, "Book Title": "book1", "price" : 20, "quantity" : 1, "date" : ISODate("2016-08-05T07:00:00Z") }
{ "_id" : 2, " Book Title " : " book2", "price" : 10, "quantity" : 2, "date" : ISODate("2016-08-05T08:00:00Z") }
{ "_id" : 3, " Book Title " : " book3", "price" : 30, "quantity" : 4, "date" : ISODate("2016-08-17T10:00:00Z") }
{ "_id" : 4, " Book Title " : " book4", "price" : 10, "quantity" : 2, "date" : ISODate("2014-09-01T11:20:39.736Z") }
{ "_id" : 5, " Book Title " : " book5", "price" : 20, "quantity" : 6, "date" : ISODate("2014-09-04T20:23:13.331Z") }

Run following query

db.Books.aggregate({ $sample: { size: 3 } })


{ "_id" : 5, " Book Title " : " book5", "price" : 20, "quantity" : 6, "date" : ISODate("2014-09-04T20:23:13.331Z") }
{ "_id" : 1, "Book Title" : "book1", "price" : 20, "quantity" : 1, "date" : ISODate("2016-08-05T07:00:00.000Z") }
{ "_id" : 3, " Book Title " : " book3", "price" : 30, "quantity" : 4, "date" : ISODate("2016-08-17T10:00:00.000Z") }

Popular posts from this blog

Merge and Merge join transformation in SSIS

Using Merge Transformation we can combine two sorted data-set into single data-set basically Merge Transformation used to combines rows from two sorted data flows into one sorted data flow. Following tasks you may perform using Merge Transformation: 1.Suppose we have a scenario like, we need to merge data from a database table and excel means we want to merge data from two different data sources. For such type of scenario, you can use Merge Transformation. 2.If we want to merge data from two same structured tables but exists two different servers. 3.Sometimes we get an error due to data in a row, after correcting errors in the data we can re-merge rows easily. See below explanations may help you to understand Merge Transformation: I do evaluate here, you already know about the data source, data conversion, data flow, task flow, control flow etc. Note:Before Merge transformation, we need to sort the data using Sort Transformation. After sorting data add data path to Merge…

Add day to ISODate in MongoDB

We can use $add operator to add days in ISODate in mongodb, $add is the Arithmetic Aggregation Operator which adds number and date in mongodb.

{ $add: [ <expression1>, <expression2>, ... ] }

Note:  If one of the argument is date $add operator treats to other arguments as milliseconds to add to the date.
Example: Suppose we have a Test collection as below.

{"Title" : "Add day to ISODate in MongoBD","CreatedDate" : ISODate("2016-07-07T08:00:00.000Z")}

Query to add 2 days in CreatedDate

db.Test.aggregate([      { $project: { Title: 1, AddedDate: { $add: [ "$CreatedDate", 2*24*60*60000 ] } } }    ])


{ "_id" : ObjectId("579a1567ac1b3f3732483de0"), "Title" : "Add day to ISODate in MongoBD", "AddedDate" : ISODate("2016-07-09T08:00:00.000Z") }

Note: As mentioned in above note we have to convert days in millisecond because $add operator treat to other arg…

What is difference between UNION and UNION ALL in SQL Server

We use UNION and UNION ALL operator to combine multiple results set into one result set.
UNION operator is used to combining multiple results set into one result set but removes any duplicate rows. Basically, UNION is used to performing a DISTINCT operation across all columns in the result set. UNION operator has the extra overhead of removing duplicate rows and sorting result.
UNION ALL operator use to combine multiple results set into one result set but it does not remove any duplicate result. Actually, this does not remove duplicate rows so it is faster than the UNION operator. If you want to combine multiple results and without duplicate records then use UNION otherwise UNION ALL is better.
Following some rules for using UNION/UNION ALL operator
1.The number of the column should be the same in the query's when you want to combine them. 2.The column should be of the same data type. 3.ORDER BY clause can be applied to the overall result set not within each result set.
4.Column name of …