美文网首页
mongo回顾(九:聚合查询三)

mongo回顾(九:聚合查询三)

作者: supremecsp | 来源:发表于2021-04-15 17:32 被阅读0次

mongo的聚合查询操作符是比较多的,平常也无需全部用法都记住,有个大体概念后,使用时翻阅下官方文档也是一个不错的方法,今天继续介绍mongo比较常用的聚合操作

常见步骤中的运算符

运算符.png
这几个运算符见名知意就不做具体介绍了
接下来分别介绍几个常用聚合操作;
$addFields

$addFields将新字段追加到现有文档中。您可以$addFields在聚合操作中包括一个或多个阶段,我一般用于操作文档中的数组并返回结果,当然在project中用sum也能达到这个效果

{
  _id: 1,
  student: "Maya",
  homework: [ 10, 5, 10 ],
  quiz: [ 10, 8 ],
  extraCredit: 0
}
{
  _id: 2,
  student: "Ryan",
  homework: [ 5, 6, 5 ],
  quiz: [ 8, 8 ],
  extraCredit: 8
}

db.scores.aggregate( [
   {
     $addFields: {
       totalHomework: { $sum: "$homework" } ,
       totalQuiz: { $sum: "$quiz" }
     }
   },
   {
     $addFields: { totalScore:
       { $add: [ "$totalHomework", "$totalQuiz", "$extraCredit" ] } }
   }
] )
//result
{
  "_id" : 1,
  "student" : "Maya",
  "homework" : [ 10, 5, 10 ],
  "quiz" : [ 10, 8 ],
  "extraCredit" : 0,
  "totalHomework" : 25,
  "totalQuiz" : 18,
  "totalScore" : 43
}
{
  "_id" : 2,
  "student" : "Ryan",
  "homework" : [ 5, 6, 5 ],
  "quiz" : [ 8, 8 ],
  "extraCredit" : 8,
  "totalHomework" : 16,
  "totalQuiz" : 16,
  "totalScore" : 40
}

$facet

facet将多个聚合操作的结果结合起来,成为最终的输出文档

$group

同sql的group,对字段分组之后进行处理
group的内存限制为100 MB,超过会报错,或者运行磁盘操作allowDiskUse:true,不建议

{
  $group:
    {
      _id: <expression>, // Group By Expression
      <field1>: { <accumulator1> : <expression1> },//可选
      ...
    }
 }
名称 描述
$addToSet 返回每个组的唯一表达式值的数组。数组元素的顺序未定义。
$avg 返回数值的平均值。忽略非数字值。
$first 从每个组的第一个文档返回一个值。仅当文档按定义的顺序定义顺序。
$last 从每个组的最后一个文档返回一个值。仅当文档按定义的顺序定义顺序。
$max 返回每个组的最高表达式值。
$mergeObjects 返回通过组合每个组的输入文档而创建的文档。
$min 返回每个组的最低表达式值。
$push 返回每个组的表达式值数组。
$stdDevPop 返回输入值的总体标准偏差。
$stdDevSamp 返回输入值的样本标准偏差。
$sum 返回数值的总和。忽略非数字值。

如果id为null的group,不建议使用,mongo不适合大范围聚合

db.sales.aggregate([
  {
    $group : {
       _id : null,
       totalSaleAmount: { $sum: { $multiply: [ "$price", "$quantity" ] } },
       averageQuantity: { $avg: "$quantity" },
       count: { $sum: 1 }
    }
  }
 ])
//等效于以下SQL语句:
SELECT Sum(price * quantity) AS totalSaleAmount,
       Avg(quantity)         AS averageQuantity,
       Count(*)              AS Count
FROM   sales

group push的使用

db.books.insertMany([
  { "_id" : 8751, "title" : "The Banquet", "author" : "Dante", "copies" : 2 },
  { "_id" : 8752, "title" : "Divine Comedy", "author" : "Dante", "copies" : 1 },
  { "_id" : 8645, "title" : "Eclogues", "author" : "Dante", "copies" : 2 },
  { "_id" : 7000, "title" : "The Odyssey", "author" : "Homer", "copies" : 10 },
  { "_id" : 7020, "title" : "Iliad", "author" : "Homer", "copies" : 10 }
])
db.books.aggregate([
   // First Stage
   {
     $group : { _id : "$author", books: { $push: "$$ROOT" } }
   },
   // Second Stage
   {
     $addFields:
       {
         totalCopies : { $sum: "$books.copies" }
       }
   }
 ])

根据author分组后把根数据都放进book数组中,再往数组中增加totalCopies字段,结果为

{ "_id" : "Homer",
  "books" :
    [
       { "_id" : 7000, "title" : "The Odyssey", "author" : "Homer", "copies" : 10 },
       { "_id" : 7020, "title" : "Iliad", "author" : "Homer", "copies" : 10 }
    ]
 },
 { "_id" : "Dante",
   "books" :
     [
       { "_id" : 8751, "title" : "The Banquet", "author" : "Dante", "copies" : 2 },
       { "_id" : 8752, "title" : "Divine Comedy", "author" : "Dante", "copies" : 1 },
       { "_id" : 8645, "title" : "Eclogues", "author" : "Dante", "copies" : 2 }
     ]
 }

$out

$out如果当前数据库中尚不存在一个新集合,则该操作将创建一个新集合。在聚合完成之前,该集合不可见。如果聚合失败,则MongoDB不会创建集合
out只能用于聚合操作的最后一个管道,4.2版本后可以用功能更完善的merge来替代
{ $out: { db: "<output-db>", coll: "<output-collection>" } }
若输出的集合不存在则新建集合,若存在,更新集合内容,并不会影响到索引

集合books包含以下文档:

{ "_id" : 8751, "title" : "The Banquet", "author" : "Dante", "copies" : 2 }
{ "_id" : 8752, "title" : "Divine Comedy", "author" : "Dante", "copies" : 1 }
{ "_id" : 8645, "title" : "Eclogues", "author" : "Dante", "copies" : 2 }
{ "_id" : 7000, "title" : "The Odyssey", "author" : "Homer", "copies" : 10 }
{ "_id" : 7020, "title" : "Iliad", "author" : "Homer", "copies" : 10 }

以下聚合操作将books 集合中的数据按作者分组,然后将结果写入到authors集合中。

db.books.aggregate( [
                      { $group : { _id : "$author", books: { $push: "$title" } } },
                      { $out : "authors" }
                  ] )

操作后,authors集合包含以下文档:

{ "_id" : "Homer", "books" : [ "The Odyssey", "Iliad" ] }
{ "_id" : "Dante", "books" : [ "The Banquet", "Divine Comedy", "Eclogues" ] }

$sortByCount

$sortByCount等效于$group+ $sort
即为分组后排序

{ $group: { _id: <expression>, count: { $sum: 1 } } },
{ $sort: { count: -1 } }
{ "_id" : 1, "title" : "The Pillars of Society", "artist" : "Grosz", "year" : 1926, "tags" : [ "painting", "satire", "Expressionism", "caricature" ] }
{ "_id" : 2, "title" : "Melancholy III", "artist" : "Munch", "year" : 1902, "tags" : [ "woodcut", "Expressionism" ] }
{ "_id" : 3, "title" : "Dancer", "artist" : "Miro", "year" : 1925, "tags" : [ "oil", "Surrealism", "painting" ] }
{ "_id" : 4, "title" : "The Great Wave off Kanagawa", "artist" : "Hokusai", "tags" : [ "woodblock", "ukiyo-e" ] }
{ "_id" : 5, "title" : "The Persistence of Memory", "artist" : "Dali", "year" : 1931, "tags" : [ "Surrealism", "painting", "oil" ] }
{ "_id" : 6, "title" : "Composition VII", "artist" : "Kandinsky", "year" : 1913, "tags" : [ "oil", "painting", "abstract" ] }
{ "_id" : 7, "title" : "The Scream", "artist" : "Munch", "year" : 1893, "tags" : [ "Expressionism", "painting", "oil" ] }
{ "_id" : 8, "title" : "Blue Flower", "artist" : "O'Keefe", "year" : 1918, "tags" : [ "abstract", "painting" ] }

db.exhibits.aggregate( [ { $unwind: "$tags" },  { $sortByCount: "$tags" } ] )

该操作返回以下文档,按计数降序排列:
{ "_id" : "oil", "count" : 4 }
{ "_id" : "Expressionism", "count" : 3 }
{ "_id" : "Surrealism", "count" : 2 }
{ "_id" : "abstract", "count" : 2 }
{ "_id" : "woodblock", "count" : 1 }
{ "_id" : "woodcut", "count" : 1 }
{ "_id" : "ukiyo-e", "count" : 1 }
{ "_id" : "satire", "count" : 1 }
{ "_id" : "caricature", "count" : 1 }

$unwind

用于展开数组

{
  $unwind:
    {
      path: <field path>,
      includeArrayIndex: <string>,
      preserveNullAndEmptyArrays: <boolean>
    }
}

includeArrayIndex 可选值。一个新字段的名称,用于保存元素的数组索引。

preserveNullAndEmptyArrays 可选值。

  • 如果为true,则如果path为null,丢失或为空数组,则$unwind输出文档。
  • 如果为false,如果path为null,缺少或为空数组,$unwind则不会输出文档。
    默认值为false
db.inventory2.insertMany([
  { "_id" : 1, "item" : "ABC", price: NumberDecimal("80"), "sizes": [ "S", "M", "L"] },
  { "_id" : 2, "item" : "EFG", price: NumberDecimal("120"), "sizes" : [ ] },
  { "_id" : 3, "item" : "IJK", price: NumberDecimal("160"), "sizes": "M" },
  { "_id" : 4, "item" : "LMN" , price: NumberDecimal("10") },
  { "_id" : 5, "item" : "XYZ", price: NumberDecimal("5.75"), "sizes" : null }
])

db.inventory2.aggregate( [ { $unwind: "$sizes" } ] )

{ "_id" : 1, "item" : "ABC", "price" : NumberDecimal("80"), "sizes" : "S" }
{ "_id" : 1, "item" : "ABC", "price" : NumberDecimal("80"), "sizes" : "M" }
{ "_id" : 1, "item" : "ABC", "price" : NumberDecimal("80"), "sizes" : "L" }
{ "_id" : 3, "item" : "IJK", "price" : NumberDecimal("160"), "sizes" : "M" }

若是preserveNullAndEmptyArrays 为true时

db.inventory2.aggregate( [
   { $unwind: { path: "$sizes", preserveNullAndEmptyArrays: true } }
] )

{ "_id" : 1, "item" : "ABC", "price" : NumberDecimal("80"), "sizes" : "S" }
{ "_id" : 1, "item" : "ABC", "price" : NumberDecimal("80"), "sizes" : "M" }
{ "_id" : 1, "item" : "ABC", "price" : NumberDecimal("80"), "sizes" : "L" }
{ "_id" : 2, "item" : "EFG", "price" : NumberDecimal("120") }
{ "_id" : 3, "item" : "IJK", "price" : NumberDecimal("160"), "sizes" : "M" }
{ "_id" : 4, "item" : "LMN", "price" : NumberDecimal("10") }
{ "_id" : 5, "item" : "XYZ", "price" : NumberDecimal("5.75"), "sizes" : null }

总结:mongo的聚合操作对文档的操作还是很全面的,也比较灵活,对于处理数据量不大的情况,大胆使用
下面一篇准备聊聊mongo是内存数据库吗?

相关文章

网友评论

      本文标题:mongo回顾(九:聚合查询三)

      本文链接:https://www.haomeiwen.com/subject/ltkrlltx.html