美文网首页
aggregate技术说明

aggregate技术说明

作者: 陆遥远 | 来源:发表于2020-05-11 14:59 被阅读0次

    大家在开发的时候注意,多表关联、统计汇总才允许用aggregate,它是pipline,效率较高,支持分片。
    但其中用于多表关联的lookup和输出结果的out,只允许运行在主分片上,其中from对应的集合不能分片,且大家注意左连接和内连接的区别,特别注意的坑请看第五点

    一、测试数据准备
    use test
    db.product.insert({"_id":1,"productname":"商品1","price":15})
    db.product.insert({"_id":2,"productname":"商品2","price":36})
    db.product.insert({"_id":3,"productname":"商品3","price":100})
    db.product.insert({"_id":10,"productname":"商品10","price":300})

    db.orders.insert({"_id":1,"pid":1,"ordername":"订单1"})
    db.orders.insert({"_id":2,"pid":2,"ordername":"订单2"})
    db.orders.insert({"_id":3,"pid":2,"ordername":"订单3"})
    db.orders.insert({"_id":4,"pid":1,"ordername":"订单4"})
    db.orders.insert({"_id":5,"pid":3,"ordername":"订单5"})
    db.orders.insert({"_id":6,"pid":4,"ordername":"订单6"})

    二、左连接
    A、语句:以订单为主,原生就是左连接
    db.orders.aggregate([
    {
    $lookup:
    {
    from: "product",
    localField: "pid",
    foreignField: "_id",
    as: "product_doc"
    }
    }
    ])

    B、结果,注意右边不符合的,最后2条为空

    { "_id" : 1, "pid" : 1, "ordername" : "订单1", "product_doc" : [ { "_id" : 1, "productname" : "商品1", "price" : 15 } ] }
    { "_id" : 2, "pid" : 2, "ordername" : "订单2", "product_doc" : [ { "_id" : 2, "productname" : "商品2", "price" : 36 } ] }
    { "_id" : 3, "pid" : 2, "ordername" : "订单3", "product_doc" : [ { "_id" : 2, "productname" : "商品2", "price" : 36 } ] }
    { "_id" : 4, "pid" : 1, "ordername" : "订单4", "product_doc" : [ { "_id" : 1, "productname" : "商品1", "price" : 15 } ] }
    { "_id" : 5, "pid" : 3, "ordername" : "订单5", "product_doc" : [ ] }
    { "_id" : 6, "pid" : 4, "ordername" : "订单6", "product_doc" : [ ] }

    三、等值连接,需要加match条件
    A、语句:
    db.orders.aggregate([
    {
    lookup: { from: "product", localField: "pid", foreignField: "_id", as: "product_doc" } }, {match :
    {
    "product_doc":{ $ne: []}
    }
    }
    ])

    B、结果,最后2条为空
    { "_id" : 1, "pid" : 1, "ordername" : "订单1", "product_doc" : [ { "_id" : 1, "productname" : "商品1", "price" : 15 } ] }
    { "_id" : 2, "pid" : 2, "ordername" : "订单2", "product_doc" : [ { "_id" : 2, "productname" : "商品2", "price" : 36 } ] }
    { "_id" : 3, "pid" : 2, "ordername" : "订单3", "product_doc" : [ { "_id" : 2, "productname" : "商品2", "price" : 36 } ] }
    { "_id" : 4, "pid" : 1, "ordername" : "订单4", "product_doc" : [ { "_id" : 1, "productname" : "商品1", "price" : 15 } ] }

    四、显示字段的过滤:加project
    db.orders.aggregate([
    {
    lookup: { from: "product", localField: "pid", foreignField: "_id", as: "product_doc" } }, {match :
    {
    "product_doc._id":{ $exists: true }
    }
    },
    ,}
    ])

    五、坑的问题解决:

    1、数组拉平的坑和问题解决:
    A、存在的坑,对空或者不存在数组的,会丢弃该行,需要特别注意的坑
    db.product.aggregate([
    {
    lookup: { from: "orders", localField: "_id", foreignField: "pid", as: "inventory_docs" } }, {unwind : "$inventory_docs"
    }
    ])

    B、#解决办法:3.2版本之后可以通过加参数解决,不丢空数组数据
    db.product.aggregate([
    {
    lookup: { from: "orders", localField: "_id", foreignField: "pid", as: "inventory_docs" } }, {unwind : {
    "path": "$inventory_docs",
    "preserveNullAndEmptyArrays": true
    }
    }
    ])

    2、每个pipline的输出不能超过100M(内存),否则报错,在实际开发中请加上磁盘作为交换参数解决:

    db.orders.aggregate([
    {
    lookup: { from: "product", localField: "pid", foreignField: "_id", as: "products" } }, {match :
    {
    "products._id":{ exists: true } } }, {project:
    {
    "pid":1,"ordername":1,"product_doc.productname":1
    }
    },
    {
    allowDiskUse: true
    }
    ])

    3、输出结果单文档超过16M的坑,mongodb的查询数据的一个大坑,通过返回结果给一个游标来规避
    var cur = 查询语句

    4、二个表多字段关联需求目前3.2原生不支持,(但3.6原生支持),变通做法是下面再加对数组过滤来实现,性能会差点

    5、$lookup限制的坑:from对应的colection不能分片,必须2个表在同一个库上。

    六、优化:
    1、普通优化优化原则
    减少初始数据量,命中率

    2、aggregate优化原则,一般顺序上有要求:
    project oraddFields + matchsort + matchredact + matchskip + limitproject + skip orlimit

    3、看执行计划
    db.orders.aggregate([
    {
    lookup: { from: "product", localField: "pid", foreignField: "_id", as: "products" } }, {match :
    {
    "products._id":{ exists: true } } }, {project:
    {
    "pid":1,"ordername":1,"product_doc.productname":1
    }
    },
    {
    allowDiskUse: true
    },
    {
    explain: true
    }
    ])

    aggregate的分页
    db.orders.aggregate([
    {
    {
    match : {limit:50,
    $skip:50,

    }
    },
    lookup: { from: "product", localField: "pid", foreignField: "_id", as: "products" } }, {match :
    {
    "products._id":{ exists: true } } }, {project:
    {
    "pid":1,"ordername":1,"product_doc.productname":1
    }
    },
    {
    allowDiskUse: true
    },
    {
    explain: true
    }
    ])

    相关文章

      网友评论

          本文标题:aggregate技术说明

          本文链接:https://www.haomeiwen.com/subject/skbtgftx.html