背景
电信卡返利系统中,核心功能就是按照用户的维度,统计用户已经销售的卡的充值金额和返利金额,并区分出返利是否已经支付,最后按照未支付的返利金额为客户支付返利。之前是希望直接系统自动统计,支付完成后,在网页上选中,修改支付状态,生成返利支付订单。但是由于查询速度较慢,因此暂时改为了我在数据库中手动查询,然后导出查询记录,邮件发送给财务人员。然而,当前在号码充值订单增加到了 近 20k 条记录后,统计语句的效率急剧下降,统计一次的时间竟然达到了 1700s。完整的统计语句如下。
// 根据用户维度统计充值返利
db.users.aggregate([
{$unwind: "$acceptOrder"},
{$project: {_id: "$_id", company: "$company", acceptOrder: "$acceptOrder"}},
{$lookup: {
from: "orders",
localField: "acceptOrder",
foreignField: "_id",
as: "orders"
}},
{$unwind: "$orders"},
{$match: {"orders.freePackage": true}},
{$project: {_id: "$_id", company: "$company", haveNo: "$orders.haveNo"}},
{$unwind: "$haveNo"},
{$lookup: {
from: "rechargeorders",
localField: "haveNo",
foreignField: "_simcardId",
as: "rechargeOrders"
}},
{$unwind: "$rechargeOrders"},
{$match: { "rechargeOrders.paid": true, "rechargeOrders.createdAt": {$lt: ISODate("2016-12-15T00:00:00.000+08:00") }}},
{$project: {_id: "$_id", company: "$company", fee: "$rechargeOrders.fee", rebateFee: "$rechargeOrders.rebateFee", rebatePaid: "$rechargeOrders.rebatePaid"}},
{$group: {
_id: {_id: "$_id", company: "$company"}, fee: {$sum: "$fee"}, rebateFeeTotal: {$sum: "$rebateFee"}, count: {$sum: 1},
rebateFeePaid: {$sum: {$cond: ["$rebatePaid", "$rebateFee", 0]}},
rebateFeeUnpaid: {$sum: {$cond: ["$rebatePaid", 0, "$rebateFee"]}},
}},
// 瓶颈应该在上面这句,之前的语句大概耗时 1s 左右
{$project: {_id: "$_id._id", company: "$_id.company", fee: {$divide: ["$fee", 100]}, rebateFeeTotal: {$divide: ["$rebateFeeTotal", 100]}, count: "$count", rebateFeePaid: {$divide: ["$rebateFeePaid", 100]}, rebateFeeUnpaid: {$divide: ["$rebateFeeUnpaid", 100]}}}
])
// 根据用户 id 查询未支付的返利订单
db.users.aggregate([
{$match: {_id: {$in: [
ObjectId("58622985b38bdf35d45adc9b"),
...
]}}},
{$unwind: "$acceptOrder"},
{$project: {_id: "$_id", company: "$company", acceptOrder: "$acceptOrder"}},
{$lookup: {
from: "orders",
localField: "acceptOrder",
foreignField: "_id",
as: "orders"
}},
{$unwind: "$orders"},
{$match: {"orders.freePackage": true}},
{$project: {_id: "$_id", company: "$company", haveNo: "$orders.haveNo"}},
{$unwind: "$haveNo"},
{$lookup: {
from: "rechargeorders",
localField: "haveNo",
foreignField: "_simcardId",
as: "rechargeOrders"
}},
{$unwind: "$rechargeOrders"},
{$match: { "rechargeOrders.paid": true, "rechargeOrders.rebatePaid": false, "rechargeOrders.createdAt": {$lt: ISODate("2017-01-11T00:00:00.000+08:00") }}},
// 查询条件设置为 false 的情况下会比 设置为 true 的情况下慢 17 倍的时间左右,当前情况下是 17 s
{$project: {_id: "$rechargeOrders._id", rebateFee: "$rechargeOrders.rebateFee"}},
])
相关的集合定义如下:
users: // 用户,当前记录为 150 条
{
_id: ObjectId;
company: String; // 公司名称
acceptOrder: [ObjectId]; // 订单 id
}
{ "_id": ObjectId("users_1"), "company": "第一通讯", "acceptOrder": [ ObjectId("orders_1"), ObjectId("orders_3")] },
{ "_id": ObjectId("users_2"), "company": "第二通讯", "acceptOrder": [ ObjectId("orders_2")] }
orders: // 号码订单,当前记录为 600 条
{
_id: ObjectId;
haveNo: [ObjectId]; // 号码 id
}
{ "_id": ObjectId("orders_1"), "haveNo": [ObjectId("simcards_1"), ObjectId("simcards_2")] },
{ "_id": ObjectId("orders_1"), "haveNo": [ObjectId("simcards_3"), ObjectId("simcards_4")] },
{ "_id": ObjectId("orders_1"), "haveNo": [ObjectId("simcards_5"), ObjectId("simcards_6")] },
rechargeOrders: // 充值订单,当前记录为 20k 条
{
_id: ObjectId;
_simcardId: ObjectId; // 号码 id
createdAt: Date; // 充值订单创建时间
paid: Boolean; // 话费是否已经支付
rebatePaid: Boolean; // 返利是否已经支付
fee: Number; // 充值金额
rebateFee: Number; // 返利金额
}
{ "_id": ObjectId("rechargeOrders_1"), "_simcardId": ObjectId("simcards_1"), createdAt: ISODate("2016-10-28T17:00:33.152+08:00"), "paid": true, rebatePaid: "true", "fee": NumberInt("10000"), rebateFee: NumberInt("1500") },
{ "_id": ObjectId("rechargeOrders_2"), "_simcardId": ObjectId("simcards_3"), createdAt: ISODate("2016-10-29T17:00:33.152+08:00"), "paid": true, rebatePaid: "false", "fee": NumberInt("1500"), rebateFee: NumberInt("225") },
{ "_id": ObjectId("rechargeOrders_3"), "_simcardId": ObjectId("simcards_1"), createdAt: ISODate("2016-10-30T17:00:33.152+08:00"), "paid": true, rebatePaid: "false", "fee": NumberInt("600"), rebateFee: NumberInt("90") },
{ "_id": ObjectId("rechargeOrders_4"), "_simcardId": ObjectId("simcards_2"), createdAt: ISODate("2016-11-24T16:00:33.152+08:00"), "paid": true, rebatePaid: "false", "fee": NumberInt("4000"), rebateFee: NumberInt("600") },
{ "_id": ObjectId("rechargeOrders_5"), "_simcardId": ObjectId("simcards_6"), createdAt: ISODate("2016-11-25T17:00:33.152+08:00"), "paid": true, rebatePaid: "false", "fee": NumberInt("10000"), rebateFee: NumberInt("1500") },
目标
- 暂时可以通过任何手段在最短时间内解决查询慢的问题;
- 结合 MongoDB 本身的特性,找到最合理的解决方案;
解决思路
- 直接优化聚合语句或数据库性能;
- MongoDB 聚合语句和程序代码结合;
- 修改数据库结构;
直接优化聚合语句或数据库性能
最直接的解决办法。但是需要 MongoDB 本身支持。至于MongoDB 本身是否支持,由于我对 MongoDB 掌握地远远不够,现在还无法得出结论。
优化方向:
- 加索引;
- 调整语句;
- 在 Pipeline 框架内调整;
- 修改为 Map-Reduce;
MongoDB 聚合语句和程序代码结合
部分功能用程序代码实现。
修改数据库结构
根据 MongoDB 本身的特性,修改原数据库结构,以达到空间和时间的平衡。
问题解决途径
- 自学 MongoDB,更深入理解 MongoDB 的机制;
- 官方文档;
- 畅销书籍;
- 社区求助:
- CNode
- CSDN
- stackoverflow
- 熟人求助;
- MongoDB 官方咨询服务;
网友评论