1. 前言
本文介绍一下mongodb常用tools,例如mongo shell、mongodb导入导出、mongodb in docker、mongodb index
2. mongo-tools
2.1. 安装
apt install mongo-tools
2.2. mongo shell
常用的使用方法,可以参考本文集的前几篇文章
mongo mongodb://test6:123456@192.168.3.33:37017
2.3. mongodump && mongorestore
推荐使用,数据体积小,导入导出速度更快。亦可以导出部分documents
mongodump -d test_dataset -o .
mongorestore --uri mongodb://test:123456@192.168.3.33:37017 --drop /mnt
2.4. mongoexport && mongoimport
导出明文json数据,如果需要导入到其他类型数据源,例如mysql,可以使用此方法
mongoexport \
--host=192.168.3.33 \
--port=37017 \
--db=test_dataset \
--username=test \
--password=123456 \
--authenticationDatabase=admin \
--collection=config \
--query "$QUERY" \
--out=config.json
mongoimport -h 192.168.3.33:37017 -u test -p 123456 \
--authenticationDatabase=admin \
--db test_dataset \
-c config config.json
2.5. mongo in docker
通过docker启动server和client,更加方便快捷。尤其在test时候,很管用
docker run -d --name mongo-test \
-e MONGO_INITDB_ROOT_USERNAME=test \
-e MONGO_INITDB_ROOT_PASSWORD=123456 \
-v /u/data/mongo.0224:/data/db \
-p 37017:27017 \
mongo:4.2.0
docker run -it --rm --name mongo-cli \
mongo:4.2.0 \
bash
3. mongo index
官方文档:https://docs.mongodb.com/manual/indexes/
3.1. index types
- Single Field
db.collection.createIndex( { name: -1 } )
- Compound Index
db.products.createIndex(
{ item: 1, quantity: -1 } ,
{ name: "query for inventory" }
)
- Multikey Index
MongoDB uses multikey indexes to index the content stored in arrays.
If you index a field that holds an array value, MongoDB creates separate index entries for every element of the array. These multikey indexes allow queries to select documents that contain arrays by matching on element or elements of the arrays.
MongoDB automatically determines whether to create a multikey index if the indexed field contains an array value; you do not need to explicitly specify the multikey type.
image.png- Geospatial Index
To support efficient queries of geospatial coordinate data, MongoDB provides two special indexes: 2d indexes that uses planar geometry when returning results and 2dsphere indexes that use spherical geometry to return results.
- Text Indexes
MongoDB provides a text index type that supports searching for string content in a collection. These text indexes do not store language-specific stop words (e.g. “the”, “a”, “or”) and stem the words in a collection to only store root words.
- Hashed Indexes
only support equality matches and cannot support range-based queries
3.2. Index Properties
- Unique Indexes
The unique property for an index causes MongoDB to reject duplicate values for the indexed field.
Other than the unique constraint, unique indexes are functionally interchangeable with other MongoDB indexes.
db.members.createIndex( { "user_id": 1 }, { unique: true } )
- Partial Indexes
Partial indexes only index the documents in a collection that meet a specified filter expression.
By indexing a subset of the documents in a collection, partial indexes have lower storage requirements and reduced performance costs for index creation and maintenance.
Partial indexes offer a superset of the functionality of sparse indexes and should be preferred over sparse indexes.
db.restaurants.createIndex(
{ cuisine: 1, name: 1 },
{ partialFilterExpression: { rating: { $gt: 5 } } }
)
- Sparse Indexes
The sparse property of an index ensures that the index only contain entries for documents that have the indexed field. The index skips documents that do not have the indexed field.
You can combine the sparse index option with the unique index option to prevent inserting documents that have duplicate values for the indexed field(s) and skip indexing documents that lack the indexed field(s).
db.addresses.createIndex( { "xmpp_id": 1 }, { sparse: true } )
- TTL Indexes
TTL indexes are special indexes that MongoDB can use to automatically remove documents from a collection after a certain amount of time.
This is ideal for certain types of information like machine generated event data, logs, and session information that only need to persist in a database for a finite amount of time.
db.eventlog.createIndex( { "lastModifiedDate": 1 }, { expireAfterSeconds: 3600 } )
3.3. Python示例代码
连接mongodb时候,对collection创建index,如果index存在则跳过
class StatisticsPluModel(DataContract):
storeid: str = data_member()
day: datetime = data_member()
active: List[PluTrxInfo] = data_member()
created_at: datetime = data_member(default=datetime.utcnow())
updated_at: datetime = data_member(default=datetime.utcnow())
def __init__(self):
db.statistics_plu.create_index([('storeid', 1)], background=True)
db.statistics_plu.create_index([('created_at', -1)], background=True)
db.statistics_plu.create_index([('day', -1)], background=True)
db.statistics_plu.create_index([('storeid', 1), ('day', -1)], background=True, unique=True)
4. 脚本
批量导入json数据到mongodb
from functools import partial
from subprocess import check_call as _call
call = partial(_call, shell=True)
cmd = '''
mongoimport -h 192.168.3.33:37017 -u test -p 123456 \
--authenticationDatabase=admin \
--db test_dataset \
-c {collection} {json_file}
'''
files = ('apilog.json', 'config.json', 'errorlog.json', 'gallery.json', 'gallerybuffer.json', 'image.json', 'transaction.json')
for file in files:
call(cmd.format(collection=file.split('.')[0], json_file=file))
网友评论