美文网首页
neo4j基本操作

neo4j基本操作

作者: Jlan | 来源:发表于2021-01-13 18:11 被阅读0次

    一. neo4j安装

    1. 安装jdk

    可以安装openjdk,neo4j 4.0版本以上需要openjdk-11,3.5版本需要openjdk-8。
    如果默认软件源没有openjdk,可以添加ppa源。
    如果ubuntu版本比较旧(如16.04),可能装openjdk-11比较麻烦,可以装openjdk-8。

    sudo add-apt-repository -y ppa:openjdk-r/ppa
    sudo apt-get update
    sudo apt-get install openjdk-8-jdk
    

    2. 安装neo4j

    wget -O - https://debian.neo4j.org/neotechnology.gpg.key | sudo apt-key add -
    echo 'deb https://debian.neo4j.org/repo stable/' | sudo tee -a /etc/apt/sources.list.d/neo4j.list
    sudo apt-get update
    sudo apt-get install neo4j
    sudo apt-get install cypher-shell
    

    3. 启动或停止服务

    neo4j status
    neo4j start
    neo4j stop
    

    通过cypher-shell可以进入neo4j交互界面,默认用户名和密码是"neo4j"。
    在交互界面可以通过CALL dbms.changePassword('password'); 修改密码。

    4. 设置远程浏览器访问

    默认只能localhost访问,需要远程访问需修改/etc/neo4j/neo4j.conf,去掉注释即可

    #dbms.connectors.default_listen_address=0.0.0.0
    

    二. py2neo使用

    节点和关系

    In [1]: from py2neo import Graph, Node, Relationship
    
    In [2]: a = Node("Person", name="Alice")
    In [3]: b = Node("Person", name="Bob")
    In [4]: ab = Relationship(a, "KNOWS", b)
    
    In [5]: print(type(a))
    <class 'py2neo.data.Node'>
    In [6]: print(a)
    (:Person {name: 'Alice'})
    
    In [7]: print(type(ab))
    <class 'py2neo.data.KNOWS'>
    In [8]: print(ab)
    (Alice)-[:KNOWS {}]->(Bob)
    

    这样就成功创建了两个 Node 和两个 Node 之间的 Relationship。 Node 和 Relationship 都继承了 PropertyDict 类,它可以赋值很多属性,类似于字典的形式。

    Subgraph
    Subgraph子图,是 Node 和 Relationship 的集合,最简单的构造子图的方式是通过关系运算符,如下:

    # 创建subgraph
    In [10]: s = a | b | ab
    
    In [11]: print(type(s))
    <class 'py2neo.data.Subgraph'>
    In [12]: print(s)
    Subgraph({Node('Person', name='Alice'), Node('Person', name='Bob')}, {KNOWS(Node('Person', name='Alice'), Node('Person', name='Bob'))})
    
    # 可以通过 nodes () 和 relationships () 方法获取所有的 Node 和 Relationship
    In [20]: type(s.nodes)
    Out[20]: py2neo.collections.SetView
    In [18]: list(s.nodes)
    Out[18]: [Node('Person', name='Alice'), Node('Person', name='Bob')]
    In [19]: list(s.relationships)
    Out[19]: [KNOWS(Node('Person', name='Alice'), Node('Person', name='Bob'))]
    
    # subgraph求交集
    In [21]: s2 = a | b
    
    In [22]: s&s2
    Out[22]: Subgraph({Node('Person', name='Alice'), Node('Person', name='Bob')}, {})
    

    walkable
    Walkable 是增加了遍历信息的 Subgraph,可以通过 + 号便可以构建一个 Walkable 对象,如:

    In [34]: a = Node("Person", name="Alice")
    In [35]: b = Node("Person", name="Bob")
    In [36]: c = Node("Person", name="Jack")
    In [37]: d = Node("Dog", name="Pupy")
    In [38]: ab = Relationship(a, "KNOWS", b)
    In [39]: bc = Relationship(b, "LIKES", c)
    In [40]: cd = Relationship(c, "HAS", d)
    # 创建walkable对象
    In [41]: w = ab+bc+cd
    
    In [42]: print(type(w))
    <class 'py2neo.data.Path'>
    In [43]: print(w)
    (Alice)-[:KNOWS {}]->(Bob)-[:LIKES {}]->(Jack)-[:HAS {}]->(Pupy)
    
    In [44]: from py2neo import walk
    
    # 用walk方法从起始节点遍历到终止节点
    In [45]: for item in walk(w):
        ...:     print(item)
    (:Person {name: 'Alice'})
    (Alice)-[:KNOWS {}]->(Bob)
    (:Person {name: 'Bob'})
    (Bob)-[:LIKES {}]->(Jack)
    (:Person {name: 'Jack'})
    (Jack)-[:HAS {}]->(Pupy)
    (:Dog {name: 'Pupy'})
    
    # 用 start_node ()、end_node ()、nodes ()、relationships () 方法来获取起始 Node、终止 Node、所有 Node 和 Relationship
    In [47]: w.start_node
    Out[47]: Node('Person', name='Alice')
    In [48]: w.end_node
    Out[48]: Node('Dog', name='Pupy')
    In [49]: w.nodes
    Out[49]:
    (Node('Person', name='Alice'),
     Node('Person', name='Bob'),
     Node('Person', name='Jack'),
     Node('Dog', name='Pupy'))
    In [50]: w.relationships
    Out[50]:
    (KNOWS(Node('Person', name='Alice'), Node('Person', name='Bob')),
     LIKES(Node('Person', name='Bob'), Node('Person', name='Jack')),
     HAS(Node('Person', name='Jack'), Node('Dog', name='Pupy')))
    

    Graph

    1. 初始化
      Graph是和 Neo4j 数据交互的 最重要得API,提供了许多方法来操作 Neo4j 数据库。 Graph 在初始化的时候需要传入连接的 URI,初始化参数有 bolt、secure、host、http_port、https_port、bolt_port、user、password,详情参考:http://py2neo.org/v3/database.html#py2neo.database.Graph。 初始化的实例如下:
    g = Graph(host='localhost', auth=('neo4j', 'passwd'))
    
    1. 创建数据
      可以直接创建子图,也可以创建单个节点或关系
    In [34]: a = Node("Person", name="Alice")
    In [35]: b = Node("Person", name="Bob")
    In [36]: c = Node("Person", name="Jack")
    In [37]: d = Node("Dog", name="Pupy")
    In [38]: ab = Relationship(a, "KNOWS", b)
    In [39]: bc = Relationship(b, "LIKES", c)
    In [40]: cd = Relationship(c, "HAS", d)
    In [41]: ss = a|b|c|d|ab|bc|cd
    In [42]: g.create(ss)
    

    得到如下结果:



    再添加一个关系

    r = Relationship(a, 'KONWS', c)
    g.create(r)
    

    得到结果如下:


    1. 查找节点
      使用NodeMatcher查找节点。
    In [40]: from py2neo import NodeMatcher, RelationshipMatcher
    
    In [41]: nm = NodeMatcher(g)
    
    In [43]: res = nm.match('Person')
    In [44]: list(res)
    Out[44]:
    [Node('Person', name='Bob'),
     Node('Person', name='Alice'),
     Node('Person', name='Jack')]
    
    # 返回查找结果得第一个
    In [58]: res = nm.match('Person').first()
    In [59]: res
    Out[59]: Node('Person', name='Bob')
    
    In [49]: res = nm.match('Dog', name='Pupy')
    In [50]: list(res)
    Out[50]: [Node('Dog', name='Pupy')]
    
    # 使用正则匹配查询
    In [56]: res = nm.match('Person').where('_.name=~"A.*"')
    In [57]: list(res)
    Out[57]: [Node('Person', name='Alice')]
    

    first()返回单个节点
    limit(amount)返回底部节点的限值条数
    skip(amount)返回顶部节点的限值条数
    order_by(fields)排序
    where(
    conditions, **properties)筛选条件

    1. 查找关系
      可以使用g.match查找关系,也可以使用RelationshipMatcher,后者更强大。
    In [40]: from py2neo import NodeMatcher, RelationshipMatcher
    
    In [42]: rm = RelationshipMatcher(g)
    
    In [96]: list(g.match())
    Out[96]:
    [LIKES(Node('Person', name='Bob'), Node('Person', name='Jack')),
     KONWS(Node('Person', name='Alice'), Node('Person', name='Jack')),
     KNOWS(Node('Person', name='Alice'), Node('Person', name='Bob')),
     HAS(Node('Person', name='Jack'), Node('Dog', name='Pupy'))]
    
    In [63]: res = g.match(r_type='LIKES')
    In [64]: list(res)
    Out[64]: [LIKES(Node('Person', name='Bob'), Node('Person', name='Jack'))]
    
    # 查询以某个节点为头节点的某个关系,例如要查询白血病的并发症
    In [293]: a = nm.match('疾病', name='白血病').first()                                                                                                                                         
    In [294]: a                                                                                                                                                                                   
    Out[294]: Node('疾病', name='白血病')
    In [295]: list(g.match(r_type='并发症', nodes=[a]))                                                                                                                                           
    Out[295]: 
    [并发症(Node('疾病', name='白血病'), Node('疾病', name='白血病性中枢神经感染')),
     并发症(Node('疾病', name='白血病'), Node('疾病', name='白血病脑出血')),
     并发症(Node('疾病', name='白血病'), Node('疾病', name='肠功能衰竭')),
     并发症(Node('疾病', name='白血病'), Node('疾病', name='卡氏肺囊虫感染'))]
    
    In [66]: res2 = rm.match(r_type='LIKES')
    In [67]: list(res2)
    Out[67]: [LIKES(Node('Person', name='Bob'), Node('Person', name='Jack'))]
    
    1. 批量插入
      批量插入时要注意避免插入很多相同节点(即使类型和值都相同,但多次用Node构建,产生的节点就是不同的,因为id不同),如下示例:
    In [258]: a1 = Node('Person', '小明')                                                                                                                                                         
    In [259]: a2 = Node('Person', '小明')                                                                                                                                                         
    
    In [260]: a1==a2                                                                                                                                                                              
    Out[260]: False
    In [261]: id(a1)                                                                                                                                                                              
    Out[261]: 139971127871536
    In [262]: id(a2)                                                                                                                                                                              
    Out[262]: 139971551445936
    

    因此在批量插入时,尤其是对表格类数据,要注意避免多次构造具有相同类型和值的节点,可以在用Node构建节点前先用NodeMatcher查询是否已经存在相同类型和值的节点。下边是一个据体的批量插入的例子:

    g = Graph(host='localhost', auth=('neo4j', 'password'))
    nm = NodeMatcher(g)
    
    for i in data:
        spos = i['spo_list']
        for spo in spos:
            p, sub, obj, sub_type, obj_type = spo.values()
            sub_existed = nm.match(sub_type, name=sub).first()  # 查询是否已存在相同类型和值的节点
            obj_existed = nm.match(obj_type, name=obj).first()
            if sub_existed and obj_existed:  # 两个节点之间只能有一种关系,因此如果sub和obj都已经存在了,就不再插入
                continue
            elif sub_existed:
                obj_node = Node(obj_type, name=obj)  # 只存在sub节点,则需要构建新的obj节点
                rel = Relationship(sub_existed, p, obj_node)
            elif obj_existed:
                sub_node = Node(sub_type, name=sub)
                rel = Relationship(sub_node, p, obj_existed)
            else:
                sub_node = Node(sub_type, name=sub)
                obj_node = Node(obj_type, name=obj)
                rel = Relationship(sub_node, p, obj_node)
            g.create(rel)
    

    参考

    1. https://www.cnblogs.com/selfcs/p/12658740.html
    2. https://py2neo.readthedocs.io/en/latest/database/work.html
    3. https://www.cnblogs.com/qiujichu/p/13032254.html
    4. http://foreversong.cn/archives/1271
    5. https://cuiqingcai.com/4778.html

    相关文章

      网友评论

          本文标题:neo4j基本操作

          本文链接:https://www.haomeiwen.com/subject/oahpaktx.html