SQL On Big Data

作者: perryn | 来源:发表于2017-03-09 17:24 被阅读41次

    The query engine parses the query, verifies the semantics of the query against the
    catalog store, and generates the logical query plan that is then optimized by the query
    optimizer to generate the optimal physical execution plan, based on CPU, I/O, and
    network costs to execute the query. The final query plan is then used by the storage
    engine to retrieve the underlying data from various data structures, either on disk or in
    memory. It is in the storage engine that processes such as Lock Manager, Index Manager,
    Buffer Manager, etc., get to work to fetch/update the data, as requested by the query.
    Most RDBMS are SMP-based architectures. Traditional database platforms operate
    by reading data off disk, bringing it across an I/O interconnect, and loading it into
    memory for further processing. An SMP-based system consists of multiple processors,
    each with its own memory cache. Memory and I/O subsystems are shared by each of the
    processors. The SMP architecture is limited in its ability to move large amounts of data, as
    required in data warehousing and large-scale data processing workloads.
    The major drawback of this architecture is that it moves data across backplanes and
    I/O channels. This does not scale and perform when large data sets are to be queried,
    and more so when the queries involve complex joins that require multiple phases of
    processing. A huge inefficiency lies in delivering large data off disk across the network
    and into memory for processing by the DataBase Management System (DBMS).

    相关文章

      网友评论

        本文标题:SQL On Big Data

        本文链接:https://www.haomeiwen.com/subject/qaqggttx.html