美文网首页
Data Flow Transformations(2):sor

Data Flow Transformations(2):sor

作者: 鲸鱼酱375 | 来源:发表于2019-10-02 09:56 被阅读0次

    Sort

    • Sorts input data in ascending or descending order.

    • Multiple sorts can be applied; each sort is identified by a numeral that determines the sort order.

    • Positive number denotes ascending order, and a negative number denotes descending order.

    • Column with the lowest number is sorted first, second lowest number is sorted next, and so on.

    • It can also remove duplicate rows as part of its sort.

    • It has one input and one output. It does not support error outputs.

    • full blocking


      image.png

    how to sort the output
    change the source output, false to true


    image.png
    image.png

    Aggregate

    • Used to perform following operations on input columns:
      ▶Sum, Average, Count, Count Distinct, Minimum, Maximum
      ▶Group By
      ▶The comparison options of the aggregation

    • Handles null values in the same way as the SSMS does.

    • IsBig property can be set for handling big or high-precision numbers.

    • Performance Can be Improved by:
      ▶Set Keys or KeysScale properties When performing a Group by operation
      ▶Set CountDistinctKeys or CountDistinctScale properties while performing Distinct count operation.
      ▶because the tranformation is able to allocate adequate memory for the data that the transformation caches.

    • The Aggregate transformation has one input and one or more outputs. It does not support an error output.

    • full blocking

    count distinct ask more space from ram???
    isbig

    Union All

    • Combines multiple inputs into one output.
    • Inputs are added to the transformation output one after the other; no reordering of rows occurs.
    • At least one input must be mapped to each output column.
    • Metadata of the columns must match. dayatype must be matched
    • Columns that are not mapped, are set to null values in output.
    • This transformation has multiple inputs and one output. It does not support an error output.
    • semi-blocking

    Merge

    • Combines two sorted datasets into a single dataset.
    • Merging is based on values in their key columns.
    • It can be used to perform:
      ▶Merge data from two data sources.
      ▶Create complex datasets by nesting Merge transformations.
      ▶Remerge rows after correcting errors in the data.
    • Inputs have matching metadata.
    • This transformation has two inputs and one output. It does not support an error output.
    • semi blocking

    Merge Join

    • Combines two sorted datasets into a single dataset by using a FULL, LEFT, or INNER join.
    • It can configured as follows:
      ▶Specify the join: FULL, LEFT, or INNER join.
      ▶Specify the columns the join uses.
      ▶Specify whether the transformation handles null values as equal to other nulls.
    • Joining columns have matching metadata.
    • This transformation has two inputs and one output. It does not support an error output.
    • semi blocking
    • swap input

    sorting in the property, it can improve performance(?)

    Conditional Split

    • It can route data rows to different outputs depending on the content of the data.

    • It evaluates expressions, and based on the results, directs the data row to the specified output.

    • It can configured as follows:
      ▶Specify an expression that evaluates to a Boolean for each condition.
      ▶Specify the order in which the conditions are evaluated.
      ▶Specify the default output for the transformation.

    • This transformation has one input, one or more outputs and one error output.

    • like case statement in sql

    • non-blocking

    • 注意大小写

    eg: [id]>1 && [id]<5

    Multicast

    • Multicast transformation distributes its input to one or more outputs.
    • Multicast transformation directs every row to every output, unlike Conditional Split directs a row to a single output.
    • You configure the Multicast transformation by adding outputs.
    • This transformation has one input and multiple outputs. It does not support an error output.
    • non- blocking
    • like copy the dataset

    区别: conditional的是符合条件的去,multicast是复制粘贴,不管条件

    Row Count

    • Count the number of rows, as they pass through a data flow and stores the final count in a variable.
    • Variable used Must already exist.
    • Variable must be in the scope of Data Flow Task using Row Count.
    • Row count value is stored in the variable only after the last row has passed the transformation.
    • This transformation has one input and one output. It does not support an error output
    • non-blocking
    • https://www.c-sharpcorner.com/UploadFile/muralidharan.d/how-to-use-rowcount-in-ssis/

    Audit

    • Audit transformation enables the data flow in a package to include data about the environment in which the package runs.
    • SSIS includes system variables that provide this information.
    • It can be configured as:
      ▶Provide the name of a output column.
      ▶Map the system variable to the output column.
    • Single system variable can be mapped to multiple columns.
    • This transformation has one input and one output. It does not support an error output.
    • non blocking

    how to show the num

    相关文章

      网友评论

          本文标题:Data Flow Transformations(2):sor

          本文链接:https://www.haomeiwen.com/subject/bnvaectx.html