esc0rp 发表于 2024-8-4 12:18:10

ETL中双流合并和多流合并的区别


    <h1 style="color: black; text-align: left; margin-bottom: 10px;"><span style="color: black;">1、</span>ETL工具</h1>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">ETLCloud</span>数据集成<span style="color: black;">平台集实时数据集成和离线数据集成以及</span>API<span style="color: black;">发布为一体的数据集成平台。与其他开源数据集成工具相比,采用轻量化架构、<span style="color: black;">拥有</span>更快的<span style="color: black;">安排</span>速度、更快的数据传输速度、更低的运维成本,<span style="color: black;">同期</span>支持多租户的</span>团队协作<span style="color: black;">能力,能够满足企业<span style="color: black;">各样</span><span style="color: black;">繁杂</span>的数据处理需求。含有丰富的ETL操作<span style="color: black;">关联</span>的组件,<span style="color: black;">经过</span>拉取的方式来搭建流程,<span style="color: black;">针对</span>小白和非<span style="color: black;">研发</span>人员<span style="color: black;">来讲</span>非常的友好。</span></p>
    <div style="color: black; text-align: left; margin-bottom: 10px;"><img src="https://p3-sign.toutiaoimg.com/tos-cn-i-axegupay5k/e32d92c88b534c2990cb7a5c5d33491f~noop.image?_iz=58558&amp;from=article.pc_detail&amp;lk3s=953192f4&amp;x-expires=1723342725&amp;x-signature=sKiYr%2BPBkZjMYsIQwbssPBCTgtE%3D" style="width: 50%; margin-bottom: 20px;"></div>
    <div style="color: black; text-align: left; margin-bottom: 10px;"><img src="https://p3-sign.toutiaoimg.com/tos-cn-i-6w9my0ksvp/8b823d6381994944affdace17dd62df7~noop.image?_iz=58558&amp;from=article.pc_detail&amp;lk3s=953192f4&amp;x-expires=1723342725&amp;x-signature=ingZ%2BVzx%2BEoiniklocjxaEwkrws%3D" style="width: 50%; margin-bottom: 20px;"></div>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;">今天<span style="color: black;">咱们</span>要介绍两个在ETL过程中经常<span style="color: black;">运用</span>的组件,双流join合并组件和多流UnionAll合并组件。</p>
    <h1 style="color: black; text-align: left; margin-bottom: 10px;"><span style="color: black;">2、</span>组件演示</h1>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><strong style="color: blue;">1、双流join合并组件</strong></p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">首要</span>创建好流程,在数据运算组件中找到双流合并组件,顾名思义这个组件是将两边的数据流合并在<span style="color: black;">一块</span>,join<span style="color: black;">便是</span><span style="color: black;">咱们</span>sql语法中的内连接和外连接了,<span style="color: black;">因此</span><span style="color: black;">咱们</span><span style="color: black;">必须</span>拉取两个输入组件,<span style="color: black;">这儿</span><span style="color: black;">咱们</span>拉取库表输入组件,流程设计如下:</p>
    <div style="color: black; text-align: left; margin-bottom: 10px;"><img src="https://p3-sign.toutiaoimg.com/tos-cn-i-6w9my0ksvp/1229801aff574a80a66758eefa003eb2~noop.image?_iz=58558&amp;from=article.pc_detail&amp;lk3s=953192f4&amp;x-expires=1723342725&amp;x-signature=kwPcAL87aVtwOPlcPtUZHXvf1dI%3D" style="width: 50%; margin-bottom: 20px;"></div>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;">库表输入配置,只<span style="color: black;">必须</span>配置好<span style="color: black;">关联</span>数据源,<span style="color: black;">选取</span>库表,设置输入字段<span style="color: black;">就可</span></p>
    <div style="color: black; text-align: left; margin-bottom: 10px;"><img src="https://p3-sign.toutiaoimg.com/tos-cn-i-6w9my0ksvp/c2376339ad124cdc9f7aacab1f9a2c94~noop.image?_iz=58558&amp;from=article.pc_detail&amp;lk3s=953192f4&amp;x-expires=1723342725&amp;x-signature=gYjKypt0BsLLIm5FfChQ6HRDPz8%3D" style="width: 50%; margin-bottom: 20px;"></div>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;">双流join组件,点击组件打开配置页面,<span style="color: black;">必须</span>理解了sql语法中的join操作<span style="color: black;">就可</span>上手。</p>
    <div style="color: black; text-align: left; margin-bottom: 10px;"><img src="https://p3-sign.toutiaoimg.com/tos-cn-i-6w9my0ksvp/2bedeb4953f04adb8b570883b37ddb6e~noop.image?_iz=58558&amp;from=article.pc_detail&amp;lk3s=953192f4&amp;x-expires=1723342725&amp;x-signature=fOQNtVlvkuEXwQ9Gls9L0%2Bc28AY%3D" style="width: 50%; margin-bottom: 20px;"></div>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;">在join模式中有三个选项,分别是左连接,内连接,笛卡尔积,</p>LEFT JOIN 会返回左边表(左表)的所有行,以及右边表(右表)中与左表匹配的行。<span style="color: black;">倘若</span>右表中<span style="color: black;">无</span>匹配的行,则会返回 NULL 值。INNER JOIN 是最常用的连接操作,它<span style="color: black;">按照</span>两个表之间的<span style="color: black;">一起</span>列的值将两个表进行连接。只返回符合连接<span style="color: black;">要求</span>的行,即两个表中<span style="color: black;">经过</span>连接<span style="color: black;">要求</span><span style="color: black;">相关</span>起来的行。笛卡儿积<span style="color: black;">指的是</span>将两个表中的每一行都与另一个表中的每一行进行组合,返回的结果集<span style="color: black;">体积</span>为两个表行数的乘积。<div style="color: black; text-align: left; margin-bottom: 10px;"><img src="https://p3-sign.toutiaoimg.com/tos-cn-i-6w9my0ksvp/98297a074aca495ba6030448e93085f8~noop.image?_iz=58558&amp;from=article.pc_detail&amp;lk3s=953192f4&amp;x-expires=1723342725&amp;x-signature=91cZcRreFjRlhnfcX%2B1NTafD54k%3D" style="width: 50%; margin-bottom: 20px;"></div>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">经过</span>leftjoin举例说明,<span style="color: black;">按照</span>自己的需求决定<span style="color: black;">上下</span>表对应的数据流,</p>
    <div style="color: black; text-align: left; margin-bottom: 10px;"><img src="https://p3-sign.toutiaoimg.com/tos-cn-i-6w9my0ksvp/376044c9ea9349ea8a02fc3e83c356dd~noop.image?_iz=58558&amp;from=article.pc_detail&amp;lk3s=953192f4&amp;x-expires=1723342725&amp;x-signature=6uWEKhRwm0aGyfAsVbZCF4EZWQI%3D" style="width: 50%; margin-bottom: 20px;"></div>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">相关</span><span style="color: black;">要求</span>配置,即符合<span style="color: black;">要求</span>的数据就<span style="color: black;">保存</span></p>
    <div style="color: black; text-align: left; margin-bottom: 10px;"><img src="https://p3-sign.toutiaoimg.com/tos-cn-i-6w9my0ksvp/d0d28411764d46a2b2512680bbb458c8~noop.image?_iz=58558&amp;from=article.pc_detail&amp;lk3s=953192f4&amp;x-expires=1723342725&amp;x-signature=uZBTScZxVSio1UnyEnTnYBYXC6Q%3D" style="width: 50%; margin-bottom: 20px;"></div>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;">字段配置<span style="color: black;">能够</span>决定<span style="color: black;">那些</span>字段<span style="color: black;">保存</span><span style="color: black;">那些</span>去掉,A表合并后的数据将以此字段配置为准,<span style="color: black;">无</span>配置在本字段列表中的字段将被删除</p>
    <div style="color: black; text-align: left; margin-bottom: 10px;"><img src="https://p3-sign.toutiaoimg.com/tos-cn-i-6w9my0ksvp/69d092edc41d462e8f66afd34e7914ef~noop.image?_iz=58558&amp;from=article.pc_detail&amp;lk3s=953192f4&amp;x-expires=1723342725&amp;x-signature=dLOt8At%2FawJDVWnYmCBHh9oyeaY%3D" style="width: 50%; margin-bottom: 20px;"></div>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;">指定B表<span style="color: black;">必须</span>加入到A表中的字段,不<span style="color: black;">必须</span>加入的字段请删除</p>
    <div style="color: black; text-align: left; margin-bottom: 10px;"><img src="https://p3-sign.toutiaoimg.com/tos-cn-i-6w9my0ksvp/bd399d6dccb74bccb0b57f47d7685fc8~noop.image?_iz=58558&amp;from=article.pc_detail&amp;lk3s=953192f4&amp;x-expires=1723342725&amp;x-signature=N8NhG8zdX%2B%2BpxRLIRtALfpHWcpo%3D" style="width: 50%; margin-bottom: 20px;"></div>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;">点击<span style="color: black;">保留</span>,运行结果如下,数据会<span style="color: black;">按照</span><span style="color: black;">咱们</span>所配置的输出。</p>
    <div style="color: black; text-align: left; margin-bottom: 10px;"><img src="https://p3-sign.toutiaoimg.com/tos-cn-i-6w9my0ksvp/1550429e03a74780b2cd15b509a00bb3~noop.image?_iz=58558&amp;from=article.pc_detail&amp;lk3s=953192f4&amp;x-expires=1723342725&amp;x-signature=BJh8CmPeW6IhIvCSlC6IoJQC3QQ%3D" style="width: 50%; margin-bottom: 20px;"></div>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;">2<strong style="color: blue;">、多流合并组件</strong></p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;">拉取多流Uinon合并组件,创建如下流程,多流合并组件相比双流join组件有两个<span style="color: black;">区别</span>点,一个是把多个流合并成一个流的数据,将<span style="color: black;">区别</span>节点的数据组合为新的数据。</p>
    <div style="color: black; text-align: left; margin-bottom: 10px;"><img src="https://p3-sign.toutiaoimg.com/tos-cn-i-6w9my0ksvp/c052182199a44b90a005b0f8bf63d3d8~noop.image?_iz=58558&amp;from=article.pc_detail&amp;lk3s=953192f4&amp;x-expires=1723342725&amp;x-signature=paJ2os1WuU%2FvCOqkXvjuB6GBLh4%3D" style="width: 50%; margin-bottom: 20px;"></div>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;">打开多流Union合并配置页,<span style="color: black;">能够</span><span style="color: black;">发掘</span><span style="color: black;">便是</span>单纯把<span style="color: black;">必须</span>的字段<span style="color: black;">保存</span>不<span style="color: black;">必须</span>的去掉,<span style="color: black;">而后</span>把多条流的数据合并输出。</p>
    <div style="color: black; text-align: left; margin-bottom: 10px;"><img src="https://p3-sign.toutiaoimg.com/tos-cn-i-6w9my0ksvp/52f150b8723e48329328cbb9c1f78120~noop.image?_iz=58558&amp;from=article.pc_detail&amp;lk3s=953192f4&amp;x-expires=1723342725&amp;x-signature=jd9%2F7Y5Pgzaas%2BfhL%2FZpydGdIOU%3D" style="width: 50%; margin-bottom: 20px;"></div>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;">字段配置中,<span style="color: black;">咱们</span><span style="color: black;">选取</span><span style="color: black;">咱们</span><span style="color: black;">必须</span>的字段。</p>
    <div style="color: black; text-align: left; margin-bottom: 10px;"><img src="https://p3-sign.toutiaoimg.com/tos-cn-i-6w9my0ksvp/fa26e0a17f7b49d0b0cf02a21fbb83b8~noop.image?_iz=58558&amp;from=article.pc_detail&amp;lk3s=953192f4&amp;x-expires=1723342725&amp;x-signature=kIYG%2F5RAboOWuM3Qi6xtxcTMjDo%3D" style="width: 50%; margin-bottom: 20px;"></div>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;">合并运行查看日志<span style="color: black;">能够</span>看出</p>
    <div style="color: black; text-align: left; margin-bottom: 10px;"><img src="https://p3-sign.toutiaoimg.com/tos-cn-i-6w9my0ksvp/f5128dcaf55f408a855fbd73c7b8e0d1~noop.image?_iz=58558&amp;from=article.pc_detail&amp;lk3s=953192f4&amp;x-expires=1723342725&amp;x-signature=wTdx8FEgh1l%2BjuRrHU9%2F8wjamCg%3D" style="width: 50%; margin-bottom: 20px;"></div>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;">双流join是会<span style="color: black;">按照</span>join<span style="color: black;">要求</span>而合并的,多流union会对每条流的数据进行字段合并,<span style="color: black;">而后</span>统一输出成新的数据。</p>
    <div style="color: black; text-align: left; margin-bottom: 10px;"><img src="https://p3-sign.toutiaoimg.com/tos-cn-i-6w9my0ksvp/30824b10e8e44fc697cfa25b9d34c99d~noop.image?_iz=58558&amp;from=article.pc_detail&amp;lk3s=953192f4&amp;x-expires=1723342725&amp;x-signature=%2BFS5e2um73S23%2Bi9iZOkRmr985k%3D" style="width: 50%; margin-bottom: 20px;"></div>
    <h1 style="color: black; text-align: left; margin-bottom: 10px;"><span style="color: black;">3、</span>总结</h1>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;">在ETL过程中经常<span style="color: black;">运用</span>的两个组件是双流join合并组件和多流UnionAll合并组件。</p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;">双流join合并组件:用于将两边的数据流合并在<span style="color: black;">一块</span>,支持左连接、内连接和笛卡尔积。用户<span style="color: black;">能够</span><span style="color: black;">按照</span><span style="color: black;">必须</span><span style="color: black;">选取</span>左连接<span style="color: black;">保存</span>左表所有行、内连接返回符合<span style="color: black;">要求</span>的数据行,或笛卡尔积返回两表所有可能组合的行。配置简单直观,<span style="color: black;">按照</span><span style="color: black;">相关</span><span style="color: black;">要求</span>和字段配置进行数据合并,并输出结果。</p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;">多流UnionAll合并组件:用于将多个数据流合并成一个流的数据,将<span style="color: black;">区别</span>节点的数据组合为新的数据。用户<span style="color: black;">能够</span><span style="color: black;">选取</span><span style="color: black;">必须</span>的字段进行<span style="color: black;">保存</span>,<span style="color: black;">而后</span>将多条流的数据合并输出。在字段配置中<span style="color: black;">选取</span>所需字段,<span style="color: black;">而后</span>合并运行查看日志<span style="color: black;">就可</span>输出合并后的数据。</p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;">总的<span style="color: black;">来讲</span>,双流join合并组件适用于<span style="color: black;">按照</span><span style="color: black;">要求</span>合并数据流,而多流UnionAll合并组件适用于将多条流数据合并成一个新的数据流。这些组件在ETLCloud中<span style="color: black;">供给</span>了强大的数据处理功能,方便用户进行数据集成和处理操作。</p>




jm2020 发表于 2024-9-4 11:09:14

论坛外链网http://www.fok120.com/

nykek5i 发表于 2024-10-26 06:36:20

软文发布论坛开幕式圆满成功。 http://www.fok120.com
页: [1]
查看完整版本: ETL中双流合并和多流合并的区别