l14107cb 发表于 2024-8-31 00:36:39

字节跳动微服务架构下的高性能优化实践


    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><img src="https://mmbiz.qpic.cn/mmbiz_gif/YriaiaJPb26VPQqHC66RJFpttVIMWG83T3lWHahUD4bvhxlKSayjeV2ibvC5ydqklP9QHDPD3qHJM07TV3IfHstjA/640?wx_fmt=gif&amp;tp=webp&amp;wxfrom=5&amp;wx_lazy=1" style="width: 50%; margin-bottom: 20px;"></p>作者 |&nbsp;<span style="color: black;">CloudWeGo 开源团队-王卓炜</span>
    <span style="color: black;">01 前言</span>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;">2019 年,字节跳动服务框架组针对大规模微服务架构下遇到的功能和性能痛点,以及吸收历史上旧框架下<span style="color: black;">累积</span>的经验与教训,着手<span style="color: black;">研发</span>了 RPC 框架 Kitex 以及周边一系列<span style="color: black;">关联</span><span style="color: black;">基本</span>库,并在 2021 年正式在 Github 上开源。</p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;">从 2019 年走到如今的 2023 年,内部微服务规模经历了巨大的扩张,Kitex 框架<span style="color: black;">亦</span><span style="color: black;">这里</span>过程中,经历了一次又一次的性能优化与考验。这篇<span style="color: black;">文案</span><span style="color: black;">期盼</span>分享<span style="color: black;">这里</span>过程中<span style="color: black;">咱们</span>所<span style="color: black;">累积</span>的<span style="color: black;">有些</span>性能优化实践,<span style="color: black;">亦</span>为<span style="color: black;">咱们</span>过去几年的优化工作做一个系统性的梳理总结。</p>
    <span style="color: black;">02 Kitex 的前世今生</span>

    <span style="color: black;"><span style="color: black;">为何</span>需要 RPC 框架</span>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;">虽然 RPC 框架的历史由来已久,但真正被大规模<span style="color: black;">做为</span>核心组件广泛<span style="color: black;">运用</span>,其实与微服务架构的流行是分不开的。<span style="color: black;">因此</span><span style="color: black;">咱们</span>有必要回顾下历史,探究<span style="color: black;">为何</span><span style="color: black;">咱们</span>需要 RPC 框架。</p>
    单体架构时代 <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;">这段时期系统服务的<span style="color: black;">重点</span>特点有:</p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">经过</span>函数分割<span style="color: black;">区别</span>业务<span style="color: black;">规律</span></p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;">性能压力<span style="color: black;">重点</span>集中在数据库,于是产生了数据库层面从分库分表这种手动分布式到真正的自动分布式架构演进过程</p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">平常</span>的业务代码如下:</p><span style="color: black;"><span style="color: black;"><span style="color: black;">func</span> <span style="color: black;">BuySomething</span><span style="color: black;">(userId <span style="color: black;">int</span>, itemId <span style="color: black;">int</span>)</span></span> {</span><span style="color: black;"> user := GetUser(userId)</span><span style="color: black;"> sth := GetItem(itemId)</span><span style="color: black;">}</span><span style="color: black;"><span style="color: black;"><span style="color: black;">func</span> <span style="color: black;">GetUser</span><span style="color: black;">(userId)</span></span> {</span><span style="color: black;"> <span style="color: black;">return</span> db.users.GetUser(userId)</span><span style="color: black;">}</span><span style="color: black;"><span style="color: black;"><span style="color: black;">func</span> <span style="color: black;">GetItem</span><span style="color: black;">(itemId)</span></span> {</span><span style="color: black;"> <span style="color: black;">return</span> db.items.GetItem(itemId)</span><span style="color: black;">}</span>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><img src="https://mmbiz.qpic.cn/sz_mmbiz_png/BvYvvlJIDbIaFKIUtTYEcRyYUbBKu38FLpx2OGApDfJqIvb2ZtgETxUUoJLjAs6fHtLgrRsF76qgZrXy1GyuSw/640?wx_fmt=png&amp;tp=webp&amp;wxfrom=5&amp;wx_lazy=1&amp;wx_co=1" style="width: 50%; margin-bottom: 20px;"></p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;">这种编码模式简单直接,在本身具备良好的设计模式时,非常易于重构和编写单元测试。<span style="color: black;">日前</span>许多 IT 系统依然采用这种模式。<span style="color: black;">然则</span>随着互联网业务极速发展,在<span style="color: black;">有些</span>超大型的互联网项目中,<span style="color: black;">亦</span>触碰到了<span style="color: black;">有些</span>天花板:</p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><strong style="color: blue;">计算能力天花板</strong>:一个请求<span style="color: black;">持有</span>的计算能力上限 &lt;= 单服务器总计算能力 / <span style="color: black;">同期</span>处理请求数</p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><strong style="color: blue;"><span style="color: black;">开发</span>效率天花板</strong>:代码仓库体积,团队人数与编码<span style="color: black;">繁杂</span>度不是线性增长的关系,越往后<span style="color: black;">守护</span>难度越大,上线难度<span style="color: black;">亦</span>越大。</p>
    微服务架构时代<p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;">为<span style="color: black;">认识</span>决单体架构的<span style="color: black;">以上</span>问题,<span style="color: black;">咱们</span>来到了微服务架构的时代。微服务架构的典型代码如下:</p><span style="color: black;"><span style="color: black;">func</span> <span style="color: black;">BuySomething</span>(<span style="color: black;">userId</span> <span style="color: black;">int</span>, <span style="color: black;">itemId</span> <span style="color: black;">int</span>) {</span><span style="color: black;"> <span style="color: black;">user </span>:= client.<span style="color: black;">GetUser</span>(userId) // RPC call</span><span style="color: black;"> sth := client.<span style="color: black;">GetItem</span>(itemId) // RPC call</span><span style="color: black;">}</span>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><img src="https://mmbiz.qpic.cn/sz_mmbiz_png/BvYvvlJIDbIaFKIUtTYEcRyYUbBKu38FZZ3LDwiazWOSY5bfjMrZgyfaiaickeJ5OiaL6M9dgnH6WFVp5lxPBXCr6Q/640?wx_fmt=png&amp;tp=webp&amp;wxfrom=5&amp;wx_lazy=1&amp;wx_co=1" style="width: 50%; margin-bottom: 20px;"></p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;">RPC(Remote Procedure Call,远程过程调用) 的<span style="color: black;">道理</span>在于:让业务能够像调用本<span style="color: black;">地区</span>法<span style="color: black;">同样</span>调用远程服务,对业务感知度降到最低,<span style="color: black;">从而</span>做好从单体架构向微服务架构演进过程中,对业务编码习惯的改变降到最小。</p>
    性能优化的方向 <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;">在不<span style="color: black;">运用</span> RPC 的<span style="color: black;">状况</span>下,如下图的代码<span style="color: black;">独一</span>的调用开销仅仅只是一个函数调用的开销,不<span style="color: black;">思虑</span>内联优化的<span style="color: black;">状况</span>下,是一个纳秒级的开销。</p><span style="color: black;"><span style="color: black;"><span style="color: black;">func</span> <span style="color: black;">client</span><span style="color: black;">()</span> <span style="color: black;">(response)</span></span> {</span><span style="color: black;"> response = server(request) <span style="color: black;">// function call</span></span><span style="color: black;">}</span><span style="color: black;"><span style="color: black;"><span style="color: black;">func</span> <span style="color: black;">server</span><span style="color: black;">(request)</span> <span style="color: black;">(response)</span></span> {</span><span style="color: black;"> response.Message = request.Message</span><span style="color: black;">}</span>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;">而将其换成 RPC 调用后,调用开销直接飙升到了毫秒级:</p><span style="color: black;"><span style="color: black;"><span style="color: black;">func</span> <span style="color: black;">client</span><span style="color: black;">()</span> <span style="color: black;">(response)</span></span> {</span><span style="color: black;"> response = client.RPCCall(request) <span style="color: black;">// rpc call - network</span></span><span style="color: black;">}</span><span style="color: black;"><span style="color: black;"><span style="color: black;">func</span> <span style="color: black;">server</span><span style="color: black;">(request)</span> <span style="color: black;">(response)</span></span> {</span><span style="color: black;"> response.Message = request.Message</span><span style="color: black;">}</span>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;">这是一个 10^6 级别的延迟差异,既证明了 RPC 的代价很大,<span style="color: black;">亦</span>证明了其中的优化空间<span style="color: black;">亦</span>很大。</p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;">一次<strong style="color: blue;"> RPC 调用的完整过程</strong>如下,后面<span style="color: black;">咱们</span>会针对每一个环节给出<span style="color: black;">咱们</span>在其上所做的性能优化实践:</p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><img src="https://mmbiz.qpic.cn/sz_mmbiz_png/BvYvvlJIDbIaFKIUtTYEcRyYUbBKu38FrGu9qiburBPy53aeSG0klRtyEpXKs708oFicsRsibteqmG9PicIsokcTuw/640?wx_fmt=png&amp;tp=webp&amp;wxfrom=5&amp;wx_lazy=1&amp;wx_co=1" style="width: 50%; margin-bottom: 20px;"></p>
    <span style="color: black;"><span style="color: black;">为何</span>自研 RPC 框架</span>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;">在<span style="color: black;">认识</span>性能实践之前,<span style="color: black;">咱们</span>需要解释一件事情是,<span style="color: black;">为何</span>在已有众多 RPC 框架时,<span style="color: black;">咱们</span>依然<span style="color: black;">选取</span>自研一个新的 RPC 框架。<span style="color: black;">重点</span>有以下<span style="color: black;">原由</span>:</p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">机构</span>内部<span style="color: black;">重点</span>以 Thrift 协议通信,而主流 Go 框架大多不支持 Thrift 协议,<span style="color: black;">亦</span><span style="color: black;">不易</span>做多协议的扩展</p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">机构</span>内部对性能有极致的<span style="color: black;">需求</span>,需要从全链路上做深度优化(后面会举例)</p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">机构</span>内部微服务规模巨大,场景<span style="color: black;">繁杂</span>,需要一个支持深度定制的高扩展性框架</p>
    <span style="color: black;">Kitex 是什么</span>
    发展历程<p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;">Kitex 从 2019 年正式立项,2020 年在内部发布,2021 年正式开源,直到 2023 年 2 月,已有超过 6 万微服务在<span style="color: black;">运用</span>。</p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><img src="https://mmbiz.qpic.cn/sz_mmbiz_png/BvYvvlJIDbIaFKIUtTYEcRyYUbBKu38FlkIY7Ef8Tpagb5NiacHQibgb6liaP267vsE0afUrulQrHoCe9LV6Bxykg/640?wx_fmt=png&amp;tp=webp&amp;wxfrom=5&amp;wx_lazy=1&amp;wx_co=1" style="width: 50%; margin-bottom: 20px;"></p>
    CloudWeGo <span style="color: black;">大众</span>族<p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;">在<span style="color: black;">研发</span> Kitex 主框架的<span style="color: black;">同期</span>,<span style="color: black;">咱们</span><span style="color: black;">亦</span>把许多与 Kitex 并不耦合的高性能组件<span style="color: black;">亦</span>一一开源出来,<span style="color: black;">从而</span>形<span style="color: black;">成为了</span> CloudWeGo 的<span style="color: black;">大众</span>族生态:</p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><img src="https://mmbiz.qpic.cn/sz_mmbiz_png/BvYvvlJIDbJ2USrk55YJLHZX9F2Inmu1HEXRdSuEJGdAoSH9ibUusGWjhibTfQb5cx5oJxaqicB6z6uGgwAnzY7dA/640?wx_fmt=png&amp;tp=webp&amp;wxfrom=5&amp;wx_lazy=1&amp;wx_co=1" style="width: 50%; margin-bottom: 20px;"></p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><img src="https://mmbiz.qpic.cn/sz_mmbiz_png/BvYvvlJIDbIaFKIUtTYEcRyYUbBKu38Fk9JMHaeEdviaibtnycfp5gL997J3fDBlg2uwczyNZrqOPfVetPc3tRcg/640?wx_fmt=png&amp;tp=webp&amp;wxfrom=5&amp;wx_lazy=1&amp;wx_co=1" style="width: 50%; margin-bottom: 20px;"></p>
    <span style="color: black;">Kitex 与其他框架对比</span>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;">Kitex <span style="color: black;">同期</span>支持 Thrit 与 gRPC 协议,但 Go 生态下支持 Thrift 的框架并不多,<span style="color: black;">因此</span><span style="color: black;">这儿</span><span style="color: black;">咱们</span><span style="color: black;">选取</span><span style="color: black;">运用</span> gRPC 协议来与 grpc-go 框架进行横向对比:</p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><strong style="color: blue;"><span style="color: black;">gRPC Unary 对比:</span></strong></p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><strong style="color: blue;"><span style="color: black;"><img src="https://mmbiz.qpic.cn/sz_mmbiz_png/BvYvvlJIDbJ2USrk55YJLHZX9F2Inmu1816Ec38ZoqsicibYiaCsEqnHcuQiattYUbjD0pPmeZb3zyyaia6yrDepKPQ/640?wx_fmt=png&amp;wxfrom=5&amp;wx_lazy=1&amp;wx_co=1&amp;tp=webp" style="width: 50%; margin-bottom: 20px;"></span></strong></p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><strong style="color: blue;"><span style="color: black;"><img src="https://mmbiz.qpic.cn/sz_mmbiz_png/BvYvvlJIDbJ2USrk55YJLHZX9F2Inmu1UQiaVKAcepCicHSBkBrMkoakC1VG2cjTgMRa0iaSYSibV9zO3vmLYdJelg/640?wx_fmt=png&amp;wxfrom=5&amp;wx_lazy=1&amp;wx_co=1&amp;tp=webp" style="width: 50%; margin-bottom: 20px;"></span></strong></p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><strong style="color: blue;"><span style="color: black;">gRPC Steaming 对比:</span></strong></p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><img src="https://mmbiz.qpic.cn/sz_mmbiz_png/BvYvvlJIDbJ2USrk55YJLHZX9F2Inmu1Xl2skCiaV0V8RRgMsN2H5sLIq7cTKD4ATnBib6CCiaKAmJicic4s7GUsuAw/640?wx_fmt=png&amp;tp=webp&amp;wxfrom=5&amp;wx_lazy=1&amp;wx_co=1" style="width: 50%; margin-bottom: 20px;"></p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><img src="https://mmbiz.qpic.cn/sz_mmbiz_png/BvYvvlJIDbJ2USrk55YJLHZX9F2Inmu1Z17M24p8K8n95TjvluCvRGQicvslODVibfOdzryAjaLWJjBDLicf5icgKA/640?wx_fmt=png&amp;tp=webp&amp;wxfrom=5&amp;wx_lazy=1&amp;wx_co=1" style="width: 50%; margin-bottom: 20px;"></p>
    <span style="color: black;">02 Kitex 框架性能优化实践</span>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;">Kitex 许多性能优化的思路其实并不与 Go 语言相绑定,但<span style="color: black;">这儿</span>为方便,<span style="color: black;">咱们</span><span style="color: black;">重点</span>以 Go 来举例。</p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;">接下来<span style="color: black;">咱们</span>会顺着前面一次 RPC 调用的完整流程图,一一介绍 Kitex 的性能优化实践。</p>
    <span style="color: black;">编解码优化</span>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><img src="https://mmbiz.qpic.cn/sz_mmbiz_png/BvYvvlJIDbIaFKIUtTYEcRyYUbBKu38FSpNiaukK7LZbDbSvekdzx90icicQSMRQFZEHsUbpcsbE0g35puhmNZQBA/640?wx_fmt=png&amp;tp=webp&amp;wxfrom=5&amp;wx_lazy=1&amp;wx_co=1" style="width: 50%; margin-bottom: 20px;"></p>
    <span style="color: black;">平常</span>编解码的问题 <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><strong style="color: blue;">以 Protobuf 为例:</strong></p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;">计算开销:</p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;">需要运行时<span style="color: black;">经过</span>反射获取额外信息</p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;">需要调用众多函数并创建众多小对象</p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;">GC 开销:<span style="color: black;">不易</span>重用内存</p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><img src="https://mmbiz.qpic.cn/sz_mmbiz_png/BvYvvlJIDbJ2USrk55YJLHZX9F2Inmu1Dx0aQ4nKy3ZfiaHpaAqz9BsupWdKZpDLAR2Dz1BJTKX3lF2XqJtRYxw/640?wx_fmt=png&amp;tp=webp&amp;wxfrom=5&amp;wx_lazy=1&amp;wx_co=1" style="width: 50%; margin-bottom: 20px;"></p>
    生成代码优化:FastThrift &amp; FastPB <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">咱们</span>在 Kitex 支持的两种协议 Thrift 和 Protobuf 中,都实现了<span style="color: black;">经过</span><span style="color: black;">海量</span>生成代码来实现编解码的能力。<span style="color: black;">因为</span>生成代码<span style="color: black;">能够</span>最大化提前预置好运行时的信息,<span style="color: black;">因此</span><span style="color: black;">能够</span><span style="color: black;">供给</span>以下好处:</p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><strong style="color: blue;">1. 预计算好 Size,并重用该内存</strong></p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;">序列化时,<span style="color: black;">咱们</span><span style="color: black;">能够</span>用极低的成本调用 Size() 并以此提前创建好<span style="color: black;">一起</span>固定<span style="color: black;">体积</span>的内存空间。</p><span style="color: black;"><span style="color: black;">type</span> User <span style="color: black;">struct</span> {</span><span style="color: black;"> Id <span style="color: black;">int32</span></span><span style="color: black;"> Name <span style="color: black;">string</span></span><span style="color: black;">}</span><span style="color: black;"><span style="color: black;"><span style="color: black;">func</span> <span style="color: black;">(x *User)</span> <span style="color: black;">Size</span><span style="color: black;">()</span> <span style="color: black;">(n <span style="color: black;">int</span>)</span></span> {</span><span style="color: black;"> n += x.sizeField1()</span><span style="color: black;">n += x.sizeField2()</span><span style="color: black;"> <span style="color: black;">return</span> n</span><span style="color: black;">}</span><span style="color: black;"><span style="color: black;">// Framework Process</span></span><span style="color: black;">size := msg.Size()</span><span style="color: black;">data = Malloc(size)</span><span style="color: black;">Encode(user, data) <span style="color: black;">// encoding user object directly into the allocated memory to save one time of copy</span></span><span style="color: black;">Send(data)</span><span style="color: black;">Free(data) <span style="color: black;">// reuse the allocated memory at next Malloc</span></span>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><strong style="color: blue;">2. 尽可能减少函数调用和中间对象创建</strong></p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;">虽然函数调用和小对象创建成本都很低,<span style="color: black;">然则</span><span style="color: black;">针对</span>编解码这种热路径,对这些低成本高频次代码的优化<span style="color: black;">亦</span>能带来非常大的收益,尤其是 Go 是带 GC 的语言。</p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">能够</span>看到,<span style="color: black;">因为</span>底层 fastWriteField 函数会在编译时被内联,<span style="color: black;">因此</span>序列化 FastWrite 函数本质上是在<strong style="color: blue;"><span style="color: black;">次序</span>写入<span style="color: black;">一起</span>固定内存空间</strong>(FastRead <span style="color: black;">亦</span>是类似)。</p><span style="color: black;"><span style="color: black;"><span style="color: black;">func</span> <span style="color: black;">(x *User)</span> <span style="color: black;">FastWrite</span><span style="color: black;">(buf []<span style="color: black;">byte</span>)</span> <span style="color: black;">(offset <span style="color: black;">int</span>)</span></span> {</span><span style="color: black;"> offset += x.fastWriteField1(buf)</span><span style="color: black;"> offset += x.fastWriteField2(buf)</span><span style="color: black;"> <span style="color: black;">return</span> offset</span><span style="color: black;">}</span><span style="color: black;"><span style="color: black;">// inline</span></span><span style="color: black;"><span style="color: black;"><span style="color: black;">func</span> <span style="color: black;">(x *User)</span> <span style="color: black;">fastWriteField1</span><span style="color: black;">(buf []<span style="color: black;">byte</span>)</span> <span style="color: black;">(offset <span style="color: black;">int</span>)</span></span> {</span><span style="color: black;"> offset += fastpb.WriteInt32(buf, <span style="color: black;">1</span>, x.Id)</span><span style="color: black;"> <span style="color: black;">return</span> offset</span><span style="color: black;">}</span><span style="color: black;"><span style="color: black;">// inline</span></span><span style="color: black;"><span style="color: black;"><span style="color: black;">func</span> <span style="color: black;">(x *User)</span> <span style="color: black;">fastWriteField2</span><span style="color: black;">(buf []<span style="color: black;">byte</span>)</span> <span style="color: black;">(offset <span style="color: black;">int</span>)</span></span> {</span><span style="color: black;">offset += fastpb.WriteString(buf,<span style="color: black;">2</span>, x.Name)</span><span style="color: black;"> <span style="color: black;">return</span> offset</span><span style="color: black;">}</span>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><strong style="color: blue;">优化效果</strong></p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;">从前面的 3.58% 优化到了 0.98%:</p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><img src="data:image/svg+xml,%3C%3Fxml version=1.0 encoding=UTF-8%3F%3E%3Csvg width=1px height=1px viewBox=0 0 1 1 version=1.1 xmlns=http://www.w3.org/2000/svg xmlns:xlink=http://www.w3.org/1999/xlink%3E%3Ctitle%3E%3C/title%3E%3Cg stroke=none stroke-width=1 fill=none fill-rule=evenodd fill-opacity=0%3E%3Cg transform=translate(-249.000000, -126.000000) fill=%23FFFFFF%3E%3Crect x=249 y=126 width=1 height=1%3E%3C/rect%3E%3C/g%3E%3C/g%3E%3C/svg%3E" style="width: 50%; margin-bottom: 20px;"></p>
    JIT 替代生成代码:Frugal(Thrift) <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;">在前面硬编码的<span style="color: black;">办法</span>上,取得了不错收益后,<span style="color: black;">咱们</span><span style="color: black;">亦</span>接到了<span style="color: black;">有些</span>反馈,<span style="color: black;">例如</span>:</p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;">生成代码体积随着字段<span style="color: black;">增多</span>而线性膨胀</p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;">生成代码依赖<span style="color: black;">运用</span>者命令行版本,多人协作易互相覆盖</p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">因此</span>,<span style="color: black;">咱们</span>自然而然地产生了一个疑问,前面生成的代码能否<span style="color: black;">经过</span>运行时自动生成?这个问题本身其实<span style="color: black;">便是</span>一个答案,即需要引入 JIT(Just-in-time compilation) 技术来优化代码生成。</p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><img src="data:image/svg+xml,%3C%3Fxml version=1.0 encoding=UTF-8%3F%3E%3Csvg width=1px height=1px viewBox=0 0 1 1 version=1.1 xmlns=http://www.w3.org/2000/svg xmlns:xlink=http://www.w3.org/1999/xlink%3E%3Ctitle%3E%3C/title%3E%3Cg stroke=none stroke-width=1 fill=none fill-rule=evenodd fill-opacity=0%3E%3Cg transform=translate(-249.000000, -126.000000) fill=%23FFFFFF%3E%3Crect x=249 y=126 width=1 height=1%3E%3C/rect%3E%3C/g%3E%3C/g%3E%3C/svg%3E" style="width: 50%; margin-bottom: 20px;"></p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">优良</span>:</p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">运用</span>寄存器传递参数,以及更深度的内联,<span style="color: black;">加强</span>函数调用效率</p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;">核心计算函数<span style="color: black;">运用</span>被充分优化的汇编代码</p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><strong style="color: blue;">优化效果</strong></p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;">从前面的 3.58% 优化到了 0.78%:</p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><img src="data:image/svg+xml,%3C%3Fxml version=1.0 encoding=UTF-8%3F%3E%3Csvg width=1px height=1px viewBox=0 0 1 1 version=1.1 xmlns=http://www.w3.org/2000/svg xmlns:xlink=http://www.w3.org/1999/xlink%3E%3Ctitle%3E%3C/title%3E%3Cg stroke=none stroke-width=1 fill=none fill-rule=evenodd fill-opacity=0%3E%3Cg transform=translate(-249.000000, -126.000000) fill=%23FFFFFF%3E%3Crect x=249 y=126 width=1 height=1%3E%3C/rect%3E%3C/g%3E%3C/g%3E%3C/svg%3E" style="width: 50%; margin-bottom: 20px;"></p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><strong style="color: blue;">Frugal VS Apache Thrift 编解码性能对比:</strong></p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><img src="data:image/svg+xml,%3C%3Fxml version=1.0 encoding=UTF-8%3F%3E%3Csvg width=1px height=1px viewBox=0 0 1 1 version=1.1 xmlns=http://www.w3.org/2000/svg xmlns:xlink=http://www.w3.org/1999/xlink%3E%3Ctitle%3E%3C/title%3E%3Cg stroke=none stroke-width=1 fill=none fill-rule=evenodd fill-opacity=0%3E%3Cg transform=translate(-249.000000, -126.000000) fill=%23FFFFFF%3E%3Crect x=249 y=126 width=1 height=1%3E%3C/rect%3E%3C/g%3E%3C/g%3E%3C/svg%3E" style="width: 50%; margin-bottom: 20px;"></p>
    <span style="color: black;">网络库优化</span>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><img src="data:image/svg+xml,%3C%3Fxml version=1.0 encoding=UTF-8%3F%3E%3Csvg width=1px height=1px viewBox=0 0 1 1 version=1.1 xmlns=http://www.w3.org/2000/svg xmlns:xlink=http://www.w3.org/1999/xlink%3E%3Ctitle%3E%3C/title%3E%3Cg stroke=none stroke-width=1 fill=none fill-rule=evenodd fill-opacity=0%3E%3Cg transform=translate(-249.000000, -126.000000) fill=%23FFFFFF%3E%3Crect x=249 y=126 width=1 height=1%3E%3C/rect%3E%3C/g%3E%3C/g%3E%3C/svg%3E" style="width: 50%; margin-bottom: 20px;"></p>
    原生 Go Net 在 RPC 场景的缺陷 <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;">一连接一协程,在上下游实例数众多时,Goroutines 数量涨到<span style="color: black;">必定</span>程度之后性能会骤降,尤其<span style="color: black;">有害</span>于大规格实例业务。</p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">没法</span>自动连接感知关闭状态</p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;">一个 struct 在做 NoCopy 序列化时,产物<span style="color: black;">常常</span>是多维数组,而 Go 的 Write([]byte) 接口<span style="color: black;">没法</span>支持非连续内存数据的读写。</p><span style="color: black;">name := <span style="color: black;">"Steve Jobs"</span> <span style="color: black;">// 0xc000000020</span></span><span style="color: black;">req := &amp;Request{Id:<span style="color: black;">int32</span>(<span style="color: black;">1</span>), Name: name}</span><span style="color: black;"><span style="color: black;">// ===&gt; Encode to [][]byte</span></span><span style="color: black;">[</span><span style="color: black;"> [<span style="color: black;">4</span> bytes],</span><span style="color: black;"> [<span style="color: black;">10</span> bytes], <span style="color: black;">// no copy encoding, 0xc000000020</span></span><span style="color: black;">]</span><span style="color: black;"><span style="color: black;">// ===&gt; Copy to []byte</span></span><span style="color: black;">buf := [<span style="color: black;">4</span>bytes +<span style="color: black;">10</span> bytes] <span style="color: black;">// new address</span></span><span style="color: black;"><span style="color: black;">// ===&gt; Write([]byte)</span></span><span style="color: black;">net.Conn.Write(buf)</span>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;">与 Go Runtime 强绑定,<span style="color: black;">有害</span>于改造支持<span style="color: black;">有些</span>新的实验特性。</p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><img src="data:image/svg+xml,%3C%3Fxml version=1.0 encoding=UTF-8%3F%3E%3Csvg width=1px height=1px viewBox=0 0 1 1 version=1.1 xmlns=http://www.w3.org/2000/svg xmlns:xlink=http://www.w3.org/1999/xlink%3E%3Ctitle%3E%3C/title%3E%3Cg stroke=none stroke-width=1 fill=none fill-rule=evenodd fill-opacity=0%3E%3Cg transform=translate(-249.000000, -126.000000) fill=%23FFFFFF%3E%3Crect x=249 y=126 width=1 height=1%3E%3C/rect%3E%3C/g%3E%3C/g%3E%3C/svg%3E" style="width: 50%; margin-bottom: 20px;"></p>
    Netpoll 优化实践 <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">重点</span>优化点:</p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><img src="data:image/svg+xml,%3C%3Fxml version=1.0 encoding=UTF-8%3F%3E%3Csvg width=1px height=1px viewBox=0 0 1 1 version=1.1 xmlns=http://www.w3.org/2000/svg xmlns:xlink=http://www.w3.org/1999/xlink%3E%3Ctitle%3E%3C/title%3E%3Cg stroke=none stroke-width=1 fill=none fill-rule=evenodd fill-opacity=0%3E%3Cg transform=translate(-249.000000, -126.000000) fill=%23FFFFFF%3E%3Crect x=249 y=126 width=1 height=1%3E%3C/rect%3E%3C/g%3E%3C/g%3E%3C/svg%3E" style="width: 50%; margin-bottom: 20px;"></p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;">协程优化:连接数与协程数不绑定;尽可能复用协程</p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;">中间层 Buffer:支持零拷贝读写和重用内存,最大化避免编解码时 GC 开销</p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;">针对 RPC 小包高并发的场景深度定制:协程调度优化,TCP 参数调优等</p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;">针对内场环境深度定制,<span style="color: black;">包含</span>:改造 Go Runtime <span style="color: black;">加强</span>调度优先级,内核支持批系统调用等</p>
    <span style="color: black;">通信层优化</span>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><img src="data:image/svg+xml,%3C%3Fxml version=1.0 encoding=UTF-8%3F%3E%3Csvg width=1px height=1px viewBox=0 0 1 1 version=1.1 xmlns=http://www.w3.org/2000/svg xmlns:xlink=http://www.w3.org/1999/xlink%3E%3Ctitle%3E%3C/title%3E%3Cg stroke=none stroke-width=1 fill=none fill-rule=evenodd fill-opacity=0%3E%3Cg transform=translate(-249.000000, -126.000000) fill=%23FFFFFF%3E%3Crect x=249 y=126 width=1 height=1%3E%3C/rect%3E%3C/g%3E%3C/g%3E%3C/svg%3E" style="width: 50%; margin-bottom: 20px;"></p>
    同机通信优化:Service Mesh 下的通信效率问题<img src="data:image/svg+xml,%3C%3Fxml version=1.0 encoding=UTF-8%3F%3E%3Csvg width=1px height=1px viewBox=0 0 1 1 version=1.1 xmlns=http://www.w3.org/2000/svg xmlns:xlink=http://www.w3.org/1999/xlink%3E%3Ctitle%3E%3C/title%3E%3Cg stroke=none stroke-width=1 fill=none fill-rule=evenodd fill-opacity=0%3E%3Cg transform=translate(-249.000000, -126.000000) fill=%23FFFFFF%3E%3Crect x=249 y=126 width=1 height=1%3E%3C/rect%3E%3C/g%3E%3C/g%3E%3C/svg%3E" style="width: 50%; margin-bottom: 20px;">
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;">在引入 Service Mesh 后,业务进程<span style="color: black;">重点</span>是和同机的另一个 sidecar 进程通信,由此产生了多一级的时延。</p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;">传统的 Service Mesh <span style="color: black;">方法</span><span style="color: black;">通常</span>都过 iptables 劫持实现流量转发到 sidecar 进程,可想而知从各个层面看性能损耗都是非常夸张的。Kitex 在通信层做了<span style="color: black;">许多</span>的性能优化尝试,并<span style="color: black;">最后</span>产出了一套系统化的<span style="color: black;">处理</span><span style="color: black;">方法</span>。</p>
    同机通信优化:UDS 替代 TCP<p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><img src="data:image/svg+xml,%3C%3Fxml version=1.0 encoding=UTF-8%3F%3E%3Csvg width=1px height=1px viewBox=0 0 1 1 version=1.1 xmlns=http://www.w3.org/2000/svg xmlns:xlink=http://www.w3.org/1999/xlink%3E%3Ctitle%3E%3C/title%3E%3Cg stroke=none stroke-width=1 fill=none fill-rule=evenodd fill-opacity=0%3E%3Cg transform=translate(-249.000000, -126.000000) fill=%23FFFFFF%3E%3Crect x=249 y=126 width=1 height=1%3E%3C/rect%3E%3C/g%3E%3C/g%3E%3C/svg%3E" style="width: 50%; margin-bottom: 20px;"></p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;">UDS 与 TCP 的性能对比:</p><span style="color: black;"><span style="color: black;">======== IPC Benchmark - TCP ========</span></span><span style="color: black;"> Type Conns Size Avg P50 P99</span><span style="color: black;"> Client 10 4096 127μs 76μs 232μs</span><span style="color: black;">Client-R 10 4096 2μs 1μs 1μs</span><span style="color: black;"> Client-W 10 4096 9μs 4μs 7μs</span><span style="color: black;"> Server 10 4096 24μs 13μs 18μs</span><span style="color: black;"> Server-R 10 4096 1μs 1μs 1μs</span><span style="color: black;"> Server-W 10 4096 7μs 4μs 7μs</span><span style="color: black;"><span style="color: black;">======== IPC Benchmark - UDS ========</span></span><span style="color: black;"> Type Conns Size Avg P50 P99</span><span style="color: black;"> Client 10 4096 118μs 75μs 205μs</span><span style="color: black;"> Client-R 10 4096 3μs 2μs 3μs</span><span style="color: black;"> Client-W 10 4096 4μs 1μs 2μs</span><span style="color: black;">Server 10 4096 24μs 11μs 16μs</span><span style="color: black;"> Server-R 10 4096 4μs 2μs 3μs</span><span style="color: black;"> Server-W 10 4096 3μs 1μs 2μs</span>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;">从性能测试中,<span style="color: black;">咱们</span><span style="color: black;">能够</span><span style="color: black;">发掘</span>两个结论:</p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;">UDS 各指标都优于 TCP</p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">然则</span>优化幅度并算不上非常大</p>
    同机通信优化:ShmIPC 替代 UDS<p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><img src="data:image/svg+xml,%3C%3Fxml version=1.0 encoding=UTF-8%3F%3E%3Csvg width=1px height=1px viewBox=0 0 1 1 version=1.1 xmlns=http://www.w3.org/2000/svg xmlns:xlink=http://www.w3.org/1999/xlink%3E%3Ctitle%3E%3C/title%3E%3Cg stroke=none stroke-width=1 fill=none fill-rule=evenodd fill-opacity=0%3E%3Cg transform=translate(-249.000000, -126.000000) fill=%23FFFFFF%3E%3Crect x=249 y=126 width=1 height=1%3E%3C/rect%3E%3C/g%3E%3C/g%3E%3C/svg%3E" style="width: 50%; margin-bottom: 20px;"></p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;">为了进一步压榨进程间通信的性能,<span style="color: black;">咱们</span><span style="color: black;">开发</span>了基于共享内存的通信模式。共享内存通信的难点在于,<span style="color: black;">怎样</span>处理好各个通信状态的进程间同步,<span style="color: black;">因此</span><span style="color: black;">咱们</span>采用的是自研的通信协议,并<span style="color: black;">保存</span> UDS <span style="color: black;">做为</span>事件<span style="color: black;">通告</span>管道(IO Queue),共享内存<span style="color: black;">做为</span>数据传输管道(Buffer):</p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><img src="data:image/svg+xml,%3C%3Fxml version=1.0 encoding=UTF-8%3F%3E%3Csvg width=1px height=1px viewBox=0 0 1 1 version=1.1 xmlns=http://www.w3.org/2000/svg xmlns:xlink=http://www.w3.org/1999/xlink%3E%3Ctitle%3E%3C/title%3E%3Cg stroke=none stroke-width=1 fill=none fill-rule=evenodd fill-opacity=0%3E%3Cg transform=translate(-249.000000, -126.000000) fill=%23FFFFFF%3E%3Crect x=249 y=126 width=1 height=1%3E%3C/rect%3E%3C/g%3E%3C/g%3E%3C/svg%3E" style="width: 50%; margin-bottom: 20px;"></p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">更加多</span> shmipc 的技术细节参考过去<span style="color: black;">咱们</span>发布的<span style="color: black;">文案</span>:<a style="color: black;"><strong style="color: blue;">字节跳动开源 Sh</strong></a><strong style="color: blue;">mipc:基于共享内存的高性能 IP</strong>C。</p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><strong style="color: blue;">性能测试:</strong></p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><img src="data:image/svg+xml,%3C%3Fxml version=1.0 encoding=UTF-8%3F%3E%3Csvg width=1px height=1px viewBox=0 0 1 1 version=1.1 xmlns=http://www.w3.org/2000/svg xmlns:xlink=http://www.w3.org/1999/xlink%3E%3Ctitle%3E%3C/title%3E%3Cg stroke=none stroke-width=1 fill=none fill-rule=evenodd fill-opacity=0%3E%3Cg transform=translate(-249.000000, -126.000000) fill=%23FFFFFF%3E%3Crect x=249 y=126 width=1 height=1%3E%3C/rect%3E%3C/g%3E%3C/g%3E%3C/svg%3E" style="width: 50%; margin-bottom: 20px;"></p>
    跨机转同机通信:合并<span style="color: black;">安排</span><span style="color: black;">处理</span><span style="color: black;">方法</span>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;">前面<span style="color: black;">咱们</span>在同机通讯上进行了极致的优化,但这仅限于服务进程与 Service Mesh 的数据面通信,对端的服务大概率并不<span style="color: black;">安排</span>在本机。<span style="color: black;">那样</span><span style="color: black;">怎样</span>去优化跨机通信呢?</p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;">一个“取巧”的思路<span style="color: black;">便是</span>让<strong style="color: blue;">跨机问题转变成同机问题</strong>。</p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><strong style="color: blue;"><img src="data:image/svg+xml,%3C%3Fxml version=1.0 encoding=UTF-8%3F%3E%3Csvg width=1px height=1px viewBox=0 0 1 1 version=1.1 xmlns=http://www.w3.org/2000/svg xmlns:xlink=http://www.w3.org/1999/xlink%3E%3Ctitle%3E%3C/title%3E%3Cg stroke=none stroke-width=1 fill=none fill-rule=evenodd fill-opacity=0%3E%3Cg transform=translate(-249.000000, -126.000000) fill=%23FFFFFF%3E%3Crect x=249 y=126 width=1 height=1%3E%3C/rect%3E%3C/g%3E%3C/g%3E%3C/svg%3E" style="width: 50%; margin-bottom: 20px;"></strong></p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;">要在大规模的微服务通信中完成这个事情,需要架构上下多层组件通力<span style="color: black;">协同</span>,<span style="color: black;">因此</span><span style="color: black;">咱们</span><span style="color: black;">开发</span>了合并<span style="color: black;">安排</span><span style="color: black;">处理</span><span style="color: black;">方法</span>:</p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><strong style="color: blue;">容器调度层改造</strong>:容器调度系统会基于合并的关系和上下游服务的实例<span style="color: black;">状况</span>,进行亲和性调度,将上下游的实例尽可能调度到一个<span style="color: black;">理学</span>机上。</p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><strong style="color: blue;">流量调度层改造</strong>:服务<span style="color: black;">掌控</span>面需要识别到某一个上游容器有<span style="color: black;">那些</span>同机下游,并在<span style="color: black;">思虑</span>全局负载<span style="color: black;">平衡</span>的<span style="color: black;">状况</span>下,针对<span style="color: black;">每一个</span>上游实例<span style="color: black;">状况</span>,计算其<span style="color: black;">拜访</span>下游实例的动态权重,尽可能让<span style="color: black;">更加多</span>的流量<span style="color: black;">能够</span>进行本地通信。</p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><strong style="color: blue;">框架改造</strong>:扩展定制支持合并<span style="color: black;">安排</span>特殊的通信方式,基于流量调度层的计算结果将请求发给同机实例或 Mesh Proxy。</p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><img src="data:image/svg+xml,%3C%3Fxml version=1.0 encoding=UTF-8%3F%3E%3Csvg width=1px height=1px viewBox=0 0 1 1 version=1.1 xmlns=http://www.w3.org/2000/svg xmlns:xlink=http://www.w3.org/1999/xlink%3E%3Ctitle%3E%3C/title%3E%3Cg stroke=none stroke-width=1 fill=none fill-rule=evenodd fill-opacity=0%3E%3Cg transform=translate(-249.000000, -126.000000) fill=%23FFFFFF%3E%3Crect x=249 y=126 width=1 height=1%3E%3C/rect%3E%3C/g%3E%3C/g%3E%3C/svg%3E" style="width: 50%; margin-bottom: 20px;"></p>
    <span style="color: black;">03 微服务线上调优实践</span>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;">除去在框架层<span style="color: black;">咱们</span>所做的性能优化外,其实线上还有很大一部分性能瓶颈是来自业务<span style="color: black;">规律</span><span style="color: black;">自己</span>。对此,<span style="color: black;">咱们</span><span style="color: black;">亦</span><span style="color: black;">累积</span>了<span style="color: black;">有些</span>实践经验。</p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><img src="data:image/svg+xml,%3C%3Fxml version=1.0 encoding=UTF-8%3F%3E%3Csvg width=1px height=1px viewBox=0 0 1 1 version=1.1 xmlns=http://www.w3.org/2000/svg xmlns:xlink=http://www.w3.org/1999/xlink%3E%3Ctitle%3E%3C/title%3E%3Cg stroke=none stroke-width=1 fill=none fill-rule=evenodd fill-opacity=0%3E%3Cg transform=translate(-249.000000, -126.000000) fill=%23FFFFFF%3E%3Crect x=249 y=126 width=1 height=1%3E%3C/rect%3E%3C/g%3E%3C/g%3E%3C/svg%3E" style="width: 50%; margin-bottom: 20px;"></p>
    <span style="color: black;">自动化 GC 调优</span>

    Go 原生 GC 策略存在的问题 <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;">Go 并不是一门专门针<span style="color: black;">针对</span>微服务场景设计的语言,<span style="color: black;">因此</span>自然其 GC 策略并不侧重于在延迟<span style="color: black;">敏锐</span>的业务上做优化。但 RPC 服务<span style="color: black;">常常</span>是对 P99 延迟是有<span style="color: black;">必定</span><span style="color: black;">需求</span>的。</p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">咱们</span><span style="color: black;">能够</span>先来<span style="color: black;">瞧瞧</span> Go GC 的<span style="color: black;">基本</span>原理:</p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><strong style="color: blue;">GOGC 原理:</strong></p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">经过</span> GOGC 参数设置一个百分比值,默认 100,计算下一次 GC 触发时的堆<span style="color: black;">体积</span>:NextGC = HeapSize + HeapSize * (GOGC / 100) 。即默认为上一次 GC 后 Heap Size 的 2 倍。</p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><img src="data:image/svg+xml,%3C%3Fxml version=1.0 encoding=UTF-8%3F%3E%3Csvg width=1px height=1px viewBox=0 0 1 1 version=1.1 xmlns=http://www.w3.org/2000/svg xmlns:xlink=http://www.w3.org/1999/xlink%3E%3Ctitle%3E%3C/title%3E%3Cg stroke=none stroke-width=1 fill=none fill-rule=evenodd fill-opacity=0%3E%3Cg transform=translate(-249.000000, -126.000000) fill=%23FFFFFF%3E%3Crect x=249 y=126 width=1 height=1%3E%3C/rect%3E%3C/g%3E%3C/g%3E%3C/svg%3E" style="width: 50%; margin-bottom: 20px;"></p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">因此</span>假设某服务活跃内存占用是 100MB,则每次堆增长到 200MB 的时候就做触发 GC。即便这个服务有 4GB 的内存。</p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><strong style="color: blue;">缺点:</strong></p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;">微服务场景,服务内存<span style="color: black;">广泛</span>利用率极低,但依然在做着较为激进的 GC</p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">针对</span> RPC 场景,<span style="color: black;">海量</span>对象本身是可高度复用的,对这些复用对象对频繁 GC 会降低复用率</p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><img src="data:image/svg+xml,%3C%3Fxml version=1.0 encoding=UTF-8%3F%3E%3Csvg width=1px height=1px viewBox=0 0 1 1 version=1.1 xmlns=http://www.w3.org/2000/svg xmlns:xlink=http://www.w3.org/1999/xlink%3E%3Ctitle%3E%3C/title%3E%3Cg stroke=none stroke-width=1 fill=none fill-rule=evenodd fill-opacity=0%3E%3Cg transform=translate(-249.000000, -126.000000) fill=%23FFFFFF%3E%3Crect x=249 y=126 width=1 height=1%3E%3C/rect%3E%3C/g%3E%3C/g%3E%3C/svg%3E" style="width: 50%; margin-bottom: 20px;"></p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;">核心诉求:在<span style="color: black;">保准</span>安全的<span style="color: black;">状况</span>下,降低 GC 频率,<span style="color: black;">提高</span>微服务资源复用率</p>
    gctuner : 自动化调优 GC 策略 <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;">用户<span style="color: black;">经过</span>设置阈值,来<span style="color: black;">掌控</span>自己想要的 GC 激进程度,例如设置成 memory_limit * 0.7,<span style="color: black;">小于</span>该值时,会尽可能增大 GCPercent。</p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;">内存未达到设置的阈值时,GOGC 参数往大了设置,超过时,往小了设置。</p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;">无论<span style="color: black;">怎样</span> GOGC 最小 50,最大 500。</p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><strong style="color: blue;"><span style="color: black;">优良</span>:</strong></p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;">低内存利用率时,延迟 GC</p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;">高内存利用率时,恢复到原生 GC 策略</p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><strong style="color: blue;">需要<span style="color: black;">重视</span>的点:</strong></p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">倘若</span>内存资源并非当前进程<span style="color: black;">独霸</span>,需要为其他进程预留内存资源</p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;">不适用于内存易<span style="color: black;">显现</span>过分极端峰值的服务</p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><strong style="color: blue;"><span style="color: black;">gctuner <span style="color: black;">日前</span>已开源在 github 上:</span></strong><span style="color: black;"><span style="color: black;">https://github.com/bytedance/gop<span style="color: black;">公斤</span>/tree/develop/util/gctuner</span><span style="color: black;"> 。</span></span></p>
    <span style="color: black;">并发调优</span>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><strong style="color: blue;">我到底能用多少 CPU ?- 容器的谎言</strong></p><span style="color: black;"><span style="color: black;">apiVersion</span>: <span style="color: black;">v1</span></span><span style="color: black;"><span style="color: black;">kind</span>: <span style="color: black;">Pod</span></span><span style="color: black;"><span style="color: black;">spec</span>:</span><span style="color: black;"> <span style="color: black;">containers</span>:</span><span style="color: black;"> <span style="color: black;">-</span> <span style="color: black;">resources:</span></span><span style="color: black;"> <span style="color: black;">limits</span>:</span><span style="color: black;"> <span style="color: black;">cpu</span>: <span style="color: black;">"4"</span></span>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;">微服务的发展<span style="color: black;">伴同</span>着容器化技术的蓬勃发展,<span style="color: black;">日前</span>业内大部分微服务<span style="color: black;">乃至</span>数据库都运行于容器的环境中。<span style="color: black;">这儿</span><span style="color: black;">咱们</span>只讨论主流的基于 cgroup 技术的容器。</p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">平常</span>的业务<span style="color: black;">研发</span>模式是,<span style="color: black;">开发</span>人员在容器平台上申请了 4 核 CPU 的容器,<span style="color: black;">而后</span>自然而然认为自己的程序最多只能<strong style="color: blue;"><span style="color: black;">同期</span><span style="color: black;">运用</span></strong> 4 个 CPU,并对自己的程序套上这个假设进行参数调优。</p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;">上线后,进到容器用 top 一看,各项指标确实是<span style="color: black;">根据</span> 4 核的标准在进行:</p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><img src="data:image/svg+xml,%3C%3Fxml version=1.0 encoding=UTF-8%3F%3E%3Csvg width=1px height=1px viewBox=0 0 1 1 version=1.1 xmlns=http://www.w3.org/2000/svg xmlns:xlink=http://www.w3.org/1999/xlink%3E%3Ctitle%3E%3C/title%3E%3Cg stroke=none stroke-width=1 fill=none fill-rule=evenodd fill-opacity=0%3E%3Cg transform=translate(-249.000000, -126.000000) fill=%23FFFFFF%3E%3Crect x=249 y=126 width=1 height=1%3E%3C/rect%3E%3C/g%3E%3C/g%3E%3C/svg%3E" style="width: 50%; margin-bottom: 20px;"></p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">乃至</span>用 <span style="color: black;">cat /proc/cpuinfo</span> 一看,<span style="color: black;">亦</span>能不多不少刚好看到 4 个 CPU。</p><span style="color: black;">processor : <span style="color: black;">0</span></span><span style="color: black;"><span style="color: black;">// ...</span></span><span style="color: black;">processor : <span style="color: black;">1</span></span><span style="color: black;"><span style="color: black;">// ...</span></span><span style="color: black;">processor : <span style="color: black;">2</span></span><span style="color: black;"><span style="color: black;">// ...</span></span><span style="color: black;">processor : <span style="color: black;">3</span></span><span style="color: black;"><span style="color: black;">// ...</span></span>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;">但<span style="color: black;">实质</span>上,这一切都只是容器为你封装出来的一个美好的假象。之<span style="color: black;">因此</span>要把这个假象做的这么逼真,只是为了让你摆脱编程时的心智<span style="color: black;">包袱</span>,顺便再让<span style="color: black;">哪些</span>传统的 Linux Debug 工具在容器环境中<span style="color: black;">亦</span>能正常运行。</p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;">然而<span style="color: black;">实质</span>上,基于 cgroups 实现的容器技术限制的只是 CPU 时间,而非 CPU 个数。<span style="color: black;">倘若</span><span style="color: black;">实质</span>登陆到<span style="color: black;">设备</span>去看进程每一个线程正在<span style="color: black;">运用</span>的 CPU 号码,会发现加起来很可能是超过容器 CPU 设置的:</p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><img src="data:image/svg+xml,%3C%3Fxml version=1.0 encoding=UTF-8%3F%3E%3Csvg width=1px height=1px viewBox=0 0 1 1 version=1.1 xmlns=http://www.w3.org/2000/svg xmlns:xlink=http://www.w3.org/1999/xlink%3E%3Ctitle%3E%3C/title%3E%3Cg stroke=none stroke-width=1 fill=none fill-rule=evenodd fill-opacity=0%3E%3Cg transform=translate(-249.000000, -126.000000) fill=%23FFFFFF%3E%3Crect x=249 y=126 width=1 height=1%3E%3C/rect%3E%3C/g%3E%3C/g%3E%3C/svg%3E" style="width: 50%; margin-bottom: 20px;"></p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;">容器申请 4 个 CPU 单位,<span style="color: black;">寓意</span>着<span style="color: black;">能够</span>在一个计算周期(<span style="color: black;">通常</span> 100ms)内以等价于 4 个 CPU 的时间运行,而非只能<span style="color: black;">运用</span> 4 个<span style="color: black;">理学</span> CPU,<span style="color: black;">亦</span>不<span style="color: black;">寓意</span>着最少<span style="color: black;">必定</span>能<span style="color: black;">同期</span>有 4 个 CPU 给程序<span style="color: black;">运用</span>。<span style="color: black;">倘若</span>超过了<span style="color: black;">运用</span>时间,该容器所有进程会被暂停执行直到计算周期结束 —— <span style="color: black;">亦</span><span style="color: black;">便是</span>程序可能<span style="color: black;">显现</span>卡顿(throttled)。</p>
    上游并行处理越快越好吗?- 并发与超时的关系 <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;">当<span style="color: black;">咱们</span><span style="color: black;">晓得</span>了原来自己程序<span style="color: black;">准许</span>的<span style="color: black;">理学</span>并行计算能力其实上限很高后,<span style="color: black;">咱们</span>便<span style="color: black;">能够</span><span style="color: black;">运用</span>这个技巧调大 / 调小自己服务的工作线程数(GOMAXPROCS)<span style="color: black;">或</span>程序内的请求并发度。</p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;">例如以下调用场景,业务以 4 并发度向同一个下游发送请求,<span style="color: black;">每一个</span>请求下游需要 50ms 时间处理,<span style="color: black;">因此</span>上游设置超时时间为 100ms。听起来很<span style="color: black;">恰当</span>,但<span style="color: black;">倘若</span>下游恰好那个时候仅有 2 个 CPU 能够处理请求,且恰好中间有<span style="color: black;">有些</span> GC 工作<span style="color: black;">或</span>其他工作,<span style="color: black;">那样</span>第 3 个 RPC 请求就会超时。</p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><img src="data:image/svg+xml,%3C%3Fxml version=1.0 encoding=UTF-8%3F%3E%3Csvg width=1px height=1px viewBox=0 0 1 1 version=1.1 xmlns=http://www.w3.org/2000/svg xmlns:xlink=http://www.w3.org/1999/xlink%3E%3Ctitle%3E%3C/title%3E%3Cg stroke=none stroke-width=1 fill=none fill-rule=evenodd fill-opacity=0%3E%3Cg transform=translate(-249.000000, -126.000000) fill=%23FFFFFF%3E%3Crect x=249 y=126 width=1 height=1%3E%3C/rect%3E%3C/g%3E%3C/g%3E%3C/svg%3E" style="width: 50%; margin-bottom: 20px;"></p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;">而<span style="color: black;">倘若</span><span style="color: black;">咱们</span>设置并发度为 2,则超时概率会大大降低。</p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><img src="data:image/svg+xml,%3C%3Fxml version=1.0 encoding=UTF-8%3F%3E%3Csvg width=1px height=1px viewBox=0 0 1 1 version=1.1 xmlns=http://www.w3.org/2000/svg xmlns:xlink=http://www.w3.org/1999/xlink%3E%3Ctitle%3E%3C/title%3E%3Cg stroke=none stroke-width=1 fill=none fill-rule=evenodd fill-opacity=0%3E%3Cg transform=translate(-249.000000, -126.000000) fill=%23FFFFFF%3E%3Crect x=249 y=126 width=1 height=1%3E%3C/rect%3E%3C/g%3E%3C/g%3E%3C/svg%3E" style="width: 50%; margin-bottom: 20px;"></p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;">当然这并不是说调小并发度<span style="color: black;">便是</span>好的,<span style="color: black;">倘若</span>下游计算能力远远冗余,<span style="color: black;">那样</span>调大并发度<span style="color: black;">才可</span>充分释放下游的处理能力。</p>
    拒绝内卷 - 为其他进程预留计算能力 <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">倘若</span>容器中存在其他进程,你需要<span style="color: black;">思虑</span>给其他进程预留资源。尤其是像 Service Mesh 数据面以同容器 sidecar <span style="color: black;">安排</span>的场景下,<span style="color: black;">倘若</span>一个上游进程把计算周期内分配的时间片用完了,轮到下游进程时,非常容易被 throttled ,<span style="color: black;">这般</span>服务总体的延迟依然是劣化的。</p>
    <span style="color: black;">怎样</span>调节服务并发度 <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><strong style="color: blue;"><span style="color: black;">调节</span>工作线程数</strong>:例如在 Go 中开放了 GOMAXPROCS 来<span style="color: black;">调节</span>工作线程数。</p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><strong style="color: blue;">修改代码中的请求并发量</strong>:业务需要自己权衡并<span style="color: black;">持续</span>尝试,在调高并发<span style="color: black;">得到</span>的延迟收益和丧失的高峰期稳定性之间做权衡,找到一个合适的并发值。</p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><strong style="color: blue;"><span style="color: black;">运用</span>批量接口</strong>:当然<span style="color: black;">倘若</span>业务场景<span style="color: black;">准许</span>的话,更好的做法是把这个接口更换成批量接口。</p><span style="color: black;">04 <span style="color: black;">将来</span>展望</span>
    <span style="color: black;">最后的堡垒:Kernel</span>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">日前</span><span style="color: black;">独一</span>的优化空白区:Kernel。</p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><img src="data:image/svg+xml,%3C%3Fxml version=1.0 encoding=UTF-8%3F%3E%3Csvg width=1px height=1px viewBox=0 0 1 1 version=1.1 xmlns=http://www.w3.org/2000/svg xmlns:xlink=http://www.w3.org/1999/xlink%3E%3Ctitle%3E%3C/title%3E%3Cg stroke=none stroke-width=1 fill=none fill-rule=evenodd fill-opacity=0%3E%3Cg transform=translate(-249.000000, -126.000000) fill=%23FFFFFF%3E%3Crect x=249 y=126 width=1 height=1%3E%3C/rect%3E%3C/g%3E%3C/g%3E%3C/svg%3E" style="width: 50%; margin-bottom: 20px;"></p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;">在线上业务中,<span style="color: black;">咱们</span>常常<span style="color: black;">发掘</span><span style="color: black;">这般</span>的现象,即便<span style="color: black;">咱们</span>把 RPC 优化到同机通信的程度,<span style="color: black;">针对</span> IO 密集型服务,RPC 的通信开销依然时常能占总开销的 20% 以上。而当前<span style="color: black;">咱们</span><span style="color: black;">已然</span>把进程间通信优化到了非常极致的地步,<span style="color: black;">倘若</span>还想要进一步优化,只能触及到彻底打破当下 Linux 进程间通信中的约束了。</p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">咱们</span>在这方面<span style="color: black;">已然</span>取得了<span style="color: black;">有些</span>初步成果,在<span style="color: black;">将来</span>的<span style="color: black;">文案</span>中,会继续就这<span style="color: black;">一起</span>内容进行分享,敬请期待。</p>
    <span style="color: black;">重新思考 TCP 协议</span>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;">数据中心内通信场景中, TCP 的缺陷:</p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;">内网网络质量优异,丢包率极低,TCP 的<span style="color: black;">许多</span>设计存在浪费</p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;">大规模点对点通信,TCP 长连接容易退化为短连接</p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;">应用层以「<span style="color: black;">信息</span>」为单位,而 TCP 数据流无<span style="color: black;">信息</span>边界</p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;">这个<span style="color: black;">原由</span><span style="color: black;">从而</span>让<span style="color: black;">咱们</span><span style="color: black;">起始</span>思考,<span style="color: black;">是不是</span>应该有专有的数据中心协议来进行 RPC 通信?</p>
    <span style="color: black;"><span style="color: black;">连续</span>深耕现有组件</span>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">针对</span>现有组件,<span style="color: black;">咱们</span>依然会<span style="color: black;">连续</span>投入精力去进一步<span style="color: black;">提高</span>性能与<span style="color: black;">运用</span>场景:</p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><strong style="color: blue;">编解码器 Frugal</strong>:</p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;">支持 ARM 架构</p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;">优化 SSA 后端</p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;">利用 SIMD 加速</p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;">网络库 &nbsp;Netpoll:</p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;">重构接口,支持无缝接入现有 Go 生态库</p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;">SMC-R ( RDMA ) 支持</p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;">合并<span style="color: black;">安排</span>:</p>从同机走向同机柜粒度<img src="data:image/svg+xml,%3C%3Fxml version=1.0 encoding=UTF-8%3F%3E%3Csvg width=1px height=1px viewBox=0 0 1 1 version=1.1 xmlns=http://www.w3.org/2000/svg xmlns:xlink=http://www.w3.org/1999/xlink%3E%3Ctitle%3E%3C/title%3E%3Cg stroke=none stroke-width=1 fill=none fill-rule=evenodd fill-opacity=0%3E%3Cg transform=translate(-249.000000, -126.000000) fill=%23FFFFFF%3E%3Crect x=249 y=126 width=1 height=1%3E%3C/rect%3E%3C/g%3E%3C/g%3E%3C/svg%3E" style="width: 50%; margin-bottom: 20px;">
    项目<span style="color: black;">位置</span>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;">GitHub:<span style="color: black;">https://github.com/cloudwego</span></p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;">官网:<span style="color: black;">www.cloudwego.io</span></p>
    活动<span style="color: black;">举荐</span>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;">以「启航·AIGC 软件工程变革」为主题的 QCon <span style="color: black;">全世界</span>软件<span style="color: black;">研发</span>大会·北京站将于 9 月 3-5 日在北京•富力万丽酒店举办,此次大会策划了微服务架构治理、大前端新场景探索、大前端融合提效、大模型应用落地、面向 AI 的存储、AIGC 浪潮下的<span style="color: black;">开发</span>效能<span style="color: black;">提高</span>、LLMOps、异构算力、业务安全技术、构建<span style="color: black;">将来</span>软件的编程语言、FinOps 等近 30 个精彩专题。</p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><img src="data:image/svg+xml,%3C%3Fxml version=1.0 encoding=UTF-8%3F%3E%3Csvg width=1px height=1px viewBox=0 0 1 1 version=1.1 xmlns=http://www.w3.org/2000/svg xmlns:xlink=http://www.w3.org/1999/xlink%3E%3Ctitle%3E%3C/title%3E%3Cg stroke=none stroke-width=1 fill=none fill-rule=evenodd fill-opacity=0%3E%3Cg transform=translate(-249.000000, -126.000000) fill=%23FFFFFF%3E%3Crect x=249 y=126 width=1 height=1%3E%3C/rect%3E%3C/g%3E%3C/g%3E%3C/svg%3E" style="width: 50%; margin-bottom: 20px;"></p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;">现已确认 130+ 名嘉宾,咨询购票可联系票务经理 18514549229(<span style="color: black;">微X</span>同手机号)。点击「<strong style="color: blue;">阅读原文</strong>」<span style="color: black;">就可</span>查看<span style="color: black;">所有</span>专题,期待与各位<span style="color: black;">研发</span>者现场交流。</p>




qzmjef 发表于 2024-10-4 20:24:08

“沙发”(SF,第一个回帖的人)‌
页: [1]
查看完整版本: 字节跳动微服务架构下的高性能优化实践