m5k1umn 发表于 2024-9-28 19:09:22

AI助手技术实践!腾讯文档探索之路


    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">作者:tensorchen</span></p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">本文从技术应用架构以及AI大模型赋能<span style="color: black;">方向</span>介绍腾讯文档AI智能助手的探索和实践之路。<span style="color: black;">做为</span>一款集多功能为一体的AI<span style="color: black;">制品</span>,腾讯文档全品类与AI进行了深度融合,全面<span style="color: black;">提高</span>用户生活和办公中的效率。<span style="color: black;">经过</span>腾讯文档AI,你脑海中灵光一现的Idea<span style="color: black;">能够</span>快速转化为<span style="color: black;">仔细</span>的内容,在<span style="color: black;">各样</span>类型的文档中同源流转。<span style="color: black;">同期</span>,面对纷繁<span style="color: black;">繁杂</span>的信息,腾讯文档AI<span style="color: black;">亦</span><span style="color: black;">能够</span>加以分析处理,<span style="color: black;">帮忙</span>你从海量的信息中提炼出有价值的内容,从而将<span style="color: black;">她们</span>转化为属于你的认知。</span></p>
    <h1 style="color: black; text-align: left; margin-bottom: 10px;"><span style="color: black;">第1</span>章:大模型给效率工具带来的挑战</h1>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">随着ChatGPT的发布和<span style="color: black;">火热</span>,全世界的目光仿佛都聚焦在了大语言模型上;其强大的语言理解能力和生成能力、上下文记忆、学习纠错、思维链推理等关键能力的涌现,都标志着”AIGC“的发展到达了技术拐点。全世界的<span style="color: black;">研发</span>者手握一个如此威力巨大的雷神之锤,恨不得把所有的钉子都锤一遍,<span style="color: black;">因此</span>大模型火热的<span style="color: black;">起始</span><span style="color: black;">周期</span>,有了<span style="color: black;">这般</span>的一个言论:”世界上所有的app都<span style="color: black;">能够</span>结合大模型重做一遍“。话语虽经不起推敲,但大模型在<span style="color: black;">有些</span><span style="color: black;">行业</span>确实<span style="color: black;">能够</span><span style="color: black;">加强</span>惊人的效率,尤其是效率工具<span style="color: black;">行业</span>,从以下几个宏观<span style="color: black;">方向</span>分析,确实给效率工具带来了<span style="color: black;">极重</span>的机会。</span></p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">● </span><strong style="color: blue;"><span style="color: black;">技术上:文本生成<span style="color: black;">行业</span>技术发展相对成熟</span></strong></p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">大模型落地最<span style="color: black;">起始</span>的应用,<span style="color: black;">便是</span>用于文本生成,<span style="color: black;">同期</span>在文本生成技术上发展<span style="color: black;">亦</span>是最快最成熟的,承载用户创作的效率工具是天然落地的有利场景,<span style="color: black;">能够</span><span style="color: black;">极重</span><span style="color: black;">加强</span>用户创作环节的门槛和效率。</span></p>
    <div style="color: black; text-align: left; margin-bottom: 10px;"><img src="https://p3-sign.toutiaoimg.com/tos-cn-i-axegupay5k/e7c34a8f28a945c2abcdd4911e38cc47~noop.image?_iz=58558&amp;from=article.pc_detail&amp;lk3s=953192f4&amp;x-expires=1727623940&amp;x-signature=htiq4dUOckhh4TL%2F%2FPIOJyJNQX4%3D" style="width: 50%; margin-bottom: 20px;"></div>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">● </span><strong style="color: blue;"><span style="color: black;">用户上:关注热度极高</span></strong></p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">用户关注度是<span style="color: black;">大众</span>极容易<span style="color: black;">忽略</span>的一个分析<span style="color: black;">方向</span>,当新技术、新概念<span style="color: black;">面世</span>后,无论新技术有多厉害,<span style="color: black;">最后</span>是要落地<span style="color: black;">作为</span><span style="color: black;">制品</span>服务用户的。叫好不叫座的技术/<span style="color: black;">制品</span>,并不是真正的好技术/<span style="color: black;">制品</span>。</span></p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">从百度关键词搜索指数,<span style="color: black;">能够</span>看出ChatGPT<span style="color: black;">面世</span><span style="color: black;">败兴</span>,所覆盖的用户面足够广、用户接受度和对其感兴趣程度极高,搜索指数峰值达85W,可算是今年的“互联网春晚”。过<span style="color: black;">查找</span>历史数据对比,可更加具象化的<span style="color: black;">认识</span><span style="color: black;">这次</span>搜索热度:</span></p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">上一个爆火的元宇宙概念, 搜索峰值指数<span style="color: black;">不外</span>10W,不及<span style="color: black;">这次</span>ChatGPT的1/8。</span></p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">22年除夕当天,春晚关键词搜索指数150W,<span style="color: black;">这次</span>ChatGPT的关注度峰值已达春晚的一半。</span></p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><strong style="color: blue;"><span style="color: black;">●</span></strong><strong style="color: blue;"><span style="color: black;">发展规律:工具始终在率先变革</span></strong></p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">历史不会重复,但会押韵。从过往历代新技术变革到<span style="color: black;">此刻</span>,每次工具都会率先的进行变革,一代人有一代人的生产力工具。</span></p>
    <div style="color: black; text-align: left; margin-bottom: 10px;"><img src="https://p3-sign.toutiaoimg.com/tos-cn-i-6w9my0ksvp/b2d91ae2c1af47c0acc902ebb0e86f0b~noop.image?_iz=58558&amp;from=article.pc_detail&amp;lk3s=953192f4&amp;x-expires=1727623940&amp;x-signature=C6z%2Boe1bbAmMpUT0BmKIPEs564k%3D" style="width: 50%; margin-bottom: 20px;"></div>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">**</span></p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;"><span style="color: black;">一样</span>从<span style="color: black;">全世界</span>/全国的竞品数据表现来看,完全<span style="color: black;">亦</span>印证这件事:用户对文档类工具与AI的结合接受度高,<span style="color: black;">需要</span>旺盛,是<span style="color: black;">这次</span>大模型爆发后重点落地的<span style="color: black;">优良</span><span style="color: black;">行业</span>。</span></p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;"><span style="color: black;">全世界</span><span style="color: black;">拜访</span>量前100的AI<span style="color: black;">制品</span>中,共有12个文档类工具竞品上榜;全国<span style="color: black;">拜访</span>量前100的AI<span style="color: black;">制品</span>中,共有26个文档类工具竞品上榜。</span></p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">是新机遇但<span style="color: black;">亦</span>确实是新挑战,对传统效率的革新并非一蹴而就,无论是用户认知教育、<span style="color: black;">制品</span>能力建设落地及差异化竞争、<span style="color: black;">商场</span>化模式都是新的挑战。本文会重点聚焦于 AI技术在<span style="color: black;">制品</span>落地环节及模型部分,其余部分不<span style="color: black;">这里</span>长篇幅展开,埋坑后续更新~</span></p>
    <h1 style="color: black; text-align: left; margin-bottom: 10px;">第二章:文档AI技术思维及架构</h1>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">本章从技术<span style="color: black;">方向</span>介绍<span style="color: black;">全部</span>腾讯文档AI技术工程的<span style="color: black;">详细</span>实践架构,以及<span style="color: black;">自己</span>对AI应用落地的技术思考。</span></p>
    <h1 style="color: black; text-align: left; margin-bottom: 10px;">2.1 AI应用技术思维</h1>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">在<span style="color: black;">实质</span>应用思维模式<span style="color: black;">能够</span>总结为:</span></p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><strong style="color: blue;"><span style="color: black;">1.</span></strong><strong style="color: blue;"><span style="color: black;">对人困难的,对AI<span style="color: black;">亦</span>困难</span></strong></p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><strong style="color: blue;"><span style="color: black;">2.</span></strong><strong style="color: blue;"><span style="color: black;">能让程序做的,就不要让AI做</span></strong></p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">举个可能不完全契合的例子:</span></p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">人捕鱼这件事:人思考决策<span style="color: black;">运用</span>渔网(工具)进行捕鱼。普通人不会<span style="color: black;">实质</span>制作渔网,普通人制作渔网需要有人教授<span style="color: black;">关联</span>技能,过程费时费力,成效低,见效慢。</span></p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">AI起到的是授人的<span style="color: black;">功效</span>;工具起到是渔的<span style="color: black;">功效</span></span></p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">在文档<span style="color: black;">实质</span>案例中,AI<span style="color: black;">帮助</span>用户美化PPT这件事:AI理解用户需要美化PPT,AI决策<span style="color: black;">运用</span>PPT美化工具进行美化。AI不会<span style="color: black;">实质</span>美化PPT,AI美化PPT需要有人教授AI<span style="color: black;">关联</span>技能(海量的高质量的PPT美化<span style="color: black;">关联</span>数据进行模型训练),过程费时费力,成效低,见效慢。</span></p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">"将<span style="color: black;">全部</span>PPT的字体<span style="color: black;">调节</span>为宋体" 任务</span></p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">AI: 用于<span style="color: black;">处理</span>对话中理解用户<span style="color: black;">调节</span>字体的意图和<span style="color: black;">详细</span>的字体类别</span></p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">工具:文档PPT<span style="color: black;">调节</span>字体工具进行<span style="color: black;">实质</span>的执行</span></p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">"创建一个关于明朝历史的PPT" 任务</span></p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">AI: 用于<span style="color: black;">处理</span>对话中理解用户创建PPT意图和主题 明朝历史</span></p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">AI:基于明朝历史这个主题生成大纲和<span style="color: black;">仔细</span>文本内容</span></p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">工具(搜图工具):基于大纲进行<span style="color: black;">照片</span>搜索,实施PPT配图</span></p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">工具(PPT模版):基于大纲,文本,<span style="color: black;">照片</span> + 模版,生成完整PPT</span></p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">腾讯文档<span style="color: black;">自己</span>业务<span style="color: black;">包括</span><span style="color: black;">许多</span>品类,例如Word,Excel,PPT,PDF,收集表(Form),思维导图,流程图,智能表格(SmartSheet),智能文档(SmartCanvas)以及正在进行中的白板品类。</span></p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;"><span style="color: black;">区别</span>品类是以产出为导向构建的<span style="color: black;">制品</span>形态,内容和形态叠加在<span style="color: black;">一块</span>,(Word需要<span style="color: black;">调节</span>格式,PPT<span style="color: black;">大众</span>需要学习美化)。其核心在于内容信息的表达。</span></p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;"><span style="color: black;">因此呢</span>,落地腾讯文档AI应用时, 从技术<span style="color: black;">方向</span>思考<span style="color: black;">一般</span> </span><strong style="color: blue;"><span style="color: black;">应用AI<span style="color: black;">处理</span>内容<span style="color: black;">关联</span>问题,应用工程<span style="color: black;">处理</span>形态<span style="color: black;">或</span>样式问题</span></strong><span style="color: black;">。</span></p>
    <h1 style="color: black; text-align: left; margin-bottom: 10px;">2.2 文档AI技术架构</h1>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">● </span><strong style="color: blue;"><span style="color: black;">AICopilot</span></strong><span style="color: black;">:<span style="color: black;">供给</span>AI侧边栏对话入口服务,<span style="color: black;">重点</span>负责对话的意图识别工具分发,意图保持,柔性处理,缓存<span style="color: black;">规律</span>,会话存档等能力。</span></p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">● </span><strong style="color: blue;"><span style="color: black;">AIServer</span></strong><span style="color: black;">:<span style="color: black;">供给</span>各品类独有浮层助手能力。</span></p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">● </span><strong style="color: blue;"><span style="color: black;">AIAgent</span></strong><span style="color: black;">:定<span style="color: black;">位置于</span>AI智能体,<span style="color: black;">日前</span><span style="color: black;">重点</span><span style="color: black;">供给</span>文档各品类的能力工具集合,被上层服务的意图识别后所<span style="color: black;">实质</span>的驱动的接口。</span></p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">● </span><strong style="color: blue;"><span style="color: black;">AIEngine</span></strong><span style="color: black;">:文档的AI引擎服务,<span style="color: black;">触及</span>对AI<span style="color: black;">关联</span>能力的抽象和封装,保持统一抽象定义(<span style="color: black;">重点</span><span style="color: black;">包括</span>文生文,文生图,TTS,ASR,OCR,Embedding等AI能力的抽象),屏蔽<span style="color: black;">区别</span>AI能力间的差异,奠定文档<span style="color: black;">能够</span>在<span style="color: black;">区别</span>AI能力进行无缝切换的<span style="color: black;">基本</span>。</span></p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">● </span><strong style="color: blue;"><span style="color: black;">AIOperation</span></strong><span style="color: black;">: 文档AI<span style="color: black;">关联</span>的灰度策略,隐私授权(柔性),运营操作。</span></p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">● </span><strong style="color: blue;"><span style="color: black;">AIExtension</span></strong><span style="color: black;">:AI拓展服务,<span style="color: black;">重点</span><span style="color: black;">包括</span>和规划为AI应用落地过程中的其他支撑能力,例如文本搜索,<span style="color: black;">照片</span>搜索,Python执行引擎。</span></p>
    <div style="color: black; text-align: left; margin-bottom: 10px;"><img src="https://p3-sign.toutiaoimg.com/tos-cn-i-6w9my0ksvp/74a3190b3e9844e28f115af761774bed~noop.image?_iz=58558&amp;from=article.pc_detail&amp;lk3s=953192f4&amp;x-expires=1727623940&amp;x-signature=7ZgzKhq2MrKN5YNN2TkvY1FBDl8%3D" style="width: 50%; margin-bottom: 20px;"></div>
    <h1 style="color: black; text-align: left; margin-bottom: 10px;">2.3 文档AI中台架构</h1>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">文档AI中台的概念初始于腾讯文档这款<span style="color: black;">制品</span>本身就存在10种品类,期望以中台<span style="color: black;">处理</span><span style="color: black;">方法</span>的形式为<span style="color: black;">区别</span>品类进行赋能,<span style="color: black;">同期</span><span style="color: black;">亦</span>是<span style="color: black;">这般</span>实践和落地。此<span style="color: black;">不仅</span>在于腾讯文档<span style="color: black;">制品</span>本身,依据<span style="color: black;">分部</span>内整体<span style="color: black;">制品</span>矩阵,<span style="color: black;">亦</span>更需要将文档xAI<span style="color: black;">基本</span>能力<span style="color: black;">做为</span>中台,交付和赋能<span style="color: black;">区别</span>的<span style="color: black;">制品</span>。</span></p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">文档AI中台同<span style="color: black;">详细</span>模型和<span style="color: black;">制品</span>应用解除耦合,形<span style="color: black;">成为了</span><span style="color: black;">能够</span>为<span style="color: black;">区别</span><span style="color: black;">制品</span>赋能的文档xAI<span style="color: black;">处理</span><span style="color: black;">方法</span>,<span style="color: black;">供给</span>对文档AI<span style="color: black;">行业</span>的整体<span style="color: black;">处理</span><span style="color: black;">方法</span>,赋能不同的AI应用<span style="color: black;">制品</span>。</span></p>
    <div style="color: black; text-align: left; margin-bottom: 10px;"><img src="https://p3-sign.toutiaoimg.com/tos-cn-i-6w9my0ksvp/acf0a039f44041c69769ab1c95f930c5~noop.image?_iz=58558&amp;from=article.pc_detail&amp;lk3s=953192f4&amp;x-expires=1727623940&amp;x-signature=9PrypNY%2B6fAvFW5vII2B4rxQ9xA%3D" style="width: 50%; margin-bottom: 20px;"></div>
    <h1 style="color: black; text-align: left; margin-bottom: 10px;">2.4 中书阁AI应用框架</h1>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">在文档AI应用和中台落地过程中,<span style="color: black;">亦</span>将AI技术和周边能力生态抽象实现为AI应用框架,其</span><strong style="color: blue;"><span style="color: black;">定位</span></strong><span style="color: black;">:AI应用落地的应用框架建设 。</span><strong style="color: blue;"><span style="color: black;">愿景</span></strong><span style="color: black;">:AI For Everyone,降低AI应用技术门槛,<span style="color: black;">提高</span>AI应用<span style="color: black;">开发</span>效率 。</span></p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><strong style="color: blue;"><span style="color: black;">理念</span></strong><span style="color: black;">:</span></p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">1.</span><strong style="color: blue;"><span style="color: black;">标准化:</span></strong><span style="color: black;"><span style="color: black;">重点</span>承接Oteam前两项内容 AI应用标准 和 AI应用规范,其将<span style="color: black;">经过</span>AI应用框架的标准化建设<span style="color: black;">最后</span>输出给业务<span style="color: black;">开发</span>者。</span></p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">2.</span><strong style="color: blue;"><span style="color: black;">可视化:</span></strong><span style="color: black;">在大语言模型应用中<span style="color: black;">常常</span>遇到多次同大语言模型交互并调用<span style="color: black;">外边</span>工具,对其中的过程的可视化将有助于<span style="color: black;">开发</span>调试,问题定位以及运营分析等。</span></p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">框架将<span style="color: black;">供给</span>UI平台,<span style="color: black;">供给</span>LLM应用过程的可视化界面(<span style="color: black;">包括</span>耗时分析,Token消耗等等内容)。</span></p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">框架<span style="color: black;">亦</span>将<span style="color: black;">供给</span>LLM的可观测性,<span style="color: black;">供给</span>基于OpenTelemetry标准的监控,分布式<span style="color: black;">跟踪</span>和日志等数据的上报。</span></p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">3.</span><strong style="color: blue;"><span style="color: black;">多语言框架:</span></strong><span style="color: black;">将<span style="color: black;">供给</span>多语言实现,以满足<span style="color: black;">不消</span>业务应用场景和业务技术栈。</span></p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">对非AI专业友好,框架站在<span style="color: black;">运用</span>者<span style="color: black;">方向</span>进行模块和能力抽象,多语言<span style="color: black;">供给</span>AI应用<span style="color: black;">研发</span>聚焦在AI<span style="color: black;">制品</span>能力落地及效果优化。</span></p>
    <div style="color: black; text-align: left; margin-bottom: 10px;"><img src="https://p3-sign.toutiaoimg.com/tos-cn-i-6w9my0ksvp/f963512a049440e8b382471d8e92daff~noop.image?_iz=58558&amp;from=article.pc_detail&amp;lk3s=953192f4&amp;x-expires=1727623940&amp;x-signature=5JCOXzP7UiSCzJ2pyUOFjFmBVUI%3D" style="width: 50%; margin-bottom: 20px;"></div>
    <h1 style="color: black; text-align: left; margin-bottom: 10px;">第三章:文档AI应用侧技术实践</h1>
    <h1 style="color: black; text-align: left; margin-bottom: 10px;">3.1 问答场景应用</h1>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">文档<span style="color: black;">制品</span>最核心能力之一是</span><strong style="color: blue;"><span style="color: black;">信息传达</span></strong><span style="color: black;">,海量的信息中对信息进行AI问答是关键AI落地场景之一,在文档中<span style="color: black;">触及</span>对Word,PPT,Sheet,思维导图,收集表,知识库等场景内容的问题。</span></p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">文档的AI应用工程实现关键一点在于搭建文档问答的<span style="color: black;">基本</span><span style="color: black;">处理</span><span style="color: black;">方法</span>。<span style="color: black;">处理</span>此类的问题的关键在于<span style="color: black;">怎样</span>让大模型理解<span style="color: black;">行业</span>知识(特定文档中的内容信息)。</span></p>
    <div style="color: black; text-align: left; margin-bottom: 10px;"><img src="https://p3-sign.toutiaoimg.com/tos-cn-i-6w9my0ksvp/3e0b6652be2a4636b5d06916a7e265c6~noop.image?_iz=58558&amp;from=article.pc_detail&amp;lk3s=953192f4&amp;x-expires=1727623940&amp;x-signature=b0z5izQ8HYzxQB3gISgV6uCF9Bs%3D" style="width: 50%; margin-bottom: 20px;"></div>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;"><span style="color: black;">一般</span>有两种<span style="color: black;">处理</span><span style="color: black;">方法</span>:</span></p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">● </span><strong style="color: blue;"><span style="color: black;"><span style="color: black;">方法</span>一</span></strong><span style="color: black;">:<span style="color: black;">行业</span>知识<span style="color: black;">经过</span>FT方式进入模型的权重文件中或<span style="color: black;">经过</span>LoRA动态叠加到模型权重文件中。</span></p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">● </span><strong style="color: blue;"><span style="color: black;"><span style="color: black;">方法</span>二</span></strong><span style="color: black;">:<span style="color: black;">经过</span>Context的方式即时的将<span style="color: black;">行业</span>知识传入模型。</span></p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">用户的文档信息,是用户<span style="color: black;">自己</span>的信息的集合,它<span style="color: black;">重点</span>服务于用户<span style="color: black;">自己</span>。<span style="color: black;">咱们</span>不可能为每位用户专门训练模型,基于时效性的<span style="color: black;">思虑</span>用户文档经常变更,<span style="color: black;">亦</span>不可能每次变更重训模型,其次基于用户隐私的<span style="color: black;">思虑</span>,<span style="color: black;">咱们</span><span style="color: black;">亦</span>不可能将用户数据拿来进行训练。显然<span style="color: black;">方法</span>一不可行。</span></p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;"><span style="color: black;">那样</span>在文档中<span style="color: black;">实质</span>进行落地的<span style="color: black;">亦</span>是<span style="color: black;">方法</span>二:</span><strong style="color: blue;"><span style="color: black;"><span style="color: black;">经过</span>Context的方式即时的将<span style="color: black;">行业</span>知识传入模型。</span></strong></p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">这种技术被<span style="color: black;">叫作</span>为 RAG,RAG (Retrieval-Augmented Generation)搜索<span style="color: black;">加强</span>生成技术,<span style="color: black;">亦</span><span style="color: black;">便是</span>一套基于特定知识库的检索召回和大模型模型生成的技术<span style="color: black;">方法</span>,用于处理大模型中<span style="color: black;">各样</span><span style="color: black;">繁杂</span>的知识密集型任务,如知识问答。</span></p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">整体<span style="color: black;">处理</span><span style="color: black;">方法</span>由如下模块串联完成:</span></p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">l </span><strong style="color: blue;"><span style="color: black;">文档加载</span></strong><span style="color: black;">:定义统一的 </span><span style="color: black;">Document</span>数据模型,将实现默认典型的数据源加载实现,业务方<span style="color: black;">亦</span><span style="color: black;">能够</span><span style="color: black;">按照</span>接口自定义实现<span style="color: black;">自己</span>所需文档数据源。</p><strong style="color: blue;"><span style="color: black;">文档分片</span></strong><span style="color: black;">:大语言模型上下文<span style="color: black;">体积</span>有<span style="color: black;">必定</span>限制,需要将<span style="color: black;">海量</span>数据进行分割操作。</span><strong style="color: blue;"><span style="color: black;">文档Embedding</span></strong><span style="color: black;">:Embedding过程将对应文本向量化,以<span style="color: black;">供给</span>更好的语义表达。</span><strong style="color: blue;"><span style="color: black;">文档向量存储</span></strong><span style="color: black;">:<span style="color: black;">运用</span>向量数据库存储文档向量信息。</span><strong style="color: blue;"><span style="color: black;">文档召回</span></strong><span style="color: black;">:<span style="color: black;">按照</span>用户输入的问题召回跟问题最<span style="color: black;">关联</span>的文档信息。</span><strong style="color: blue;"><span style="color: black;">问题解答</span></strong><span style="color: black;">:<span style="color: black;">按照</span>召回文档资料 + 用户输入问题<span style="color: black;">供给</span>给大语言模型进行知识问答。</span>
    <div style="color: black; text-align: left; margin-bottom: 10px;"><img src="https://p3-sign.toutiaoimg.com/tos-cn-i-6w9my0ksvp/9f467ae926844567b5a4aa57ebb06bf8~noop.image?_iz=58558&amp;from=article.pc_detail&amp;lk3s=953192f4&amp;x-expires=1727623940&amp;x-signature=o9WQhCsT4T8hj7XcGwF%2F%2B7ft6Tc%3D" style="width: 50%; margin-bottom: 20px;"></div>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">为<span style="color: black;">处理</span>如下两种场景,在原有架构上规划进行进一步的升级。</span></p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">1.<span style="color: black;">处理</span>元数据问答、总结、非总结类问题</span></p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">2.<span style="color: black;">处理</span><span style="color: black;">触及</span>多模态文档的问答</span></p>
    <div style="color: black; text-align: left; margin-bottom: 10px;"><img src="https://p3-sign.toutiaoimg.com/tos-cn-i-6w9my0ksvp/1fde6ecc4dc745d8af92cd0cc5e8e5ba~noop.image?_iz=58558&amp;from=article.pc_detail&amp;lk3s=953192f4&amp;x-expires=1727623940&amp;x-signature=owREUtWaAF51Ih3%2BgnG1NlAgMA4%3D" style="width: 50%; margin-bottom: 20px;"></div>
    <h1 style="color: black; text-align: left; margin-bottom: 10px;">3.2 意图识别应用</h1>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">依在文档AI<span style="color: black;">实质</span>产生应用效益,需要将用户意图<span style="color: black;">实质</span>转化为<span style="color: black;">详细</span>的<span style="color: black;">行径</span></span></p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><strong style="color: blue;"><span style="color: black;">挑战一:上百种指令场景</span></strong></p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><strong style="color: blue;"><span style="color: black;">挑战二:意图和任务流程各不相拥,<span style="color: black;">经过</span>会<span style="color: black;">触及</span>多种工具的串联</span></strong></p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">如下<span style="color: black;">实质</span>用户<span style="color: black;">运用</span>的示例:</span></p>
    <div style="color: black; text-align: left; margin-bottom: 10px;"><img src="https://p3-sign.toutiaoimg.com/tos-cn-i-6w9my0ksvp/57375c0349044bceb88d2bedc82d0849~noop.image?_iz=58558&amp;from=article.pc_detail&amp;lk3s=953192f4&amp;x-expires=1727623940&amp;x-signature=%2BEznD97HPu1OvIfw3BUKCn1WRyc%3D" style="width: 50%; margin-bottom: 20px;"></div>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">结合用户<span style="color: black;">区别</span>输入应用场景,落地AI功能,关键在于 </span><strong style="color: blue;"><span style="color: black;">意图识别</span></strong><span style="color: black;"> 和 </span><strong style="color: blue;"><span style="color: black;">任务编排</span></strong></p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">● <span style="color: black;">经过</span>PromptID为<span style="color: black;">独一</span>任务索引</span></p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">● 将能力标准化工具化</span></p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">● <span style="color: black;">经过</span>As Code形式对任务进行编排(参考gitlab的,利用yml进行任务编排上百种任务场景)</span></p>
    <div style="color: black; text-align: left; margin-bottom: 10px;"><img src="https://p3-sign.toutiaoimg.com/tos-cn-i-6w9my0ksvp/1954ea9e914c49c980bbe8fe625dc551~noop.image?_iz=58558&amp;from=article.pc_detail&amp;lk3s=953192f4&amp;x-expires=1727623940&amp;x-signature=FOjSxWtJp%2BceSjQxArsMgQyXOpU%3D" style="width: 50%; margin-bottom: 20px;"></div>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">用户意图更大的挑战是 多意图识别,用户可能<span style="color: black;">同期</span>进行字体<span style="color: black;">调节</span>和字号<span style="color: black;">调节</span>,<span style="color: black;">针对</span><span style="color: black;">以上</span>的<span style="color: black;">处理</span>放哪,<span style="color: black;">咱们</span>是不可能<span style="color: black;">运用</span>单个function call<span style="color: black;">处理</span>问题的,function call的参数有限,<span style="color: black;">亦</span><span style="color: black;">没法</span>预测所有的用户<span style="color: black;">行径</span>。</span></p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;"><span style="color: black;">那样</span>可行的大概两种:</span></p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;"><span style="color: black;">方法</span>一:多轮Function Call</span></p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;"><span style="color: black;">方法</span>二:生成代码</span></p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;"><span style="color: black;">最后</span><span style="color: black;">咱们</span>规划<span style="color: black;">运用</span>生成代码的<span style="color: black;">方法</span>,<span style="color: black;">重点</span>多轮function call实现上<span style="color: black;">没法</span><span style="color: black;">处理</span>任务<span style="color: black;">次序</span>问题,而<span style="color: black;">经过</span>生成的代码是可行的。</span></p>
    <h1 style="color: black; text-align: left; margin-bottom: 10px;">3.3 表格应用场景</h1>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">表格场景最大的挑战是表格内容容量,<span style="color: black;">按照</span><span style="color: black;">日前</span>大模型的context容量,大概只能支持有限个单元格。超大表格<span style="color: black;">处理</span><span style="color: black;">方法</span>,核心策略在于:将原有AI返回结果升级为AI返回得出结果的方式<span style="color: black;">办法</span>(即代码)。</span></p>
    <div style="color: black; text-align: left; margin-bottom: 10px;"><img src="https://p26-sign.toutiaoimg.com/tos-cn-i-6w9my0ksvp/915311e86fd14b45b33ae8f8d2223161~noop.image?_iz=58558&amp;from=article.pc_detail&amp;lk3s=953192f4&amp;x-expires=1727623940&amp;x-signature=k8sP97iqgcnosCnvU1B5JYU2oJM%3D" style="width: 50%; margin-bottom: 20px;"></div>
    <h1 style="color: black; text-align: left; margin-bottom: 10px;">第四章:文档AI模型侧技术实践</h1>
    <h1 style="color: black; text-align: left; margin-bottom: 10px;">4.1 创作场景模型</h1>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><strong style="color: blue;"><span style="color: black;"><span style="color: black;">运用</span>数据<span style="color: black;">加强</span><span style="color: black;">办法</span>,强化薄弱能力</span></strong></p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;"><span style="color: black;">针对</span>创作能力,<span style="color: black;">运用</span>self instruct、evol instruct等<span style="color: black;">办法</span>构造类似的种子指令、并<span style="color: black;">经过</span><span style="color: black;">繁杂</span>化进化和泛化,进行数据<span style="color: black;">加强</span>。<span style="color: black;">能够</span>有一套比较标准化的流程:</span></p>
    <div style="color: black; text-align: left; margin-bottom: 10px;"><img src="https://p3-sign.toutiaoimg.com/tos-cn-i-6w9my0ksvp/ed78983af78246d8abfee4c22c9e5142~noop.image?_iz=58558&amp;from=article.pc_detail&amp;lk3s=953192f4&amp;x-expires=1727623940&amp;x-signature=DXBQ2ihsVZ6kB9XvGmIPMexIKSY%3D" style="width: 50%; margin-bottom: 20px;"></div>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">● </span><strong style="color: blue;"><span style="color: black;">收集种子指令</span></strong><span style="color: black;">:收集新<span style="color: black;">需要</span>,人工编写简单种子指令;</span></p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">● </span><strong style="color: blue;"><span style="color: black;">指令多样化</span></strong><span style="color: black;">:参考self instruct、evol instruct 宽度变换的做法,对种子指令进行多样性变换,覆盖<span style="color: black;">更加多</span>的<span style="color: black;">行业</span>、主题、形式等;</span></p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">● </span><strong style="color: blue;"><span style="color: black;">指令<span style="color: black;">繁杂</span>化</span></strong><span style="color: black;">:参考evol instruct 深度变换操作(例如:<span style="color: black;">增多</span>约束、<span style="color: black;">增多</span>参考示例、<span style="color: black;">增多</span><span style="color: black;">详细</span>化操作等),为种子指令添加约束,让指令变得<span style="color: black;">繁杂</span>,为<span style="color: black;">每一个</span>指令<span style="color: black;">增多</span>3-10个约束<span style="color: black;">要求</span>;</span></p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">● </span><strong style="color: blue;"><span style="color: black;">指令泛化</span></strong><span style="color: black;">:对进化后的指令同意改写,进一步丰富表达方式和形式,<span style="color: black;">每一个</span>指令改写3-5种形式。</span></p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">● </span><strong style="color: blue;"><span style="color: black;">结果抓取</span></strong><span style="color: black;">:标注和抓取<span style="color: black;">以上</span>进化后的指令;</span></p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">● </span><strong style="color: blue;"><span style="color: black;">结果清洗</span></strong><span style="color: black;">:<span style="color: black;">运用</span>self-refine、人工<span style="color: black;">检测</span>等方式,抓取结果的准确率接近100%。</span></p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><strong style="color: blue;"><span style="color: black;">对比学习方式,<span style="color: black;">提高</span>理解稳定性</span></strong></p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">针对差别较小,难以区分的任务,例如:约束漏点、否定约束、数字<span style="color: black;">需求</span>等任务、专门构造对比样本,加入sft<span style="color: black;">或</span>进行强化学习。这类样本<span style="color: black;">能够</span>加入sft<span style="color: black;">周期</span>学习,<span style="color: black;">能够</span>构造pair数据,加入偏好学习<span style="color: black;">周期</span>。</span></p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">● </span><strong style="color: blue;"><span style="color: black;">局部对比</span></strong><span style="color: black;">:在约束数量较多的<span style="color: black;">状况</span>下,模型难以兼顾到所有约束。容易<span style="color: black;">显现</span>漏点问题。<span style="color: black;">经过</span>从指令中逐个去掉约束、其他部分不变的方式,增加局部对比的样本,让<span style="color: black;">每一个</span>约束在指令中<span style="color: black;">显现</span>和<span style="color: black;">无</span><span style="color: black;">显现</span>的<span style="color: black;">状况</span>对应什么response,模型都见到过。</span></p>
    <div style="color: black; text-align: left; margin-bottom: 10px;"><img src="https://p3-sign.toutiaoimg.com/tos-cn-i-6w9my0ksvp/7f38926959074de29db21703b04fa367~noop.image?_iz=58558&amp;from=article.pc_detail&amp;lk3s=953192f4&amp;x-expires=1727623940&amp;x-signature=f8MztH1hZvnW9N4fuPNeDkodx8g%3D" style="width: 50%; margin-bottom: 20px;"></div>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">● </span><strong style="color: blue;"><span style="color: black;">否定对比</span></strong><span style="color: black;">:针对否定约束,<span style="color: black;">经过</span>去掉否定<span style="color: black;">要求</span>和对否定<span style="color: black;">要求</span>取反的方式,构造对比样本</span></p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">写一篇邮件,关于提前预约<span style="color: black;">咱们</span>的美容护肤师,享受专业的面部护理服务和个性化护肤<span style="color: black;">方法</span><span style="color: black;">举荐</span>。邮件需<span style="color: black;">包括</span>邮件主题、收件人、发件人、正文等基本部分。在邮件中指出收件人需要在预约后的48小时内完成预约确认和安排护肤师的任务,并提醒收件人应<span style="color: black;">经过</span><span style="color: black;">tel</span>或电子邮件回复预约信息。不要<span style="color: black;">显现</span>"顺祝商祺"</span></p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">● </span><strong style="color: blue;"><span style="color: black;">数字变换对比</span></strong><span style="color: black;">:对指令里的数字<span style="color: black;">需求</span>的数字进行变换,构造对比样本</span></p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">编写一篇关于<span style="color: black;">将来</span>城市规划的短文,强调可<span style="color: black;">连续</span>发展和绿色出行的重要性。<span style="color: black;">同期</span>,探讨<span style="color: black;">怎样</span>有效利用现有资源,以减少对环境的影响。请务必<span style="color: black;">包含</span><span style="color: black;">最少</span>三个创新性的规划策略,并在文中<span style="color: black;">供给</span>相应的实例或案例。</span></p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">编写一篇关于<span style="color: black;">将来</span>城市规划的短文,强调可<span style="color: black;">连续</span>发展和绿色出行的重要性。<span style="color: black;">同期</span>,探讨<span style="color: black;">怎样</span>有效利用现有资源,以减少对环境的影响。请务必<span style="color: black;">包含</span><span style="color: black;">最少</span>六个创新性的规划策略,并在文中<span style="color: black;">供给</span>相应的实例或案例。</span></p>
    <h1 style="color: black; text-align: left; margin-bottom: 10px;">4.2 表格场景模型</h1>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><strong style="color: blue;"><span style="color: black;">公式生成</span></strong></p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">公式生成除了<span style="color: black;">能够</span>识别<span style="color: black;">基本</span>公式<span style="color: black;">需要</span>外(“求A列的和”),还支持热门<span style="color: black;">行业</span>专业术语的理解,例如:用户问营运资金周转率最大的<span style="color: black;">制品</span>,基于混元知识<span style="color: black;">包含</span>能力,混元<span style="color: black;">晓得</span>【营运资金周转率=<span style="color: black;">营销</span>额/平均营运资金】,<span style="color: black;">从而</span>计算出<span style="color: black;">每一个</span><span style="color: black;">制品</span>的营运资金周转率。</span></p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;"><span style="color: black;">另外</span>,在技术<span style="color: black;">方法</span>上<span style="color: black;">运用</span>了思维链(COT)+代码生成(POT)的方式,<span style="color: black;">处理</span>公式嵌套带来的效果不稳定的问题。</span></p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">思维链(CoT)被认为最具开拓性和影响力的提示工程技术之一,它<span style="color: black;">能够</span><span style="color: black;">加强</span>大型语言模型在决策过程中的表现。</span></p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">CoT迫使模型将推理过程划分为中间<span style="color: black;">过程</span>。这种<span style="color: black;">办法</span>类似于人类的认知过程,将<span style="color: black;">繁杂</span>的挑战分解为更小、更易于管理的部分。</span></p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">思维程序(PoT) 是一种独特的LLM推理<span style="color: black;">办法</span>。它<span style="color: black;">不仅</span>是生成自然语言答案,而是<span style="color: black;">需求</span>创建一个可执行程序,<span style="color: black;">能够</span>在Python等程序解释器上运行,从而产生<span style="color: black;">实质</span>的结果。</span></p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">PoT<span style="color: black;">供给</span>了一个更清晰、更具表达力和<span style="color: black;">基本</span>的答案推导模型,<span style="color: black;">加强</span>了准确性和理解力。</span></p>
    <div style="color: black; text-align: left; margin-bottom: 10px;"><img src="https://p3-sign.toutiaoimg.com/tos-cn-i-6w9my0ksvp/846d2ed31d8f4ed7b41c9dde4714cd3c~noop.image?_iz=58558&amp;from=article.pc_detail&amp;lk3s=953192f4&amp;x-expires=1727623940&amp;x-signature=VZ3nbpiHUodjuGZsE5T7GQG8pCU%3D" style="width: 50%; margin-bottom: 20px;"></div>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><strong style="color: blue;"><span style="color: black;">图表生成</span></strong></p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">图表生成的核心部分<span style="color: black;">包含</span>6个模块,其中拒识、分步改写、代码生成三个模块是基于大模型的推理模块,背后模型均经过了模型精调。</span></p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;"><span style="color: black;">详细</span>地:</span></p><span style="color: black;">拒识模型能够识别用户问题和表格的<span style="color: black;">关联</span>性,对与表格无关的问题或非绘图问题进行拒绝回复</span><span style="color: black;">分步改写模型针对<span style="color: black;">区别</span>表格和<span style="color: black;">区别</span>问题,将绘图<span style="color: black;">过程</span>拆解为多个可执行<span style="color: black;">过程</span></span><span style="color: black;">代码生成模型<span style="color: black;">按照</span>绘图<span style="color: black;">过程</span>生成python表格可视化代码。</span>
    <div style="color: black; text-align: left; margin-bottom: 10px;"><img src="https://p3-sign.toutiaoimg.com/tos-cn-i-6w9my0ksvp/08707e7573034578b980f32e2a0457c9~noop.image?_iz=58558&amp;from=article.pc_detail&amp;lk3s=953192f4&amp;x-expires=1727623940&amp;x-signature=WybGu%2F5k62FsOE%2ByHXFxQpkAY2g%3D" style="width: 50%; margin-bottom: 20px;"></div>
    <h1 style="color: black; text-align: left; margin-bottom: 10px;">第五章:总结</h1>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">结合腾讯文档AI落地过程,总结一下AI助手<span style="color: black;">研发</span>过程中的<span style="color: black;">有些</span>经验:</span></p><strong style="color: blue;"><span style="color: black;">对人困难的,对AI<span style="color: black;">亦</span>困难</span></strong><strong style="color: blue;"><span style="color: black;">能让程序做的,就不要让AI做</span></strong><strong style="color: blue;"><span style="color: black;">应用AI<span style="color: black;">处理</span>内容<span style="color: black;">关联</span>问题,应用工程<span style="color: black;">处理</span>形态<span style="color: black;">或</span>样式问题</span></strong>




4lqedz 发表于 2024-10-4 19:56:27

期待楼主的下一次分享!”

nykek5i 发表于 2024-10-27 08:48:01

感谢楼主分享,祝愿外链论坛越办越好!
页: [1]
查看完整版本: AI助手技术实践!腾讯文档探索之路