Nature:论文撰写运用AI工具形成抄袭吗?界限在哪里?
<span style="color: black;">本文整理自Nature上的一篇<span style="color: black;">文案</span>,原文题目为“AI is complicating plagiarism. How should scientists respond?”</span><span style="color: black;">今年的学术界抄袭事件频发,1月份哈佛校长因遭受抄袭指控而辞职,2月份又有同行评审意见被<span style="color: black;">揭发</span>存在抄袭<span style="color: black;">行径</span>。</span><span style="color: black;"><span style="color: black;">关联</span>阅读:</span><span style="color: black;"><a style="color: black;"><span style="color: black;">同行评审意见,</span></a><span style="color: black;">亦</span>存在抄袭<span style="color: black;">行径</span>?</span><span style="color: black;">在学术写作中,还有一个更大的问题,由生成式人工智能 (AI) 工具的快速普及而<span style="color: black;">诱发</span>,<span style="color: black;"><strong style="color: blue;">即<span style="color: black;">运用</span>AI工具<span style="color: black;">是不是</span><span style="color: black;">形成</span>抄袭?在<span style="color: black;">那些</span><span style="color: black;">状况</span>下<span style="color: black;">准许</span><span style="color: black;">运用</span>AI工具?</strong></span></span><span style="color: black;">ChatGPT等生成式AI工具基于大型语言模型 (LLM) ,<span style="color: black;">能够</span><span style="color: black;">帮忙</span>节省时间、<span style="color: black;">提高</span>文字表达能力并减少语言<span style="color: black;">阻碍</span>。许多<span style="color: black;">科研</span>者认为,在某些<span style="color: black;">状况</span>下,<span style="color: black;"><strong style="color: blue;">这些工具应被<span style="color: black;">准许</span><span style="color: black;">运用</span>,但需要充分披露</strong></span>。</span><span style="color: black;">然而,此类工具的<span style="color: black;">运用</span>,使得本就充满争议的、<span style="color: black;">怎样</span>合法<span style="color: black;">运用</span>他人作品的讨论变得更加<span style="color: black;">繁杂</span>。LLMs<span style="color: black;">运用</span><span style="color: black;">海量</span>已<span style="color: black;">发布</span><span style="color: black;">文案</span>进行训练,并生成文本,<span style="color: black;">因此呢</span>,<span style="color: black;">运用</span>此类工具可能会<span style="color: black;">引起</span><span style="color: black;"><strong style="color: blue;">类似抄袭的<span style="color: black;">行径</span><span style="color: black;">出现</span></strong></span>。例如,<span style="color: black;">科研</span>者<span style="color: black;">运用</span>AI生成的内容充当自己的成果,或<span style="color: black;">运用</span>AI生成与某些论文内容非常接近的文本,却不注明参考文献。这些工具还可用来<span style="color: black;"><strong style="color: blue;">掩盖故意抄袭的内容</strong></span>,<span style="color: black;">运用</span>此类工具的<span style="color: black;">行径</span>很难被<span style="color: black;">发掘</span>。</span><span style="color: black;">2023 年,在一项针对 1600 名<span style="color: black;">科研</span>者的调查中,<span style="color: black;"><strong style="color: blue;">68% 的受访者<span style="color: black;">暗示</span>,AI将使抄袭<span style="color: black;">更易</span>,<span style="color: black;">亦</span>更难被<span style="color: black;">发掘</span></strong></span>。德国柏林应用科技大学一位检测抄袭<span style="color: black;">行径</span>的专家 Debora Weber-Wulff 说:“<span style="color: black;">每一个</span>人都担心其他人<span style="color: black;">运用</span>这些工具,又担心自己<span style="color: black;">错失</span><span style="color: black;">运用</span>的<span style="color: black;">机会</span>”。</span><span style="color: black;"><span style="color: black;">关联</span>阅读:</span><span style="color: black;"><a style="color: black;"><span style="color: black;">Nature深度调研:1600名<span style="color: black;">科研</span>者<span style="color: black;">怎样</span>看待和<span style="color: black;">运用</span>ChatGPT等AI工具</span></a></span><p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;"><strong style="color: blue;"><strong style="color: blue;"><span style="color: black;">当AI遇上抄袭</span></strong></strong></span></p><span style="color: black;">美国<span style="color: black;">科研</span>诚信办公室将抄袭定义为“</span><span style="color: black;"><strong style="color: blue;">未经授权而搬用他人的想法、<span style="color: black;">方法</span>、结果或文字的<span style="color: black;">行径</span></strong></span><span style="color: black;">”。</span><span style="color: black;">2015 年的一项<span style="color: black;">科研</span><span style="color: black;">表示</span>,1.7%的<span style="color: black;">科研</span>者承认有过抄袭<span style="color: black;">行径</span>,30% 的<span style="color: black;">科研</span>者<span style="color: black;">晓得</span>有人抄袭</span><span style="color: black;">。</span><span style="color: black;">LLM的<span style="color: black;">显现</span>,可能会使<span style="color: black;">状况</span>变得更糟。<span style="color: black;"><strong style="color: blue;"><span style="color: black;">倘若</span>有人<span style="color: black;">运用</span>LLM先对他人的文本进行解释,<span style="color: black;">那样</span>故意抄袭的<span style="color: black;">行径</span>就很容易被掩盖</strong></span>。加拿大计算机学家Muhammad Abdul-Mageed<span style="color: black;">暗示</span>,人们<span style="color: black;">经过</span>提示语<span style="color: black;">能够</span>让AI工具以<span style="color: black;">繁杂</span>的方式进行修改,<span style="color: black;">例如</span>以学术期刊的风格。</span><span style="color: black;">一个核心问题是,<span style="color: black;"><strong style="color: blue;"><span style="color: black;">运用</span>完全由AI编写的、未注明<span style="color: black;">源自</span>的内容<span style="color: black;">是不是</span>算作抄袭?</strong></span>许多<span style="color: black;">科研</span>者认为<span style="color: black;">无</span><span style="color: black;">结论</span>。</span><span style="color: black;">例如,欧洲学术诚信网(European Network for Academic Integrity)将禁止<span style="color: black;">运用</span>AI或未申报<span style="color: black;">运用</span>AI所进行的写作定义为“<span style="color: black;"><strong style="color: blue;">未经授权的内容生成</strong></span>”,而不是抄袭。Weber-Wulff说:“对我而言,抄袭<span style="color: black;">指的是</span><span style="color: black;">源自</span>于一个<span style="color: black;">详细</span>的人的内容。尽管在<span style="color: black;">有些</span><span style="color: black;">状况</span>下,AI生成的文本与人类书写的内容几乎完全相同,但这<span style="color: black;">一般</span>不足以被视为抄袭。“</span><span style="color: black;">而另<span style="color: black;">有些</span>人认为,<span style="color: black;"><strong style="color: blue;">生成式AI工具侵犯了版权</strong></span>。抄袭和侵犯版权都是对他人作品的<span style="color: black;">欠妥</span><span style="color: black;">运用</span>,抄袭违反了学术道德,而侵犯版权则可能违法。密歇根大学一位计算机<span style="color: black;">专家</span>说:“这些AI系统是<span style="color: black;">创立</span>在数百万、<span style="color: black;">乃至</span>数亿人的工作之上的。”</span><span style="color: black;"><span style="color: black;">有些</span><span style="color: black;">媒介</span><span style="color: black;">机构</span>认为AI侵犯了自己的版权,并提出了抗议。2023 年 12 月,《纽约时报》对科技巨头微软和 OpenAI(ChatGPT的<span style="color: black;">机构</span>) 提起了版权诉讼。诉讼<span style="color: black;">叫作</span>,<span style="color: black;"><strong style="color: blue;">这两家<span style="color: black;">机构</span>复制并<span style="color: black;">运用</span>了数百万篇<span style="color: black;">文案</span>来训练LLM,而这些LLM与原出版物存在内容竞争</strong></span>。提出诉讼的依据是,GPT-4 几乎逐字逐句地复制了期刊<span style="color: black;">文案</span>中的<span style="color: black;">有些</span>段落。</span><span style="color: black;">2024年 2 月,OpenAI 向联邦法院提出<span style="color: black;">需求</span>驳回部分诉讼,认为“ChatGPT 绝不是订阅《纽约时报》的替代品” 。微软发言人<span style="color: black;">亦</span><span style="color: black;">暗示</span>:“应当<span style="color: black;">准许</span>AI工具的<span style="color: black;">研发</span>以合法合规的方式进行,它们<span style="color: black;">亦</span><span style="color: black;">不可</span>替代杂志所<span style="color: black;">装扮</span>的重要角色。”</span><span style="color: black;">美国路易斯安那州一位版权和剽窃顾问<span style="color: black;">暗示</span>,“<span style="color: black;"><strong style="color: blue;"><span style="color: black;">倘若</span>法院裁定,未经许可<span style="color: black;">运用</span>文本训练AI确实侵犯了版权,这对AI<span style="color: black;">机构</span><span style="color: black;">来讲</span>将是一个巨大的打击</strong></span>。<span style="color: black;">由于</span><span style="color: black;">倘若</span><span style="color: black;">无</span>广泛的训练集,ChatGPT 等工具就不可能存在。”</span>
<p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;"><strong style="color: blue;"><strong style="color: blue;"><span style="color: black;">AI</span></strong><strong style="color: blue;"><span style="color: black;">的爆炸式增长</span></strong></strong></span></p><span style="color: black;">自 2022 年 11 月ChatGPT发布<span style="color: black;">败兴</span>,AI在学术写作中的<span style="color: black;">运用</span>呈爆炸式增长。</span><span style="color: black;">7 月更新的一项预印本<span style="color: black;">科研</span>中,<span style="color: black;">科研</span>者估计,<span style="color: black;"><strong style="color: blue;">2024 年上半年<span style="color: black;">发布</span>的生物医学论文中,<span style="color: black;">最少</span>有 10% 的摘要<span style="color: black;">运用</span>了LLM撰写——相当于每年 15 万篇论文</strong></span>。该<span style="color: black;">科研</span>由德国数据<span style="color: black;">专家</span> Dmitry Kobak 领导,分析了2010-2024 年6 月PubMed中的1400 万篇摘要。</span><span style="color: black;">Kobak及其同事<span style="color: black;">发掘</span>,相比以英语为母语的国家,<span style="color: black;"><strong style="color: blue;">中国和韩国等国家的论文<span style="color: black;">表示</span>出<span style="color: black;">更加多</span><span style="color: black;">运用</span>LLM的迹象</strong></span>。Kobak预测,LLM的<span style="color: black;">运用</span>肯定会继续<span style="color: black;">增多</span>,并且越来越难被<span style="color: black;">发掘</span>。</span><span style="color: black;">学术写作中不披露软件的<span style="color: black;">运用</span>,不是什么<span style="color: black;">鲜嫩</span>事。自 2015 年<span style="color: black;">败兴</span>,法国图卢兹大学的计算机<span style="color: black;">专家</span> Guillaume Cabanac等<span style="color: black;">科研</span>者<span style="color: black;">始终</span>在揭露由<span style="color: black;"><strong style="color: blue;">论文生成软件 SCIgen编写的乱码论文</strong></span>,以及<span style="color: black;">哪些</span><span style="color: black;">包括</span>“扭曲短语”的论文,这些短语<span style="color: black;">是由于</span>翻译或转换文本的软件所创建。Cabanac <span style="color: black;">暗示</span>,“在生成式AI<span style="color: black;">显现</span>之前,人们就有了<span style="color: black;">有些</span>瞒天过海的工具。”</span><span style="color: black;">AI<span style="color: black;">针对</span>学术写作是有价值的。<span style="color: black;">科研</span>者<span style="color: black;">暗示</span>,<span style="color: black;"><strong style="color: blue;">它<span style="color: black;">能够</span>使文本和概念更清晰,减少语言<span style="color: black;">阻碍</span>,并为<span style="color: black;">科研</span>执行和思考腾出时间</strong></span>。</span><span style="color: black;"><span style="color: black;">然则</span>,<span style="color: black;">针对</span><span style="color: black;">那些</span><span style="color: black;">状况</span>下使用AI会<span style="color: black;">形成</span>抄袭,或违反学术道德,人们仍感到困惑。美国马里兰大学计算机<span style="color: black;">专家</span> Soheil
Feizi <span style="color: black;">暗示</span>,<span style="color: black;"><strong style="color: blue;"><span style="color: black;">运用</span>LLM改写已有论文显然是抄袭,但<span style="color: black;">倘若</span><span style="color: black;">运用</span>LLM来<span style="color: black;">帮忙</span>表述想法(无论是<span style="color: black;">按照</span>提示生成文本,还是编辑草稿),只要将过程公开,就<span style="color: black;">不应</span>受<span style="color: black;">处罚</span></strong></span>。“<span style="color: black;">咱们</span>应该<span style="color: black;">准许</span>人们<span style="color: black;">运用</span>LLM更清晰地表达自己的想法”。</span><span style="color: black;"><span style="color: black;">日前</span>,许多期刊都有在<span style="color: black;">必定</span>程度上<span style="color: black;">准许</span><span style="color: black;">运用</span> LLM的政策。在最初禁止<span style="color: black;">运用</span> ChatGPT 生成文本后,《Science》于 2</span><span style="color: black;">023 年 11 月更新了其政策,</span><span style="color: black;"><span style="color: black;"><strong style="color: blue;"><span style="color: black;">叫作</span>在撰写稿件时需要完全披露AI的<span style="color: black;">运用</span><span style="color: black;">状况</span></strong></span>——<span style="color: black;">包含</span>所<span style="color: black;">运用</span>的版本和提示词。作者应对准确性负责,并<span style="color: black;">保证</span>不存在抄袭。《Nature》<span style="color: black;">亦</span>表示,作者应在<span style="color: black;">办法</span>学部分描述 LLM 的<span style="color: black;">运用</span><span style="color: black;">状况</span>。</span><span style="color: black;"><span style="color: black;">关联</span>阅读:</span><span style="color: black;"><a style="color: black;"><span style="color: black;">《Science》更新投稿政策:放宽ChatGPT等AI在论文中的<span style="color: black;">运用</span>限制</span></a></span><span style="color: black;">对 TOP100学术出版商和期刊的分析<span style="color: black;">发掘</span>,<span style="color: black;">截止</span>2023年10月,<span style="color: black;"><strong style="color: blue;">24%的出版商和87% 的期刊发布了生成式AI的<span style="color: black;">运用</span>指南</strong></span>。几乎所有出版商都<span style="color: black;">暗示</span>,AI工具<span style="color: black;">不可</span>被列为作者。但在<span style="color: black;">准许</span><span style="color: black;">运用</span>的AI类型和披露程度上,各出版商政策有所<span style="color: black;">区别</span>。Weber-Wulff<span style="color: black;">暗示</span>,关于<span style="color: black;"><strong style="color: blue;">AI在学术写作中的<span style="color: black;">运用</span>,迫切需要更<span style="color: black;">知道</span>的指南</strong></span>。</span><span style="color: black;"><span style="color: black;">关联</span>阅读:</span><span style="color: black;"><a style="color: black;"><span style="color: black;">BMJ:总结TOP100期刊关于ChatGPT等AI的<span style="color: black;">运用</span>指南</span></a></span><span style="color: black;">Abdul-Mageed<span style="color: black;">暗示</span>,<span style="color: black;">日前</span>,LLM在撰写学术论文方面的广泛应用,受到其局限性的限制。用户需要创建<span style="color: black;">仔细</span>的提示词,以描述对象、语言风格和<span style="color: black;">科研</span><span style="color: black;">行业</span>。</span><span style="color: black;">然而,Abdul-Mageed <span style="color: black;">暗示</span>,<span style="color: black;">开发</span>人员正在<span style="color: black;">研发</span><span style="color: black;">有些</span>应用程序,使<span style="color: black;">科研</span>者<span style="color: black;">更易</span>获取专属的学术内容。<span style="color: black;"><strong style="color: blue;">在<span style="color: black;">将来</span>,用户将不必编写<span style="color: black;">仔细</span>的提示词,只需简单地从下拉菜单中<span style="color: black;">选取</span>选项</strong></span>,按下按钮,就<span style="color: black;">能够</span>从头<span style="color: black;">起始</span>制作整篇论文。</span>
<p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;"><strong style="color: blue;"><strong style="color: blue;"><span style="color: black;">AI</span></strong><strong style="color: blue;"><span style="color: black;">工具的检测</span></strong></strong></span></p><span style="color: black;">随着LLM在写作<span style="color: black;">行业</span>的快速应用,一系列旨在检测AI<span style="color: black;">运用</span>的工具<span style="color: black;">亦</span>应运而生。</span><span style="color: black;">尽管许多工具都宣<span style="color: black;">叫作</span>准确率极高——在某些<span style="color: black;">状况</span>下超过 90%。</span><span style="color: black;">但<span style="color: black;">科研</span><span style="color: black;">表示</span>,大<span style="color: black;">都数</span>工具并<span style="color: black;">无</span>达到。</span><span style="color: black;">2023年 12 月<span style="color: black;">发布</span>的一项<span style="color: black;">科研</span>中</span><span style="color: black;">,Weber-Wulff等人<span style="color: black;">评定</span>了学术界广泛<span style="color: black;">运用</span>的 14 种AI检测工具。</span><span style="color: black;"><strong style="color: blue;"><span style="color: black;">仅有</span> 5种工具能够准确识别出 70% 及以上的文本,<span style="color: black;">无</span>一种工具的准确率超过 80%</strong></span><span style="color: black;">。</span><span style="color: black;">当有人<span style="color: black;">经过</span><span style="color: black;"><strong style="color: blue;">同义词替换或将句子重新排序</strong></span>等方式对AI生成的文本进行轻度编辑后,这些工具的准确率下降到平均 50% 以下。作者写道,<span style="color: black;">这般</span>的文本“几乎<span style="color: black;">没法</span>被当前的工具检测到”。其他<span style="color: black;">科研</span><span style="color: black;">显示</span>,<strong style="color: blue;"><span style="color: black;"><span style="color: black;">需求</span>AI多次转述文本,<span style="color: black;">亦</span>会大大降低检测的准确率</span></strong>。</span><span style="color: black;">AI检测还存在其他问题。一项<span style="color: black;">科研</span><span style="color: black;">显示</span>,<span style="color: black;"><strong style="color: blue;">检测软件更有可能将非英语母语人士撰写的论文识别为AI生成的</strong></span>。Feizi说,检测软件<span style="color: black;">没法</span>准确区分完全由AI撰写的文本和<span style="color: black;">运用</span>AI润色的文本。区分这些<span style="color: black;">状况</span>既困难又不<span style="color: black;">靠谱</span>,并可能产生<span style="color: black;">海量</span>的误报,这会严重损害<span style="color: black;">关联</span>学者或学生的声誉。</span><span style="color: black;">本文整理自:</span><span style="color: black;">https://www.nature.com/articles/d41586-024-02371-z</span><span style="color: black;">参考文献:</span><span style="color: black;">1. Sci Eng Ethics. 2015
Oct;21(5):1331-52.</span><span style="color: black;">2. Foltynek, T. et al. Int. J.
Educ. Integr. 19, 12 (2023).</span><span style="color: black;">3. Kobak, D., González-Márquez, R.,
Horvát, E.-Á. & Lause, J. Preprint at arXiv </span><span style="color: black;">https://doi.org/10.48550/arXiv.2406.07016</span><span style="color: black;"> (2024).</span><span style="color: black;">4. BMJ. 2024 Jan 31:384:e077192.</span><span style="color: black;">5. Weber-Wulff,
D. et al. Int. J. Educ. Integr. 19, 26
(2023).</span><span style="color: black;">6. Sadasivan, V. S., Kumar, A.,
Balasubramanian, S., Wang, W. & Feizi, S. Preprint at arXiv</span><span style="color: black;">https://doi.org/10.48550/arXiv.2303.11156</span><span style="color: black;"> (2023).</span><span style="color: black;">7. Patterns (N
Y). 2023 Jul 10;4(7):100779.</span>
<p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">------</span><span style="color: black;">分割线</span><span style="color: black;">------</span></p><span style="color: black;">医咖会8周岁啦,今年计划推出近10门<span style="color: black;">研究</span>课</span><span style="color: black;">程(<span style="color: black;">包含</span></span><span style="color: black;"><strong style="color: blue;"><span style="color: black;">R语言绘图、公共数据库挖掘、文献计量学</span></strong></span><span style="color: black;">等)。快点击"</span><span style="color: black;"><strong style="color: blue;"><span style="color: black;">阅读原文</span></strong></span><span style="color: black;">”,来投票你最感兴趣的课程。</span>
seo常来的论坛,希望我的网站快点收录。
页:
[1]