u1jodi1q 发表于 2024-8-25 22:14:57

引起Baiduspider(百度蜘蛛)抓取网站反常的原由有哪些


    <div style="color: black; text-align: left; margin-bottom: 10px;">
      <div style="color: black; text-align: left; margin-bottom: 10px;"><img src="https://p3-sign.toutiaoimg.com/pgc-image/59da9e8dbeeb41bc97a42b6919ca0317~noop.image?_iz=58558&amp;from=article.pc_detail&amp;lk3s=953192f4&amp;x-expires=1725117973&amp;x-signature=nn%2FnPsz%2BWw9YzKBU9%2BFelcyYHZk%3D" style="width: 50%; margin-bottom: 20px;">
            <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">导致</span>Baiduspider(百度蜘蛛)抓取网站<span style="color: black;">反常</span>的<span style="color: black;">原由</span>有<span style="color: black;">那些</span></p>
      </div>
      <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;">有<span style="color: black;">有些</span>网页,内容<span style="color: black;">优秀</span>,用户<span style="color: black;">亦</span><span style="color: black;">能够</span>正常<span style="color: black;">拜访</span>,<span style="color: black;">然则</span>Baiduspider却<span style="color: black;">没法</span>正常<span style="color: black;">拜访</span>并抓取,<span style="color: black;">导致</span>搜索结果覆盖率缺失,对百度搜索引擎对站点都是一种损失,百度把这种<span style="color: black;">状况</span>叫“抓取<span style="color: black;">反常</span>”。<span style="color: black;">针对</span><span style="color: black;">海量</span>内容<span style="color: black;">没法</span>正常抓取的网站,百度搜索引擎会认为网站存在用户体验上的缺陷,并降低对网站的<span style="color: black;">评估</span>,在抓取、索引、排序上都会受到<span style="color: black;">必定</span>程度的<span style="color: black;">消极</span>影响,影响到网站从百度获取的流量。</p>
      <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;">下面向站长介绍<span style="color: black;">有些</span><span style="color: black;">平常</span>的抓取<span style="color: black;">反常</span><span style="color: black;">原由</span>:</p>
      <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;">1,服务器连接<span style="color: black;">反常</span></p>
      <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;">服务器连接<span style="color: black;">反常</span>会有两种<span style="color: black;">状况</span>:一种是站点不稳定,Baiduspider尝试连接您网站的服务器时<span style="color: black;">显现</span>暂时<span style="color: black;">没法</span>连接的<span style="color: black;">状况</span>;一种是Baiduspider<span style="color: black;">始终</span><span style="color: black;">没法</span>连接上您网站的服务器。</p>
      <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">导致</span>服务器连接<span style="color: black;">反常</span>的<span style="color: black;">原由</span><span style="color: black;">一般</span>是您的网站服务器过大,超负荷运转。<span style="color: black;">亦</span>有可能是您的网站运行不正常,请<span style="color: black;">检测</span>网站的web服务器(如apache、iis)<span style="color: black;">是不是</span>安装且正常运行,并<span style="color: black;">运用</span>浏览器<span style="color: black;">检测</span><span style="color: black;">重点</span>页面能否正常<span style="color: black;">拜访</span>。您的网站和主机还可能阻止了Baiduspider的<span style="color: black;">拜访</span>,您需要<span style="color: black;">检测</span>网站和主机的防火墙。</p>
      <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;">2,网络运营商<span style="color: black;">反常</span>:网络运营商分电信和联通两种,Baiduspider<span style="color: black;">经过</span>电信或网通<span style="color: black;">没法</span><span style="color: black;">拜访</span>您的网站。<span style="color: black;">倘若</span><span style="color: black;">显现</span>这种<span style="color: black;">状况</span>,您需要与网络服务运营商进行联系,<span style="color: black;">或</span>购买<span style="color: black;">持有</span>双线服务的空间<span style="color: black;">或</span>购买cdn服务。</p>
      <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;">3,DNS<span style="color: black;">反常</span>:当Baiduspider<span style="color: black;">没法</span>解析您网站的IP时,会<span style="color: black;">显现</span>DNS<span style="color: black;">反常</span>。可能是您的网站IP<span style="color: black;">位置</span>错误,<span style="color: black;">或</span>域名服务商把Baiduspider封禁。请<span style="color: black;">运用</span>WHOIS<span style="color: black;">或</span>host<span style="color: black;">查找</span>自己网站IP<span style="color: black;">位置</span><span style="color: black;">是不是</span>正确且可解析,<span style="color: black;">倘若</span>不正确或<span style="color: black;">没法</span>解析,请与域名注册商联系,更新您的IP<span style="color: black;">位置</span>。</p>
      <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;">4,IP封禁:IP封禁为:限制网络的出口IP<span style="color: black;">位置</span>,禁止该IP段的<span style="color: black;">运用</span>者进行内容<span style="color: black;">拜访</span>,在<span style="color: black;">这儿</span>特指封禁了BaiduspiderIP。当您的网站不<span style="color: black;">期盼</span>Baiduspider<span style="color: black;">拜访</span>时,才需要该设置,<span style="color: black;">倘若</span>您<span style="color: black;">期盼</span>Baiduspider<span style="color: black;">拜访</span>您的网站,请<span style="color: black;">检测</span><span style="color: black;">关联</span>设置中<span style="color: black;">是不是</span>误添加了BaiduspiderIP。<span style="color: black;">亦</span>有可能是您网站所在的空间服务商把百度IP进行了封禁,<span style="color: black;">此时</span>您需要联系服务商更改设置。</p>
      <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;">5,UA封禁:UA即为用户代理(User-Agent),服务器<span style="color: black;">经过</span>UA识别<span style="color: black;">拜访</span>者的身份。当网站针对指定UA的<span style="color: black;">拜访</span>,返回<span style="color: black;">反常</span>页面(如403,500)或<span style="color: black;">转</span>到其它页面的<span style="color: black;">状况</span>,即为UA封禁。当您的网站不<span style="color: black;">期盼</span>Baiduspider<span style="color: black;">拜访</span>时,才需要该设置,<span style="color: black;">倘若</span>您<span style="color: black;">期盼</span>Baiduspider<span style="color: black;">拜访</span>您的网站,useragent<span style="color: black;">关联</span>的设置中<span style="color: black;">是不是</span>有Baiduspider UA,并<span style="color: black;">即时</span>修改。</p>
      <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;">6,死链:页面<span style="color: black;">已然</span>无效,<span style="color: black;">没法</span>对用户<span style="color: black;">供给</span>任何有价值信息的页面<span style="color: black;">便是</span>死链接,<span style="color: black;">包含</span>协议死链和内容死链两种形式:</p>
      <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;">协议死链:页面的TCP协议状态/HTTP协议状态<span style="color: black;">知道</span><span style="color: black;">暗示</span>的死链,<span style="color: black;">平常</span>的如404、403、503状态等。</p>
      <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;">内容死链:服务器返回状态是正常的,但内容<span style="color: black;">已然</span>变更为不存在、已删除或需要权限等与原内容无关的信息页面。</p>
      <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">针对</span>死链,<span style="color: black;">咱们</span><span style="color: black;">意见</span>站点<span style="color: black;">运用</span>协议死链,并<span style="color: black;">经过</span>百度站长平台--死链工具向百度提交,以便百度更快地<span style="color: black;">发掘</span>死链,减少死链对用户以及搜索引擎<span style="color: black;">导致</span>的<span style="color: black;">消极</span>影响。</p>
      <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;">7,<span style="color: black;">反常</span><span style="color: black;">转</span>:将网络请求重新指向其它位置即为<span style="color: black;">转</span>。<span style="color: black;">反常</span><span style="color: black;">转</span>指的是以下几种<span style="color: black;">状况</span>:</p>
      <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;">1)当前该页面为无效页面(内容已删除、死链等),直接<span style="color: black;">转</span>到前一目录<span style="color: black;">或</span>首页,百度<span style="color: black;">意见</span>站长将该无效页面的入口超链接删除掉</p>
      <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;">2)<span style="color: black;">转</span>到出错<span style="color: black;">或</span>无效页面</p>
      <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">重视</span>:<span style="color: black;">针对</span><span style="color: black;">长期</span><span style="color: black;">转</span>到其它域名的<span style="color: black;">状况</span>,如网站更换域名,百度<span style="color: black;">意见</span><span style="color: black;">运用</span>301<span style="color: black;">转</span>协议进行设置。</p>
      <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;">8,其它<span style="color: black;">反常</span>:</p>
      <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;">1)针对百度refer的<span style="color: black;">反常</span>:网页针对来自百度的refer返回<span style="color: black;">区别</span>于正常内容的<span style="color: black;">行径</span>。</p>
      <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;">2)针对百度ua的<span style="color: black;">反常</span>:网页对百度UA返回<span style="color: black;">区别</span>于页面原内容的<span style="color: black;">行径</span>。</p>
      <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;">3)JS<span style="color: black;">转</span><span style="color: black;">反常</span>:网页加载了百度<span style="color: black;">没法</span>识别的JS<span style="color: black;">转</span>代码,使得用户<span style="color: black;">经过</span>搜索结果进入页面后<span style="color: black;">出现</span>了<span style="color: black;">转</span>的<span style="color: black;">状况</span>。</p>
      <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;">4)压力过大<span style="color: black;">导致</span>的偶然封禁:百度会<span style="color: black;">按照</span>站点的规模、<span style="color: black;">拜访</span>量等信息,自动设定一个<span style="color: black;">恰当</span>的抓取压力。<span style="color: black;">然则</span>在<span style="color: black;">反常</span><span style="color: black;">状况</span>下,如压力<span style="color: black;">掌控</span>失常时,服务器会<span style="color: black;">按照</span><span style="color: black;">自己</span>负荷进行<span style="color: black;">守护</span>性的偶然封禁。这种<span style="color: black;">状况</span>下,请在返回码中返回503(其含义是“Service Unavailable”),<span style="color: black;">这般</span>Baiduspider会过段时间再来尝试抓取这个链接,<span style="color: black;">倘若</span>网站已空闲,则会被成功抓取。</p>
    </div>




听听海 发表于 2024-8-31 08:32:24

“BS”(鄙视的缩写)‌

4lqedz 发表于 2024-10-14 06:18:39

期待更新、坐等、迫不及待等。
页: [1]
查看完整版本: 引起Baiduspider(百度蜘蛛)抓取网站反常的原由有哪些