python - scrapy xpath 页面解析找不到
问题描述
下面是页面大类的html
<p class='house-item clearfix house-item-curr'><p class='item-photo fl'> <a href='https://www.haobala.com/zufang/shbs12427828.html' target='_blank'><img src='https://imgsh.centanet.com/ctpostimage/a4/d5/4e4da3a2496299b2b26243565af6_220x165.jpg' src='https://imgsh.centanet.com/ctpostimage/a4/d5/4e4da3a2496299b2b26243565af6_220x165.jpg' alt='采菊苑租房2800元/月' style='display: inline-block;'></a></p><p class='item-info fl'> <h4 class='house-title'><a href='https://www.haobala.com/zufang/shbs12427828.html' target='_blank' class='cBlueB'>采菊苑,77平两房出租,毛坯3千,好谈价!</a> </h4> <p class='f14 f000 mb_10'><a href='https://www.haobala.com/xiaoqu/xq-pedpwawows/' target='_blank'>采菊苑</a><span class='f000 mr_10'>2室2厅</span><span class='f000'>77平</span> </p> <p class='f7b mb_10'>南北<em class='mrl_6'>|</em>高层<em class='mrl_6'>|</em>毛坯<em class='mrl_6'>|</em>2008年<em class='mrl_6'>|</em>整租 </p> <p class='f7b mb_15'>宝山-顾村 菊联路68弄<a href='javascript:void(0)' value='/page/v1/common/maputil.aspx?x=121.372168&y=31.354713' class=' icons_map jsmapicon '></a> </p> </p> <p class='item-pricearea fr'> <p class='price-nub cRed'>2800元/月</p> <p class='f14 f000 mb_15 fsm'> </p> </p><p class='clear'></p> </p>
我用xpath提取text 我的代码是
class itcast(scrapy.Spider): name = ’SH’ allowd_domains = ['http://sh.centanet.com/'] start_urls = ['http://sh.centanet.com/zufang/p2/'] def parse(self,response):list = response.xpath('//p[@class=’//p[@class=’ouse-item clearfix house-item-curr’]')for i in list: name = i.xpath('//a[@class=’cBlueB’.text()]') print name[0]
我可以找到这个大类报错是在 for i in list:name =f.xpath()这行,text提取不到。执行代码找不到,帮我看看我是哪里不对,类似的我也试过好多次,没办法了,只能提问了希望能回答下 谢谢!
问题解答
回答1:def parse(response): list = response.xpath('//p[@class=’house-item clearfix house-item-curr’]') for i in list:name = i.xpath('//a[@class=’cBlueB’]')print name[0].text
也许你是这个意思?
回答2:name = i.xpath('//a[@class=’cBlueB’.text()]')这种写法是错误的,text怎么能放在[]里面呢
相关文章:
1. javascript - vue 数据更新了。但是dom没有更新,,,,,如图2. java基础,求解答。3. debian - docker依赖的aufs-tools源码哪里可以找到啊?4. docker api 开发的端口怎么获取?5. javascript - 读取页面源码,页面中所有的换行都被当成<br/>读取出来 了,,求解应该怎么让它被正确的解析6. html5 - 百度Ueditor代码高亮和代码段滚动条冲突是怎么回事?7. java - 为什么要将Runnable接口的子类对象传递给Thread的构造函数?8. javascript - 关于一段 for 循环代码执行顺序的问题9. 如何调整IDEA中直接运行java程序的日志输出级别?10. javascript - 静态页面引公共头尾文件,js怎么写吖?