文章详情页

python - scrapy xpath 页面解析找不到

浏览：194日期：2022-06-28 15:02:38

问题描述

下面是页面大类的html

<a href='https://www.haobala.com/zufang/shbs12427828.html' target='_blank'><img src='https://imgsh.centanet.com/ctpostimage/a4/d5/4e4da3a2496299b2b26243565af6_220x165.jpg' src='https://imgsh.centanet.com/ctpostimage/a4/d5/4e4da3a2496299b2b26243565af6_220x165.jpg' alt='采菊苑租房2800元/月' style='display: inline-block;'></a> <h4 class='house-title'><a href='https://www.haobala.com/zufang/shbs12427828.html' target='_blank' class='cBlueB'>采菊苑，77平两房出租，毛坯3千，好谈价！</a> </h4> <a href='https://www.haobala.com/xiaoqu/xq-pedpwawows/' target='_blank'>采菊苑</a>2室2厅77平 南北|高层|毛坯|2008年|整租 宝山-顾村菊联路68弄<a href='javascript:void(0)' value='/page/v1/common/maputil.aspx?x=121.372168&y=31.354713' class=' icons_map jsmapicon '></a> 2800元/月

我用xpath提取text 我的代码是

class itcast(scrapy.Spider): name = ’SH’ allowd_domains = ['http://sh.centanet.com/'] start_urls = ['http://sh.centanet.com/zufang/p2/'] def parse(self,response):list = response.xpath('//p[@class=’//p[@class=’ouse-item clearfix house-item-curr’]')for i in list: name = i.xpath('//a[@class=’cBlueB’.text()]') print name[0]

我可以找到这个大类报错是在 for i in list：name =f.xpath()这行，text提取不到。执行代码找不到，帮我看看我是哪里不对，类似的我也试过好多次，没办法了，只能提问了希望能回答下谢谢！

问题解答

回答1：

def parse(response): list = response.xpath('//p[@class=’house-item clearfix house-item-curr’]') for i in list:name = i.xpath('//a[@class=’cBlueB’]')print name[0].text

也许你是这个意思？

回答2：

name = i.xpath('//a[@class=’cBlueB’.text()]')这种写法是错误的，text怎么能放在[]里面呢

Python 编程

上一条：python - 用ffmpeg采集视频，用terminate没法结束进程下一条：关于Python 中出现的TypeError: missing 1 required positional argument

相关文章：

1. docker绑定了nginx端口外部访问不到2. python - 使用pandas的resample报错3. docker-machine添加一个已有的docker主机问题4. html5 - 图片一般一怎么的形式存放在服务器中的？5. python - flask post提交timestamp不能作为参数，这是为什么？6. 网页爬虫 - python 爬取网站并解析非json内容7. android - 类似于微信朋友圈图片这样的是在listview中嵌套gridview还是动态加入多个imageview呢？8. vue.js - vue-router开启HTML5的history模式后nginx配置9. html - 类似这样的弹层用什么插件写比较好？10. node.js - mongodb查找子对象的名称为某个值的对象的方法

排行榜

					
					docker绑定了nginx端口 外部访问不到
python - 使用pandas的resample报错
docker-machine添加一个已有的docker主机问题
python - flask post提交timestamp不能作为参数，这是为什么？
android - 类似于微信朋友圈图片这样的是在listview中嵌套gridview还是动态加入多个imageview呢？
html5 - 图片一般一怎么的形式存放在服务器中的？
网页爬虫 - python 爬取网站 并解析非json内容
html - 类似这样的弹层用什么插件写比较好？
node.js - mongodb查找子对象的名称为某个值的对象的方法
vue.js - vue-router开启HTML5的history模式后nginx配置
IOS app应用软件的id号怎么查询？比如百度贴吧的app-id=477927813
				

热门标签