python - 我已经连上了美国的VPN,而且在浏览器里查看的game排行也是美国的,可是为啥我用代码怎么爬都是中国地区排行
问题描述
-encoding:utf-8--import requests, xlwt, sysfrom bs4 import BeautifulSoup
reload(sys)sys.setdefaultencoding(’utf-8’)header = {’user-agent’:’Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/56.0.2924.87 Safari/537.36’,}
def main():
url = ’https://play.google.com/store/apps/category/GAME/collection/topselling_free?hl=zh-TW’data = {’start’: ’0’,’num’ : ’100’,}a = []b = []req = requests.post(url,headers = header,data = data).contentsoup = BeautifulSoup(req,’html.parser’)titles = soup.find_all(’p’,{’class’:’details’})for title in titles : name = title.find(’a’,{’class’:’title’}).get(’title’) host_url = title.find(’a’,{’class’:’title’}).get(’href’) print name , host_url a.append(name) b.append(host_url)for i in range(len(a)): sheet.write(i,0,a[i])for n in range(len(b)): sheet.write(n,1,b[n])
if name == ’__main__’:
wb = xlwt.Workbook()sheet = wb.add_sheet('top_100')main()wb.save(’gametop100.xls’)# login()
问题解答
回答1:服务器可以检测请求最终来源,你设置了代理也不管事或者,你需要更深的代理
回答2:你的爬虫没设置代理吧
![android-studio - 一直无法下载android studio的 安卓模拟器求助](http://www.haobala.com/attached/image/news/202310/095450fe73.jpg)