赵乾舟 发表于 2021-7-27 19:19:01

京东衣服评论爬取

import requests
import pandas as pd
import json
import time
id = input('请输入ID:')
url = f'https://club.jd.com/comment/productPageComments.action?callback=fetchJSON_comment98&productId={id}&score=0&sortType=5&page=0&pageSize=10&isShadowSku=0&fold=1'
UA = {'user-agent':'Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/86.0.4240.198 Safari/537.36'}
respon = requests.get( url = url,headers=UA).text
respon = respon.replace('fetchJSON_comment98(','')
respon = respon.replace(');','')
zidian = json.loads(respon)
yeshu = zidian['maxPage']
print(yeshu)
for ye in range(1,yeshu+1):
    url = f'https://club.jd.com/comment/productPageComments.action?callback=fetchJSON_comment98&productId={id}&score=0&sortType=5&page={ye-1}&pageSize=10&isShadowSku=0&fold=1'
    respon = requests.get(url=url, headers=UA).text
    respon = respon.replace('fetchJSON_comment98(', '')
    respon = respon.replace(');', '')
    zidian = json.loads(respon)
    pinglun = zidian['comments']
    neirong = for pinglun in pinglun]
    yanse = for pinglun in pinglun]
    size = for pinglun in pinglun]
    shuju = pd.DataFrame({'评价':neirong,'颜色':yanse,'大小':size})
    shuju.to_csv('d:/j3d.csv',mode='a',header=0,encoding='ANSI')
    time.sleep(3)


赵乾舟 发表于 2021-7-27 19:21:27

第14行,控制翻多少页
页: [1]
查看完整版本: 京东衣服评论爬取