首页 IT技术

【python】百度关键词排名查询实现

时间:2019-10-14 14:12:12 分类:IT技术

百度排名查询

python版本:3.7.1

安装依赖包requests  re urllib bs4......

安装方法:打开python安装目录,找到scripts的目录,按住shift出现打开命令窗口,进入后先pip list查看安装了那些包,然后再pip install 安装所需要的包。

参考网址:https://blog.csdn.net/Ryuchong/article/details/80687447

# -*- coding:utf8 -*-
import requests
import re
import pymysql

#关键字,公司网址,查询网址
keyword = input(u"请输入你要查询的关键字")
site = input("请输入您要查询的网址")
site_baidu = u"http://www.baidu.com/s?wd=%s&pn=%d0"
site_360 = "https://hao.360.cn/"


#查询排名
i = 0
#word = u"体检行业爆丑闻"
#site = "https://baijiahao.baidu.com"
site_baidu = u"http://www.baidu.com/s?wd=%s&pn=%d0"
def KeywordRank(searchTxt, webUrl):
    global i
    try:
        pattern = re.compile(b'class="c-showurl" style="text-decoration:none;">(.*?)&nbsp', re.S)
        result = pattern.findall(searchTxt)
        for item in result:
            item_str = str(item, encoding = "utf8")
            i = i+1
            print ("rank %d: %s"%(i,item_str))
            if site  in item_str:
                return i
    except Exception as e:
        print(e)
       
        return None
    return None
 
# content:要搜索的关键词, page:要搜索的页码
def BaiduSearch(content, page):
    try:
        url = site_baidu % (content, page)
        data = requests.get(url)
        return data.content
    except Exception as e:
        return None
     
if __name__ == "__main__":
    loops = 10     # 最多查到第 10 页
    page = 0
    while(loops):
        searchTxt = BaiduSearch(keyword, page)
        page = page+1
        rank = KeywordRank(searchTxt, site)
        if None!=rank:
            print (u"输入的关键词排在第 %d 名" % rank)
            print(rank)
            break
        loops = loops - 1


#数据库连接存储数据
        
conn = pymysql.Connect(
    host = '127.0.0.1',
    port = 3306,
    user = 'root',
    password = 'root',
    db = 'test',
    charset = 'utf8'
    )

cursor = conn.cursor()

sql_insert="insert into seo(id,site,word,rank) values('','%s','%s','%d')"%(site,keyword,rank)
cursor.execute(sql_insert)
conn.commit()
cursor.close()
conn.close()

运行结果:

思路的话参考网址里说的很清楚,在这里就强调一下注意添加编码格式以及python2版本与3的不兼容,语法方面的变化。

推荐文章

重点栏目推荐