python获取北京时间

发布时间：2015年12月28日 17:36
作者：杨仕航

分类标签： Python
阅读(15084)
评论(1)

python网络模块是很发达的，不仅可以访问某个页面，也可以爬取大量的数据。

现将简单演示如何获取北京时间。该练习很好整合了很多知识点。(注：本文在python2.7环境开发)

先分析一下，我们如何获取北京时间：

打开 http://open.baidu.com/special/time/

该网站可以得到北京时间。通过html代码看到这个位置：

实际上这个位置不是我们想要的内容。找到js代码部分，可以发现如下内容：

这一部分数据才是我们最终想要的数据。这个数据有13位数字，实际上是由一个时间戳转换得来。

那么这里我们就需要导入如下模块：

import re       #正则表达式模块
import urllib   #网页抓取模块
import time     #时间模块
import string   #字符串模块

先写一个获取网页内容的函数

def getHtml(url):
    """get url html text"""
    page=urllib.urlopen(url)
    html_text=page.read()
    return html_text

该函数可以打开一个链接，获取其链接的html代码内容。

然后我们就可以用这个函数打开获取北京时间的网址。获取之后，就需要通过正则表达式获取我们想要的数据。

expStr=r'\d{13}'
beijing_time=re.search(expStr,html).group()

通过这两句代码，可以得到那一串伪时间戳。百度是将时间戳乘以1000转成字符串得来的。所以，我们得反其道而行之。

seconds=string.atof(beijing_time)/1000
ttime=time.localtime(seconds)
print time.strftime('%Y-%m-%d %H:%M:%S',ttime)

其中，string.atof是将字符串型的数字转成float浮点型数字。该数字除以1000之后，则是真正的时间戳。再用time.localtime方法得到本地时间，接着格式化输出得到北京时间。

完整代码如下：

# -*- coding: UTF-8 -*-
"""get beijing current time"""

import re       #正则表达式模块
import urllib   #网页抓取模块
import time     #时间模块
import string   #字符串模块

def getHtml(url):
    """get url html text"""
    page=urllib.urlopen(url)
    html_text=page.read()
    return html_text

def getBeijingTime():
    """Get Beijing Time"""
    #抓取百度时间页面
    url='http://open.baidu.com/special/time/'
    html=getHtml(url)

    #正则表达式匹配结果
    expStr=r'\d{13}'
    beijing_time=re.search(expStr,html).group()

    #转化结果为日期形式
    seconds=string.atof(beijing_time)/1000
    ttime=time.localtime(seconds)
    return time.strftime('%Y-%m-%d %H:%M:%S',ttime)

if __name__ == '__main__':
    print getBeijingTime()
    raw_input('(Please press Enter key to end)')

为了方便测试，加了一句raw_input()。得到我们输入其他内容，才会继续执行。这样保存为py文件，直接执行即可。

(原创博文，转载请注明来自杨仕航的博客！本文链接：http://yshblog.com/blog/17)

若对你有帮助，不妨扫一扫右侧的二维码打赏支持我 ^_^

分享到：

上一篇：Access建表规范总结(1)：高度概括

下一篇：为Django网站添加favicon.ico图标