pd.data_range
函数签名:
pandas.date_range(start=None, end=None, periods=None, freq=None, tz=None, normalize=False, name=None, closed=None, **kwargs)
注意
要想生成时间数组,start, end, periods, freq四个参数至少指定其中三个。
1.通过start, end, freq生成时间数组
例如,生成2020-08-01日,时间间隔为10分钟的数组。
start_date = "2020-08-01"
end_date = "2020-08-02"
dr1 = pd.date_range(start=start_date, end=end_date, freq='10min')
start和end可以接受str或者datetime类型的时间对象。比如
start_date_dt = datetime.strptime(start_date, "%Y-%m-%d")
end_date_dt = datetime.strptime(end_date, "%Y-%m-%d")
dr2 = pd.date_range(start=start_date_dt, end=end_date_dt, freq='10min')
print(dr2)
神奇的freq
freq代表frequency,频率。也就是你想将一段时间,按每段多长时间进行切割。
关于freq官网资料

我们对常用的参数做一下测试。
# 测试freq
# freq = '30s'
start_time = "2020-08-01 06:00:00"
end_time = "2020-08-01 08:00:00"
dr3 = pd.date_range(start=start_time, end=end_time, freq='10s')
print(dr3)
# freq = '1min'
dr4 = pd.date_range(start=start_time, end=end_time, freq='1min')
print(dr4)
# freq = '1min 30s'
dr5 = pd.date_range(start=start_time, end=end_time, freq='1min 30s')
print(dr5)
# freq = '90s', 等价于'1min 30s'
dr6 = pd.date_range(start=start_time, end=end_time, freq='90s')
print(dr6)
# freq = 'D'
# 注意:如果始末时间的间距不足freq,只会返回一个起始时间。
dr7 = pd.date_range(start=start_time, end=end_time, freq='D')
print(dr7)
# freq = 'H'
dr8 = pd.date_range(start=start_time, end=end_time, freq='H')
print(dr8)
# 测试: 如果时间无法被整份切割的情况
# 结论: 不足freq的时候,不会包含结束时间
dr8 = pd.date_range(start=start_time, end=end_time, freq='1H 20min')
print(dr8)
还有一些比较陌生的参数,比如:
B(business day), C(custom business day), W(weekly frequency)
M(month end frequency), SM(semi-month end frequency (15th and end of month))
BM(business month end frequency), CBM(custom business month end frequency)
start_date = "2020-08-01"
end_date = "2020-08-30"
# 1. 工作日
dr1 = pd.date_range(start=start_date, end=end_date, freq='B')
print(dr1)
# 2. freq = 'C', 和'B'一样,没看出来有啥区别
dr2 = pd.date_range(start=start_date, end=end_date, freq='C')
print(dr2)
# 3. freq = 'W', 只会收集周日的日期
dr3 = pd.date_range(start=start_date, end=end_date, freq='W')
print(dr3)
# 其他冷门参数
# 1. freq = 'B', 工作日(除去周六和周日)
start_date = "2020-08-01"
end_date = "2020-12-30"
# 4. freq = 'M', 只会记录月末最后一天
dr4 = pd.date_range(start=start_date, end=end_date, freq='M')
print(dr4)
# 5. freq = 'SM', 只会记录每月15日和最后一天
dr5 = pd.date_range(start=start_date, end=end_date, freq='SM')
print(dr5)
# 6. freq = 'BM', 结果与'M'相同,未发现区别
dr6 = pd.date_range(start=start_date, end=end_date, freq='BM')
print(dr6)
# 7. freq = 'CBM', 结果与'BM', 'M'相同,为发现区别
dr7 = pd.date_range(start=start_date, end=end_date, freq='CBM')
print(dr7)
# 8. freq = 'MS', 只记录月初时间
dr8 = pd.date_range(start=start_date, end=end_date, freq='MS')
print(dr8)
# 9. freq = 'Q', quarter end frequency, 记录季度最后一天
dr9 = pd.date_range(start=start_date, end=end_date, freq='Q')
print(dr9)
# 10. freq = 'BH', 工作时间(小时,09:00-16:00)
dr10 = pd.date_range(start=start_date, end=end_date, freq='BH')
print(dr10)
还有一些其他参数,也做了相应测试.
start_date = "2020-08-01"
end_date = "2020-08-02"
# 1. tz, 时区, 在时间基础上加上时区偏移量
dr1 = pd.date_range(start=start_date, end=end_date, freq='10min', tz='Asia/Hong_Kong')
print(dr1)
# 2. name, 对返回的DatetimeIndex命名
dr2 = pd.date_range(start=start_date, end=end_date, freq='10min', name='my_test_time')
print(dr2)
# 当你再次使用这个DatetimeIndex对象时,可以获取你设置的name
dt_index_name = dr2.name
print(dt_index_name)
# 3. closed, 可选参数{None, 'left', 'right'}
# 注意,默认是None,对给定时间返回的开始和结束时间都会记录。
# 如果是left,左闭合,只记录左闭合区间的值,换言之,最后一个时间不会被记录
# 如果是right,右闭合,只记录右边的值,不会记录第一个时间
dr3 = pd.date_range(start=start_date, end=end_date, freq='10min', closed='left')
print(dr3)
dr4 = pd.date_range(start=start_date, end=end_date, freq='10min', closed='right')
print(dr4)
网友评论