Python-for-data-移动窗口函数
本文中介绍的是,主要的算子是:
- rolling算子
- expanding算子
- ewm算子
移动窗口函数
统计和通过其他移动窗口或者指数衰减而运行的函数,称之为移动窗口函数
- ounter(line
- ounter(line
- ounter(line
- ounter(line
import pandas as pd
- ounter(line
- ounter(line
- ounter(line
close_px_all = pd.read_csv("./examples/stock_px_2.csv"
- ounter(line
- ounter(line
- ounter(line
close_px = close_px_all[["AAPL","MSFT","XOM"]]
<style scoped="">.dataframe tbody tr th:only-of-type { vertical-align: middle; } <pre><code>.dataframe tbody tr th { vertical-align: top; } .dataframe thead th { text-align: right; } </code></pre></style>
AAPL | MSFT | XOM | |
---|---|---|---|
2003-01-02 | 7.40 | 21.11 | 29.22 |
2003-01-03 | 7.45 | 21.14 | 29.24 |
2003-01-06 | 7.45 | 21.52 | 29.96 |
2003-01-07 | 7.43 | 21.93 | 28.95 |
2003-01-08 | 7.28 | 21.31 | 28.83 |
... | ... | ... | ... |
2011-10-10 | 388.81 | 26.94 | 76.28 |
2011-10-11 | 400.29 | 27.00 | 76.27 |
2011-10-12 | 402.19 | 26.96 | 77.16 |
2011-10-13 | 408.43 | 27.18 | 76.37 |
2011-10-14 | 422.00 | 27.27 | 78.11 |
2292 rows × 3 columns
- ounter(line
close_px.AAPL.plot()
image
rolling算子
rolling算子,行为和resample和groupby类似
rolling可以在S或者DF上通过一个window进行调用
- ounter(line
- ounter(line
# 图形更加地平滑:根据250日滑动窗口分组,而不是直接分组
image
- ounter(line
- ounter(line
appl_std250 = close_px.AAPL.rolling(250,min_periods=10).std()
2003-01-09 NaN
2003-01-10 NaN
2003-01-13 NaN
2003-01-14 NaN
2003-01-15 0.077496
2003-01-16 0.074760
2003-01-17 0.112368
Freq: B, Name: AAPL, dtype: float64
- ounter(line
- ounter(line
- ounter(line
# 滚动窗口函数需要窗口中所有的值必须是非NaN值
<matplotlib.axes._subplots.AxesSubplot at 0x11ee210d0>
image
在DF上调用移动窗口函数作用到每列
- ounter(line
close_px.rolling(60).mean().plot(logy=True)
image
- ounter(line
- ounter(line
## rolling算子接收固定大小的时间偏置字符串
<style scoped="">.dataframe tbody tr th:only-of-type { vertical-align: middle; } <pre><code>.dataframe tbody tr th { vertical-align: top; } .dataframe thead th { text-align: right; } </code></pre></style>
AAPL | MSFT | XOM | |
---|---|---|---|
2003-01-02 | 7.400000 | 21.110000 | 29.220000 |
2003-01-03 | 7.425000 | 21.125000 | 29.230000 |
2003-01-06 | 7.433333 | 21.256667 | 29.473333 |
2003-01-07 | 7.432500 | 21.425000 | 29.342500 |
2003-01-08 | 7.402000 | 21.402000 | 29.240000 |
... | ... | ... | ... |
2011-10-10 | 389.351429 | 25.602143 | 72.527857 |
2011-10-11 | 388.505000 | 25.674286 | 72.835000 |
2011-10-12 | 388.531429 | 25.810000 | 73.400714 |
2011-10-13 | 388.826429 | 25.961429 | 73.905000 |
2011-10-14 | 391.038000 | 26.048667 | 74.185333 |
2292 rows × 3 columns
扩展均值算子 expanding
- ounter(line
- ounter(line
- ounter(line
# 调用扩展均值算子
2003-01-02 NaN
2003-01-03 NaN
2003-01-06 NaN
2003-01-07 NaN
2003-01-08 NaN
...
2011-10-10 18.521201
2011-10-11 18.524272
2011-10-12 18.527385
2011-10-13 18.530554
2011-10-14 18.533823
Freq: B, Name: AAPL, Length: 2292, dtype: float64
指数加权函数
指定一个常数衰减因子为观测值提供更多的权重。常用指定衰减因子的方法:使用span(跨度)
ewm算子
- ounter(line
- ounter(line
- ounter(line
# 将苹果公司的股票价格的60日均线和span=60的EW移动均线进行比较
- ounter(line
ewma60 = appl_px.ewm(span=30).mean()
rolling和ewm对比
- ounter(line
ma60.plot(style="k--",label="Simple MA")
image
- ounter(line
ewma60.plot(style="k-",label="EWMA")
image
二元移动窗口函数rolling+corr
一些统计算子,例如相关度和协方差等需要同时操作两个时间序列。
例如,金融分析中的股票和基准指数的关联性问题:计算时间序列的百分比变化pct_change()
- ounter(line
close_px_all[:5]
<style scoped="">.dataframe tbody tr th:only-of-type { vertical-align: middle; } <pre><code>.dataframe tbody tr th { vertical-align: top; } .dataframe thead th { text-align: right; } </code></pre></style>
AAPL | MSFT | XOM | SPX | |
---|---|---|---|---|
2003-01-02 | 7.40 | 21.11 | 29.22 | 909.03 |
2003-01-03 | 7.45 | 21.14 | 29.24 | 908.59 |
2003-01-06 | 7.45 | 21.52 | 29.96 | 929.01 |
2003-01-07 | 7.43 | 21.93 | 28.95 | 922.93 |
2003-01-08 | 7.28 | 21.31 | 28.83 | 909.93 |
计算苹果和标普500的相关性
- ounter(line
- ounter(line
spx_px = close_px_all["SPX"] # 选择某列的数据
- ounter(line
returns = close_px.pct_change() # 计算整个数据的百分比变化
- ounter(line
- ounter(line
- ounter(line
- ounter(line
- ounter(line
# 调用rolling后,corr聚合函数可以根据spx_rets计算滚动相关性
image
计算全部公司和标普500的相关性
- ounter(line
- ounter(line
corr = returns.rolling(125,min_periods=100).corr(spx_rets)
image
自定义移动窗口函数
在rolling及其相关方法上使用apply方法提供了一种在移动窗口中应用自己设计的数组函数的方法。
唯一要求:该函数从每个数组中产生一个单值(缩聚),例如使用rolling()...quantile(q)计算样本的中位数
- ounter(line
- ounter(line
# 定值的百分位数:scipy.stats.percentileofscore
- ounter(line
- ounter(line
- ounter(line
- ounter(line
score_at_2percent = lambda x: percentileofscore(x,0.02)
image
网友评论