美文网首页
第五次作业-股票数据分析

第五次作业-股票数据分析

作者: 海墨星人 | 来源:发表于2017-04-18 13:16 被阅读0次

数据是未来的石油。上周参加了Tiger创办的[解密大数据社群],受益匪浅。 中途参加正在补课中。

基本指标的定义

开盘价,指某种证券在证券交易所每个营业日的第一笔交易的成交价格。

交易日,指开放式基金销售机构在规定时间受理投资者申购、转换、赎回或其它业务申请的工作日。周末和节假日不属于T日(交易日),T日以股市收市时间为界,每天15:00之前提交的交易按照当天收市后公布的净值成交(净值公布时间一般是当天18:00左右),15:00之后提交的交易将按照下一个交易日的净值成交。

收盘价,是指某种证券在证券交易所一天交易活动结束前最后一笔交易的成交价格。

成交量:成交量是指在某一时段内成交的股价的总手数 (1手=100股)。

换手率,也称“周转率“,指在一定时间内市场中股票转手买卖的频率,是反映股票流通性强弱的指标之一。股票的换手率越高,意味着该只股票的交投越活跃,人们购买该只股票的意愿越高,反之表明该只股票少人关注;手率高一般意味着股票流通性好,进出市场比较容易,具有较强的变现能力;将换手率与股价走势相结合,可以对未来的股价做出一定的预测和判断。换手率的计算公式:周转率(换手率)=(某一段时期内的成交量)/(发行总股数)x100%

涨跌幅, 是对涨跌值的描述,用%表示。其计算公式:(当前最新成交价(或收盘价) - 前一日交易日收盘价) ÷ 前一日交易日收盘价×100%。

均价,由均价线表示。反映了当天的市场参与平均成本,和一定的压力支撑作用。其计算公式为: 每股平均成交价格=某日总成交金额÷某日总成交股数。延长时间 期限,可得5日均价、10均价等等。

均量:其含义可参考均价得出。

数据读取与展示

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import os

data_src='C://Users//ecaoyng//Downloads//stockData.txt'
stock_data=pd.read_table(data_src)

if os.path.exists(data_src):
    stock_data=stock_data.iloc[:,0:15]
    print(stock_data)

stock_data.sort_index(ascending = False)[:5]
Paste_Image.png

分析一 统计信息

stock_data.describe()
Paste_Image.png Paste_Image.png

从中可以得到各个指标的平均值,标准差,最小值,最大值,上下四分卫以及中位数

分析二 相关系数分析与协方差

获取各项指标彼此间相关系数的代码如下

for i in ['high' ,'low','    open','close','ma5','ma10','ma20','volume','v_ma5','v_ma10','v_ma20']:
    for j in ['high' ,'low','    open','close','ma5','ma10','ma20','volume','v_ma5','v_ma10','v_ma20']:
        print('%s and %s corr is %s '%(i, j, (stock_data[i].corr(stock_data[j]))))    
    print('-'*20)

得到的相关系数统计如下,可以很清楚的看到各个列之间的相关系数

high and high corr is 1.0 
high and low corr is 0.93037923331 
high and     open corr is 0.897211953333 
high and close corr is 0.966243822278 
high and ma5 corr is 0.812480947253 
high and ma10 corr is 0.660447391359 
high and ma20 corr is 0.179590482526 
high and volume corr is 0.896833634665 
high and v_ma5 corr is 0.868560964795 
high and v_ma10 corr is 0.628375689954 
high and v_ma20 corr is -0.354876794097 
high and turnover corr is 0.897045822307 
--------------------
low and high corr is 0.93037923331 
low and low corr is 1.0 
low and     open corr is 0.95643942191 
low and close corr is 0.910515603813 
low and ma5 corr is 0.894385404937 
low and ma10 corr is 0.759209638782 
low and ma20 corr is 0.350171538359 
low and volume corr is 0.730795712392 
low and v_ma5 corr is 0.907433960723 
low and v_ma10 corr is 0.708135592414 
low and v_ma20 corr is -0.284208967808 
low and turnover corr is 0.730853369056 
--------------------
    open and high corr is 0.897211953333 
    open and low corr is 0.95643942191 
    open and     open corr is 1.0 
    open and close corr is 0.836541327926 
    open and ma5 corr is 0.898542997094 
    open and ma10 corr is 0.757148165488 
    open and ma20 corr is 0.350463060696 
    open and volume corr is 0.697294144296 
    open and v_ma5 corr is 0.922297477575 
    open and v_ma10 corr is 0.725333330517 
    open and v_ma20 corr is -0.247214903269 
    open and turnover corr is 0.697646328153 
--------------------
close and high corr is 0.966243822278 
close and low corr is 0.910515603813 
close and     open corr is 0.836541327926 
close and close corr is 1.0 
close and ma5 corr is 0.756787587238 
close and ma10 corr is 0.593292197451 
close and ma20 corr is 0.107855192304 
close and volume corr is 0.910780093909 
close and v_ma5 corr is 0.824391839972 
close and v_ma10 corr is 0.576449470244 
close and v_ma20 corr is -0.376021929049 
close and turnover corr is 0.910879839054 
--------------------
ma5 and high corr is 0.812480947253 
ma5 and low corr is 0.894385404937 
ma5 and     open corr is 0.898542997094 
ma5 and close corr is 0.756787587238 
ma5 and ma5 corr is 1.0 
ma5 and ma10 corr is 0.921168092037 
ma5 and ma20 corr is 0.525347581453 
ma5 and volume corr is 0.535869431606 
ma5 and v_ma5 corr is 0.953928210144 
ma5 and v_ma10 corr is 0.853326052426 
ma5 and v_ma20 corr is -0.200960528265 
ma5 and turnover corr is 0.535991249199 
--------------------
ma10 and high corr is 0.660447391359 
ma10 and low corr is 0.759209638782 
ma10 and     open corr is 0.757148165488 
ma10 and close corr is 0.593292197451 
ma10 and ma5 corr is 0.921168092037 
ma10 and ma10 corr is 1.0 
ma10 and ma20 corr is 0.523157969292 
ma10 and volume corr is 0.35901429877 
ma10 and v_ma5 corr is 0.791054293397 
ma10 and v_ma10 corr is 0.843184862447 
ma10 and v_ma20 corr is -0.266123905738 
ma10 and turnover corr is 0.358978976146 
--------------------
ma20 and high corr is 0.179590482526 
ma20 and low corr is 0.350171538359 
ma20 and     open corr is 0.350463060696 
ma20 and close corr is 0.107855192304 
ma20 and ma5 corr is 0.525347581453 
ma20 and ma10 corr is 0.523157969292 
ma20 and ma20 corr is 1.0 
ma20 and volume corr is -0.0600220207727 
ma20 and v_ma5 corr is 0.374810467632 
ma20 and v_ma10 corr is 0.558744215458 
ma20 and v_ma20 corr is 0.51793654476 
ma20 and turnover corr is -0.0600833456034 
--------------------
volume and high corr is 0.896833634665 
volume and low corr is 0.730795712392 
volume and     open corr is 0.697294144296 
volume and close corr is 0.910780093909 
volume and ma5 corr is 0.535869431606 
volume and ma10 corr is 0.35901429877 
volume and ma20 corr is -0.0600220207727 
volume and volume corr is 1.0 
volume and v_ma5 corr is 0.643850457981 
volume and v_ma10 corr is 0.356907380189 
volume and v_ma20 corr is -0.376176237006 
volume and turnover corr is 0.99999710781 
--------------------
v_ma5 and high corr is 0.868560964795 
v_ma5 and low corr is 0.907433960723 
v_ma5 and     open corr is 0.922297477575 
v_ma5 and close corr is 0.824391839972 
v_ma5 and ma5 corr is 0.953928210144 
v_ma5 and ma10 corr is 0.791054293397 
v_ma5 and ma20 corr is 0.374810467632 
v_ma5 and volume corr is 0.643850457981 
v_ma5 and v_ma5 corr is 1.0 
v_ma5 and v_ma10 corr is 0.828185562361 
v_ma5 and v_ma20 corr is -0.171466705223 
v_ma5 and turnover corr is 0.644134172052 
--------------------
v_ma10 and high corr is 0.628375689954 
v_ma10 and low corr is 0.708135592414 
v_ma10 and     open corr is 0.725333330517 
v_ma10 and close corr is 0.576449470244 
v_ma10 and ma5 corr is 0.853326052426 
v_ma10 and ma10 corr is 0.843184862447 
v_ma10 and ma20 corr is 0.558744215458 
v_ma10 and volume corr is 0.356907380189 
v_ma10 and v_ma5 corr is 0.828185562361 
v_ma10 and v_ma10 corr is 1.0 
v_ma10 and v_ma20 corr is 0.150980899067 
v_ma10 and turnover corr is 0.357412751599 
--------------------
v_ma20 and high corr is -0.354876794097 
v_ma20 and low corr is -0.284208967808 
v_ma20 and     open corr is -0.247214903269 
v_ma20 and close corr is -0.376021929049 
v_ma20 and ma5 corr is -0.200960528265 
v_ma20 and ma10 corr is -0.266123905738 
v_ma20 and ma20 corr is 0.51793654476 
v_ma20 and volume corr is -0.376176237006 
v_ma20 and v_ma5 corr is -0.171466705223 
v_ma20 and v_ma10 corr is 0.150980899067 
v_ma20 and v_ma20 corr is 1.0 
v_ma20 and turnover corr is -0.375592133339 
--------------------
turnover and high corr is 0.897045822307 
turnover and low corr is 0.730853369056 
turnover and     open corr is 0.697646328153 
turnover and close corr is 0.910879839054 
turnover and ma5 corr is 0.535991249199 
turnover and ma10 corr is 0.358978976146 
turnover and ma20 corr is -0.0600833456034 
turnover and volume corr is 0.99999710781 
turnover and v_ma5 corr is 0.644134172052 
turnover and v_ma10 corr is 0.357412751599 
turnover and v_ma20 corr is -0.375592133339 
turnover and turnover corr is 1.0 
--------------------

同理可以得到各个指标对应的协方差:

for i in ['high' ,'low','    open','close','ma5','ma10','ma20']:
        for j in ['high' ,'low','    open','close','ma5','ma10','ma20']:
        print('%s and %s cov is %s '%(i, j, (stock_data[i].cov(stock_data[j]))))    
    print('-'*20)

结论:
由彼此间相关系数可知,如果定义相关系数大于等于0.95为强相关,则

  1. 最高点与收盘价存在强相关,相关系数为0.97
  2. 开盘价与最低价存在强相关,相关系数为0.96
  3. 收盘价与最高价存在强相关,相关系数为0.97
  4. 5日均线与5日均成交量成正强相关,相关系数为0.95
  5. 换手率与成交量存在强相关,相关系数为0.999

当然,我们可以更改强相关的定义,结论同样可以由上面的输出得到。
其实根据相关系数我们可以开展许多分析,此处不在一一展开。

分析三: 最低点10.55(19日)分析

由matplotlib进行图形化显示各项价格指标的走势

from datetime import datetime
from dateutil.parser import parse
from matplotlib.dates import AutoDateLocator, DateFormatter,DayLocator


date = stock_data['date']
date = pd.to_datetime(date)
high = stock_data['high'].values
low = stock_data['low'].values
open_price= stock_data['    open'].values
close = stock_data['close'].values
ma5 = stock_data['ma5'].values
ma10 = stock_data['ma10'].values
ma20 = stock_data['ma20'].values


fig = plt.figure(figsize = (15,5))
ax = fig.add_subplot(111)
ax.set_title("Stock price")
ax.plot(date,open_price,label='open')
ax.plot(date,high,label='high')
ax.plot(date,low,label = 'low')
ax.plot(date,close,label = 'close')
ax.plot(date,ma5,label = 'ma5')
ax.plot(date,ma10,label = 'ma10')
ax.plot(date,ma20,label = 'ma20')

ax.xaxis.set_major_locator(autodates)
ax.set_xlabel("date")
ax.set_ylabel("values")
ax.xaxis.set_major_locator(DayLocator(bymonthday=range(1,32), interval=1)) 
ax.xaxis.set_major_formatter(DateFormatter('%Y%m%d')) 
plt.xticks(rotation=60)

plt.legend(loc='upper left') 


plt.grid(True)
plt.show()

20170417171433358.png

图一: 最低值曲线和5/10/20日均线的关系


20170417213935413.png

图二:成交量与5/10/20日均成交量曲线的关系(图二)

20170417213935413.png

图三:换手率与其他价格指标的的关系(图三)

20170417214350594.png

在19号,分析之前的数据, 可以得到如下的买入理由:

  1. 根据19日之前的数据得到,low曲线的下四分位是10.89,19日当日数最低值(10.55)与下四分卫10.89相差较大。19日之前最低值的下四分位可以由下面的程序得出。
data_19=stock_data.sort_index(ascending = False)[:10]
data_19.describe()
  1. 由图一可知,5日均线已经止跌企稳并持续一段时间
  2. 19日最低值(10.55)与5/10/20日均线(11.014 11.241 11.227)相差比较大。
  3. 由相关系数可知,v_ma5 与最低值存在0.9的相关系数,由图二可知,v_ma5曲线已经趋稳。

分析四:最低值曲线和5/10/20日均线交点分析(21日)

最低点线由下向上穿过5/10/20日均线交叉点并且5日均线由下向上穿过10/20日均线。由三根均线作为支撑,该股票看多头的概率增加。

分析五:最高卖出点分析(26日)

最低点由上而下穿过5日均线
最低值均高于三根均线
同理,可以根据相关系数做出相应解读

分析六:均价解读

5/10/20日均线的变化趋向于平滑,不会出现陡增减的情况并且时间越长越趋向平缓。5/10/20日均成交量同这里

PS:
自己平时爱好金融,在我看来中国的股票市场是一个会让有钱人取经验也未必能挣到钱的地方。机构与上市公司是不会亏钱的,他们可以通过透支未来的股价来人为拉高或者降低股价从而受益。本篇仅仅是举例来说明一些分析思路,不做展开。

最后感谢 虎哥能分享自己的专业知识。

相关文章

网友评论

      本文标题:第五次作业-股票数据分析

      本文链接:https://www.haomeiwen.com/subject/neytzttx.html