85 lines
3.5 KiB
Markdown
85 lines
3.5 KiB
Markdown
---
|
|
title: "Getting Into Day Trading: Analyzing The Moving Average"
|
|
date: 2017-11-04T14:11:54-04:00
|
|
draft: false
|
|
tags: ["day trading", "data analysis", "python"]
|
|
---
|
|
|
|
I know that I have this bit of data for the WLTW symbol, and what would be helpful is to see that data completely
|
|
plotted in all of it's glory. Let's take a look at the closing costs (y) plotted against the date (x).
|
|
|
|
![Image](/img/post/WLTW_CLOSING_COSTS.png)
|
|
|
|
This is a good start, but how good are the SMA's at tracking this close cost? Let's first write a little Python that will grab
|
|
the SMA for a given window, and the end of the window it was calculated for the X-axis.
|
|
|
|
```python
|
|
import numpy as np
|
|
|
|
def moving_avs(col, window):
|
|
moving_avs = {}
|
|
for i in range(0, len(col), window):
|
|
moving_avs[i] = np.mean(col[i:i+window])
|
|
return moving_avs
|
|
```
|
|
|
|
Using Numpy for analysis, and Pandas for Series to hold my values, I can use this function to create a dictionary tracking exactly what
|
|
day I am ending an SMA calculation on, as well as the SMA for that range. Window becomes the step size in the range call, and `np.mean` does
|
|
the work calculating simple moving averages for slices of the data array.
|
|
|
|
Now I can plug in my values to the function to generate some simple moving averages.
|
|
|
|
```python
|
|
data = pd.read_csv("prices.csv")
|
|
wltw = data[data["symbol"] == "WLTW"]
|
|
|
|
threedaysma = moving_avs(wltw, 3)
|
|
fivedaysma = moving_avs(wltw, 5)
|
|
```
|
|
|
|
Back to the orignal question, how well do the SMAs track against the closing cost? Well let's find out.
|
|
|
|
```python
|
|
import matplotlib.pyplot as plt
|
|
plt.plot(wltw["close"])
|
|
plt.plot(list(threedaysma.keys()), list(threedaysma.values()))
|
|
plt.show()
|
|
```
|
|
|
|
The resulting graph is here for the three day SMA:
|
|
|
|
![Image](/img/post/threedaysma.png)
|
|
|
|
That looks very very promising for this small timerange. The three day SMA follows the closing cost very closely.
|
|
Now I don't know about you, but I'd like to see just how closely the three day SMA follows the closing cost. I learned
|
|
in statistics of a little measure called correlation. From the interwebs:
|
|
|
|
> Correlation is a statistical measure for how two or more variables fluctuate together.
|
|
|
|
Now, I won't go into too many details here, as I have mammoth libraries at my disposal. However, I can explain the basics of
|
|
the measure of correlation. Correlation is between the values of -1 and 1, inclusive. A value of 1 means that the two datasets are positively
|
|
correlated (fluctuate together), while a value of -1 means that the two datasets are negatively correlated (fluctuate inversely). Any number in-between
|
|
represents how strongly correlated datasets are positively or negatively, and 0 means that the data is not correlated whatsoever.
|
|
|
|
To calculate correlation, I use the Numpy method for the Pearson product-moment correlation given two array-like inputs. First, I clean the data.
|
|
I'll do this by dropping close cost values that don't correspond to the end of SMA windows for the 3-day SMA.
|
|
|
|
```python
|
|
cleaned = wltw.iloc[list(threedaysma.keys()),:]
|
|
```
|
|
|
|
And now, to calculate our Pearson product-moment correlation coefficients
|
|
|
|
```python
|
|
threedaysma_array = np.array(list(threedaysma.values()))
|
|
print(np.corrcoef(cleaned["close"], threedaysma_array))
|
|
|
|
#[[ 1. 0.96788571]
|
|
# [ 0.96788571 1. ]]
|
|
|
|
```
|
|
|
|
What this output 2D array tells us, is that the data are very strongly correlated! For the points we cleaned, the correlation coefficient is almost 1. That
|
|
is great news, and we can most likely use this moving forward for forecasting and short-term trading.
|
|
|