Gathering Stock and Crypto data using Python and yfinance
In this post, I will be sharing how you gather any stock or crypto data using yfinance (Yahoo Finance) library and how to automate jobs to save the data on a local server.
The yfinance library makes use of the free Yahoo Finance API to gather stock or crypto information for a specified period and interval. In this post, we will make use of this API to gather data and store a local copy.
Getting Started
First, make sure you have python 3 installed you can download it from the official website or make use of python distribution like anaconda. I would recommend the use of Anaconda as it makes it simple to manage environments for different python projects.
Once that's done let's start with download yfinance, if you are using pip to install it run the below command.
pip install yfinance
Or if you are using anaconda make sure to create and activate an environment before installing yfinance.
conda create -n <env-name>
conda activate <env-name>
conda install yfinance
Once you have yfinance installed now we can start coding the python script to collect the data.
To get the data we need the ticker symbol of the stock or cryptocurrency, easy way to find out is to head to yahoo finance and search for the name of the company or coin you want the ticker for.
Once you have made note of the ticker, you can make use of the below line to get the data for that ticker.
import yfinance as yf
msft = yf.Ticker("DOGE-USD")
You can specify the time period for which you want the data and the row interval to the ticker function. The expected format is “d” for days “m” for minutes, “h” for hours, “wk” for weeks, “mo” for months, and “y” for years. You can see an example below to get data of Dogecoin-USD pair for a period of 1 month at an interval of 15m.
df = yf.download(tickers = "DOGE-USD",period = "1mo",interval = "15m")
The output is returned in the form of pandas dataframe, if you don't know what a pandas dataframe is, pandas is a library to work with data manipulation and is widely used in the data science industry. It gives us a lot of useful functions to work with data and we will make use of a few of them in this post.
In the output df we get Open, High, Low, Close, Adjusted Close, Volume of the stock or crypto indexed according to the date and time going from older to new. So in this case we can get the latest price by viewing the close value last row.
Similarly, we can get the latest data row by running,
df.tail(1)
the tail function is one of the useful functions provided by pandas where you can request the last n number of rows. The alternate to it is .head(n) where you can request the first n rows.
You can also request multiple tickers at once, by passing a list to the tickers key.
df = yf.download(tickers = ["DOGE-USD", "BTC-USD"],period = "1mo",interval = "15m")
And you can specify a start date and end date to get data from that range.
df = yf.download(tickers = ["DOGE-USD", "BTC-USD"],start = "2020-01-01",end = "2021-01-01")
Finally to store the data you can make use of the panda's functions, depending on the format you need it example df.to_csv(<file_name>) to export to CSV, df.to_json(<file_name>) to export the data as a JSON file and so on. You can find a complete list of all the output format here.
df.to_csv(“data.csv”)
In the next few articles, I will write about working with stock/crypto data and how to set up indicators.