What Is yfinance?

yfinance is an open-source Python library that provides a convenient way to download financial data from Yahoo Finance. It was created by Ran Aroussi and is available on PyPI. The library wraps Yahoo Finance's data endpoints and returns data in pandas DataFrame format, making it straightforward to integrate into data analysis and trading system workflows.

yfinance is not an official Yahoo Finance API -- Yahoo discontinued its official API years ago. Instead, yfinance reverse-engineers the data endpoints that Yahoo Finance's own website uses. This means the library could break if Yahoo changes its backend, which has happened occasionally in the past. Despite this fragility, yfinance remains the most popular Python library for downloading free stock market data, with millions of monthly downloads on PyPI.

Installation

Install yfinance using pip:

pip install yfinance

yfinance depends on pandas, numpy, and requests, which will be installed automatically if not already present. For the most up-to-date version, you can install directly from the GitHub repository, though the PyPI version is usually sufficient.

The Ticker Object

The core of yfinance is the Ticker object. You create one by passing a stock symbol:

import yfinance as yf

ticker = yf.Ticker("AAPL")

The Ticker object is your gateway to all data about that security -- price history, company information, financials, earnings, options, and more. The object itself does not immediately download data; data is fetched lazily when you access specific properties or call methods.

Downloading OHLCV Price Data

The most common use of yfinance is downloading historical price data. The .history() method returns a pandas DataFrame with Open, High, Low, Close, and Volume columns:

import yfinance as yf

ticker = yf.Ticker("AAPL")
hist = ticker.history(period="6mo")

print(hist.head())

This returns a DataFrame indexed by date with columns for Open, High, Low, Close, Volume, Dividends, and Stock Splits. The data is adjusted for splits and dividends by default.

Period Parameter

The period parameter accepts the following values:

Available Period Values

Interval Parameter

The interval parameter controls the granularity of the data. Available intervals are: 1m, 2m, 5m, 15m, 30m, 60m, 90m, 1h, 1d, 5d, 1wk, 1mo, 3mo.

# Get 5-minute bars for the last 5 days
hist_intraday = ticker.history(period="5d", interval="5m")

# Get weekly bars for the last 2 years
hist_weekly = ticker.history(period="2y", interval="1wk")

Intraday data availability is limited. Yahoo Finance only retains intraday data (intervals shorter than 1 day) for a limited period. 1-minute data is available for approximately the last 30 days. 5-minute, 15-minute, 30-minute, and 60-minute data is available for approximately the last 60 days. If you need longer intraday history, you must collect and store it yourself.

Custom Date Ranges

Instead of using period, you can specify exact start and end dates:

hist = ticker.history(start="2024-01-01", end="2024-12-31")

# Or using datetime objects
from datetime import datetime
hist = ticker.history(
    start=datetime(2024, 1, 1),
    end=datetime(2024, 12, 31)
)

When using start and end, do not also pass period. If both are provided, period takes precedence.

Downloading Multiple Tickers

The yf.download() function allows you to download data for multiple tickers in a single call:

import yfinance as yf

data = yf.download(
    ["AAPL", "MSFT", "GOOGL"],
    period="1y",
    interval="1d"
)

print(data.head())

The result is a DataFrame with a MultiIndex on the columns -- the first level is the data field (Open, High, Low, Close, Volume) and the second level is the ticker symbol. To access closing prices for a specific ticker:

# Access AAPL closing prices
aapl_close = data["Close"]["AAPL"]

# Access all closing prices (DataFrame with one column per ticker)
all_close = data["Close"]

The download() function is more efficient than creating individual Ticker objects in a loop because it batches the requests. For downloading the same data type (e.g., daily OHLCV) for many tickers, download() is the preferred approach.

Company Information

The .info property returns a dictionary containing a wide range of company information:

ticker = yf.Ticker("AAPL")
info = ticker.info

print(info["marketCap"])        # Market capitalization
print(info["sector"])           # GICS sector
print(info["industry"])         # Industry classification
print(info["fullTimeEmployees"])# Employee count
print(info["forwardPE"])        # Forward P/E ratio
print(info["dividendYield"])    # Dividend yield
print(info["fiftyTwoWeekHigh"]) # 52-week high
print(info["fiftyTwoWeekLow"])  # 52-week low

The .info dictionary contains dozens of fields. Not all fields are available for every ticker -- some may be None or missing entirely. Common fields include marketCap, sector, industry, trailingPE, forwardPE, dividendYield, beta, fiftyTwoWeekHigh, fiftyTwoWeekLow, averageVolume, shortRatio, and recommendationKey.

The .info call can be slow. The .info property makes a web request to Yahoo Finance and parses the response, which can take 1-3 seconds. If you need info for many tickers, consider batching your requests and adding delays to avoid rate limiting.

Financial Statements

yfinance provides access to the three primary financial statements:

ticker = yf.Ticker("AAPL")

# Income statement (annual)
income_stmt = ticker.income_stmt

# Balance sheet (annual)
balance_sheet = ticker.balance_sheet

# Cash flow statement (annual)
cashflow = ticker.cashflow

Each returns a pandas DataFrame with dates as columns and line items as rows. For quarterly statements, use the quarterly variants:

# Quarterly income statement
quarterly_income = ticker.quarterly_income_stmt

# Quarterly balance sheet
quarterly_bs = ticker.quarterly_balance_sheet

# Quarterly cash flow
quarterly_cf = ticker.quarterly_cashflow

The column names in the financial statements are standardized by Yahoo Finance. For example, the income statement includes rows like "Total Revenue", "Cost Of Revenue", "Gross Profit", "Operating Income", "Net Income", and others. The exact row labels can vary slightly between companies.

Earnings Data

yfinance provides earnings-related data through several properties:

ticker = yf.Ticker("AAPL")

# Upcoming and recent earnings dates with EPS estimates and actuals
earnings_dates = ticker.earnings_dates
print(earnings_dates)

The earnings_dates property returns a DataFrame with columns for EPS Estimate, Reported EPS, and Surprise (%). This includes both historical earnings and upcoming estimated dates. It is useful for identifying earnings surprises and tracking the pattern of beats vs. misses.

Options Data

yfinance can retrieve options chain data for any optionable stock:

ticker = yf.Ticker("AAPL")

# Get available expiration dates
expirations = ticker.options
print(expirations)  # Tuple of date strings like ('2024-01-19', '2024-01-26', ...)

Once you have the expiration dates, you can retrieve the full options chain for a specific date:

# Get options chain for the first available expiration
chain = ticker.option_chain(expirations[0])

# chain.calls is a DataFrame of call options
print(chain.calls.head())

# chain.puts is a DataFrame of put options
print(chain.puts.head())

The calls and puts DataFrames include columns for contractSymbol, lastTradeDate, strike, lastPrice, bid, ask, change, percentChange, volume, openInterest, and impliedVolatility.

Insider Transactions

yfinance can retrieve insider transaction data:

ticker = yf.Ticker("AAPL")

# Get insider transactions
insider_txns = ticker.insider_transactions
print(insider_txns)

This returns a DataFrame with columns including the insider's name, title (e.g., "Chief Executive Officer"), transaction type (e.g., "Sales", "Purchases"), date, number of shares, and value. This data comes from SEC Form 4 filings and provides a view of what corporate insiders are doing with their own company's stock.

For institutional holder data:

# Top institutional holders
inst_holders = ticker.institutional_holders
print(inst_holders)

# Top mutual fund holders
mf_holders = ticker.mutualfund_holders
print(mf_holders)

Rate Limiting and Best Practices

Yahoo Finance throttles aggressive requests. If you make too many requests too quickly, you will start getting errors, empty responses, or temporary IP blocks. There is no officially documented rate limit, but in practice, keeping your request rate under 2 requests per second is advisable.

import yfinance as yf
import time

tickers = ["AAPL", "MSFT", "GOOGL", "AMZN", "META",
           "TSLA", "NVDA", "JPM", "V", "JNJ"]

results = {}
for symbol in tickers:
    ticker = yf.Ticker(symbol)
    results[symbol] = ticker.history(period="6mo")
    time.sleep(0.5)  # Pause between requests

Do not hammer Yahoo Finance. If you are downloading data for hundreds of tickers, use yf.download() with a list of tickers rather than looping with individual Ticker objects. For large-scale data collection, consider using yf.download() in batches of 50-100 tickers with pauses between batches. If you are building a production system that needs reliable, high-volume data access, consider a paid data provider.

Adjusted vs. Unadjusted Prices

By default, yfinance returns prices adjusted for both stock splits and dividends. This means the historical prices are modified so that the price series is continuous -- a 2-for-1 stock split, for example, will halve all historical prices before the split date so that returns calculated from the series are accurate.

The auto_adjust parameter controls this behavior:

# Default: adjusted prices (auto_adjust=True)
hist = ticker.history(period="1y")

# Unadjusted prices
hist_raw = ticker.history(period="1y", auto_adjust=False)

When auto_adjust=False, the DataFrame includes both a "Close" column (unadjusted) and an "Adj Close" column (adjusted for splits and dividends). When auto_adjust=True (the default), only the adjusted values are returned in the standard OHLC columns.

For most quantitative analysis -- calculating returns, computing moving averages, building backtests -- you want adjusted prices. Unadjusted prices are useful when you need to know the actual trading price on a given day (e.g., for verifying a trade execution or for display purposes).

Timezone Handling

The DataFrame index returned by .history() is timezone-aware. For U.S. stocks, the timezone is typically America/New_York. For daily data, the index contains dates with the timezone set to the exchange's timezone.

hist = ticker.history(period="5d")
print(hist.index.tz)  # America/New_York

If you need to work with timezone-naive timestamps (e.g., for compatibility with other data sources), you can remove the timezone:

# Remove timezone information
hist.index = hist.index.tz_localize(None)

For intraday data, timestamps include both date and time, and the timezone becomes more important. Be careful when comparing intraday data from different sources -- ensure the timezones match before merging or aligning data.

Computing Technical Indicators

Since yfinance returns pandas DataFrames, you can compute technical indicators directly using pandas operations:

import yfinance as yf
import pandas as pd

ticker = yf.Ticker("AAPL")
df = ticker.history(period="1y")

# Simple Moving Averages
df["SMA_50"] = df["Close"].rolling(window=50).mean()
df["SMA_200"] = df["Close"].rolling(window=200).mean()

# RSI (14-period)
delta = df["Close"].diff()
gain = delta.where(delta > 0, 0.0)
loss = -delta.where(delta < 0, 0.0)
avg_gain = gain.ewm(alpha=1/14, min_periods=14).mean()
avg_loss = loss.ewm(alpha=1/14, min_periods=14).mean()
rs = avg_gain / avg_loss
df["RSI_14"] = 100 - (100 / (1 + rs))

# Average True Range (ATR, 14-period)
high_low = df["High"] - df["Low"]
high_close = (df["High"] - df["Close"].shift()).abs()
low_close = (df["Low"] - df["Close"].shift()).abs()
true_range = pd.concat([high_low, high_close, low_close], axis=1).max(axis=1)
df["ATR_14"] = true_range.rolling(window=14).mean()

print(df[["Close", "SMA_50", "SMA_200", "RSI_14", "ATR_14"]].tail())

Common Pitfalls

Missing Data for Delisted Stocks

yfinance cannot retrieve data for stocks that have been delisted from their exchange. If a company was acquired, went bankrupt, or was otherwise removed from trading, its ticker may return empty data or an error. This creates survivorship bias in any analysis that only uses currently available tickers. For rigorous backtesting, you need a data source that includes delisted securities.

Data Gaps and Errors

Yahoo Finance data occasionally has gaps (missing days), incorrect values, or duplicate entries. These are rare for large-cap stocks but more common for small-cap stocks, foreign ADRs, and ETFs. Always validate your data before relying on it for trading decisions. Simple sanity checks -- looking for NaN values, zero volumes on trading days, and extreme price jumps -- can catch most issues.

# Basic data quality checks
print(f"Missing values:\n{df.isnull().sum()}")
print(f"Zero volume days: {(df['Volume'] == 0).sum()}")
print(f"Date range: {df.index.min()} to {df.index.max()}")

The Library Can Break

Because yfinance relies on scraping Yahoo Finance's web endpoints rather than a stable API, it can break when Yahoo changes its website or data format. This has happened multiple times in the library's history. The maintainers typically release a fix within days, but if you are running a production system that depends on yfinance, you should pin your version and test before upgrading.

Not Suitable for Production Trading Systems

yfinance is excellent for research, prototyping, and personal projects. It is not suitable as the sole data source for a production trading system that manages real capital. The reasons: no SLA (service level agreement), no guaranteed uptime, potential for data errors, rate limiting, and the possibility that the library could stop working if Yahoo Finance makes breaking changes. For production systems, paid data providers (such as Polygon.io, Alpha Vantage, or Interactive Brokers' market data) provide more reliable, higher-quality data with contractual guarantees.

yfinance for prototyping, paid data for production. A common and sensible workflow is to use yfinance during development and backtesting, then switch to a paid data provider when moving to live trading. The pandas DataFrame interface means your analysis code barely needs to change -- you just swap the data source.

A Complete Working Example

Here is a complete script that downloads data for a list of tickers, computes basic technical indicators, and identifies stocks with potential bullish setups:

import yfinance as yf
import pandas as pd
import time

def analyze_ticker(symbol):
    """Download data and compute basic signals for a ticker."""
    ticker = yf.Ticker(symbol)
    df = ticker.history(period="6mo")

    if df.empty or len(df) < 50:
        return None

    # Compute indicators
    df["SMA_50"] = df["Close"].rolling(50).mean()
    df["SMA_200"] = df["Close"].rolling(200).mean()

    delta = df["Close"].diff()
    gain = delta.where(delta > 0, 0.0)
    loss = -delta.where(delta < 0, 0.0)
    avg_gain = gain.ewm(alpha=1/14, min_periods=14).mean()
    avg_loss = loss.ewm(alpha=1/14, min_periods=14).mean()
    rs = avg_gain / avg_loss
    df["RSI"] = 100 - (100 / (1 + rs))

    latest = df.iloc[-1]
    prev = df.iloc[-2]

    return {
        "symbol": symbol,
        "close": round(latest["Close"], 2),
        "sma_50": round(latest["SMA_50"], 2) if pd.notna(latest["SMA_50"]) else None,
        "rsi": round(latest["RSI"], 1) if pd.notna(latest["RSI"]) else None,
        "above_50sma": latest["Close"] > latest["SMA_50"] if pd.notna(latest["SMA_50"]) else None,
        "volume": int(latest["Volume"]),
        "avg_volume": int(df["Volume"].rolling(20).mean().iloc[-1]),
    }

# Analyze a list of tickers
symbols = ["AAPL", "MSFT", "GOOGL", "AMZN", "NVDA",
           "JPM", "V", "JNJ", "XOM", "PG"]

results = []
for sym in symbols:
    result = analyze_ticker(sym)
    if result:
        results.append(result)
    time.sleep(0.5)

# Display results
df_results = pd.DataFrame(results)
print(df_results.to_string(index=False))

This script demonstrates the typical yfinance workflow: download data, compute indicators, extract signals, and present results. The time.sleep(0.5) between requests prevents rate limiting.

How Alpha Suite Uses yfinance

Alpha Suite uses yfinance as one of its market data sources to fetch 6-month OHLCV histories for securities identified through SEC EDGAR Form 4 filings. The data is used to compute technical overlays -- moving averages, ATR, volatility, RSI-14, and relative strength versus the S&P 500 -- that feed into the signal scoring pipeline alongside insider conviction scores.

The combination is straightforward: SEC EDGAR provides the insider trading signal (who is buying or selling, how much, and when), and yfinance provides the market data needed to assess the technical context of each signal. A strong insider buying cluster in a stock that is above its 50-day moving average with an RSI in the 40-60 range (not overbought) and rising relative strength scores higher than the same insider buying in a stock that is in a technical downtrend.

From Raw Data to Trading Signals

Alpha Suite combines SEC insider filing analysis with market data to generate quantitative trading signals with built-in risk management -- the kind of pipeline you would build with yfinance, but fully automated.

Get Started with Alpha Suite