YfinanceTickers
Scrapes ticker lists from public sources (S&P 500, Russell 3000, NASDAQ).
Component Separation
This class handles ticker list scraping only. For data scraping see YfinancePipeline, for validation see YfinanceValidation.
Sources
| Source | Method | Count | Cleaning |
|---|---|---|---|
| S&P 500 | Wikipedia table scrape | ~500 | Replace dots with dashes (BRK.B → BRK-B) |
| Russell 3000 | iShares IWV ETF CSV | ~3000 | Remove CASH, USD, symbols >5 chars |
| NASDAQ | NASDAQ Trader official list | ~3000+ | Remove test issues |
How It Works
Pulls the Symbol column from Wikipedia's List of S&P 500 companies table. Converts dots to dashes for yfinance compatibility.
BRK.B → BRK-B
yfinance uses dashes for class shares, not dots
Downloads iShares IWV ETF holdings CSV. The CSV has metadata rows at the top, so we scan for the "Ticker" header before parsing. Filters out CASH, USD positions, and symbols >5 characters.
CSV Structure
First ~10 rows are metadata - scan for "Ticker" before parsing
Uses the official NASDAQ Trader pipe-delimited file. Filters out test symbols using the "Test Issue" flag.
Official Source
Direct from NASDAQ, updated daily
API Reference
Ticker list scraping from public sources - Scrapes S&P 500, Russell 3000, NASDAQ ticker lists - Validates and cleans ticker symbols
Source code in data_pipeline/sec_data_pipeline/yfinance/yfinance_tickers.py
9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 | |
scrape_nasdaq_tickers()
Scrapes NASDAQ-listed tickers from the official NASDAQ Trader symbol directory.
The file is pipe-delimited with a "Test Issue" flag that we use to filter out test symbols.
Returns:
| Type | Description |
|---|---|
List[str]
|
List of ~3000+ NASDAQ-listed tickers |
Source code in data_pipeline/sec_data_pipeline/yfinance/yfinance_tickers.py
105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 | |
scrape_russell3000_tickers()
Scrapes Russell 3000 tickers from iShares IWV ETF holdings CSV.
The CSV has metadata rows at the top, so we scan for the "Ticker" header before parsing. Filters out CASH, USD positions, and symbols >5 characters.
Returns:
| Type | Description |
|---|---|
List[str]
|
List of ~3000 US equity tickers |
Raises:
| Type | Description |
|---|---|
ValueError
|
If Ticker column not found in CSV |
Source code in data_pipeline/sec_data_pipeline/yfinance/yfinance_tickers.py
53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 | |
scrape_sp500_tickers()
Scrapes S&P 500 constituents from Wikipedia's List of S&P 500 companies.
Pulls the Symbol column from the first table and converts dots to dashes for yfinance compatibility (BRK.B → BRK-B).
Returns:
| Type | Description |
|---|---|
List[str]
|
List of ~500 ticker symbols |
Source code in data_pipeline/sec_data_pipeline/yfinance/yfinance_tickers.py
25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 | |