Scrape US Margin Debt report with Python
Goal here is to scrape margin debt from FINRA statistics report.
Margin debt is the debt incurred by brokerage customers who use margin account for trading. It is a form of leverage for the stock trading, and is leading indicator of overprice and low confidence in current market condition.
The latest data can be extracted from link below, while it is difficult to read due to poor formatting and multiple data tables.
Below code scrape multiple table data using python – BeautifulSoup – and plot historical margin dept.
https://www.finra.org/investors/learn-to-invest/advanced-investing/margin-statistics
import pandas as pd import numpy as np from bs4 import BeautifulSoup import pandas as pd import requests import datetime MG_DEBT= pd.DataFrame() url="https://www.finra.org/investors/learn-to-invest/advanced-investing/margin-statistics" html_content = requests.get(url).text soup = BeautifulSoup(html_content, "lxml") tables = soup.find_all('table') for table in tables: rows = table.findAll('tr') temp = pd.read_html(str(table))[0] MG_DEBT = MG_DEBT.append(pd.read_html(str(table))[0]) MG_DEBT.columns = ['Date', 'Debt', '2', '3', '4', '5', '6', '7', '8', 'Debt', '10'] MG_DEBTA = MG_DEBT.iloc[:,[0, 1]] MG_DEBTB = MG_DEBT.iloc[:,[0, 9]] MG_DEBTA = MG_DEBTA.dropna() MG_DEBTB = MG_DEBTB.dropna() MG_DEBTA["Date"] = MG_DEBTA["Date"].str.replace('June','Jun') MG_DEBTA["Date"] = MG_DEBTA["Date"].str.replace('July','Jul') MG_DEBTA["Date"] = MG_DEBTA["Date"].str.replace('Sept','Sep') MG_DEBTB["Date"] = MG_DEBTB["Date"].str.replace('June','Jun') MG_DEBTB["Date"] = MG_DEBTB["Date"].str.replace('July','Jul') MG_DEBTB["Date"] = MG_DEBTB["Date"].str.replace('Sept','Sep') conv = lambda x: datetime.datetime.strptime(x, "%b-%y") MG_DEBTA["Date"] = MG_DEBTA["Date"].apply(conv) MG_DEBTB["Date"] = MG_DEBTB["Date"].apply(conv) MG_DEBTA.set_index('Date', inplace=True) MG_DEBTB.set_index('Date', inplace=True) MG_DEBT = pd.concat([MG_DEBTA,MG_DEBTB]) MG_DEBT = MG_DEBT.sort_values(by=['Date'], ascending=False) MG_DEBT.plot() # %%