Scrape TSA passenger numbers from website table data using python scraper

Goal here is to read table data available in website and analyze data.

The daily security check point data is available from TSA website as table format. The data set includes the numbers from 2019 to estimate the impact of COVID-19.

Here the table data was extracted using python BeautifulSoup library from link below and stored as time series data on dataframe.

https://www.tsa.gov/coronavirus/passenger-throughput

import pandas as pd
import requests
import datetime
from bs4 import BeautifulSoup
url = 'https://www.tsa.gov/coronavirus/passenger-throughput'
req = requests.get(url)
print(req.ok)
soup = BeautifulSoup(req.content, 'lxml')
table = soup.find_all('table')[0]
df = pd.read_html(str(table))[0]
#Set x-axis to date
df.set_index('Date', inplace=True)
df.index = pd.to_datetime(df.index)
#Simplify name
df.columns = ['2021', '2020', '2019']
df.plot();
TSA passenger numbers from website table data using python scraper

Leave a Reply

Your email address will not be published. Required fields are marked *