Scrape TSA passenger numbers from website table data using python scraper
Goal here is to read table data available in website and analyze data.
The daily security check point data is available from TSA website as table format. The data set includes the numbers from 2019 to estimate the impact of COVID-19.
Here the table data was extracted using python BeautifulSoup library from link below and stored as time series data on dataframe.
https://www.tsa.gov/coronavirus/passenger-throughput
import pandas as pd import requests import datetime from bs4 import BeautifulSoup url = 'https://www.tsa.gov/coronavirus/passenger-throughput' req = requests.get(url) print(req.ok) soup = BeautifulSoup(req.content, 'lxml') table = soup.find_all('table')[0] df = pd.read_html(str(table))[0] #Set x-axis to date df.set_index('Date', inplace=True) df.index = pd.to_datetime(df.index) #Simplify name df.columns = ['2021', '2020', '2019'] df.plot();