Traveling by airplane is one of the most common modes of transportation for the US dwellers. Compared with countries like Japan and China, which have a high population density, the distribution of the population in the US is quite sparse and thus not suitable for developing a high speed railway network. However, the US also enjoys a huge land area, spanning from the Pacific Ocean to the Atlantic Ocean, which makes long range transportation a necessity for the US citizens. Throughout the history of the US, airplanes have thus been playing the essential role of long distance transportation.
Due to its popularity, large scale data sets are provided by the US Department of Transportation for data scientists to retrieve and analyze. These large scale data sets, which usually contain millions of records, cover a variety of commercial aviation related topics, including: departure records , airfare records, flight delay records, and airline financial reviews. Valuable information can thus be discovered from the data set and transformed into useful suggestions for stakeholders to consider.
Inspired by the motivation mentioned above, this project series use several data sets to answer questions about the US aviation industry. In general, it can be divided into two parts.
I have been hearing from famous aviation YouTubers like Sam-Chui that the aviation industry is having its toughest years since the COVID pandemic. But I have never looking at the details by myself. Instead of just believing it to be true, I think analyzing it by myself can convince myself better. As a crazy aviation fan, I want to get some insights to the current aviation industry.
In this project, I am trying to analyze the US commerical aviation industry situation after 1990. I am trying to ask myself following questions:
Is there any significant structural change made by the COVID pandemic to the US commercial aviation industry?
a. Is there a significant change in top international destinations considering the flights departed from the US?
b. Is there a significant change in US airports rank considering the flights departed from the US?
c. Is there a significant change in top airline carriers considering the flights departed from the US?
I hope that by answering these questions, I can know the aviation industry better. And if possible, give my naive suggestions to airline carriers to help them go through this tough period.
The major data set used here is the International Departures published by the Department of Transportation. There are also some relatively minor data sets including
NumPy, Pandas, PySparkSQL, Seaborn
The major analysis code is this Jupyter Notebook.
The report is here.
As mentioned above, traveling by airplane is one of the most common transportation methods used in the US. As a consequence, the airfare problem becomes an important topic widely discussed by the US citizens. How to buy tickets with reasonable prices is important for money-saving. But this is not an easy task. Since the airfare is influenced both by inner factors like distance and outer factors like market situation. In this project, we want to explore which factors influence the airfare significantly, and try to use them to predict the airfare precisely.
In this project, the US domestic airline consumer airfare will be analyzed closely based on an inter-city view. The prices of flights between US major cities will be collected and inspected. In particular, there are three overarching questions covered by this project:
The major data set used here is the Consumer Airfare Report published by the Department of Transportation.
Numpy, Pandas, Plotnine, Seaborn, SkLearn
The major analysis code is this Jupyter Notebook.
The report is here.