Data Wrangling

In this section, we prepare and clean the dataset for predictive modeling. The goal of Data Wrangling is to transform raw data into a structured format that can be effectively used by machine learning algorithms.

Based on the results of our EDA, we perform the following steps:

  • Selection of relevant columns: FlightNumber, PayloadMass, Orbit, LaunchSite, GridFins, Legs, ReusedCount, Block, and Class.
  • Renaming variables for clarity (FlightNumber → flight, PayloadMass → payload, ReusedCount → reused_count, etc.).
  • Handling missing values in payload (median imputation or row removal).
  • Encoding categorical features (orbit, site) using One-Hot Encoding.
  • Scaling numerical features (flight, payload, reused_count, block).