Peeking into the Future of Weather: Forecasting with Neural Networks in Python
Predicting the future has been a human fascination for millennia. While we may not have crystal balls, we have something far more powerful for forecasting: neural networks. These complex, brain-inspired algorithms are masters at finding patterns in sequential data, making them perfect for tasks like predicting stock prices, analyzing language, and, as we'll explore today, forecasting the weather.
In this post, we'll break down a Python project that uses three different types of neural networks to predict the mean temperature in Delhi, India. We'll see how data preparation is the secret ingredient and compare the performance of a standard Artificial Neural Network (ANN), a 1D Convolutional Neural Network (CNN), and a Recurrent Neural Network (RNN).
Step 1: Data - The Fuel for Our Models
Our journey begins with the "Daily Delhi Climate" dataset
Before feeding this data to our hungry models, we must preprocess it
Normalization: Neural networks perform best when input values are on a small, consistent scale. We used a
MinMaxScalerto transform all our feature values into a range between 0 and 1. This prevents any single feature from having an outsized influence on the learning process. Creating Sequences: This is the most critical step in time-series forecasting. We can't just feed the network one day of data at a time. It needs context! We used a helper function to create "sequences" or "windows" of data
. Specifically, we created input samples ( X) containing 10 consecutive days of all four features, and the corresponding target (y) was the mean temperature of the 11th day. This sliding window moves across our entire dataset, generating hundreds of training examples. Train-Test Split: Finally, we split our sequenced data into a training set (80%) and a testing set (20%)
. The models will learn from the training data and be evaluated on the unseen test data.
The Contenders: A Trio of Neural Architectures
With our data perfectly structured, we trained three different neural network architectures to see which approach is best for this task
1. The Classic: Artificial Neural Network (ANN)
The ANN is the foundational neural network
The Catch: While simple, this approach loses the crucial temporal order of the data. The model sees 40 numbers but doesn't inherently know which came from day 1 versus day 10.
Performance: The ANN achieved a Test MSE of 0.0034 in one run and 0.00323 in another
.
2. The Pattern-Spotter: 1D Convolutional Neural Network (CNN)
You might associate CNNs with image recognition, but 1D CNNs are incredibly effective at finding patterns in data sequences
The Architecture: Our model used a
Conv1Dlayer to extract features, aMaxPooling1Dlayer to condense them, andDenselayers to make the final prediction. Performance: The 1D-CNN achieved a Test MSE of 0.0048 in the first run and 0.00310 in the second
.
3. The Memory Keeper: Recurrent Neural Network (RNN with LSTM)
RNNs are the quintessential choice for time-series data because they have a concept of "memory"
The Advantage: Unlike the ANN, the LSTM processes the input in its original shape (samples, 10 timesteps, 4 features), fully preserving and leveraging the temporal sequence.
Performance: The RNN (LSTM) achieved a Test MSE of 0.0028 in the first run and 0.00318 in the second
.
The Final Showdown: Which Model Won?
Let's compare the best scores from our runs:
Across the two provided executions, the results were very close, showcasing the power of all three approaches. In the first run, the
RNN (LSTM) was the clear winner with an MSE of 0.0028
The key takeaway is that the models designed to understand sequences—the 1D-CNN and the RNN (LSTM)—consistently outperformed the standard ANN that ignores the temporal order. This confirms that for time-series data, choosing an architecture that can "see" the patterns over time is crucial for achieving the best performance.
Conclusion
This project is a perfect illustration of how to tackle a time-series forecasting problem with modern deep learning. We saw that the heavy lifting is often in the data preparation—thoughtfully creating sequences to provide context for our models. By comparing an ANN, 1D-CNN, and an RNN, we learned that while all are powerful, architectures that respect the sequential nature of data are ultimately the most effective for peeking into the future.
This blog presents key insights from our project for the ‘Machine Learning’ course (MBA 2024–26, 4th trimester) at Amrita School of Business, Coimbatore, under the guidance of Dr. Prashobhan Palakkel.
Comments
Post a Comment