Peeking into the Future of Weather: Forecasting with Neural Networks in Python

Predicting the future has been a human fascination for millennia. While we may not have crystal balls, we have something far more powerful for forecasting: neural networks. These complex, brain-inspired algorithms are masters at finding patterns in sequential data, making them perfect for tasks like predicting stock prices, analyzing language, and, as we'll explore today, forecasting the weather.

In this post, we'll break down a Python project that uses three different types of neural networks to predict the mean temperature in Delhi, India. We'll see how data preparation is the secret ingredient and compare the performance of a standard Artificial Neural Network (ANN), a 1D Convolutional Neural Network (CNN), and a Recurrent Neural Network (RNN).

Step 1: Data - The Fuel for Our Models

Our journey begins with the "Daily Delhi Climate" dataset. This dataset contains four key climate features that we'll use for our prediction: 'meantemp', 'humidity', 'wind_speed', and 'meanpressure'. Our specific goal is to predict the next day's 'meantemp' using the data from the past 10 days.

Before feeding this data to our hungry models, we must preprocess it.

Normalization: Neural networks perform best when input values are on a small, consistent scale. We used a MinMaxScaler to transform all our feature values into a range between 0 and 1. This prevents any single feature from having an outsized influence on the learning process.
Creating Sequences: This is the most critical step in time-series forecasting. We can't just feed the network one day of data at a time. It needs context! We used a helper function to create "sequences" or "windows" of data. Specifically, we created input samples (X) containing 10 consecutive days of all four features, and the corresponding target (y) was the mean temperature of the 11th day. This sliding window moves across our entire dataset, generating hundreds of training examples.
Train-Test Split: Finally, we split our sequenced data into a training set (80%) and a testing set (20%). The models will learn from the training data and be evaluated on the unseen test data.

The Contenders: A Trio of Neural Architectures

With our data perfectly structured, we trained three different neural network architectures to see which approach is best for this task. The evaluation metric for all models is the Mean Squared Error (MSE), which measures the average squared difference between the predicted and actual temperatures. For MSE, a lower score is better.

1. The Classic: Artificial Neural Network (ANN)

The ANN is the foundational neural network. Our ANN consisted of a few densely connected layers. However, a standard ANN can't directly process sequential data. To make it work, we had to flatten our input. This means our 10 days of 4 features were converted into a single, flat vector of 40 inputs.

The Catch: While simple, this approach loses the crucial temporal order of the data. The model sees 40 numbers but doesn't inherently know which came from day 1 versus day 10.
Performance: The ANN achieved a Test MSE of 0.0034 in one run and 0.00323 in another.

2. The Pattern-Spotter: 1D Convolutional Neural Network (CNN)

You might associate CNNs with image recognition, but 1D CNNs are incredibly effective at finding patterns in data sequences. Instead of looking for patterns in 2D pixels, a 1D CNN slides a filter (or kernel) across our 10-day time sequence, learning to recognize significant shapes and patterns (like a sudden drop in pressure followed by a rise in humidity).

The Architecture: Our model used a Conv1D layer to extract features, a MaxPooling1D layer to condense them, and Dense layers to make the final prediction.
Performance: The 1D-CNN achieved a Test MSE of 0.0048 in the first run and 0.00310 in the second.

3. The Memory Keeper: Recurrent Neural Network (RNN with LSTM)

RNNs are the quintessential choice for time-series data because they have a concept of "memory". They process data sequentially, passing information from one step to the next. We used a special, powerful type of RNN layer called Long Short-Term Memory (LSTM). LSTMs are designed to remember patterns over long sequences, making them ideal for our 10-day lookback window.

The Advantage: Unlike the ANN, the LSTM processes the input in its original shape (samples, 10 timesteps, 4 features), fully preserving and leveraging the temporal sequence.
Performance: The RNN (LSTM) achieved a Test MSE of 0.0028 in the first run and 0.00318 in the second.

The Final Showdown: Which Model Won?

Let's compare the best scores from our runs:

Across the two provided executions, the results were very close, showcasing the power of all three approaches. In the first run, the

RNN (LSTM) was the clear winner with an MSE of 0.0028. In the second run, the 1D-CNN edged out the others with a slightly better MSE of 0.00310.

The key takeaway is that the models designed to understand sequences—the 1D-CNN and the RNN (LSTM)—consistently outperformed the standard ANN that ignores the temporal order. This confirms that for time-series data, choosing an architecture that can "see" the patterns over time is crucial for achieving the best performance.

Conclusion

This project is a perfect illustration of how to tackle a time-series forecasting problem with modern deep learning. We saw that the heavy lifting is often in the data preparation—thoughtfully creating sequences to provide context for our models. By comparing an ANN, 1D-CNN, and an RNN, we learned that while all are powerful, architectures that respect the sequential nature of data are ultimately the most effective for peeking into the future.

This blog presents key insights from our project for the ‘Machine Learning’ course (MBA 2024–26, 4th trimester) at Amrita School of Business, Coimbatore, under the guidance of Dr. Prashobhan Palakkel.

Search This Blog

Nanda Gopan