PJM Hourly Energy Consumption Prediction using LSTM
Context
PJM Interconnection LLC (PJM) is a regional transmission organization (RTO) in the United States. It is part of the Eastern Interconnection grid operating an electric transmission system serving all or parts of Delaware, Illinois, Indiana, Kentucky, Maryland, Michigan, New Jersey, North Carolina, Ohio, Pennsylvania, Tennessee, Virginia, West Virginia, and the District of Columbia.
The hourly power consumption data comes from PJM’s website and are in megawatts (MW).
The regions have changed over the years so data may only appear for certain dates per region.
Importing necessary libraries
import tensorflow as tf
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import sklearn.preprocessing
Function to plot series
def plot_series(time, series, format="-", start=0, end=None):
plt.plot(time[start:end], series[start:end], format)
plt.xlabel("Time")
plt.ylabel("Value")
plt.grid(True)
Reading the dataset
path = '/kaggle/input/hourly-energy-consumption/FE_hourly.csv'
df = pd.read_csv(path)
Plotting the dataset
df.plot()
plt.show()
Scaling the dataset
scaler = sklearn.preprocessing.MinMaxScaler()
df_norm = scaler.fit_transform(df['FE_MW'].values.reshape(-1,1))
df_norm.shape(62874, 1)
Power and Time plot
power = df_norm
time = np.array(df.index)
plt.figure(figsize=(10, 6))
plot_series(time, power)
Preprocessing the dataset
split_time = 50000
time_train = time[:split_time]
x_train = power[:split_time]
time_valid = time[split_time:]
x_valid = power[split_time:]
window_size = 30
batch_size = 32
shuffle_buffer_size = 1000def windowed_dataset(series, window_size, batch_size, shuffle_buffer):
series = tf.expand_dims(series, axis=-1)
ds = tf.data.Dataset.from_tensor_slices(series)
ds = ds.window(window_size + 1, shift=1, drop_remainder=True)
ds = ds.flat_map(lambda w: w.batch(window_size + 1))
ds = ds.shuffle(shuffle_buffer)
ds = ds.map(lambda w: (w[:-1], w[1:]))
return ds.batch(batch_size).prefetch(1)def model_forecast(model, series, window_size):
ds = tf.data.Dataset.from_tensor_slices(series)
ds = ds.window(window_size, shift=1, drop_remainder=True)
ds = ds.flat_map(lambda w: w.batch(window_size))
ds = ds.batch(32).prefetch(1)
forecast = model.predict(ds)
return forecast
Building the model
tf.keras.backend.clear_session()
tf.random.set_seed(51)
np.random.seed(51)
train_set = windowed_dataset(x_train, window_size=60, batch_size=100, shuffle_buffer=shuffle_buffer_size)
model = tf.keras.models.Sequential([
tf.keras.layers.Conv1D(filters=60, kernel_size=5,
strides=1, padding="causal",
activation="relu",
input_shape=[None, 1]),
tf.keras.layers.LSTM(60,return_sequences=True),
tf.keras.layers.Dense(10, activation ='relu'),
tf.keras.layers.Dense(1),
tf.keras.layers.Lambda(lambda x: x * 400)
])
To understand the above code more elaborately I would recommend to watch this tutorial from Udacity. https://classroom.udacity.com/courses/ud187/lessons/6d543d5c-6b18-4ecf-9f0f-3fd034acd2cc/concepts/c10fb954-25ea-43e3-b22c-21b3e423eb05#
Compiling the model
optimizer = tf.keras.optimizers.SGD(lr=1e-5, momentum=0.9)
model.compile(loss=tf.keras.losses.Huber(),
optimizer=optimizer,
metrics=["mae"])
history = model.fit(train_set,epochs=1000)
At the end of 1000th epoch we are getting a loss of 3.3108e-04 and mean absolute error of 0.0186.
Forecasting the model
rnn_forecast = model_forecast(model, power[..., np.newaxis], window_size)
rnn_forecast = rnn_forecast[split_time - window_size:-1, -1, 0]plt.figure(figsize=(10, 6))
plot_series(time_valid, x_valid)
plot_series(time_valid, rnn_forecast)
Actual vs Prediction Plot
plt.plot(time_valid[:300], x_valid[:300])
plt.plot(time_valid[:300], rnn_forecast[:300])
plt.xlabel("Time")
plt.ylabel("Value")
plt.grid(True)
Results
We are getting a loss of 3.3108e-04 and mean absolute error of 0.0186. We can further improve our model using more complex layers. Considering it as a baseline model it further needs to be optimized to get a better performance.