Improving LSTM Performance

创建于:2025年1月17日

使用 OpenAI 01 回答 OpenAI01

问题

import pandas as pd
import numpy as np
import torch
import torch.nn as nn
import torch.optim as optim
from sklearn.preprocessing import MinMaxScaler
from torch.utils.data import TensorDataset, DataLoader
import matplotlib.pyplot as plt

# 1. Load data
df = pd.read_csv("realDataForTrain.csv")

# 2. Normalize
scaler = MinMaxScaler()
scaled_data = scaler.fit_transform(df[['col1', 'col2']])

# 3. Create sequences
SEQ_LENGTH = 10
def create_sequences(data, seq_length=10):
X, y = [], []
for i in range(len(data) - seq_length):
X.append(data[i : i + seq_length])
y.append(data[i + seq_length])
return np.array(X), np.array(y)

X, y = create_sequences(scaled_data, SEQ_LENGTH)
train_size = int(len(X)*0.8)
X_train = X[:train_size]; y_train = y[:train_size]
X_test = X[train_size:]; y_test = y[train_size:]

X_train_tensors = torch.tensor(X_train, dtype=torch.float32)
y_train_tensors = torch.tensor(y_train, dtype=torch.float32)
X_test_tensors = torch.tensor(X_test, dtype=torch.float32)
y_test_tensors = torch.tensor(y_test, dtype=torch.float32)

# 4. Define LSTM
class LSTMModel(nn.Module):
def __init__(self, input_size=2, hidden_size=32, num_layers=1, output_size=2):
super(LSTMModel, self).__init__()
self.hidden_size = hidden_size
self.num_layers = num_layers
self.lstm = nn.LSTM(input_size, hidden_size, num_layers, batch_first=True)
self.fc = nn.Linear(hidden_size, output_size)

def forward(self, x):
h0 = torch.zeros(self.num_layers, x.size(0), self.hidden_size)
c0 = torch.zeros(self.num_layers, x.size(0), self.hidden_size)
out, _ = self.lstm(x, (h0, c0))
out = out[:, -1, :]
out = self.fc(out)
return out

# 5. Train
model = LSTMModel()
criterion = nn.MSELoss()
optimizer = optim.Adam(model.parameters(), lr=1e-3)
train_dataset = TensorDataset(X_train_tensors, y_train_tensors)
train_loader = DataLoader(train_dataset, batch_size=32, shuffle=True)

for epoch in range(30):
for X_batch, y_batch in train_loader:
optimizer.zero_grad()
outputs = model(X_batch)
loss = criterion(outputs, y_batch)
loss.backward()
optimizer.step()
if (epoch+1) % 5 == 0:
print(f"Epoch [{epoch+1}/30], Loss: {loss.item():.6f}")

# 6. Evaluate
model.eval()
with torch.no_grad():
predictions = model(X_test_tensors).numpy()
y_test_np = y_test_tensors.numpy()
predictions_inv = scaler.inverse_transform(predictions)
y_test_inv = scaler.inverse_transform(y_test_np)

mse = np.mean((predictions - y_test_np)**2)
rmse = np.sqrt(mse)
mae = np.mean(np.abs(predictions - y_test_np))
ss_res = np.sum((y_test_np - predictions)**2)
ss_tot = np.sum((y_test_np - np.mean(y_test_np, axis=0))**2)
r2 = 1 - (ss_res / ss_tot)

print(f"MSE : {mse:.6f}")
print(f"RMSE: {rmse:.6f}")
print(f"MAE : {mae:.6f}")
print(f"R^2 : {r2:.6f}")

# Approximate accuracy
epsilon = 0.1
diff = np.linalg.norm(predictions - y_test_np, axis=1)
accuracy = np.mean(diff < epsilon)
print(f"Accuracy (with epsilon={epsilon} in scaled space): {accuracy*100:.2f}%")

# 7. Predict the next 10 values
def predict_future(model, data, scaler, seq_length=10, steps=10):
model.eval()
window = data[-seq_length:].copy()
preds = []
for _ in range(steps):
x_t = torch.tensor(window[np.newaxis, :, :], dtype=torch.float32)
with torch.no_grad():
pred = model(x_t)
pred_np = pred.numpy()[0]
preds.append(pred_np)
window = np.vstack([window[1:], pred_np])
preds = np.array(preds)
return scaler.inverse_transform(preds)

future_preds = predict_future(model, scaled_data, scaler, seq_length=SEQ_LENGTH, steps=10)
print("Next 10 predictions (col1, col2):")
for i, (p1, p2) in enumerate(future_preds, start=1):
print(f"Step {i}: col1={p1:.4f}, col2={p2:.4f}")

# 8. (Optional) Plot
plt.figure(figsize=(12, 4))
test_index = range(train_size+SEQ_LENGTH, train_size+SEQ_LENGTH+len(y_test_inv))
plt.plot(test_index, y_test_inv[:, 0], label="Actual col1", color="blue")
plt.plot(test_index, predictions_inv[:, 0], label="Predicted col1", color="red")
plt.title("col1 Actual vs Predicted")
plt.legend()
plt.show() i have this code but my accuary is 17.16% how i improve the accuracy of model

思考

Thought about improving model accuracy for a couple of seconds

回答

Improving a time-series model’s accuracy typically involves multiple dimensions, including data preprocessing, model architecture, hyperparameter tuning, and training strategy. Below are several suggestions you can try. You might not need to do all of them—start with small changes and see which ones most effectively improve performance.


1. Data-Related Improvements

1.1. Check data quality

  • Outliers and anomalies: Outliers can harm the training process if they dominate the loss function. Consider removing or clipping extreme values.
  • Missing values: Ensure there are no missing values or that they’re properly imputed before training.
  • Seasonality or trends: If your data has strong trends or seasonal patterns, consider using techniques such as detrending or adding time-based features (e.g., day of week, month, etc.).

1.2. Try different scalers

  • The MinMaxScaler forces data into [0, 1]. However, if your data has outliers or long tails, try StandardScaler or RobustScaler and see if it affects performance.

1.3. Increase the number of features (if available)

  • If there are other relevant features (e.g., external factors, weather, holiday flags, etc.), include them in your model. More context sometimes helps the model learn better.

2. Model Architecture and Hyperparameters

2.1. Increase model complexity

  • Hidden size: Increase hidden_size from 32 to 64 or 128.
  • Number of layers: Use a stacked LSTM (e.g., num_layers=2 or 3).

However, be mindful: too large a model may overfit if you have limited data. Use a validation set or cross-validation to monitor overfitting.

2.2. Add regularization

  • Dropout: Apply dropout on the LSTM. For instance:

    self.lstm = nn.LSTM(input_size, hidden_size, num_layers, batch_first=True, dropout=0.2)

    (Note that if num_layers=1, PyTorch’s LSTM ignores the dropout parameter, so consider num_layers > 1 if you want dropout to be active within the LSTM layers.)

  • Weight decay: Use a small weight decay in your optimizer to reduce overfitting, for example:

    optimizer = optim.Adam(model.parameters(), lr=1e-3, weight_decay=1e-5)

2.3. Tune the learning rate

  • The default lr=1e-3 may be too high or too low. Try different learning rates (e.g., 1e-2, 5e-4, 1e-4) and see what yields the best validation loss.

2.4. Increase number of epochs

  • If your model is still improving at the end of 30 epochs, increase to 50 or 100. Plot training vs. validation loss to see if there’s a plateau or signs of overfitting.

2.5. Consider alternative architectures

  • GRU: Replace the LSTM with a GRU layer. GRUs often train faster and can perform comparably or better in some time-series tasks.
  • Temporal Convolution: In some cases, a 1D Conv (e.g., a Temporal Convolutional Network) can outperform RNN-based models on time-series data.

3. Training Strategy

3.1. Use a validation set

  • Split your data into three parts: train, validation, and test. Use the validation set to tune hyperparameters (learning rate, hidden size, etc.). Only check test set performance at the end.

3.2. Early stopping

  • Monitor validation loss and stop training once it stops improving (or starts deteriorating). This prevents overfitting and might save you time.

3.3. Batch size

  • Experiment with different batch sizes. A batch size of 32 might work, but 16 or 64 could yield better results.

4. Accuracy Metric and Threshold

4.1. Rethink the “accuracy” definition

Right now, you’re defining accuracy in the scaled space as:

epsilon = 0.1 diff = np.linalg.norm(predictions - y_test_np, axis=1) accuracy = np.mean(diff < epsilon)

A norm difference less than 0.1 on a 2D scaled vector might be quite strict or too lenient, depending on the scale of your data:

  1. Tune epsilon: If your data has certain variance, epsilon=0.1 might be too small. You can experiment with different thresholds, e.g., 0.05, 0.2, etc.
  2. Evaluate dimension-wise: Instead of using the vector norm, you might also consider checking each feature’s absolute difference. For example:
    accuracy_feature1 = np.mean(np.abs(predictions[:, 0] - y_test_np[:, 0]) < epsilon_feature1) accuracy_feature2 = np.mean(np.abs(predictions[:, 1] - y_test_np[:, 1]) < epsilon_feature2)
    Then average these or interpret them individually.

4.2. Stick to continuous error metrics

  • For time-series, it’s often more typical to rely on RMSE, MAE, or MAPE (mean absolute percentage error). The notion of “accuracy” might be less intuitive unless you have a clear classification-like criterion.

5. Putting it All Together (Example Changes)

Below is an example of how you might modify your code in small steps.

  1. Increase hidden size and layers, and add dropout:

    class LSTMModel(nn.Module): def __init__(self, input_size=2, hidden_size=64, num_layers=2, output_size=2, dropout=0.2): super(LSTMModel, self).__init__() self.hidden_size = hidden_size self.num_layers = num_layers self.lstm = nn.LSTM(input_size, hidden_size, num_layers=num_layers, batch_first=True, dropout=dropout) self.fc = nn.Linear(hidden_size, output_size) def forward(self, x): # Initialize hidden states h0 = torch.zeros(self.num_layers, x.size(0), self.hidden_size) c0 = torch.zeros(self.num_layers, x.size(0), self.hidden_size) out, _ = self.lstm(x, (h0, c0)) # out shape: (batch, seq_len, hidden_size) out = out[:, -1, :] # only last time step out = self.fc(out) return out
  2. Use weight decay and possibly tweak lr:

    optimizer = optim.Adam(model.parameters(), lr=5e-4, weight_decay=1e-5)
  3. Train for more epochs or use early stopping:

    best_val_loss = float('inf') for epoch in range(100): model.train() for X_batch, y_batch in train_loader: optimizer.zero_grad() outputs = model(X_batch) loss = criterion(outputs, y_batch) loss.backward() optimizer.step() # Validation step (if you have a validation set) model.eval() with torch.no_grad(): val_outputs = model(X_val_tensors) val_loss = criterion(val_outputs, y_val_tensors).item() # Early stopping check if val_loss < best_val_loss: best_val_loss = val_loss # Save best model if you like else: # If val_loss didn't improve, consider stopping or adjusting pass # Print progress if (epoch + 1) % 10 == 0: print(f"Epoch [{epoch+1}/100], Train Loss: {loss.item():.6f}, Val Loss: {val_loss:.6f}")
  4. Consider standard scaling:

    from sklearn.preprocessing import StandardScaler scaler = StandardScaler() scaled_data = scaler.fit_transform(df[['col1', 'col2']]) # Keep the rest of your code the same
  5. Adjust your “accuracy” threshold or remove it altogether** and rely on RMSE/MAE if that is more relevant for your problem.


Summary

  • Data: Explore new features, remove outliers, and carefully choose a scaler.
  • Architecture: Use stacked LSTM or GRU with a larger hidden size, dropout, and possibly weight decay.
  • Hyperparameters: Tune the learning rate, batch size, and number of epochs. Consider early stopping.
  • Metrics: Choose thresholds carefully if using accuracy in a regression setting. RMSE, MAE, or MAPE are often more informative for time-series.

Incrementally applying these tips should help you improve your model’s performance and push that 17% accuracy closer to a more acceptable range.

问题

how i add grid search and early stopping to my code to tune my hyper parameters?
use this accuracy formula : # Define accuracy function with a threshold tolerance (5%)
def calculate_accuracy(y_true, y_pred, threshold=0.10):
"""
Calculate accuracy based on how many predictions are within a given percentage threshold of the true values.
"""
within_tolerance = np.abs(y_true - y_pred) <= threshold * np.abs(y_true)
accuracy = np.mean(within_tolerance) * 100 # Convert to percentage
return accuracy and import pandas as pd
import numpy as np
import torch
import torch.nn as nn
import torch.optim as optim
from sklearn.preprocessing import MinMaxScaler
from torch.utils.data import TensorDataset, DataLoader
import matplotlib.pyplot as plt

# 1. Load data
df = pd.read_csv("realDataForTrain.csv")

# 2. Normalize
scaler = MinMaxScaler()
scaled_data = scaler.fit_transform(df[['col1', 'col2']])

# 3. Create sequences
SEQ_LENGTH = 10
def create_sequences(data, seq_length=10):
X, y = [], []
for i in range(len(data) - seq_length):
X.append(data[i : i + seq_length])
y.append(data[i + seq_length])
return np.array(X), np.array(y)

X, y = create_sequences(scaled_data, SEQ_LENGTH)
train_size = int(len(X)*0.8)
X_train = X[:train_size]; y_train = y[:train_size]
X_test = X[train_size:]; y_test = y[train_size:]

X_train_tensors = torch.tensor(X_train, dtype=torch.float32)
y_train_tensors = torch.tensor(y_train, dtype=torch.float32)
X_test_tensors = torch.tensor(X_test, dtype=torch.float32)
y_test_tensors = torch.tensor(y_test, dtype=torch.float32)

# 4. Define LSTM
class LSTMModel(nn.Module):
def __init__(self, input_size=2, hidden_size=32, num_layers=1, output_size=2):
super(LSTMModel, self).__init__()
self.hidden_size = hidden_size
self.num_layers = num_layers
self.lstm = nn.LSTM(input_size, hidden_size, num_layers, batch_first=True)
self.fc = nn.Linear(hidden_size, output_size)

def forward(self, x):
h0 = torch.zeros(self.num_layers, x.size(0), self.hidden_size)
c0 = torch.zeros(self.num_layers, x.size(0), self.hidden_size)
out, _ = self.lstm(x, (h0, c0))
out = out[:, -1, :]
out = self.fc(out)
return out

# 5. Train
model = LSTMModel()
criterion = nn.MSELoss()
optimizer = optim.Adam(model.parameters(), lr=1e-3)
train_dataset = TensorDataset(X_train_tensors, y_train_tensors)
train_loader = DataLoader(train_dataset, batch_size=32, shuffle=True)

for epoch in range(30):
for X_batch, y_batch in train_loader:
optimizer.zero_grad()
outputs = model(X_batch)
loss = criterion(outputs, y_batch)
loss.backward()
optimizer.step()
if (epoch+1) % 5 == 0:
print(f"Epoch [{epoch+1}/30], Loss: {loss.item():.6f}")

# 6. Evaluate
model.eval()
with torch.no_grad():
predictions = model(X_test_tensors).numpy()
y_test_np = y_test_tensors.numpy()
predictions_inv = scaler.inverse_transform(predictions)
y_test_inv = scaler.inverse_transform(y_test_np)

mse = np.mean((predictions - y_test_np)**2)
rmse = np.sqrt(mse)
mae = np.mean(np.abs(predictions - y_test_np))
ss_res = np.sum((y_test_np - predictions)**2)
ss_tot = np.sum((y_test_np - np.mean(y_test_np, axis=0))**2)
r2 = 1 - (ss_res / ss_tot)

print(f"MSE : {mse:.6f}")
print(f"RMSE: {rmse:.6f}")
print(f"MAE : {mae:.6f}")
print(f"R^2 : {r2:.6f}")

# Approximate accuracy
epsilon = 0.1
diff = np.linalg.norm(predictions - y_test_np, axis=1)
accuracy = np.mean(diff < epsilon)
print(f"Accuracy (with epsilon={epsilon} in scaled space): {accuracy*100:.2f}%")

# 7. Predict the next 10 values
def predict_future(model, data, scaler, seq_length=10, steps=10):
model.eval()
window = data[-seq_length:].copy()
preds = []
for _ in range(steps):
x_t = torch.tensor(window[np.newaxis, :, :], dtype=torch.float32)
with torch.no_grad():
pred = model(x_t)
pred_np = pred.numpy()[0]
preds.append(pred_np)
window = np.vstack([window[1:], pred_np])
preds = np.array(preds)
return scaler.inverse_transform(preds)

future_preds = predict_future(model, scaled_data, scaler, seq_length=SEQ_LENGTH, steps=10)
print("Next 10 predictions (col1, col2):")
for i, (p1, p2) in enumerate(future_preds, start=1):
print(f"Step {i}: col1={p1:.4f}, col2={p2:.4f}")

# 8. (Optional) Plot
plt.figure(figsize=(12, 4))
test_index = range(train_size+SEQ_LENGTH, train_size+SEQ_LENGTH+len(y_test_inv))
plt.plot(test_index, y_test_inv[:, 0], label="Actual col1", color="blue")
plt.plot(test_index, predictions_inv[:, 0], label="Predicted col1", color="red")
plt.title("col1 Actual vs Predicted")
plt.legend()
plt.show() this is my code give me whole answer code together

思考

Integrating components

I’m pulling together components like an LSTM model, GridSearch, early stopping, and a custom accuracy metric into a unified PyTorch training and evaluation script.

Crafting the function

Working on the train_and_evaluate function to manually handle hyperparameter optimization, integrating early stopping. Addresses model creation, data loading, and defines hyperparameters like hidden_size, num_layers, and lr.

Strategizing data division

I’m piecing together a data split—70% training, 10% validation, and 20% test. This clear segmentation guides hyperparameter tuning and model training, aiming for a well-rounded evaluation.

Pulling together

I’m integrating functions for calculating accuracy and staging data. This includes the LSTM model, hidden_size, num_layers, and the dataset.

Adjusting data handling

OK, I’m tweaking the user's data approach to a 70/10/20 train-validation-test split, balancing validation set use for early stopping with the original 80/20 train-test split.

Thought about LSTM tuning with grid search and early stopping for 15 seconds

回答

Below is a complete code example showing how you could incorporate grid search for hyperparameter tuning and early stopping into your current workflow. In this example, we do the following:

  1. Split the data into train, validation, and test sets (70/10/20).
  2. Define a grid of hyperparameters (e.g., hidden sizes, number of layers, learning rates).
  3. Implement early stopping during training by monitoring validation loss.
  4. Use your custom accuracy function (calculate_accuracy) with a 10% threshold.
  5. At the end of the grid search, we pick the best hyperparameters based on validation performance and evaluate on the final test set.

Feel free to modify the splits or the hyperparameter search space to fit your exact needs.


import pandas as pd import numpy as np import torch import torch.nn as nn import torch.optim as optim from sklearn.preprocessing import MinMaxScaler from torch.utils.data import TensorDataset, DataLoader import matplotlib.pyplot as plt ############################################ # 1. Load data ############################################ df = pd.read_csv("realDataForTrain.csv") ############################################ # 2. Normalize ############################################ scaler = MinMaxScaler() scaled_data = scaler.fit_transform(df[['col1', 'col2']]) ############################################ # 3. Create sequences ############################################ SEQ_LENGTH = 10 def create_sequences(data, seq_length=10): X, y = [], [] for i in range(len(data) - seq_length): X.append(data[i : i + seq_length]) y.append(data[i + seq_length]) return np.array(X), np.array(y) X, y = create_sequences(scaled_data, SEQ_LENGTH) # Let's do 70% train, 10% validation, 20% test train_size = int(len(X) * 0.7) val_size = int(len(X) * 0.1) test_size = len(X) - train_size - val_size # Indices X_train = X[:train_size] y_train = y[:train_size] X_val = X[train_size : train_size + val_size] y_val = y[train_size : train_size + val_size] X_test = X[train_size + val_size:] y_test = y[train_size + val_size:] # Convert to tensors X_train_tensors = torch.tensor(X_train, dtype=torch.float32) y_train_tensors = torch.tensor(y_train, dtype=torch.float32) X_val_tensors = torch.tensor(X_val, dtype=torch.float32) y_val_tensors = torch.tensor(y_val, dtype=torch.float32) X_test_tensors = torch.tensor(X_test, dtype=torch.float32) y_test_tensors = torch.tensor(y_test, dtype=torch.float32) ############################################ # 4. Define LSTM model ############################################ class LSTMModel(nn.Module): def __init__(self, input_size=2, hidden_size=32, num_layers=1, output_size=2): super(LSTMModel, self).__init__() self.hidden_size = hidden_size self.num_layers = num_layers self.lstm = nn.LSTM(input_size, hidden_size, num_layers, batch_first=True) self.fc = nn.Linear(hidden_size, output_size) def forward(self, x): # Initialize hidden state and cell state h0 = torch.zeros(self.num_layers, x.size(0), self.hidden_size) c0 = torch.zeros(self.num_layers, x.size(0), self.hidden_size) out, _ = self.lstm(x, (h0, c0)) # out: (batch, seq_length, hidden_size) out = out[:, -1, :] # Take the last time step out = self.fc(out) # Linear layer return out ############################################ # 5. Accuracy calculation (10% threshold) ############################################ def calculate_accuracy(y_true, y_pred, threshold=0.10): """ Calculate accuracy based on how many predictions are within a given percentage threshold of the true values. """ # y_true and y_pred shape: (batch, 2) within_tolerance = np.abs(y_true - y_pred) <= threshold * np.abs(y_true) # within_tolerance is a boolean array of shape (batch, 2) accuracy = np.mean(within_tolerance) * 100 # Convert to percentage return accuracy ############################################ # 6. Training routine with early stopping ############################################ def train_model(model, train_loader, val_data, criterion, optimizer, max_epochs=100, patience=10): """ Train the model with early stopping on validation loss. Args: model: The PyTorch model train_loader: DataLoader for training val_data: (X_val_tensors, y_val_tensors) criterion: Loss function optimizer: Optimizer max_epochs: Maximum number of epochs patience: How many epochs to wait for improvement before stopping Returns: best_model: The best model (state) found best_val_loss: The validation loss of the best model history: A dict with training/validation loss for plotting """ X_val_tensors, y_val_tensors = val_data best_val_loss = float('inf') epochs_no_improve = 0 history = {'train_loss': [], 'val_loss': []} for epoch in range(max_epochs): # Train mode model.train() train_loss_sum = 0.0 for X_batch, y_batch in train_loader: optimizer.zero_grad() outputs = model(X_batch) loss = criterion(outputs, y_batch) loss.backward() optimizer.step() train_loss_sum += loss.item() train_loss = train_loss_sum / len(train_loader) # Validation model.eval() with torch.no_grad(): val_outputs = model(X_val_tensors) val_loss = criterion(val_outputs, y_val_tensors).item() history['train_loss'].append(train_loss) history['val_loss'].append(val_loss) # Check for improvement if val_loss < best_val_loss: best_val_loss = val_loss best_model_state = model.state_dict() # save model state epochs_no_improve = 0 else: epochs_no_improve += 1 # Early stopping if epochs_no_improve >= patience: print(f"Early stopping on epoch {epoch+1}") break if (epoch + 1) % 5 == 0: print(f"Epoch [{epoch+1}/{max_epochs}] " f"Train Loss: {train_loss:.6f}, Val Loss: {val_loss:.6f}") # Load the best model state model.load_state_dict(best_model_state) return model, best_val_loss, history ############################################ # 7. Grid search ############################################ from torch.utils.data import TensorDataset, DataLoader # Prepare train DataLoader train_dataset = TensorDataset(X_train_tensors, y_train_tensors) # Hyperparameter grid hidden_sizes = [32, 64] num_layers_list = [1, 2] learning_rates = [1e-3, 1e-4] batch_sizes = [16, 32] best_hparams = None best_val_loss = float('inf') best_model = None # We will do a simple manual grid search for hidden_size in hidden_sizes: for num_layers in num_layers_list: for lr in learning_rates: for batch_size in batch_sizes: print(f"\nTrying: hidden_size={hidden_size}, num_layers={num_layers}, lr={lr}, batch_size={batch_size}") # Create model model = LSTMModel(input_size=2, hidden_size=hidden_size, num_layers=num_layers, output_size=2) criterion = nn.MSELoss() optimizer = optim.Adam(model.parameters(), lr=lr) # Create DataLoader with the chosen batch_size train_loader = DataLoader(train_dataset, batch_size=batch_size, shuffle=True) # Train with early stopping model, val_loss, history = train_model( model=model, train_loader=train_loader, val_data=(X_val_tensors, y_val_tensors), criterion=criterion, optimizer=optimizer, max_epochs=50, # you can increase if you want patience=5 # early stopping patience ) print(f"Finished. Val Loss = {val_loss:.6f}") # Check if this is the best so far if val_loss < best_val_loss: best_val_loss = val_loss best_hparams = (hidden_size, num_layers, lr, batch_size) # Clone the model to keep it best_model = LSTMModel(input_size=2, hidden_size=hidden_size, num_layers=num_layers, output_size=2) best_model.load_state_dict(model.state_dict()) print("==> New best hyperparameters found.") print(f"\nBest val loss = {best_val_loss:.6f} with hparams: hidden_size={best_hparams[0]}, " f"num_layers={best_hparams[1]}, lr={best_hparams[2]}, batch_size={best_hparams[3]}") ############################################ # 8. Final Evaluation on Test Set ############################################ best_model.eval() with torch.no_grad(): # Predictions in scaled space predictions = best_model(X_test_tensors).numpy() y_test_np = y_test_tensors.numpy() # Inverse transform predictions and ground truth to original scale predictions_inv = scaler.inverse_transform(predictions) y_test_inv = scaler.inverse_transform(y_test_np) # Calculate standard regression metrics in scaled space mse = np.mean((predictions - y_test_np) ** 2) rmse = np.sqrt(mse) mae = np.mean(np.abs(predictions - y_test_np)) # R^2 in scaled space ss_res = np.sum((y_test_np - predictions) ** 2) ss_tot = np.sum((y_test_np - np.mean(y_test_np, axis=0)) ** 2) r2 = 1 - (ss_res / ss_tot) print("\n=== Test Set Performance (Scaled Space) ===") print(f"MSE : {mse:.6f}") print(f"RMSE: {rmse:.6f}") print(f"MAE : {mae:.6f}") print(f"R^2 : {r2:.6f}") # Calculate custom accuracy in scaled space (element-wise within 10% of y_true) accuracy_10pct = calculate_accuracy(y_test_np, predictions, threshold=0.10) print(f"Accuracy (10% threshold in scaled space): {accuracy_10pct:.2f}%") ############################################ # 9. Predict the next 10 values ############################################ def predict_future(model, data, scaler, seq_length=10, steps=10): model.eval() window = data[-seq_length:].copy() preds = [] for _ in range(steps): x_t = torch.tensor(window[np.newaxis, :, :], dtype=torch.float32) with torch.no_grad(): pred = model(x_t) pred_np = pred.numpy()[0] preds.append(pred_np) # Update the "window" window = np.vstack([window[1:], pred_np]) preds = np.array(preds) return scaler.inverse_transform(preds) future_preds = predict_future(best_model, scaled_data, scaler, seq_length=SEQ_LENGTH, steps=10) print("\nNext 10 predictions (col1, col2):") for i, (p1, p2) in enumerate(future_preds, start=1): print(f"Step {i}: col1={p1:.4f}, col2={p2:.4f}") ############################################ # 10. (Optional) Plot ############################################ plt.figure(figsize=(12, 4)) test_index = range(train_size + val_size + SEQ_LENGTH, train_size + val_size + SEQ_LENGTH + len(y_test_inv)) plt.plot(test_index, y_test_inv[:, 0], label="Actual col1", color="blue") plt.plot(test_index, predictions_inv[:, 0], label="Predicted col1", color="red") plt.title("col1 Actual vs Predicted (Test Set)") plt.legend() plt.show()

Key Points to Note

  1. Train/Val/Test Split
    We changed the split to 70/10/20, so we’d have a dedicated validation set for early stopping and hyperparameter tuning. If you want to keep your 80/20 train/test split exactly, you can carve the validation set out of the training split.

  2. Grid Search

    • We manually iterate through various hidden sizes, number of layers, learning rates, and batch sizes.
    • For each combination, we train a new model from scratch and keep track of the best validation loss.
    • You can easily expand or change these ranges (e.g., add dropout or different optimizers).
  3. Early Stopping

    • We set a patience parameter that stops training if the validation loss does not improve for patience consecutive epochs.
    • The best model parameters (lowest validation loss) are stored and then reloaded at the end of training.
  4. Accuracy Function (calculate_accuracy)

    • This uses your 10% threshold formula on the scaled predictions. It checks element-wise within 10% for both col1 and col2.
    • We convert the final mean to a percentage.
  5. Final Model on Test Set

    • After grid search picks the best hyperparameters, we evaluate that best model on the test set and print out MSE, RMSE, MAE, R^2, and your accuracy metric.
  6. Predict Future

    • The predict_future function is unchanged from your original code, but here we apply it to the best model found from grid search.

Feel free to adjust any details (e.g., patience, max_epochs, or hyperparameter grid) to suit your data and to find the best performance.

问题

this is my dataset csv file:col1,col2
-0.09049732849448675,97.07421138751958
0.01706581776673801,8.931705375213935
-0.12610981210709962,35.23775348541896
0.031167010198950122,86.54867751356406
-0.053238021550445415,22.287907143775364
0.020438444672281317,31.388636228767187
-0.01565222813575494,93.97151854250646
0.22906818845030516,69.4499608684432
-0.2611698787716817,93.64285187572544
-0.2957446866807184,18.761178121102905
-0.5205892443687473,76.16039005338293
0.14079975792471874,53.70918678438088
0.23705597759198002,89.83426087115495
0.11900448613994938,73.96817267917855
-0.05261774608285257,52.67666231222007
0.08120536389089739,57.48031277482477
-0.10964451038664037,45.09610060773563
-0.4845673186393317,23.04112803211897
-0.06478251924575947,3.261716174240877
-0.14918256000337174,90.65409689896671
-0.08759536291182543,53.377850095060076
0.10029882758615853,13.138297971520993
-0.009918622549426565,70.80288049499147
0.16062090982715485,44.509847873616366
-0.09603715876384693,25.86202209027607
-0.0033966884949180945,11.996735169037102
-0.01824051735346668,72.6810220405896
-0.11917608111555861,93.65919081671609
-0.05494612123888718,18.465966437002336
0.13053277322131715,86.15064251373265
-0.3724950348201942,42.84877876773256
-0.022801163586592003,42.38081841212634
0.07915904849967931,66.80035640910071
-0.14649942687149403,86.58205861858345
0.3857647890239809,71.59557944671099
-0.013467690339096457,79.71689726229093
0.1913032220056984,47.22437212823499
-0.013326698369027213,35.44254286993759
-0.3695457981611071,76.36907185309816
-0.25990761547265084,15.76792251773578
-0.05358681331584539,23.144395166234585
0.29815473347436816,61.60255880915928
0.08123156499807074,7.131215857920453
-0.5339889229893859,30.078056940022673
-0.11232669235627182,63.08231913648018
0.30521270148277824,40.44822778161537
0.2730498228331732,76.58850437211771
0.18008782019471117,9.990175149410833
-0.2063665691917882,26.85580835288186
0.1260833214511237,82.223791787763
0.003971171006524425,73.27447173225598
-0.10226426751991102,43.13169158317856
0.03934472756074335,49.44003858672774
-0.14570623247968645,46.567002091004085
0.39324599311514147,29.868614748776345
-0.007693037981765757,45.14379306710379
-0.04674785897043179,61.696823339628395
-0.010660980810234541,24.996840345260136
-0.19698262074848064,30.511926208713614
-0.09224909927595494,18.606086056651275
-0.03504104517909163,53.237143752269134
0.10963123620583148,44.535185465119554
0.264049197434485,54.680548640951876
0.08107784944201617,87.93867352739419
-0.03918569568772377,98.35097852632039
-0.054210374602316376,45.91341297946296
-0.00908081961431891,34.4718237470684
-0.22448289788874776,36.93548135152515
0.001987073127817178,78.80710581959404
-0.3524848122792666,3.698648736069274
0.002251360786790367,9.376254504770017
0.6822327250007452,18.89362351528824
-0.0026087308364731343,20.104202427204637
-0.15149135449371368,21.879374173367548
0.0023354848231902856,86.57727171074418
0.07058544758201635,92.60011135668613
0.021728691655691582,52.70445159444371
-0.3085251405159542,75.63686772308004
-0.12457001383030422,64.47285783148064
0.16269086670125238,11.826800770956522
-0.005715485279426678,83.5521631593243
0.3082905982905958,55.229057540525574
-0.007040261348135048,14.912121143778762
-0.016744432769026596,16.757787604662138
-0.10610844816074032,10.812031650378708
-0.015016003027944744,66.5884355058072
0.2841736038113078,45.22015516702302
-0.10256440801256125,25.72589547384651
0.08015115434486056,11.035534446148597
-0.09022549671897946,50.38791032291661
-0.09218105137934823,17.499789049393886
-0.24919415039086815,20.918043388826817
0.29368765467964236,57.7002093159159
0.10192025994087224,55.912307638876946
0.10474316692380457,24.055116692569637
-0.30637372810728913,28.659328256127438
-0.20517543299322133,56.10909226193551
-0.036567247991874816,28.71403155177681
-0.21701439461471075,54.57974687955062
-0.1907572202454172,86.61490916432493
-0.05187530349145775,41.9054415613359
-0.35779568134372997,63.7622534786716
0.036180721247635036,78.73720100226723
0.13952066732034885,83.89588941745498
-0.45948623138984174,67.75301436452634
-0.09892973654081999,36.49280611149989
-0.35126542973457836,57.78776214033613
-0.15404323872542586,43.29347122002223
-0.4123980164443776,49.79282084118664
0.17422470225876022,26.505296705548215
0.4202744117117361,71.41403264777908
-0.0012482631503645625,56.33993739532509
-0.1279539973467072,48.14764362927211
-0.0763947171243433,29.493453721468825
0.044497044715346164,15.43564711298304
0.20270799177208543,31.311804741115246
-0.01524422496136366,39.66002348249342
0.022224151402027003,20.895341037792875
0.46381522988207474,62.954108828829625
0.6711015843522097,53.66455311880157
0.5268960152188265,75.8753058941458
-0.06210476903801643,56.641708656241285
0.401038242619858,27.758815056014143
0.5355312134425159,85.17793575898924
-0.17102920909779873,53.93216163512601
0.08595780861840979,47.97703429590883
-0.04553587498543695,9.382182718041756
-0.31504081294787506,87.19149396098133
0.24483448330382074,23.478552405966028
-0.011659585058686932,53.656562300364875
0.13364502685092267,59.49388230776486
0.4560225412233261,49.50459122364533
-0.36247692035308393,13.127879185055752
-0.15027788398839464,63.545823122341396
-0.0053388263924969035,30.489612095358993
-0.2777397295464552,61.03080194854748
0.03082788595807224,76.82088156819918
-0.3004420877194081,55.35753019302706
-1.5469368564637738,46.29571132880528
-0.4273425211055649,52.418673652087314
0.2872948130904874,40.337057285807056
0.3996368749796185,75.8591504566996
0.1287288498197679,67.33731934935882
-0.14610892438096026,93.22250921215912
-0.22782177964557,14.09559130241361
0.38310135115326815,35.36449059590018
-0.16692812273322247,91.00203209272715
0.15123133720562468,62.484766836538775
-0.07651464012090288,55.474645926021346
-0.021648467715622226,10.865094555668161
-0.31310225198210134,19.42000390286952
0.31619608832884427,41.616968063104274
-0.17534695025985272,43.71811254941259
-0.10375238887173491,43.040078491271586
-0.12392895576520954,42.1240244133818
0.012294535030424662,95.41352284714758
0.24577382634388775,75.59556261698903
-0.03903594926774875,9.4335340728314
0.13393921589354194,67.25949406468675
-0.07287073566952323,19.006296242548558
-0.1857393045432316,52.606680899816816
0.16364003132331273,23.012759289902288
0.10454296217664019,76.22527063060774
-0.07191425204913536,90.14401399844772
0.01580163088382864,17.260792137840095
0.049978396330625494,7.622227231715298
0.273642905722403,75.6247396335515
-0.057492044386316284,91.16838617306122
-0.009930379637768579,51.89160383948463
0.021855467820121367,38.65693471332024
0.18219320427316513,69.17521334562771
0.31045942211184885,92.17866693850404
0.8030609290250404,50.024916439696774
0.23344535535275082,98.09471762489625
-0.2841508401758265,63.02232842914487
0.07912439218128837,44.11689098711809
-0.015835748639277406,18.7901327755423
-0.08319868157672666,20.723092117013515
-0.009670445268163958,55.88964022087603
-0.19641733336950856,51.28202117214552
0.018304115686289,61.24337968340985
0.18225090040939643,12.740867261087233
0.27546966663191014,35.04375894869364
-0.010461414145530135,24.26049789794694
-0.1906702011411857,67.40246490438582
0.17084554297785703,57.548777546308486
0.3762592431704198,14.999007833297744
0.3417434658122848,55.627532541091526
-0.22172813487881796,16.164346837064727
-0.1899681041990097,53.947276061734804
0.846008399137864,4.662312196013473
0.33925349960749995,65.08717924524116
-0.05918483864289564,20.524843058277106
-0.0131204486481895,37.83989299052527
0.24645403169787855,46.156459437563
-0.06733892156262496,64.34522126627317
0.001901091633966776,54.93461473043162
-0.12900019417923966,40.41591131607494
-0.49863621802061714,39.988984771662395
0.24964253816351176,43.433800599555255
0.1242797660944277,91.15028036418796
-0.0070895946741583275,73.96278525953409
-0.050772218422330734,49.06683555572056
-0.07211054099565192,38.81123826776437
-0.1531331424726558,46.68234148790708
-0.536724988303644,0.7909071186906234
0.2751109772299607,26.894117861185872
-0.17124097215340456,42.41082406963904
-0.02899519875792673,17.644772467568238
-0.043584542142324975,38.924799801142775
-0.8390753641545308,83.8255342406184
-0.6050451189237662,31.836754710498745
0.9616643989691226,43.801243432213106
-0.3858832843876683,50.81369878511273
0.36370672051348885,35.697580658165265
-0.2793244818513399,86.31241954121856
-0.07160403246672419,59.603553289869936
-0.20383974671701593,68.3813825125783
0.08680217521812897,45.65455311466455
-0.2510773439526098,61.85915365384279
0.05255610428250286,0.06556803532583806
-0.018136109902685486,2.7792420576749906
-0.1779648170425712,49.32604709446584
0.04943183885584645,68.80683656623576
0.012140596883076926,91.28356041149134
0.037850534559922475,18.134857365687328
0.15177250204283094,49.29842802686298
0.15780873555630728,39.20726459526161
0.6826360529209166,74.88648384642048
-0.5007145503625682,6.6819166456188555
-0.04592944080347078,4.107371259384554
-0.24783704065287085,84.59870249695629
0.0939007313479802,67.75066684853597
-0.0599015031236969,85.79895996379233
-0.007104166711115879,60.773462002253055
-0.5045490165790615,68.62161165476017
-0.24570011398315578,78.10556623997313
0.171917134543899,68.84948086488669
0.0011051625468769563,61.90594005845838
0.3300214946375013,86.89694807410879
-0.007913797781301269,67.24232474759535
-0.1068875061938415,33.97732067934985
-0.00788006624823149,64.55515179491272
0.027903629045769415,97.90788544566912
0.13831243838909996,73.2096955062701
0.07252950192275957,97.84725229194645
0.2350239526388489,63.25608897564309
-0.0683639468472649,25.39043692757653
0.24702506700837482,59.7055171955507
-0.012552724100701476,77.14087682757759
0.17299399840624846,49.7886432525421
0.210781663715828,76.97062417168925
-0.4445868886082357,7.272645421618462
0.2647517471975878,47.779925694477896
0.05703851809428192,66.73851805405211
0.033575389536596396,67.14249200475011
-0.08470591030848738,81.33595175716293
0.610877369263951,74.98367383631094
-0.029842918401038837,46.239033905698165
0.03771590455310656,51.127996337993764
0.1357387282724645,58.21885522213014
-0.11564934234249613,39.19105697927231
-0.15423699685586204,29.123442046670355
-0.005589426115984024,50.167054462336935
-0.10515893771721425,24.017511961792778
-0.03469497720390678,21.0453171404436
0.7016743385402983,99.85667011285531
-0.16445567598522068,55.98950160029834
0.338483595596715,34.98963822557395
-0.1722897459332015,55.66975509105584
-0.25301215706654434,10.908451202260482
-0.06415351335294675,41.97103561399792
0.253785141721134,47.53246343540003
-0.11873818519309445,49.46005742862882
0.04508635568766193,62.71875316187816
0.6871334307186314,49.737948701163965
0.28591926020480557,67.5524044735248
-0.26799367894214815,54.28435298259649
-0.1545851501920869,63.90781492769264
-0.003987031369473303,79.81482902974612
0.044926530523833684,65.47822669497285
0.19302515278934979,65.74513989613759
0.07727846986542049,54.48129118393938
0.023211540381455845,13.538888895203408
0.37727469849208595,48.497203368782365
0.2411434488687691,98.40833029436988
0.2418694035826364,97.71129328753645
0.023121413582838393,82.66558747407689
-0.11146576655976036,29.90216344394708
0.1629245189188494,48.17097697363423
0.03957476246100088,57.889693321798205
-0.16643745452122835,36.2778512008452
0.1247516838321815,70.45438897521701
-0.04014400590059153,51.763008368524055
-0.06951785311503038,49.39744110970328
0.04981871923892073,35.25043763030457
-0.3030852891350318,85.53217961885453
0.9438062603972819,73.15327580503302
-1.0491538654210486,35.51520345531151
0.6265004249624915,47.56193505622899
0.11977161262137917,58.820795688334094
-0.6457642058386424,68.14752453791691
0.08374887949310586,63.786868847909616
-0.23975002072796797,79.39628787876445
0.09788412584811126,68.33039287519487
-0.12127547550417628,53.8727415049567
0.7287026503303918,61.904626273948814
0.6146780378252262,35.35345347648614
-0.11395334864331476,88.15920063848752
-0.2504516526380674,12.797447317091496
0.08470197531264376,16.404652977694255
-0.016327651968494474,64.11848125355081
0.26009219842694015,44.28718365397753
-0.6362969478752636,13.000556447137152
-0.14822193548386736,1.3888167645044125
-0.18661983734136003,39.845907915320986
0.29566791150242233,19.087304443463072
0.025444987812428533,95.99113865741093
-0.06155212108071912,67.85208799803789
-0.1512478437807431,53.708264541742764
0.3340645597427627,6.207202888384383
0.3085858978533809,62.71209123505513
0.21607341993677684,87.44610887069997
0.26945045462841016,56.12833889784864
-0.528552483739388,48.98740634577518
0.30299281120990157,43.283628547952866
-0.18962771078729063,51.87127943464297
0.09904153354632947,44.44879903575918
-0.1256370497317238,4.5274760132744865
-0.17498486132792243,37.76542597143299
0.034411874000874525,71.59165744511978
-0.048497568381524754,44.11713976768926
-0.20220728279781688,8.026652716382175
0.08335345817302106,36.205119596150894
-0.08186161264490649,95.95381896163944
0.0516206394666544,48.196218589632906
0.030385199160186054,37.727152440028114
-0.3709846926553836,73.72706624153533
0.046380231406629495,60.47213671943452
0.1768829084782733,35.361160667892015
0.18225841881943872,65.7982106781941
0.08630602782070557,60.361300773421
-0.18197586503394367,34.06659938455088
-0.006724630506311122,57.23002605913258
-0.04345517422914266,5.677775879024911
0.14938857177940756,66.0187538608035
-0.0666954344177662,30.62127354981672
0.010640448820737189,69.67565275861818
-0.13931113104494341,92.21974803184537
-0.060466482861780434,82.82387035647437
-0.12087184330645155,21.736126397626336
-0.03309614429918915,30.442512451318393
-0.1200132428405893,59.91950311960037
-0.145017609281127,62.37184629978144
-0.05804979253112697,14.609630258610565
0.19539350851386345,24.00222654762329
-0.019330319419531995,63.26473500558896
-0.06140065398017805,38.146903423818834
0.2546592756761613,30.409941042501583
0.14892450879006697,38.604443654947666
-0.06759323576129474,28.134477106535634
-0.14453616013052958,75.90472329363979
-0.09500337701253304,68.63730114330909
0.059546214138922006,34.83023744802343
0.282668430562384,26.290940150221164
0.10555607281861426,58.68766055952312
-0.286039302253827,74.34612618426394
0.6015680292129575,54.58103111134529
-0.30012126350466145,1.6382703460328298
0.28450788332734483,82.65452190340523
-0.27550687538625035,73.30387757725747
0.22269069869169017,24.39349305445896
0.3382981529175851,81.10177063694047
0.12960185514569478,28.063415911942
0.49215061289228784,71.5766015264821
-0.011573884223054201,67.25006417285674
-0.23620181851251007,1.0517483180696114
0.1282315263872015,12.891649925532345
-0.00879283280081137,33.639061562463304
-0.03972424802172156,30.638250327950946
0.033626968032333615,60.022264252625355
0.1830038794210767,37.885416139814325
0.05349332927083982,70.02058734956199
-0.041098529704684625,77.3561936058411
0.0652572499032917,91.96299938217061
0.3420388011648891,83.62990624506695
-0.21084753366593634,88.27833777959427
0.09380752456057782,30.902029864203918
0.4673188019038027,3.0935439718938302
-0.24638168272797575,75.96322853008596
-0.1378651377268537,35.49808254163785
-0.19014708603445768,49.59803806387744
0.08266178047763299,71.17041036708041
-0.19909574682536263,2.121646772454115
-0.050979830303312504,34.51015296839772
-0.010190985793088539,1.0737232742443157
0.0,39.95292152666086
0.1611582049615089,60.729990564182344
-0.26580645187523355,38.90220975873099
0.020517608399060965,59.901330283577714
-0.21906435663844506,30.090200360996565
0.0012153272993888624,41.48245779174456
0.06412560816690852,53.7804462546482
0.2907940692788377,82.59650316621988
0.00991198180509267,19.553213047068887
-0.06374990905588998,80.00599016646238
-0.28507717997800264,57.73162413500251
0.19925045965557547,74.21397644371143
0.04086534664463996,95.35362865468839
-0.006743593204449899,28.94453731149589
0.0430722581955707,53.63050072490133
0.12257274249815543,59.45821247001454
-0.11074691568466737,97.3903704751975
-0.012717077405872234,65.72427627105918
0.06610870054450352,77.56846420291585
-0.48326586735992266,92.76668223567498
0.012679982898393286,93.03187179594457
0.03891443252052099,42.614439578529364
-0.08423751709253857,72.78583682476875
0.05772301280369973,27.138980667049033
0.11608482831100195,62.233110162429185
-0.08860415021145633,38.38840320643382
-0.06182341988474893,51.80846971910703
0.15380022783173525,15.025414838238682
0.0688633503612899,19.981974850928253
0.001969377305337783,3.071417205612517
-0.36085832548063257,39.685777310487836
0.015125633904363467,16.735887341021005
0.16425776776335152,28.832082654934087
-0.18000696966052096,66.68494627975285
0.23461846081687965,21.972742916481277
0.29646392921728115,68.26915477413642
0.015208750542577084,53.719051667092955
0.21656697096516583,34.22795649323835
0.24854989693045315,21.062421732557794
-0.29208442193774975,79.62038711991482
0.05213968963058033,14.024755735305272
-0.21285036979767744,61.88756523484814
0.07893637235482374,32.21403821369982
-0.0680163288034001,53.01549891872453
0.12526007172033607,71.37552777322274
-0.12815442073765454,59.988007477829655
-0.00237269232757573,44.456518684237075
-0.0403570935764243,100.0
0.1666790275993825,21.601289611149273
0.15265097969097813,15.485416931391875
-0.14259830315940064,34.83419266782582
0.1708396046048135,80.27800295327539
-0.3231269363580634,63.46132278883738
-0.3404096343402464,54.45706484340816
0.040174765852448203,56.936212461038124
-0.0007968387974669065,84.97705145398173
-0.03189423780842368,84.91056077548843
-0.008952039143720404,80.58311156874653
-0.2003555790300697,99.05574894402896
0.07001536098309816,12.546856875505414
0.09546823181762872,68.38469299302251
-0.062497661321773555,43.26044554099576
-0.381307395468849,22.60067781349826
0.014233223588950029,46.71415943885419
0.4672884721340526,85.54289392078877
0.2472130911641831,97.99069438216458
0.0019879975919860674,48.13467905855218
-0.05205391750077196,45.2175387837814
0.0856185167739489,26.084229523755
-0.09699003745076944,90.33153864678683
0.008630160001744805,26.061106153901914
-0.08769158755434431,5.568145733746224
0.0,7.02196416474303
-0.1284318249570222,22.106782948094846
-0.14300025555839452,4.073879416954189
0.024323051896030786,98.35146428040954
0.8251860859715181,52.48419350678803
-0.2807385463047993,30.47808539357872
0.14071864638109105,74.06552380026233
0.04776531707366703,17.516910197617374
-0.02315496626273012,90.25162939206018
0.27875727304453035,4.524479217986685
0.10609542487421147,15.52528709917794
-0.2799655517345953,35.329810527064524
-0.047348590597730694,48.95488194925303
-0.07202142015149736,34.42981391304632
-0.036478800673795074,8.718013339272364
0.05076765617332158,32.82784371578425
-0.14284560103942906,62.63600887420615
-0.17746904392547588,59.85428666137598
0.08224498320385375,40.39601774615055
0.5385588934951682,92.5814930107478
0.09374395862134784,49.3529603029114
0.1580473034475733,22.34635721047292
0.03563667610274775,75.34501887691003
-0.002455084333164812,76.83776982854833
0.1439947364932736,98.08656766143524
0.10691258410651787,86.2063041970973
0.02594121824758629,85.9174883821512
-0.0075868964122779835,81.46311890954419
0.02395142244445182,38.47793530355844
-0.05481215127267598,94.42239324370443
0.41766159099867045,76.98310254719513
0.16862018421445535,46.23857273057227
-0.1445574124203083,80.47441843592019
-0.14249874560963371,29.131031254653454
-0.018038750653210887,9.967699263798918
-0.14092866573090113,27.71814464318303
-0.06770015935606415,3.442836602079369
0.07296345441466401,0.2816077829366313
-0.35731460775712115,34.623880019901385
0.1606262464013934,41.12089931562524
-0.07601929950303533,48.61213765279089
0.015552515475563824,7.202656826036508
-0.2581154654746408,69.49207273104932
0.04002295755694613,19.159383605789746
-0.1719509848170798,88.21748735514369
0.5596133429504618,42.307450089195186
0.21213877450114413,80.44618952871558
0.20240100631521607,34.411358440033
-0.4681214236405087,48.838401820313955
-0.2025793232411346,75.2578298277554
0.20852536028347005,56.67739367063033
0.16983147879374294,4.939646388973619
-0.28955219375255103,21.083889222072397
-0.2335596744715967,58.50039067445918
0.24445386691870752,34.86786606861709
0.5617365319246868,18.862271968467574
1.320751952968235,86.2294333287139
0.019805904120212332,34.502889916665445
0.5059208421703764,81.29098106511802
0.1261144485172431,80.92637508777932
0.27702951396899206,77.84975361133996
0.23847003980355183,37.74424416525359
0.03263392904772698,46.06019930659094
0.04320089787416232,58.844316253735855
0.15542673151989653,59.01301495031382
-0.22777341525649333,33.11005890781622
-0.24154624835452743,56.82859890126433
-0.003914835548948973,68.37764751432069
-0.4136151340319416,48.976904087258625
0.11927744627637574,11.697902753487485
0.13671139999086582,69.76209773231487
0.16284301697301573,29.814414079844433
0.07835546536729697,23.21300264799898
-0.2324215686274578,98.97631041035685
-0.06353983714893915,56.331215614107734
0.14801628385615492,59.04275337961729
0.26171147819120394,84.49081666798804
-0.04085575772888323,19.717784218786928
0.11204656873670049,32.19978595278053
-0.15586957372685523,75.47210395399078
0.025071372060731276,30.745607240285196
0.14275733604561722,78.54310472306823
-0.07011742369055368,45.0015028564009
0.014785319607195445,32.358413451393595
-0.229510600926553,37.17361322419444
-0.09517314466828762,6.063968438388722
0.0814540445670482,65.10923501736524
-0.13171409689235442,54.04133786269283
0.17030145125728058,64.15298632343394
0.06382559386693303,22.82062870252218
-0.09136356020431391,39.84355875972261
-0.05632647373703994,11.11323854516522
-0.07811602527282942,49.193208433758464
-0.08357166734762386,66.31564553603192
0.02534324758819128,17.07654100587007
0.1225547307895771,96.21974000527955
-0.14480384852609288,72.68273511326639
-0.21965475378576538,26.146997649923254
-0.059942743936766796,53.40368819051575
-0.039377661357430746,13.697011846401608
-0.09788606740504563,2.68784549569407
-0.4169023553595591,18.239185213504392
-0.041005004395524076,15.580114800753918
-0.04760912698413218,17.711308227182254
0.15822016851078555,40.81400229146316
-0.23985392337160016,74.53414457968212
-0.0546044884750469,47.57533053584591
-0.29055480223955643,33.77398235994845
0.27291103542819,20.88259502860493
-0.32120606156157455,38.20069500904029
-1.105724247686864,38.817668517663975
-0.12480335605662818,45.99102744859204
-0.507652243169038,47.13340598417876
-0.6321453790264218,83.83266685354663
-0.372988450338736,56.439785351896546
0.4858873995572456,47.80839968589003
-0.03957221204381462,77.85208149407912
-0.1507176540299041,52.522259634938386
0.1515481997275915,47.13887949365961
-0.08470673050084232,77.22986970543548
0.4104711511383874,54.01668222474777
-0.5217794313333645,58.910165875003294
-0.04782645775595854,35.40114665418685
-0.3569824897188296,16.802229400156797
-0.28235431338823536,69.20397849377078
-0.12396988643781198,43.065236012705434
-0.08783508776705849,35.818694228573655
0.507177647656558,71.37913692333423
-0.09139040199558933,31.434007537806863
-0.16161571322335577,46.12465244845976
-0.1526451371598661,72.04653896713351
-0.1158829534752911,13.950744136266351
-0.18942353120136146,24.301975910319744
0.1522400032011249,57.81896142057173
-0.2751413208042856,65.24912555123205
-0.031243387437474474,28.376394121467584
-0.20667674195935698,65.82263685861362
-0.13742301799749979,43.41805907058416
0.1202898136578139,24.45953263667025
0.15132328476146803,36.519870457365364
0.30917404600261555,34.7164197773184
0.08790820519629654,24.47084169921859
-0.32415613499859014,54.31558110910588
-0.15417562434499188,9.48776747443379
0.6663809612314595,3.947630872936704
0.06117939616408905,86.57606076129429
-0.24579327046834873,67.45020744205735
-0.03337135651725845,79.93978025372407
0.10893453148238935,7.562367719789045
-0.005724443248904063,16.515538460454064
0.12710023021831302,45.98538613248749
0.011754364078090074,75.88582679736095
0.1465363105358379,98.66813633869255
-0.2147103595819962,66.1271107941579
-0.015719178170442275,28.24648957742628
-0.31932989690721647,5.587746754820964
0.19904799087809413,20.711030798356706
0.15149245928400384,56.38932307237554
0.12577620739594622,78.1607595201559
-0.4964616096588164,86.0746769682886
-0.019923615794890992,44.54287624405314
-0.0036316628856772316,35.7573968130874
-0.2686183028177927,13.722622559295175
-0.08510499672051987,32.911066719183154
-0.021411188373906785,64.10255720436294
0.18733089769584693,51.35065083266001
0.09155695968695543,33.505801365013546
0.037368131819193906,15.832545993678975
-0.13733456439216407,70.73538253173504
0.16884331752755635,27.87610922934911
-0.47885952215476774,50.60757819642118
-0.5115723870091636,23.863919945053304
-0.21863893811471002,45.01370793240242
0.6826642932024485,30.789077347831057
-0.03019549554377548,61.35104754372938
-0.009690040480636073,68.2100378192381
-0.35141511201565484,65.85974271506636
-0.08268993405695968,35.92213647025302
0.039047814597830785,60.17702201218521
0.021778817299441817,79.61391160059313
0.00935115032210444,86.04840644724729
0.003614743556920041,86.3629126931857
-0.2582463318991856,26.16003285333659
-0.7168132175673437,38.59837200325927
0.32921908814531403,12.135675009646736
0.23232693378392918,48.36267777834043
0.015714825227640387,76.62903669631855
-0.33118667726464035,51.87787001615859
0.07982258654016057,54.0967330063156
0.06046325579595512,45.64846705575889
0.08978885117561217,62.5172695559333
0.13479344516211447,71.905512776365
-0.02595144824809375,36.382220349262425
0.10053945827628276,70.9580078743358
-0.0720441241971075,73.9480710713986
-0.09356970413457223,32.53845908074655
0.3067267358223992,43.217340311267975
-0.441461261572894,27.9325740086401
0.16864269481925392,28.633291779815668
0.6428205796170555,48.61984557368261
-0.9455313993420514,65.22908938582917
0.37255318591382713,37.07060883951256
-0.4555854274315348,61.15305225016272
-0.1478310182444556,59.86701093520149
0.2737472379757016,19.19887869513479
0.17676457395140394,43.87876550915834
-0.23803205551553808,62.31219719708064
0.042105263157894736,31.331101799246042
-0.1443076599326538,64.62313421280112
-0.19964626984331993,75.54477333055463
-0.4660165226919414,74.48152871065071
-0.35809147997284635,14.33342774046431
-0.7309567674980005,46.9772741427907
-0.5160014499001622,46.855935767041274
0.8983078399735096,59.58189146676761
0.37717960807038037,45.33945290447263
0.49069210236053545,84.94464182463462
0.005793630459372578,39.093428126356415
0.3778139049666604,60.751756060698646
-0.25849308418365097,71.9510684896268
-0.38186318089132276,75.45277548436637
-0.21507851841971512,79.95094086002399
0.5706959794278333,43.66058083568116
-0.058314998128651055,39.106079537662104
-1.0587738053975044e-05,27.6587655590401
-0.2603101551135557,17.598081623003644
0.1465876727293651,79.52305857215688
-0.4174850621839267,36.363488305631684
0.24527596553128436,51.2484840104406
-0.11960372347287783,90.50282834826898
-0.030861628386233263,41.68799033123265
-0.19558466970096372,73.42927548204469
0.3940136435974499,31.237454988749374
0.2494113097097309,31.977854539303134
0.07737923138889852,42.19692034651341
0.20102621433458642,5.540049917824348
-0.0031461151970249126,95.52404811902181
0.27325609201558093,4.54060795135653
0.09121268011030244,4.296673968983473
-0.014306383403086582,71.52697806070903
0.14757673053627796,36.893489289728514
0.09905542786239731,65.72442654801918
-0.2595939566400863,49.05717791801049
-0.0602823748919838,49.18359293217408
-1.3646732459608286,21.08751536353352
1.1340309297465276,30.730877053010154
-0.0470631596257102,48.821327798305774
0.12624018708257495,83.55635394182848
-0.47359216907659935,49.28948078725633
-0.1302731816114565,90.80774334521587
-0.04334310401134461,23.618822691009093
-0.17786339526530281,18.592664296418576
0.21570706884345733,32.79330331132695
0.06677460107667818,44.42325047237126
-0.025926418808701352,4.952375247478368
0.14450434817657154,1.201781484699487
-0.07373661180585998,27.810618514684887
0.1358064313720065,24.594094963758693
-0.08909173951171362,76.40569725051006
0.035268013596151,30.636466807451047
-0.32701165170960844,48.98015742486558
0.00273067307690482,55.47380006287814
-0.13525493802710423,21.706482567154417
-0.18071893067504424,30.07294752184786
-0.6161600658271956,77.08608759501352
-0.2609302479716807,60.17470557055883
-0.014602148967512744,20.70063847602191
0.16521765595259938,22.957573188226583
0.4518678643754313,89.62892747994286
0.19192840430295743,48.01615437957656
-0.6546690064593237,71.06110143567926
0.19795588402524023,31.105872668791683
0.11247314897810409,57.172870517608324
0.10049658458268562,24.427487863990624
0.08256871805888531,65.74558645640629
-0.17315402106271255,76.45986055252615
-0.043746250340508254,43.246663003716755
0.042790927980604875,51.30465967703562
0.2291375343272714,30.36960067084587
-0.2950394982566474,94.46486381500262
0.0017030476197939328,57.396283486793166
-0.0858899766429415,28.222112362367437
-0.12850083573246798,32.494537666509885
-0.2829002795835456,32.86022035674456
-1.1524110707845958,50.519162991478375
0.6416244380069014,55.125109559042684
-0.11441417324776187,30.076034102940895
0.11361361491483105,1.5575426200077547
-0.10803118361665173,50.94449114435399
0.6160884040714899,46.43403949420318
0.5626210345767615,59.38853304744157
0.3026787060129314,47.23969501996157
-0.009659308194998014,27.5476395317754
0.5193046846512962,24.180838219085203
0.3323370443028895,42.28299339808759
-0.27020899761035216,64.1431266048707
-0.26439309617017914,49.996502519535326
0.11363299002244417,21.33077313034374
0.13535664323178076,27.53326541906192
0.14074620752125153,64.58137979218021
-0.24531311923835317,68.38950356500865
-0.07940430092285153,28.742748714917333
-0.34686834847322456,54.3261408687406
-0.3028415565629045,25.254411749917434
-0.2497094839405151,14.95938830789903
-0.4279079828836894,44.90230118701618
-0.07185617385486283,71.21857880700291
0.005521418141735821,59.52302815700583
-0.3264897408656891,59.06736562054646
0.728004271241125,48.321989383598165
-0.09797534097502453,53.795845889552886
0.03746754246465453,44.550232376316295
-0.057292585204582316,48.97351515227243
-0.0728208218515814,40.586707105991465
-0.1817327585780215,47.1116924884137
-0.15064881189684173,23.163262810053237
-0.6038607180553736,51.22045163677388
-0.08665024617164839,8.506788361996634
-0.0744401118591254,30.84350025549767
-0.47953732889384587,40.62513032948327
-0.1375740181889781,55.613048751531856
0.5478903479252061,47.41874691122586
0.12322883053793726,80.01429612731451
-0.052240916751976625,97.75417183863806
0.18680122941299798,33.98600484767798
0.3941043124286036,23.084324214943667
-0.33840618087790525,99.95113839285563
-0.04262349743217477,9.243555579557562
-0.15752552772462583,18.632435066772487
-0.060958407551058684,44.06324117545826
-0.05552176584898196,59.09237535421915
0.23739975459727117,57.27621739156567
0.543239165360444,64.90713927649588
-0.07039834051794368,36.78192039393963
0.011222368970738654,19.31599649437173
0.12015668041954645,38.918476383714804
-0.08887293761283648,77.23202376426973
-0.36645281156809956,17.157354135489978
0.27591826233321404,58.8406468504219
0.13068919023432266,46.75002289974641
0.18157233126320757,68.40792956857904
-0.16849755626783708,53.662227773700465
0.05389875394505306,43.15274039119096
-0.05978567192866727,82.69429176511704
-0.008749614233819796,62.04606830610807
0.3143114839229023,84.3909767870436
-0.0307780581091479,80.93253771624838
-0.04838964392676386,23.743634597481318
0.34869913044002393,51.84603298980063
-0.20385915771294286,37.07080778396184
0.34235345155366714,18.43229373923043
-0.16537479340217137,74.3331139512441
0.4657265903121965,20.593033000660036
-0.1758094495467231,13.623032475870794
0.15044196881018157,77.11612729252161
-0.05884992360313425,30.13700150417069
0.11246076635783667,42.2544269440919
-0.023636413589559375,59.000682314089204
0.0731501283148027,83.0997048002411
0.11507835211646086,88.13751002036845
-0.02216740908454372,11.202961570815926
0.04026673401986176,47.72233980220803
0.024478978703717846,22.506412309840083
0.08925066143211093,78.88985882951015
-0.07318295547010456,70.85306693572416
-0.033855638873190944,26.65648408716757
0.4416914943630708,34.70592237887347
-0.1393025904460335,52.53620286442998
-0.0979552792874389,20.2113403041711
0.09679480353529356,14.847180608202915
-0.16132824260192635,25.19825349589155
0.30184577926834666,21.10450560954573
-0.02025867147094892,95.4417881128135
0.06713173652312114,40.13769814635179
0.13098789678629497,68.53350383133247
0.6515058324496239,80.67820633866592
-0.3955143989260996,31.731672063817502
0.12936464190375982,27.561217685578498
-0.17035478184479177,79.50564207527682
0.021153436914985765,8.30597603595089
0.28523063901820334,59.20602667623488
0.12490821777915949,20.189029403523957
0.0442639187423555,25.261199814498887
-0.03948393475681644,48.46806702750204
0.007396284494384431,12.537519390063173
and my accuracy with previous code you gave me is Best val loss = 0.030709 with hparams: hidden_size=32, num_layers=1, lr=0.001, batch_size=32

=== Test Set Performance (Scaled Space) ===
MSE : 0.034912
RMSE: 0.186847
MAE : 0.139514
R^2 : -0.040570
Accuracy (10% threshold in scaled space): 34.02% how i improve my accuracy my goal is achieve accuracy higher than 80%

思考

Advancing strategies

I’m working through sophisticated data cleaning, scaling, feature engineering, and model architecture. Considering an accuracy metric and exploring ensemble methods for enhanced performance.

Refining the approach

I'm addressing the need for refining scaling, considering separate models for columns, and exploring dropout and architecture changes. Negative R^2 highlights the model's underperformance.

Thought about improving accuracy for 6 seconds

回答

Below is a comprehensive list of suggestions and code modifications you can try in order to significantly boost your accuracy (with the 10% threshold definition). The main takeaways will be:

  1. Data preparation (potentially the biggest factor).
  2. Model architecture (stacked LSTM, larger hidden dimension, dropout).
  3. Hyperparameter tuning (including learning rate, batch size, epochs).
  4. Rethinking the threshold-based accuracy or ensuring it’s appropriate for your data distribution.

Important: Achieving an 80%+ accuracy with the threshold-based metric can be challenging depending on the data’s volatility, magnitude differences, etc. If your data varies greatly (e.g., col2 ranges up to 100, while col1 is around ±1 or so), you may need to treat each column differently (e.g., scale or transform each column separately, or compute accuracy per column). Nonetheless, let’s walk through the main steps.


1. Data-Related Enhancements

  1. Separate Scaling for Each Column

    • If col2 is on a much different scale than col1, using the same MinMaxScaler for both might distort one relative to the other. Instead, consider:
      scaler_col1 = MinMaxScaler() scaler_col2 = MinMaxScaler() data_col1 = df[['col1']].values data_col2 = df[['col2']].values data_col1_scaled = scaler_col1.fit_transform(data_col1) data_col2_scaled = scaler_col2.fit_transform(data_col2) # Combine them back or keep them separate scaled_data = np.hstack([data_col1_scaled, data_col2_scaled])
    • Then, when inversely transforming, you must inverse each column separately.
  2. Check Outliers

    • If you have large negative or positive spikes in col1 or col2, the LSTM might “overfit” to those points. Consider removing or clipping extreme outliers if they are not valid data points.
  3. Temporal/Seasonal Features

    • If your data is time-indexed (e.g., daily, hourly), consider adding time-based features (like day of week, month, etc.) or additional relevant exogenous variables that may help the model learn patterns.
  4. Smoothing

    • For some tasks, smoothing the output (e.g., a rolling average) can improve predictive performance if you only need approximate trends. But be careful if your application requires precise short-term predictions.

2. Model Architecture

To get higher accuracy, especially if your dataset is large or has complex patterns:

  1. Use a Larger Hidden Size

    • For example, try hidden_size=64 or 128.
  2. Use Multiple LSTM Layers (Stacked LSTM)

    • num_layers=2 or 3 can capture more complex temporal dependencies.
  3. Add Dropout

    • E.g.:
      self.lstm = nn.LSTM(input_size, hidden_size, num_layers, batch_first=True, dropout=0.2)
    • Note that dropout only works if num_layers > 1 in PyTorch’s LSTM.
  4. Try GRU

    • A GRU can sometimes train faster and yield better performance. Simply replace nn.LSTM with nn.GRU (with minimal code changes).

3. Training Strategy

  1. Increase Number of Epochs
    • If your model is still improving after 50 epochs, train longer (e.g., 100, 200).
  2. Tune Learning Rate
    • If lr=1e-3 is not working, try 5e-4, 1e-4, or even bigger like 5e-3.
    • You can also try advanced optimizers or learning rate schedules (e.g., ReduceLROnPlateau).
  3. Add Weight Decay
    • A small weight decay (e.g. weight_decay=1e-5) can improve generalization.
  4. Don’t Shuffle Time Series
    • Make sure your training DataLoader is not incorrectly shuffling time steps that should remain sequential.
    • Typically, you do want to shuffle the batches in a time-series sense (i.e., each batch is a consecutive chunk) or use a rolling window. But do not mix rows from the far past with the far future in the same batch if it breaks the time ordering.

4. Revisit the Accuracy Definition

You’re using:

def calculate_accuracy(y_true, y_pred, threshold=0.10): within_tolerance = np.abs(y_true - y_pred) <= threshold * np.abs(y_true) accuracy = np.mean(within_tolerance) * 100 return accuracy

Tips:

  • If y_true values are near zero, the term threshold * np.abs(y_true) becomes very small, making it almost impossible to stay within 10%.
  • If your data has zeros or very small values, you might need a small offset or a different approach (e.g., (pred - true) / (mean_of_true + 1)).
  • Consider measuring accuracy per column and then combining them. For example, you might get 90% accuracy for col1 and 30% for col2 or vice versa. Then you see which column is the bottleneck.

5. Example Code with Enhancements

Below is a refined version of the code you already have with the following changes:

  • Separate scaling for col1 and col2.
  • Bigger hidden size and stacked LSTM with dropout.
  • Early stopping and grid search are retained from the previous example.
  • More epochs (up to 100, with patience=10 for early stopping).
  • Weight decay added to Adam.
  • Accuracy calculation remains the same, but we print accuracy for each column separately, too (to diagnose differences).

You can incrementally tweak these parameters (hidden_size, num_layers, dropout, lr, etc.) and see if you approach or exceed that 80% threshold accuracy.

import pandas as pd import numpy as np import torch import torch.nn as nn import torch.optim as optim from sklearn.preprocessing import MinMaxScaler from torch.utils.data import TensorDataset, DataLoader import matplotlib.pyplot as plt ############################################ # 1. Load data ############################################ df = pd.read_csv("realDataForTrain.csv") ############################################ # 2. Separate scaling for col1 and col2 ############################################ col1_data = df[['col1']].values # shape: (N, 1) col2_data = df[['col2']].values # shape: (N, 1) scaler_col1 = MinMaxScaler() scaler_col2 = MinMaxScaler() col1_scaled = scaler_col1.fit_transform(col1_data) # shape (N, 1) col2_scaled = scaler_col2.fit_transform(col2_data) # shape (N, 1) # Combine scaled columns scaled_data = np.hstack([col1_scaled, col2_scaled]) # shape (N, 2) ############################################ # 3. Create sequences ############################################ SEQ_LENGTH = 10 def create_sequences(data, seq_length=10): X, y = [], [] for i in range(len(data) - seq_length): X.append(data[i : i + seq_length]) y.append(data[i + seq_length]) # next point return np.array(X), np.array(y) X, y = create_sequences(scaled_data, SEQ_LENGTH) # We do 70% train, 10% val, 20% test train_size = int(0.7 * len(X)) val_size = int(0.1 * len(X)) test_size = len(X) - train_size - val_size X_train = X[:train_size] y_train = y[:train_size] X_val = X[train_size : train_size + val_size] y_val = y[train_size : train_size + val_size] X_test = X[train_size + val_size:] y_test = y[train_size + val_size:] # Convert to tensors X_train_tensors = torch.tensor(X_train, dtype=torch.float32) y_train_tensors = torch.tensor(y_train, dtype=torch.float32) X_val_tensors = torch.tensor(X_val, dtype=torch.float32) y_val_tensors = torch.tensor(y_val, dtype=torch.float32) X_test_tensors = torch.tensor(X_test, dtype=torch.float32) y_test_tensors = torch.tensor(y_test, dtype=torch.float32) ############################################ # 4. Define LSTM Model (Stacked + Dropout) ############################################ class LSTMModel(nn.Module): def __init__(self, input_size=2, hidden_size=64, num_layers=2, output_size=2, dropout=0.2): super(LSTMModel, self).__init__() self.hidden_size = hidden_size self.num_layers = num_layers self.lstm = nn.LSTM( input_size, hidden_size, num_layers=num_layers, batch_first=True, dropout=dropout ) self.fc = nn.Linear(hidden_size, output_size) def forward(self, x): # Initialize hidden state and cell state with zeros h0 = torch.zeros(self.num_layers, x.size(0), self.hidden_size) c0 = torch.zeros(self.num_layers, x.size(0), self.hidden_size) out, _ = self.lstm(x, (h0, c0)) # out: (batch, seq_len, hidden_size) out = out[:, -1, :] # last time step out = self.fc(out) # linear return out ############################################ # 5. Accuracy function (10% threshold) ############################################ def calculate_accuracy(y_true, y_pred, threshold=0.10): """ y_true, y_pred: shape (batch, 2) We'll calculate fraction of points where |y_pred - y_true| <= threshold * |y_true| Then convert to percentage """ within_tolerance = np.abs(y_true - y_pred) <= threshold * np.abs(y_true) accuracy = np.mean(within_tolerance) * 100 return accuracy ############################################ # 6. Early Stopping Training Routine ############################################ def train_model(model, train_loader, val_data, criterion, optimizer, max_epochs=100, patience=10): X_val_tensors, y_val_tensors = val_data best_val_loss = float('inf') epochs_no_improve = 0 for epoch in range(max_epochs): # Training model.train() train_loss_sum = 0 for X_batch, y_batch in train_loader: optimizer.zero_grad() outputs = model(X_batch) loss = criterion(outputs, y_batch) loss.backward() optimizer.step() train_loss_sum += loss.item() train_loss = train_loss_sum / len(train_loader) # Validation model.eval() with torch.no_grad(): val_outputs = model(X_val_tensors) val_loss = criterion(val_outputs, y_val_tensors).item() if val_loss < best_val_loss: best_val_loss = val_loss best_model_state = model.state_dict() epochs_no_improve = 0 else: epochs_no_improve += 1 if epochs_no_improve >= patience: print(f"Early stopping at epoch {epoch+1}") break if (epoch + 1) % 5 == 0: print(f"Epoch [{epoch+1}/{max_epochs}] - Train Loss: {train_loss:.6f}, Val Loss: {val_loss:.6f}") # Load best model model.load_state_dict(best_model_state) return model, best_val_loss ############################################ # 7. Grid Search (Optional) ############################################ from torch.utils.data import DataLoader train_dataset = TensorDataset(X_train_tensors, y_train_tensors) # Let’s do a small grid for demonstration hidden_sizes = [64, 128] # bigger hidden sizes num_layers_list = [1, 2] # stacked or not learning_rates = [1e-3, 1e-4] batch_sizes = [32, 64] dropouts = [0.0, 0.2] best_val_loss = float('inf') best_model = None best_hparams = None for hs in hidden_sizes: for nl in num_layers_list: for lr in learning_rates: for bs in batch_sizes: for dp in dropouts: print(f"\nTesting hyperparams: hs={hs}, nl={nl}, lr={lr}, bs={bs}, dropout={dp}") # Create model model = LSTMModel(input_size=2, hidden_size=hs, num_layers=nl, dropout=dp) criterion = nn.MSELoss() optimizer = optim.Adam(model.parameters(), lr=lr, weight_decay=1e-5) # Create DataLoader train_loader = DataLoader(train_dataset, batch_size=bs, shuffle=True) # Train model, val_loss = train_model( model=model, train_loader=train_loader, val_data=(X_val_tensors, y_val_tensors), criterion=criterion, optimizer=optimizer, max_epochs=100, # can increase patience=10 ) print(f"Val Loss: {val_loss:.6f}") if val_loss < best_val_loss: best_val_loss = val_loss best_hparams = (hs, nl, lr, bs, dp) # Save best model best_model = LSTMModel(input_size=2, hidden_size=hs, num_layers=nl, dropout=dp) best_model.load_state_dict(model.state_dict()) print(" ==> New best found!") print(f"\nBest val loss = {best_val_loss:.6f} with hparams: {best_hparams}") ############################################ # 8. Evaluate on Test ############################################ best_model.eval() with torch.no_grad(): test_preds_scaled = best_model(X_test_tensors).numpy() # shape (test_samples, 2) y_test_np = y_test_tensors.numpy() # shape (test_samples, 2) # Inverse transform each column separately # test_preds_scaled[:, 0] -> col1, test_preds_scaled[:, 1] -> col2 test_preds_col1 = scaler_col1.inverse_transform(test_preds_scaled[:, [0]]) test_preds_col2 = scaler_col2.inverse_transform(test_preds_scaled[:, [1]]) test_preds_inv = np.hstack([test_preds_col1, test_preds_col2]) # Same for y_test_np y_test_col1 = scaler_col1.inverse_transform(y_test_np[:, [0]]) y_test_col2 = scaler_col2.inverse_transform(y_test_np[:, [1]]) y_test_inv = np.hstack([y_test_col1, y_test_col2]) # 8.1 - MSE, RMSE, MAE in *scaled* space mse_scaled = np.mean((test_preds_scaled - y_test_np)**2) rmse_scaled = np.sqrt(mse_scaled) mae_scaled = np.mean(np.abs(test_preds_scaled - y_test_np)) # 8.2 - R^2 in scaled space ss_res = np.sum((y_test_np - test_preds_scaled)**2) ss_tot = np.sum((y_test_np - np.mean(y_test_np, axis=0))**2) r2_scaled = 1 - (ss_res / ss_tot) print("\n=== Test Set Performance (Scaled Space) ===") print(f"MSE : {mse_scaled:.6f}") print(f"RMSE: {rmse_scaled:.6f}") print(f"MAE : {mae_scaled:.6f}") print(f"R^2 : {r2_scaled:.6f}") # 8.3 - Accuracy in scaled space (10% threshold) acc_10pct = calculate_accuracy(y_test_np, test_preds_scaled, threshold=0.10) print(f"Accuracy (10% threshold, scaled space, both columns): {acc_10pct:.2f}%") # 8.4 - (Optional) Accuracy for each column separately col1_acc_10pct = calculate_accuracy(y_test_np[:, [0]], test_preds_scaled[:, [0]], threshold=0.10) col2_acc_10pct = calculate_accuracy(y_test_np[:, [1]], test_preds_scaled[:, [1]], threshold=0.10) print(f"Accuracy col1 (10% threshold): {col1_acc_10pct:.2f}%") print(f"Accuracy col2 (10% threshold): {col2_acc_10pct:.2f}%") ############################################ # 9. Predict the next 10 values ############################################ def predict_future(model, data_scaled_col1, data_scaled_col2, scaler_col1, scaler_col2, seq_length=10, steps=10): """ data_scaled_col1, data_scaled_col2 are 1D arrays of the scaled col1, col2 We'll stack them last 10 steps to feed the model. """ model.eval() # Combine them into shape (N, 2) combined_scaled = np.hstack([ data_scaled_col1.reshape(-1,1), data_scaled_col2.reshape(-1,1) ]) window = combined_scaled[-seq_length:].copy() # last seq_length rows preds = [] for _ in range(steps): x_t = torch.tensor(window[np.newaxis, :, :], dtype=torch.float32) with torch.no_grad(): pred = model(x_t).numpy()[0] # shape (2,) preds.append(pred) # Shift the window window = np.vstack([window[1:], pred]) preds = np.array(preds) # shape (steps, 2) # Inverse transform each column pred_col1_inv = scaler_col1.inverse_transform(preds[:, [0]]) pred_col2_inv = scaler_col2.inverse_transform(preds[:, [1]]) preds_inv = np.hstack([pred_col1_inv, pred_col2_inv]) return preds_inv future_preds = predict_future(best_model, col1_scaled, col2_scaled, scaler_col1, scaler_col2, seq_length=SEQ_LENGTH, steps=10) print("\nNext 10 predictions (col1, col2):") for i, (p1, p2) in enumerate(future_preds, start=1): print(f"Step {i}: col1={p1:.4f}, col2={p2:.4f}") ############################################ # 10. Plot ############################################ plt.figure(figsize=(12, 4)) test_index = range(train_size + val_size + SEQ_LENGTH, train_size + val_size + SEQ_LENGTH + len(y_test_inv)) # Plot col1 plt.plot(test_index, y_test_inv[:, 0], label="Actual col1", color="blue") plt.plot(test_index, test_preds_inv[:, 0], label="Predicted col1", color="red") plt.title("col1 Actual vs Predicted (Test Set)") plt.legend() plt.show()

6. Further Tips for Getting to 80%+

  1. Look at Per-Column Accuracy

    • If col2 has large swings (e.g., from near 0 up to 100), the &plusmn;10% band might be too narrow for higher accuracy. Possibly adjust the threshold for each column separately or transform col2 (e.g., log-scale if it’s always positive).
  2. Consider a Log or Other Transform

    • If col2 is always positive, you can transform col2 with log1p(col2) (log of col2 + 1), train the model, and then exponentiate predictions. This compresses outliers and might help the model get more consistent predictions.
  3. Feature Engineering

    • Add external signals (if relevant) that can help your LSTM see more context about what drives col2.
  4. Try More Advanced Architectures

    • E.g., GRU, TCN (Temporal Convolutional Network), or Transformers.
    • Sometimes a simpler architecture (like GRU) might converge faster and generalize better.
  5. Check Data Integrity

    • Missing or corrupted data can reduce performance drastically.
  6. Hyperparameter Tuning

    • Expand or refine your grid search (or use libraries like Optuna or Ray Tune) to systematically search larger spaces for hidden_size, LR, number of layers, dropout, batch size, and so on.
  7. Longer Training

    • If you see your training loss still improving at the end of 100 epochs, you can push further or reduce the learning rate and keep going.

Conclusion

Hitting 80% accuracy (with your threshold-based metric) may require careful data scaling, transformations, a sufficiently powerful (yet well-regularized) model, and thorough hyperparameter tuning. Start by focusing on data (scaling and outliers), and systematically tune the LSTM architecture and training scheme. Good luck!

问题

refactor code with this architecture :class LSTMModel(nn.Module):
def __init__(self,
input_size,
hidden_size=50, # 50 units in each LSTM layer
num_layers=3, # Keep this argument for compatibility
output_size=20, # 10 time steps * 2 features
dropout=0.2):
super(LSTMModel, self).__init__()

# First LSTM layer (returns the full sequence so the next LSTM can see all timesteps)
self.lstm1 = nn.LSTM(input_size, hidden_size, batch_first=True)
self.dropout1 = nn.Dropout(dropout)

# Second LSTM layer (again returns the full sequence)
self.lstm2 = nn.LSTM(hidden_size, hidden_size, batch_first=True)
self.dropout2 = nn.Dropout(dropout)

# Third LSTM layer (by default returns the full sequence; we’ll just slice the last timestep later)
self.lstm3 = nn.LSTM(hidden_size, hidden_size, batch_first=True)
self.dropout3 = nn.Dropout(dropout)

# Final dense layer to map the last hidden state to your output size
self.fc = nn.Linear(hidden_size, output_size)

def forward(self, x):
# First layer
out, _ = self.lstm1(x)
out = self.dropout1(out)

# Second layer
out, _ = self.lstm2(out)
out = self.dropout2(out)

# Third layer
out, _ = self.lstm3(out)
out = self.dropout3(out)

# We only need the last timestep’s output for final prediction
out = out[:, -1, :]

# Fully connected layer
out = self.fc(out)
return out
# Hyperparameters
input_size = 2 # (coin1, coin2)
hidden_size = 64 # 64 hidden units in each LSTM layer
num_layers = 3
output_size = 20 # 10 time steps * 2 features
dropout = 0.5 #0.2
learning_rate = 0.005 # Reduced learning rate 0.001
num_epochs = 9 # Increase epochs 22
batch_size = 64

# Create the model
model = LSTMModel(
input_size=input_size,
hidden_size=hidden_size,
num_layers=num_layers,
output_size=output_size,
dropout=dropout
)

思考

Mapping out the refactor

I'm integrating the LSTM model for a 10-step, multi-output forecast. This involves enhancing data loading, preparation, and training components to support the refined objectives.

Refining predictions

I'm working on a multi-step forecast for 10 future timesteps across two columns. The output will either be a single 20-dimensional vector or a (batch_size, 10, 2) shape.

Shaping the trainer

I'm refining the training process for multi-step predictions by determining data splitting, accuracy metrics, and adjusting the "predict_future" function to manage sequence prediction dynamically.

Figuring out the solution

I’m moving away from the iterative "predict_future" approach, focusing instead on a simple, efficient code snippet that seamlessly integrates with the provided architecture.

Refining sequences

Crafting "create_sequences_multistep" to produce X and flattened y, training model with specified parameters, and incorporating hyperparameters for consistency.

Thought about refactoring LSTM architecture for 25 seconds

回答

Below is one possible refactoring of your code that uses the 3-layer LSTM architecture you provided and predicts the next 10 time steps for 2 features (thus output_size = 20). This example demonstrates:

  • How to create multi-step training sequences.
  • How to handle a (forecast_horizon=10)-step-ahead prediction (for both col1 and col2).
  • How to shape the targets (y) to match the 20-dimensional output from the final dense layer.
  • A simple training loop (no advanced hyperparameter search or early stopping here, but you can add them back if desired).

Adjust any parameters (epochs, batch size, learning rate, etc.) as needed to see better performance.


1) Complete Refactored Code

import pandas as pd import numpy as np import torch import torch.nn as nn import torch.optim as optim from sklearn.preprocessing import MinMaxScaler from torch.utils.data import TensorDataset, DataLoader import matplotlib.pyplot as plt ############################################################################## # 1. Hyperparameters (from your snippet, feel free to tweak as needed) ############################################################################## input_size = 2 # Two features: (col1, col2) hidden_size = 64 # 64 hidden units in each LSTM layer num_layers = 3 # 3-layer LSTM (stacked) output_size = 20 # Predict 10 time steps * 2 features => 20 dropout = 0.5 # High dropout to reduce overfitting learning_rate = 0.005 num_epochs = 9 batch_size = 64 ############################################################################## # 2. Load Data ############################################################################## df = pd.read_csv("realDataForTrain.csv") # We'll assume your CSV has columns named exactly 'col1' and 'col2'. # If not, adjust accordingly. data = df[['col1', 'col2']].values # shape (N, 2) ############################################################################## # 3. Scaling ############################################################################## scaler = MinMaxScaler() data_scaled = scaler.fit_transform(data) # shape (N, 2) ############################################################################## # 4. Create Multi-Step Sequences ############################################################################## # We want to predict the next 10 time steps (each has 2 features). # That means our target y will be shape (10 * 2) = 20 for each sample. SEQ_LENGTH = 10 # how many past time steps we feed into LSTM FORECAST_HORIZON = 10 # how many future steps to predict def create_sequences_multistep(dataset, seq_length=10, forecast_horizon=10): X, y = [], [] # We stop at len(dataset) - seq_length - forecast_horizon + 1 for i in range(len(dataset) - seq_length - forecast_horizon + 1): # Past window past_window = dataset[i : i + seq_length] # shape (seq_length, 2) # Next horizon steps (the future we want to predict) future = dataset[i + seq_length : i + seq_length + forecast_horizon] # shape (forecast_horizon, 2) => e.g. (10, 2) # Flatten the future to match output_size=20 future_flat = future.flatten() # shape (forecast_horizon*2,) X.append(past_window) y.append(future_flat) return np.array(X), np.array(y) X, Y = create_sequences_multistep( data_scaled, seq_length=SEQ_LENGTH, forecast_horizon=FORECAST_HORIZON ) # For example, X.shape = (samples, 10, 2), Y.shape = (samples, 20) ############################################################################## # 5. Train/Val/Test Split ############################################################################## train_size = int(0.7 * len(X)) val_size = int(0.1 * len(X)) test_size = len(X) - train_size - val_size X_train = X[:train_size] y_train = Y[:train_size] X_val = X[train_size : train_size + val_size] y_val = Y[train_size : train_size + val_size] X_test = X[train_size + val_size:] y_test = Y[train_size + val_size:] # Convert to tensors X_train_tensors = torch.tensor(X_train, dtype=torch.float32) # (train_size, 10, 2) y_train_tensors = torch.tensor(y_train, dtype=torch.float32) # (train_size, 20) X_val_tensors = torch.tensor(X_val, dtype=torch.float32) y_val_tensors = torch.tensor(y_val, dtype=torch.float32) X_test_tensors = torch.tensor(X_test, dtype=torch.float32) y_test_tensors = torch.tensor(y_test, dtype=torch.float32) ############################################################################## # 6. Define the 3-Layer LSTM Model (Multi-step Output) ############################################################################## class LSTMModel(nn.Module): def __init__(self, input_size, # 2 hidden_size=50, # 50 units in each LSTM layer num_layers=3, output_size=20, # 10 time steps * 2 features dropout=0.2): super(LSTMModel, self).__init__() # First LSTM layer (returns the full sequence so the next LSTM can see all timesteps) self.lstm1 = nn.LSTM(input_size, hidden_size, batch_first=True) self.dropout1 = nn.Dropout(dropout) # Second LSTM layer (again returns the full sequence) self.lstm2 = nn.LSTM(hidden_size, hidden_size, batch_first=True) self.dropout2 = nn.Dropout(dropout) # Third LSTM layer (returns full sequence, we slice the last timestep) self.lstm3 = nn.LSTM(hidden_size, hidden_size, batch_first=True) self.dropout3 = nn.Dropout(dropout) # Final dense layer to map the last hidden state to your output size self.fc = nn.Linear(hidden_size, output_size) def forward(self, x): # x shape: (batch, seq_length=10, input_size=2) # First layer out, _ = self.lstm1(x) out = self.dropout1(out) # Second layer out, _ = self.lstm2(out) out = self.dropout2(out) # Third layer out, _ = self.lstm3(out) out = self.dropout3(out) # We only need the last timestep’s output for final prediction out = out[:, -1, :] # shape (batch, hidden_size) # Fully connected layer => shape (batch, output_size=20) out = self.fc(out) return out ############################################################################## # 7. Instantiate Model ############################################################################## model = LSTMModel( input_size=input_size, # 2 hidden_size=hidden_size, # 64 num_layers=num_layers, # 3 output_size=output_size, # 20 dropout=dropout # 0.5 ) model.to(torch.device("cpu")) # or "cuda" if you have a GPU ############################################################################## # 8. Prepare for Training ############################################################################## criterion = nn.MSELoss() # Common for regression optimizer = optim.Adam(model.parameters(), lr=learning_rate) # 0.005 train_dataset = TensorDataset(X_train_tensors, y_train_tensors) train_loader = DataLoader(train_dataset, batch_size=batch_size, shuffle=True) ############################################################################## # 9. Training Loop ############################################################################## for epoch in range(num_epochs): model.train() total_loss = 0 for X_batch, y_batch in train_loader: optimizer.zero_grad() outputs = model(X_batch) # shape (batch_size, 20) loss = criterion(outputs, y_batch) loss.backward() optimizer.step() total_loss += loss.item() avg_loss = total_loss / len(train_loader) # Optional: Validation model.eval() with torch.no_grad(): val_outputs = model(X_val_tensors) val_loss = criterion(val_outputs, y_val_tensors).item() print(f"Epoch [{epoch+1}/{num_epochs}], Train Loss: {avg_loss:.6f}, Val Loss: {val_loss:.6f}") ############################################################################## # 10. Evaluate on Test Set ############################################################################## model.eval() with torch.no_grad(): test_preds = model(X_test_tensors) # shape (test_size, 20) test_loss = criterion(test_preds, y_test_tensors).item() print(f"\nTest MSE Loss: {test_loss:.6f}") # Convert predictions back to shape (test_size, FORECAST_HORIZON=10, 2) test_preds_np = test_preds.numpy().reshape(-1, FORECAST_HORIZON, 2) y_test_np = y_test_tensors.numpy().reshape(-1, FORECAST_HORIZON, 2) # (Optional) Inverse transform each 2D time step. # Because we flattened for training, we need to invert the scaling carefully. # If you used a single scaler for both columns, we can do: test_preds_2d = test_preds_np.reshape(-1, 2) # shape (test_size * 10, 2) y_test_2d = y_test_np.reshape(-1, 2) test_preds_inv = scaler.inverse_transform(test_preds_2d) # shape (total_predictions, 2) y_test_inv = scaler.inverse_transform(y_test_2d) # Reshape back to (test_size, 10, 2) test_preds_inv = test_preds_inv.reshape(-1, FORECAST_HORIZON, 2) y_test_inv = y_test_inv.reshape(-1, FORECAST_HORIZON, 2) # Example: Plot the first test sample's 10 predicted points vs actual plt.figure(figsize=(10, 4)) plt.plot(range(1, FORECAST_HORIZON+1), y_test_inv[0, :, 0], label="Actual col1") plt.plot(range(1, FORECAST_HORIZON+1), test_preds_inv[0, :, 0], label="Pred col1") plt.title("First Test Sample, col1 (10-step forecast)") plt.legend() plt.show()

How the Multi-step Forecast Works

  1. create_sequences_multistep collects a 10-step input window (seq_length=10) and then the next 10 steps (forecast_horizon=10) as the target.
    • The target is flattened into a shape (10 * 2) = 20, matching the output_size=20.
  2. During training, the LSTM sees a sequence of length 10 (with 2 features each time step) and outputs a single vector of length 20, which represents the 10 future time steps (2 features each).
  3. You can reshape the final predictions from (N, 20) back to (N, 10, 2) to interpret them as 10 future steps of 2D data.

2) Key Points and Tuning Tips

  1. Check data splitting: Ensure your time-series is not randomly shuffled in a way that breaks temporal order. If your data is strictly sequential, you may prefer slicing (the code above does a simple slice-based split: first 70%, next 10%, final 20%).
  2. Hyperparameter choices:
    • Learning Rate (lr=0.005) might be high for some datasets; experiment with smaller values like 0.001 or 0.0005.
    • Epochs (num_epochs=9) might be too few for a deep LSTM. Try 20–50+ epochs, watch for overfitting.
    • Batch Size (64) can be changed to 32 or 128, depending on dataset size.
  3. Dropout (0.5) is fairly high; sometimes 0.2–0.3 is enough. Keep an eye on validation loss.
  4. Loss and metrics: We used MSE. If your target scale differs greatly across columns, consider scaling each column separately or using MAE or MAPE.

That’s it! This refactoring uses the 3-layer LSTM with dropout and a multi-step forecast output. Adjust the parameters (epochs, LR, hidden size, etc.) and keep an eye on validation metrics to refine performance.

问题

# Evaluate on the test set
model.eval()
all_predictions = []
all_actuals = []

with torch.no_grad():
for inputs, targets in test_loader:
inputs, targets = inputs.to(device), targets.to(device)

# Generate predictions
outputs = model(inputs)
all_predictions.append(outputs.cpu().numpy())
all_actuals.append(targets.cpu().numpy())

# Convert to numpy arrays
all_predictions = np.concatenate(all_predictions, axis=0)
all_actuals = np.concatenate(all_actuals, axis=0)

# Reshape predictions and actuals to (num_samples, 10, 2)
all_predictions = all_predictions.reshape(-1, 10, 2)
all_actuals = all_actuals.reshape(-1, 10, 2)

# Calculate metrics
mse = mean_squared_error(all_actuals.flatten(), all_predictions.flatten())
rmse = np.sqrt(mse)
mae = mean_absolute_error(all_actuals.flatten(), all_predictions.flatten())
r2 = r2_score(all_actuals.flatten(), all_predictions.flatten())

# Calculate accuracy (within 5% tolerance)
accuracy = calculate_accuracy(all_actuals.flatten(), all_predictions.flatten(), threshold=0.10)

# Print all evaluation metrics
print(f"MSE: {mse:.4f}, RMSE: {rmse:.4f}, MAE: {mae:.4f}, R²: {r2:.4f}, Accuracy: {accuracy:.2f}%")
this is my code sample refactor my code to calculate parameter with the same formula of code sample

思考

Pulling together evaluation metrics

I’m gathering predictions and actual values, converting them to numpy arrays, reshaping them, and calculating MSE, RMSE, MAE, R^2, and accuracy with a 10% tolerance.

Thought about model evaluation metrics for 7 seconds

回答

Below is a refactored evaluation code snippet that calculates the same metrics—MSE, RMSE, MAE, R2R^2, and accuracy (with a threshold)—using the same formulas you showed in your sample. This snippet assumes:

  1. You already have:
    • a trained model called model,
    • a test_loader that yields (inputs, targets) batches,
    • a function calculate_accuracy(...) that calculates accuracy based on a threshold (like 5% or 10%).
  2. You are working on either CPU or GPU (specified by device).

Feel free to insert this code in your main script wherever you evaluate your model.


import numpy as np import torch from sklearn.metrics import mean_squared_error, mean_absolute_error, r2_score # Example accuracy function (threshold in fraction, e.g., 0.10 for 10%) def calculate_accuracy(y_true, y_pred, threshold=0.10): """ Calculate accuracy based on how many predictions are within a given percentage threshold of the true values. Note: y_true and y_pred must be the same shape. """ within_tolerance = np.abs(y_true - y_pred) <= threshold * np.abs(y_true) # Convert boolean array to percentage accuracy = np.mean(within_tolerance) * 100 return accuracy ############################################## # Evaluation on the Test Set ############################################## model.eval() # put model in evaluation mode all_predictions = [] all_actuals = [] with torch.no_grad(): for inputs, targets in test_loader: # Move data to device (CPU or GPU) inputs = inputs.to(device) targets = targets.to(device) # Generate predictions outputs = model(inputs) # shape (batch, 20) if you're predicting 10 steps * 2 features # Collect for later metric computation all_predictions.append(outputs.cpu().numpy()) all_actuals.append(targets.cpu().numpy()) # Concatenate all batches along axis=0 all_predictions = np.concatenate(all_predictions, axis=0) # shape (num_samples, 20) all_actuals = np.concatenate(all_actuals, axis=0) # shape (num_samples, 20) # If each sample is 10 time steps * 2 features => reshape to (num_samples, 10, 2) all_predictions = all_predictions.reshape(-1, 10, 2) all_actuals = all_actuals.reshape(-1, 10, 2) # Flatten them again for standard regression metrics => shape (num_samples * 10 * 2,) preds_flat = all_predictions.flatten() actuals_flat = all_actuals.flatten() # Calculate standard metrics mse = mean_squared_error(actuals_flat, preds_flat) rmse = np.sqrt(mse) mae = mean_absolute_error(actuals_flat, preds_flat) r2 = r2_score(actuals_flat, preds_flat) # Calculate accuracy with threshold (e.g., 0.10 for 10%) accuracy = calculate_accuracy(actuals_flat, preds_flat, threshold=0.10) print(f"MSE: {mse:.4f}") print(f"RMSE: {rmse:.4f}") print(f"MAE: {mae:.4f}") print(f"R^2: {r2:.4f}") print(f"Accuracy (&plusmn;10%): {accuracy:.2f}%")

How It Works

  1. Gather Predictions
    We loop through the test_loader and collect the model outputs (all_predictions) and actual values (all_actuals) in separate lists.

  2. Reshape

    • If each prediction is for 10 time steps ×\times 2 features = 20 values, we first get them as (num_samples, 20).
    • Reshape them into (num_samples, 10, 2) so it’s easier to interpret.
    • For metric calculations (MSE, MAE, etc.), we flatten them back to 1D arrays so that sklearn.metrics can directly handle them.
  3. Compute Metrics

    • MSE and RMSE measure average squared/absolute error.
    • MAE measures mean absolute error.
    • R2R^2 measures how well predictions approximate the true data variance (1 is perfect, 0 is baseline, negative indicates worse than baseline).
    • Accuracy: We use your custom tolerance-based approach, which checks whether each predicted value is within threshold * abs(true_value) of the actual value. We then convert that fraction to a percentage.
  4. Print
    We print each metric to get a final overview of the model’s performance on the test set.


Note:

  • Make sure your calculate_accuracy threshold matches how you want to interpret being “within 5% or 10%.” If your data can be zero or very small in magnitude, you may need to handle that carefully (e.g., add a small epsilon or use a different metric).
  • If you want to inverse transform predictions to the original scale (e.g., after MinMax scaling), do that before flattening for MSE/MAE calculations in original units. Then flatten the “inverse-transformed” arrays. This will give you metrics in the original scale.

问题

class LSTMModel(nn.Module):
def __init__(self, input_size, hidden_size, num_layers, output_size, dropout=0.5):
super(LSTMModel, self).__init__()

# LSTM Layer
self.lstm = nn.LSTM(input_size, hidden_size, num_layers, batch_first=True, dropout=dropout)

# Dropout layer
self.dropout = nn.Dropout(0.2)

# Dense Layer
self.dense = nn.Linear(hidden_size, 256) # Dense layer with 256 units
self.output = nn.Linear(256, output_size) # Output layer to produce the final prediction

def forward(self, x):
out, _ = self.lstm(x) # Pass through LSTM
out = self.dropout(out) # Apply dropout
out = self.dense(out[:, -1, :]) # Take the last time step output and pass through the dense layer
out = self.output(out) # Output layer

return out

input_size = 2 # (coin1, coin2)
hidden_size = 64 # 64 hidden units in each LSTM layer
num_layers = 3
output_size = 20 # 10 time steps * 2 features
dropout = 0.5 #0.2
learning_rate = 0.005 # Reduced learning rate 0.001
num_epochs = 22 # Increase epochs 22
batch_size = 64

# Create the model
model = LSTMModel(
input_size=input_size,
hidden_size=hidden_size,
num_layers=num_layers,
output_size=output_size,
dropout=dropout
) i want to use this LSTM in my code step by step tell me how i refactor my code to use this LSTM in my code

思考

Integrating LSTM architecture

I think through the steps of incorporating the given LSTM architecture into multi-step forecasting, covering data loading, scaling, sequence creation, data splitting, model instance creation, criterion and optimizer setup, training loop, evaluation, and performance metrics.

Clarifying forecast strategy

I’m noting that the user aims for a multi-step forecast with 10 steps and suggests detailed instructions for integrating the LSTM model, including data handling, model setup, and performance evaluation.

Thought for 6 seconds

回答

Below is a step-by-step guide on how to integrate and use your new LSTMModel in a typical multi-step forecasting scenario. I will assume:

  • You want to predict the next 10 time steps of 2 features (hence output_size=20 = 10 * 2).
  • You have a dataset containing 2 columns (like col1, col2) and you want to train an LSTM to forecast the next 10 steps for both columns.
  • You’re doing a sequence-to-multi-sequence forecast, i.e., you feed the past SEQ_LENGTH steps in order to predict the next 10 steps.

Feel free to copy & paste and then edit to match your file names, variable names, or paths.


1. Imports and Device Setup

import numpy as np import pandas as pd import torch import torch.nn as nn import torch.optim as optim from sklearn.preprocessing import MinMaxScaler from torch.utils.data import TensorDataset, DataLoader import matplotlib.pyplot as plt # If you have a GPU and want to use it: device = torch.device("cuda" if torch.cuda.is_available() else "cpu") print("Using device:", device)

2. Load Your CSV Data

# Load your dataset df = pd.read_csv("realDataForTrain.csv") # Replace with your CSV path # Let's assume your CSV has 'col1' and 'col2' columns data = df[['col1', 'col2']].values # shape (N, 2)

3. Scaling

You can use a single MinMaxScaler for both columns. Alternatively, if columns differ widely, you can scale them separately. For now, let’s do one:

scaler = MinMaxScaler() data_scaled = scaler.fit_transform(data) # shape (N,2)

4. Create Multi-step Sequences

We’ll define a function to create sequences of length SEQ_LENGTH (past) and then to forecast the next 10 steps. Because you want 10 steps * 2 features = 20 outputs, we flatten the future window into a shape (20,).

Note: If you only want a single-step forecast, you’d change forecast_horizon. But your code suggests you want 10 steps.

SEQ_LENGTH = 10 FORECAST_HORIZON = 10 # the next 10 steps to predict => output_size=20 def create_sequences_multistep(dataset, seq_length=10, forecast_horizon=10): X, y = [], [] for i in range(len(dataset) - seq_length - forecast_horizon + 1): # Past window (shape: (seq_length, 2)) past_window = dataset[i : i + seq_length] # Future window => next 10 steps (shape: (forecast_horizon, 2)) future = dataset[i + seq_length : i + seq_length + forecast_horizon] # Flatten future from shape (10, 2) to (10*2=20,) future_flat = future.flatten() X.append(past_window) y.append(future_flat) return np.array(X), np.array(y) X, Y = create_sequences_multistep( data_scaled, seq_length=SEQ_LENGTH, forecast_horizon=FORECAST_HORIZON ) print("X shape:", X.shape) # e.g. (num_samples, 10, 2) print("Y shape:", Y.shape) # e.g. (num_samples, 20)

5. Train/Validation/Test Split

We can do a simple 70%/10%/20% split:

train_size = int(len(X)*0.7) val_size = int(len(X)*0.1) test_size = len(X) - train_size - val_size X_train = X[:train_size] Y_train = Y[:train_size] X_val = X[train_size : train_size + val_size] Y_val = Y[train_size : train_size + val_size] X_test = X[train_size + val_size:] Y_test = Y[train_size + val_size:]

6. Convert to Tensors

X_train_tensors = torch.tensor(X_train, dtype=torch.float32).to(device) Y_train_tensors = torch.tensor(Y_train, dtype=torch.float32).to(device) X_val_tensors = torch.tensor(X_val, dtype=torch.float32).to(device) Y_val_tensors = torch.tensor(Y_val, dtype=torch.float32).to(device) X_test_tensors = torch.tensor(X_test, dtype=torch.float32).to(device) Y_test_tensors = torch.tensor(Y_test, dtype=torch.float32).to(device)

7. Build a DataLoader for Training

Since you want to train in batches, define a TensorDataset for train and then create a DataLoader:

from torch.utils.data import TensorDataset, DataLoader train_dataset = TensorDataset(X_train_tensors, Y_train_tensors) batch_size = 64 # as in your code train_loader = DataLoader(train_dataset, batch_size=batch_size, shuffle=True)

(Optionally, you can also define a validation or test loader for iterating in mini-batches, but it’s often okay to just do an entire pass in evaluation mode for val/test.)


8. Define Your New LSTM Model

You already have the architecture:

class LSTMModel(nn.Module): def __init__(self, input_size, hidden_size, num_layers, output_size, dropout=0.5): super(LSTMModel, self).__init__() # LSTM Layer self.lstm = nn.LSTM(input_size, hidden_size, num_layers, batch_first=True, dropout=dropout) # Dropout layer self.dropout = nn.Dropout(0.2) # Dense Layer self.dense = nn.Linear(hidden_size, 256) # Dense layer with 256 units self.output = nn.Linear(256, output_size) # Output layer => final prediction def forward(self, x): out, _ = self.lstm(x) out = self.dropout(out) out = self.dense(out[:, -1, :]) # take only the last time step out = self.output(out) return out # Hyperparameters input_size = 2 # (col1, col2) hidden_size = 64 num_layers = 3 output_size = 20 # 10 time steps * 2 features dropout = 0.5 learning_rate = 0.005 # or 0.001 num_epochs = 22

Create an instance of the model and move it to device:

model = LSTMModel( input_size=input_size, hidden_size=hidden_size, num_layers=num_layers, output_size=output_size, dropout=dropout ).to(device)

9. Define Loss Function and Optimizer

criterion = nn.MSELoss() # or nn.SmoothL1Loss(), etc. optimizer = optim.Adam(model.parameters(), lr=learning_rate)

10. Train Loop

We’ll do a simple loop for num_epochs epochs. You can incorporate early stopping or validation checks if you like.

for epoch in range(num_epochs): model.train() total_loss = 0.0 for X_batch, Y_batch in train_loader: # (X_batch, Y_batch) are already on device if you used .to(device) earlier optimizer.zero_grad() # Forward pass outputs = model(X_batch) # shape (batch_size, 20) # Compute loss loss = criterion(outputs, Y_batch) # Backprop loss.backward() optimizer.step() total_loss += loss.item() avg_train_loss = total_loss / len(train_loader) # Optionally, check validation loss model.eval() with torch.no_grad(): val_outputs = model(X_val_tensors) val_loss = criterion(val_outputs, Y_val_tensors).item() print(f"Epoch [{epoch+1}/{num_epochs}], " f"Train Loss: {avg_train_loss:.6f}, " f"Val Loss: {val_loss:.6f}")

11. Evaluate on Test Set

  1. Gather predictions from the test set.
  2. Reshape them to (num_samples, 10, 2) so you can interpret them as 10-step forecasts.
  3. Optionally invert scale with your scaler.
  4. Compute MSE, RMSE, MAE, R2R^2, and an accuracy measure (within some tolerance).

11.1 Building a DataLoader (Optional)

test_dataset = TensorDataset(X_test_tensors, Y_test_tensors) test_loader = DataLoader(test_dataset, batch_size=batch_size, shuffle=False)

11.2 Forward Pass Over Test Loader

import numpy as np from sklearn.metrics import mean_squared_error, mean_absolute_error, r2_score def calculate_accuracy(y_true, y_pred, threshold=0.10): """ e.g., threshold=0.10 for 10% tolerance y_true, y_pred: shape (n,) flattened arrays """ within_tolerance = np.abs(y_true - y_pred) <= threshold * np.abs(y_true) return np.mean(within_tolerance) * 100 model.eval() all_predictions = [] all_actuals = [] with torch.no_grad(): for inputs, targets in test_loader: inputs, targets = inputs.to(device), targets.to(device) # Generate predictions outputs = model(inputs) # shape (batch, 20) all_predictions.append(outputs.cpu().numpy()) all_actuals.append(targets.cpu().numpy()) # Concatenate to get full arrays all_predictions = np.concatenate(all_predictions, axis=0) # shape (N, 20) all_actuals = np.concatenate(all_actuals, axis=0) # shape (N, 20) # Reshape to (N, 10, 2) all_predictions = all_predictions.reshape(-1, 10, 2) all_actuals = all_actuals.reshape(-1, 10, 2) # Flatten to compute metrics => shape (N * 10 * 2,) preds_flat = all_predictions.flatten() actuals_flat = all_actuals.flatten() mse = mean_squared_error(actuals_flat, preds_flat) rmse = np.sqrt(mse) mae = mean_absolute_error(actuals_flat, preds_flat) r2 = r2_score(actuals_flat, preds_flat) acc = calculate_accuracy(actuals_flat, preds_flat, threshold=0.10) # 10% threshold print(f"MSE : {mse:.4f}") print(f"RMSE: {rmse:.4f}") print(f"MAE : {mae:.4f}") print(f"R^2 : {r2:.4f}") print(f"Accuracy (&plusmn;10%): {acc:.2f}%")

11.3 (Optional) Inverse-Transform Predictions

If you want metrics in original scale, you must:

  1. Flatten all_predictions to (N * 10, 2).
  2. Use scaler.inverse_transform(...).
  3. Compare to all_actuals also inverse-transformed.

Example:

# Flatten from (N, 10, 2) to (N*10, 2) preds_2d = all_predictions.reshape(-1, 2) acts_2d = all_actuals.reshape(-1, 2) preds_inv = scaler.inverse_transform(preds_2d) acts_inv = scaler.inverse_transform(acts_2d) # Now flatten for metrics preds_inv_flat = preds_inv.flatten() acts_inv_flat = acts_inv.flatten() mse_inv = mean_squared_error(acts_inv_flat, preds_inv_flat) rmse_inv = np.sqrt(mse_inv) mae_inv = mean_absolute_error(acts_inv_flat, preds_inv_flat) r2_inv = r2_score(acts_inv_flat, preds_inv_flat) acc_inv = calculate_accuracy(acts_inv_flat, preds_inv_flat, threshold=0.10) print("\n=== Metrics on Original Scale ===") print(f"MSE : {mse_inv:.4f}") print(f"RMSE: {rmse_inv:.4f}") print(f"MAE : {mae_inv:.4f}") print(f"R^2 : {r2_inv:.4f}") print(f"Accuracy (&plusmn;10%): {acc_inv:.2f}%")

12. (Optional) Predict Future (Out-of-sample)

If you want to predict the next 10 steps after your dataset ends, you can take the last SEQ_LENGTH rows of your scaled data and run them through the model:

def predict_future(model, data_scaled, seq_length=10, forecast_horizon=10): """ data_scaled: shape (N, 2) Take last seq_length steps as input and produce the next 10-step forecast. """ model.eval() window = data_scaled[-seq_length:] # shape (10,2) # Convert to (1, seq_length, 2) x_t = torch.tensor(window[np.newaxis, :, :], dtype=torch.float32).to(device) with torch.no_grad(): preds = model(x_t) # shape (1, 20) # Reshape to (10,2) preds = preds.cpu().numpy().reshape(-1, 2) return preds future_preds_scaled = predict_future(model, data_scaled, seq_length=10, forecast_horizon=10) # Inverse transform to original scale future_preds = scaler.inverse_transform(future_preds_scaled) print("Next 10 predictions (col1, col2):\n", future_preds)

Putting It All Together

Below is a condensed version of the code that includes all the key steps. You can copy, paste, and then customize:

import numpy as np import pandas as pd import torch import torch.nn as nn import torch.optim as optim from sklearn.preprocessing import MinMaxScaler from torch.utils.data import TensorDataset, DataLoader from sklearn.metrics import mean_squared_error, mean_absolute_error, r2_score import matplotlib.pyplot as plt device = torch.device("cuda" if torch.cuda.is_available() else "cpu") ############################################################################## # 1. Load Data ############################################################################## df = pd.read_csv("realDataForTrain.csv") data = df[['col1', 'col2']].values # shape (N, 2) ############################################################################## # 2. Scaling ############################################################################## scaler = MinMaxScaler() data_scaled = scaler.fit_transform(data) ############################################################################## # 3. Sequence Creation ############################################################################## SEQ_LENGTH = 10 FORECAST_HORIZON = 10 # Next 10 steps => output_size=20 def create_sequences_multistep(dataset, seq_length=10, forecast_horizon=10): X, y = [], [] for i in range(len(dataset) - seq_length - forecast_horizon + 1): past_window = dataset[i : i + seq_length] future = dataset[i + seq_length : i + seq_length + forecast_horizon] future_flat = future.flatten() X.append(past_window) y.append(future_flat) return np.array(X), np.array(y) X, Y = create_sequences_multistep(data_scaled, SEQ_LENGTH, FORECAST_HORIZON) ############################################################################## # 4. Train/Val/Test Split ############################################################################## train_size = int(0.7 * len(X)) val_size = int(0.1 * len(X)) test_size = len(X) - train_size - val_size X_train, Y_train = X[:train_size], Y[:train_size] X_val, Y_val = X[train_size: train_size+val_size], Y[train_size: train_size+val_size] X_test, Y_test = X[train_size+val_size:], Y[train_size+val_size:] ############################################################################## # 5. Convert to Tensors ############################################################################## X_train_tensors = torch.tensor(X_train, dtype=torch.float32).to(device) Y_train_tensors = torch.tensor(Y_train, dtype=torch.float32).to(device) X_val_tensors = torch.tensor(X_val, dtype=torch.float32).to(device) Y_val_tensors = torch.tensor(Y_val, dtype=torch.float32).to(device) X_test_tensors = torch.tensor(X_test, dtype=torch.float32).to(device) Y_test_tensors = torch.tensor(Y_test, dtype=torch.float32).to(device) ############################################################################## # 6. DataLoader for Training ############################################################################## train_dataset = TensorDataset(X_train_tensors, Y_train_tensors) batch_size = 64 train_loader = DataLoader(train_dataset, batch_size=batch_size, shuffle=True) ############################################################################## # 7. Define the Model ############################################################################## class LSTMModel(nn.Module): def __init__(self, input_size, hidden_size, num_layers, output_size, dropout=0.5): super(LSTMModel, self).__init__() self.lstm = nn.LSTM(input_size, hidden_size, num_layers, batch_first=True, dropout=dropout) self.dropout = nn.Dropout(0.2) self.dense = nn.Linear(hidden_size, 256) self.output = nn.Linear(256, output_size) def forward(self, x): out, _ = self.lstm(x) out = self.dropout(out) out = self.dense(out[:, -1, :]) out = self.output(out) return out input_size = 2 hidden_size = 64 num_layers = 3 output_size = 20 # 10 steps * 2 features dropout = 0.5 learning_rate = 0.005 num_epochs = 22 model = LSTMModel(input_size, hidden_size, num_layers, output_size, dropout=dropout).to(device) ############################################################################## # 8. Define Loss & Optimizer ############################################################################## criterion = nn.MSELoss() optimizer = optim.Adam(model.parameters(), lr=learning_rate) ############################################################################## # 9. Training Loop ############################################################################## for epoch in range(num_epochs): model.train() total_loss = 0.0 for X_batch, Y_batch in train_loader: optimizer.zero_grad() outputs = model(X_batch) # (batch_size, 20) loss = criterion(outputs, Y_batch) loss.backward() optimizer.step() total_loss += loss.item() avg_train_loss = total_loss / len(train_loader) # Validation model.eval() with torch.no_grad(): val_outputs = model(X_val_tensors) val_loss = criterion(val_outputs, Y_val_tensors).item() print(f"Epoch [{epoch+1}/{num_epochs}], " f"Train Loss: {avg_train_loss:.6f}, " f"Val Loss: {val_loss:.6f}") ############################################################################## # 10. Evaluate on Test ############################################################################## test_dataset = TensorDataset(X_test_tensors, Y_test_tensors) test_loader = DataLoader(test_dataset, batch_size=batch_size, shuffle=False) def calculate_accuracy(y_true, y_pred, threshold=0.10): within_tolerance = np.abs(y_true - y_pred) <= threshold * np.abs(y_true) return np.mean(within_tolerance) * 100 model.eval() all_predictions = [] all_actuals = [] with torch.no_grad(): for inputs, targets in test_loader: inputs, targets = inputs.to(device), targets.to(device) outputs = model(inputs) all_predictions.append(outputs.cpu().numpy()) all_actuals.append(targets.cpu().numpy()) all_predictions = np.concatenate(all_predictions, axis=0) # (N, 20) all_actuals = np.concatenate(all_actuals, axis=0) # (N, 20) # Reshape to (N, 10, 2) all_predictions = all_predictions.reshape(-1, 10, 2) all_actuals = all_actuals.reshape(-1, 10, 2) preds_flat = all_predictions.flatten() actuals_flat = all_actuals.flatten() mse = mean_squared_error(actuals_flat, preds_flat) rmse = np.sqrt(mse) mae = mean_absolute_error(actuals_flat, preds_flat) r2 = r2_score(actuals_flat, preds_flat) acc = calculate_accuracy(actuals_flat, preds_flat, threshold=0.10) print(f"\n=== Test Set Metrics (Scaled Space) ===") print(f"MSE: {mse:.6f}") print(f"RMSE: {rmse:.6f}") print(f"MAE: {mae:.6f}") print(f"R^2: {r2:.6f}") print(f"Accuracy (&plusmn;10%): {acc:.2f}%")

Summary of Refactoring Steps

  1. Architecture: You replaced your older LSTM code with the new LSTMModel that uses a single LSTM layer (with num_layers=3 internally), plus a dropout, a dense layer, and an output layer of size output_size=20.

  2. Data: Created a multi-step (10-step) forecast problem by flattening the future window.

  3. Training: Standard PyTorch loop with nn.MSELoss() and Adam optimizer. The final layer of your LSTM code returns shape (batch_size, 20), matching the flattened 10-step * 2 features.

  4. Evaluation: Collected predictions from the test loader, reshaped them to (N, 10, 2), and computed error metrics.

  5. Optional: Inverse scale predictions if you want metrics in the original units.

With these steps, you’re now using the new LSTM architecture in your existing pipeline, and you have a clear, step-by-step blueprint for how it all connects.

问题

write accuracy formula for evaluate LSTM with threshhold 10% Python code

思考

Thought for a second

回答

Below is a simple Python function you can use to calculate accuracy for an LSTM (or any regression model) based on a 10% threshold. It assumes your targets and predictions are NumPy arrays of the same shape:

import numpy as np def calculate_accuracy(y_true, y_pred, threshold=0.10): """ Calculate accuracy as the percentage of predictions where the absolute error is within 'threshold' times the absolute value of the true target. Parameters ---------- y_true : np.ndarray Ground-truth values. y_pred : np.ndarray Model predictions, same shape as y_true. threshold : float, optional (default=0.10) The percentage threshold (e.g., 0.10 for 10%). Returns ------- accuracy : float Accuracy percentage (0 to 100). """ # Ensure both arrays are flattened or the same shape y_true_flat = y_true.flatten() y_pred_flat = y_pred.flatten() # Avoid division by zero by also checking if y_true is zero # (you can decide how to handle points where y_true is close to 0) within_tolerance = np.abs(y_true_flat - y_pred_flat) <= threshold * np.abs(y_true_flat) # Convert the fraction to a percentage accuracy = np.mean(within_tolerance) * 100 return accuracy

Usage Example

# Suppose you have these arrays (e.g., after model evaluation) y_true = np.array([1.0, 2.0, 3.0, 4.0]) y_pred = np.array([1.05, 2.2, 2.7, 4.1]) acc_10pct = calculate_accuracy(y_true, y_pred, threshold=0.10) print(f"Accuracy with &plusmn;10% tolerance: {acc_10pct:.2f}%")

分享这个问答