← Back to blog

BTC Predictions Using Deep Learning Models

2023-10-20

Overview

This project leverages deep learning models to forecast Bitcoin prices. It includes extensive data preprocessing, feature engineering, and predictive modeling. The goal was to capture Bitcoin's highly volatile and non-linear price dynamics while providing interpretable and accurate predictions.


Key Contributions

  1. Data Preprocessing:

    • Historical Bitcoin price data was collected and cleaned.
    • Features like moving averages, price differences, and lagged returns were engineered.
    • Data was scaled to improve model performance.
  2. Baseline Models:

    • Naïve forecast is a simple and intuitive time-series forecasting method that assumes the value at the next time step will be equal to the value at the current time step (or a recent average, depending on the context). It serves as a baseline model to compare more sophisticated forecasting methods.

Mapping the forcast to our data:

Alt Text

Evaulating the Naive-Forcast prediction:

import tensorflow as tf

# Implement MASE (assuming no seasonality of data).

def mean_absolute_scaled_error(y_true, y_pred):
  
  mae = tf.reduce_mean(tf.abs(y_true - y_pred))

  # Finding the MAE of naive forecast (no seasonality)
  mae_naive_no_season = tf.reduce_mean(tf.abs(y_true[1:] - y_true[:-1])) # our seasonality is 1 day (hence the shifting of 1 day)

  return mae / mae_naive_no_season

def evaluate_preds(y_true, y_pred):
  # Make sure float32 (for metric calculations)
  y_true = tf.cast(y_true, dtype=tf.float32)
  y_pred = tf.cast(y_pred, dtype=tf.float32)

  # Calculate various metrics
  mae = tf.keras.metrics.mean_absolute_error(y_true, y_pred)
  mse = tf.keras.metrics.mean_squared_error(y_true, y_pred) # puts and emphasis on outliers (all errors get squared)
  rmse = tf.sqrt(mse)
  mape = tf.keras.metrics.mean_absolute_percentage_error(y_true, y_pred)
  mase = mean_absolute_scaled_error(y_true, y_pred)
  
  return {"mae": mae.numpy(),
          "mse": mse.numpy(),
          "rmse": rmse.numpy(),
          "mape": mape.numpy(),
          "mase": mase.numpy()}

naive_results = evaluate_preds(y_true=y_test[1:],
                               y_pred=naive_forecast)
naive_results

Outputed the following Evaluation Metrics:

Mean Absolute Error (MAE)

Mean Squared Error (MSE)

Root Mean Squared Error (RMSE)

Mean Absolute Percentage Error (MAPE)

Mean Absolute Scaled Error (MASE)

I then created 3 new models to evaluate against our baseline model.

Model 1:

def make_preds(model, input_data):
  forecast = model.predict(input_data)
  return tf.squeeze(forecast) 

model_1_preds = make_preds(model_1, test_windows)
len(model_1_preds), model_1_preds[:10]

# Evaluate preds
model_1_results = evaluate_preds(y_true=tf.squeeze(test_labels), # reduce to right shape
                                 y_pred=model_1_preds)
model_1_results

Model 2:

HORIZON = 1 # predict one step at a time
WINDOW_SIZE = 30 # use 30 timesteps in the past

full_windows, full_labels = make_windows(prices, window_size=WINDOW_SIZE, horizon=HORIZON)
len(full_windows), len(full_labels)

train_windows, test_windows, train_labels, test_labels = make_train_test_splits(windows=full_windows, labels=full_labels)
len(train_windows), len(test_windows), len(train_labels), len(test_labels)
     
tf.random.set_seed(42)

# Create model (same model as model 1 but data input will be different)
model_2 = tf.keras.Sequential([
  layers.Dense(128, activation="relu"),
  layers.Dense(HORIZON) 
], name="model_2_dense")

model_2.compile(loss="mae",
                optimizer=tf.keras.optimizers.Adam())

model_2.fit(train_windows,
            train_labels,
            epochs=100,
            batch_size=128,
            verbose=0,
            validation_data=(test_windows, test_labels),
            callbacks=[create_model_checkpoint(model_name=model_2.name)])

model_2.evaluate(test_windows, test_labels)

# Loading in best performing model
model_2 = tf.keras.models.load_model("model_experiments/model_2_dense/")
model_2.evaluate(test_windows, test_labels)

model_2_preds = make_preds(model_2,
                           input_data=test_windows)

# Evaluate results for model 2 predictions
model_2_results = evaluate_preds(y_true=tf.squeeze(test_labels), # remove 1 dimension of test labels
                                 y_pred=model_2_preds)
model_2_results

Model 3:

HORIZON = 7
WINDOW_SIZE = 30

full_windows, full_labels = make_windows(prices, window_size=WINDOW_SIZE, horizon=HORIZON)
len(full_windows), len(full_labels)
     
train_windows, test_windows, train_labels, test_labels = make_train_test_splits(windows=full_windows, labels=full_labels, test_split=0.2)
len(train_windows), len(test_windows), len(train_labels), len(test_labels)

tf.random.set_seed(42)

# Create model (same as model_1 except with different data input size)
model_3 = tf.keras.Sequential([
  layers.Dense(128, activation="relu"),
  layers.Dense(HORIZON)
], name="model_3_dense")

model_3.compile(loss="mae",
                optimizer=tf.keras.optimizers.Adam())

model_3.fit(train_windows,
            train_labels,
            batch_size=128,
            epochs=100,
            verbose=0,
            validation_data=(test_windows, test_labels),
            callbacks=[create_model_checkpoint(model_name=model_3.name)])


model_3.evaluate(test_windows, test_labels)

model_3 = tf.keras.models.load_model("model_experiments/model_3_dense/")
model_3.evaluate(test_windows, test_labels)

model_3_preds = make_preds(model_3,
                           input_data=test_windows)
model_3_preds[:5]

model_3_results = evaluate_preds(y_true=tf.squeeze(test_labels),
                                 y_pred=model_3_preds)
model_3_results

1. Model Architecture

All models share a similar architecture:

The primary difference lies in the input structure and the forecasting horizon.


2. Input and Forecasting Horizon


3. Performance Metrics


Then plotting the comparison between the two models.

Alt Text

Model 4: Convolutional Neural Network (Conv1D)

Incorporate sequence-to-sequence models or attention mechanisms for multistep forecasting,will improve data prediction. Use external features like trading volume and sentiment data to enhance predictions.Taking account of the pervious models, I created Conv1D, which is great for time-seires forecasting Conv1D is often used to find patterns or trends from sequential data.

Input Data


Architecture

  1. Lambda Layer: Expands input dimensions to (7, 1).
  2. Conv1D Layer:
    • Filters: 128 filters extract 128 different patterns from the input sequence.
    • Kernel Size: A kernel size of 5 enables the model to analyze local patterns over 5 consecutive timesteps.
    • Padding: "Causal" padding ensures predictions are based only on past data, crucial for time-series forecasting.
    • Activation Function: ReLU introduces non-linearity to learn complex patterns.
  3. Dense Layer: A single output neuron predicts the next day's price.

Training


Evaluation Results


HORIZON = 1 # predict next day
WINDOW_SIZE = 7 # use previous week worth of data

# Create windowed dataset
full_windows, full_labels = make_windows(prices, window_size=WINDOW_SIZE, horizon=HORIZON)
len(full_windows), len(full_labels)

train_windows, test_windows, train_labels, test_labels = make_train_test_splits(full_windows, full_labels)
len(train_windows), len(test_windows), len(train_labels), len(test_labels)

train_windows[0].shape 

# In order to ensure functionality, we must modify the structure of our data before feeding it into the Conv1D layer.
x = tf.constant(train_windows[0])
expand_dims_layer = layers.Lambda(lambda x: tf.expand_dims(x, axis=1)) 
print(f"Original shape: {x.shape}") # (WINDOW_SIZE)
print(f"Expanded shape: {expand_dims_layer(x).shape}") # (WINDOW_SIZE, input_dim) 
print(f"Original values with expanded shape:\n {expand_dims_layer(x)}")

tf.random.set_seed(42)

# Create model
model_4 = tf.keras.Sequential([
  layers.Lambda(lambda x: tf.expand_dims(x, axis=1)), 
  layers.Conv1D(filters=128, kernel_size=5, padding="causal", activation="relu"),
  layers.Dense(HORIZON)
], name="model_4_conv1D")

# Compile model
model_4.compile(loss="mae",
                optimizer=tf.keras.optimizers.Adam())

# Fit model
model_4.fit(train_windows,
            train_labels,
            batch_size=128, 
            epochs=100,
            verbose=0,
            validation_data=(test_windows, test_labels),
            callbacks=[create_model_checkpoint(model_name=model_4.name)])
     
model_4 = tf.keras.models.load_model("model_experiments/model_4_conv1D")
model_4.evaluate(test_windows, test_labels)

# Making predictions
model_4_preds = make_preds(model_4, test_windows)
model_4_preds[:10]

model_4_results = evaluate_preds(y_true=tf.squeeze(test_labels),
                                 y_pred=model_4_preds)
model_4_results

Recurrent Neural Networks (RNNs)

Another Model we can use for the sequential data is a Recurrent Neural Net or in Our case an LSTM model. Recurrent Neural Networks introduce a mechanism where the output from one step is fed back as input to the next, allowing them to retain information from previous inputs. This design makes RNNs well-suited for tasks where context from earlier steps is essential, such as predicting the next word in a sentence.

Definition adapted from
Introduction to Recurrent Neural Networks

Model 5: Recurrent Neural Network (LSTM)

Input Data

Architecture

Evaluation


tf.random.set_seed(42)
inputs = layers.Input(shape=(WINDOW_SIZE))
x = layers.Lambda(lambda x: tf.expand_dims(x, axis=1))(inputs) 
x = layers.LSTM(128, activation="relu")(x) 
output = layers.Dense(HORIZON)(x)
model_5 = tf.keras.Model(inputs=inputs, outputs=output, name="model_5_lstm")

# Compile model
model_5.compile(loss="mae",
                optimizer=tf.keras.optimizers.Adam())

model_5.fit(train_windows,
            train_labels,
            epochs=100,
            verbose=0,
            batch_size=128,
            validation_data=(test_windows, test_labels),
            callbacks=[create_model_checkpoint(model_name=model_5.name)])

model_5 = tf.keras.models.load_model("model_experiments/model_5_lstm/")
model_5.evaluate(test_windows, test_labels)

# Make predictions with our LSTM model
model_5_preds = make_preds(model_5, test_windows)
model_5_preds[:10]

# Evaluate model 5 preds
model_5_results = evaluate_preds(y_true=tf.squeeze(test_labels),
                                 y_pred=model_5_preds)
model_5_results

Model 6: Multivariate Dense Model

Input Data

Architecture

Evaluation

X = bitcoin_prices_windowed.dropna().drop("Price", axis=1).astype(np.float32) 
y = bitcoin_prices_windowed.dropna()["Price"].astype(np.float32)
X.head()
y.head()

# training and test sets
split_size = int(len(X) * 0.8)
X_train, y_train = X[:split_size], y[:split_size]
X_test, y_test = X[split_size:], y[split_size:]
len(X_train), len(y_train), len(X_test), len(y_test)

tf.random.set_seed(42)
model_6 = tf.keras.Sequential([
  layers.Dense(128, activation="relu"),
  layers.Dense(HORIZON)
], name="model_6_dense_multivariate")

# Compile
model_6.compile(loss="mae",
                optimizer=tf.keras.optimizers.Adam())

# Fit
model_6.fit(X_train, y_train,
            epochs=100,
            batch_size=128,
            verbose=0, # only print 1 line per epoch
            validation_data=(X_test, y_test),
            callbacks=[create_model_checkpoint(model_name=model_6.name)])

model_6 = tf.keras.models.load_model("model_experiments/model_6_dense_multivariate")
model_6.evaluate(X_test, y_test)

# Making predictions on multivariate data
model_6_preds = tf.squeeze(model_6.predict(X_test))
model_6_preds[:10]

# Evaluate preds
model_6_results = evaluate_preds(y_true=y_test,
                                 y_pred=model_6_preds)
model_6_results

Model 7: N-BEATS

The Neural Basis Expansion Analysis (NBEATS) is an MLP-based deep neural architecture with backward and forward residual links. The network has two variants: (1) in its interpretable configuration, NBEATS sequentially projects the signal into polynomials and harmonic basis to learn trend and seasonality components; (2) in its generic configuration, it substitutes the polynomial and harmonic basis for identity basis and larger network’s depth. The Neural Basis Expansion Analysis with Exogenous (NBEATSx), incorporates projections to exogenous temporal variables available at the time of the prediction.

Definition adapted from
NBEATS


1. Custom N-BEATS Block

The N-BEATS block decomposes time-series data into backcast (residuals) and forecast components, forming the building block of the architecture.

class NBeatsBlock(tf.keras.layers.Layer):
    def __init__(self, input_size, theta_size, horizon, n_neurons, n_layers, **kwargs):
        super().__init__(**kwargs)
        self.hidden = [tf.keras.layers.Dense(n_neurons, activation="relu") for _ in range(n_layers)]
        self.theta_layer = tf.keras.layers.Dense(theta_size, activation="linear", name="theta")

    def call(self, inputs):
        x = inputs
        for layer in self.hidden:
            x = layer(x)
        theta = self.theta_layer(x)
        backcast, forecast = theta[:, :self.input_size], theta[:, -self.horizon:]
        return backcast, forecast

# Test the block with dummy inputs
dummy_inputs = tf.expand_dims(tf.range(7) + 1, axis=0)  # Example input for WINDOW_SIZE = 7
dummy_block = NBeatsBlock(input_size=7, theta_size=8, horizon=1, n_neurons=128, n_layers=4)
backcast, forecast = dummy_block(dummy_inputs)
print(f"Backcast: {tf.squeeze(backcast.numpy())}, Forecast: {tf.squeeze(forecast.numpy())}")

2. Preparing the Dataset

The dataset is structured into rolling windows for forecasting, aligning with the N-BEATS architecture's input format.

# Add windowed columns for Bitcoin prices
WINDOW_SIZE = 7
bitcoin_prices_nbeats = bitcoin_prices.copy()
for i in range(WINDOW_SIZE):
    bitcoin_prices_nbeats[f"Price+{i+1}"] = bitcoin_prices_nbeats["Price"].shift(periods=i+1)
bitcoin_prices_nbeats = bitcoin_prices_nbeats.dropna()

# Split the dataset into training and testing sets
X = bitcoin_prices_nbeats.drop("Price", axis=1)
y = bitcoin_prices_nbeats["Price"]
split_size = int(len(X) * 0.8)
X_train, X_test = X[:split_size], X[split_size:]
y_train, y_test = y[:split_size], y[split_size:]

# Convert to TensorFlow datasets
train_features_dataset = tf.data.Dataset.from_tensor_slices(X_train)
train_labels_dataset = tf.data.Dataset.from_tensor_slices(y_train)
test_features_dataset = tf.data.Dataset.from_tensor_slices(X_test)
test_labels_dataset = tf.data.Dataset.from_tensor_slices(y_test)

train_dataset = tf.data.Dataset.zip((train_features_dataset, train_labels_dataset)).batch(1024).prefetch(tf.data.AUTOTUNE)
test_dataset = tf.data.Dataset.zip((test_features_dataset, test_labels_dataset)).batch(1024).prefetch(tf.data.AUTOTUNE)

3. Building the N-BEATS Model

The model consists of multiple stacked N-BEATS blocks with residual stacking to improve predictions.

# Define input parameters
N_EPOCHS = 5000
N_NEURONS = 512
N_LAYERS = 4
N_STACKS = 30
INPUT_SIZE = WINDOW_SIZE
THETA_SIZE = WINDOW_SIZE + 1

# Initialize N-BEATS block
stack_input = layers.Input(shape=(INPUT_SIZE), name="stack_input")
nbeats_block = NBeatsBlock(input_size=INPUT_SIZE, theta_size=THETA_SIZE, horizon=1, n_neurons=N_NEURONS, n_layers=N_LAYERS)
backcast, forecast = nbeats_block(stack_input)

# Residual stacking
residuals = layers.subtract([stack_input, backcast], name="subtract_00")
for i in range(N_STACKS - 1):
    block = NBeatsBlock(input_size=INPUT_SIZE, theta_size=THETA_SIZE, horizon=1, n_neurons=N_NEURONS, n_layers=N_LAYERS)
    backcast, block_forecast = block(residuals)
    residuals = layers.subtract([residuals, backcast], name=f"subtract_{i}")
    forecast = layers.add([forecast, block_forecast], name=f"add_{i}")

# Final model
model_7 = tf.keras.Model(inputs=stack_input, outputs=forecast, name="model_7_N-BEATS")
model_7.compile(loss="mae", optimizer=tf.keras.optimizers.Adam(0.001), metrics=["mae", "mse"])

4. Training and Evaluation

The model is trained with callbacks to handle early stopping and learning rate reduction.

# Train the model
model_7.fit(
    train_dataset,
    validation_data=test_dataset,
    epochs=N_EPOCHS,
    callbacks=[
        tf.keras.callbacks.EarlyStopping(monitor="val_loss", patience=200, restore_best_weights=True),
        tf.keras.callbacks.ReduceLROnPlateau(monitor="val_loss", patience=100, verbose=1),
    ]
)

# Evaluate the model
results = model_7.evaluate(test_dataset)
print(f"MAE: {results[1]}, MSE: {results[2]}")

# Make predictions
model_7_preds = make_preds(model_7, test_dataset)
print(f"Predictions: {model_7_preds[:10]}")

5. Results

The N-BEATS model achieved the following metrics:


Constructing and Fitting Ensemble Models

This process involves building and training multiple models with varying loss functions, aggregating their predictions to create a robust ensemble model for time-series forecasting.


1. Ensemble Model Construction

The ensemble consists of multiple individual models trained with different configurations to optimize performance.

Key Components:

  1. Input Parameters:

    • horizon: Defines the forecasting horizon (number of steps to predict).
    • num_iter: Number of iterations per loss function to train models.
    • num_epochs: Number of training epochs per model.
    • loss_fns: List of loss functions to optimize (mae, mse, mape).
  2. Model Architecture:

    • Each model is a simple feedforward neural network with:
      • Two hidden layers (128 neurons each, ReLU activation).
      • One output layer matching the horizon.
  3. Compilation:

    • Models are compiled with the specified loss function and the Adam optimizer.
    • Metrics include mae and mse.
  4. Training:

    • Each model is trained on the training dataset and validated on the test dataset.
    • Early stopping and learning rate reduction callbacks are used to prevent overfitting and optimize training.

Code Sample: Model Construction

def get_ensemble_models(horizon=HORIZON, 
                        train_data=train_dataset,
                        test_data=test_dataset,
                        num_iter=10, 
                        num_epochs=100, 
                        loss_fns=["mae", "mse", "mape"]):
    # List to store trained ensemble models
    ensemble_models = []

    for i in range(num_iter):
        for loss_function in loss_fns:
            print(f"Optimizing model with loss: {loss_function}, model number: {i}")

            # Define the model architecture
            model = tf.keras.Sequential([
                layers.Dense(128, kernel_initializer="he_normal", activation="relu"),
                layers.Dense(128, kernel_initializer="he_normal", activation="relu"),
                layers.Dense(horizon)
            ])

            # Compile the model
            model.compile(
                loss=loss_function,
                optimizer=tf.keras.optimizers.Adam(),
                metrics=["mae", "mse"]
            )

            # Train the model
            model.fit(
                train_data,
                epochs=num_epochs,
                verbose=0,
                validation_data=test_data,
                callbacks=[
                    tf.keras.callbacks.EarlyStopping(monitor="val_loss", patience=200, restore_best_weights=True),
                    tf.keras.callbacks.ReduceLROnPlateau(monitor="val_loss", patience=100, verbose=1)
                ]
            )

            # Add the trained model to the ensemble list
            ensemble_models.append(model)

    return ensemble_models

# Train ensemble models
ensemble_models = get_ensemble_models(num_iter=5, num_epochs=1000)

2. Making Predictions with the Ensemble

The ensemble's predictions are computed by aggregating the outputs of all individual models, typically using the median to reduce the influence of outliers.

Steps:

  1. Generate Predictions:
    • Each model in the ensemble predicts the test dataset.
  2. Aggregate Predictions:
    • Use the median of predictions across all models as the final ensemble output.

Code Sample: Prediction Aggregation

def make_ensemble_preds(ensemble_models, data):
    ensemble_preds = []
    for model in ensemble_models:
        preds = model.predict(data)
        ensemble_preds.append(preds)
    return tf.constant(tf.squeeze(ensemble_preds))

# Generate ensemble predictions
ensemble_preds = make_ensemble_preds(ensemble_models=ensemble_models, data=test_dataset)

3. Evaluating the Ensemble Model

The aggregated predictions are evaluated against the true labels using metrics like mae, mse, rmse, and mape.

Code Sample: Evaluation

# Evaluate the ensemble model
ensemble_results = evaluate_preds(
    y_true=y_test,
    y_pred=np.median(ensemble_preds, axis=0)
)
ensemble_results

Advantages of Ensemble Models

  1. Robustness:
    • Reduces variance and overfitting compared to individual models.
  2. Performance:
    • Combines strengths of multiple loss functions for better generalization.
  3. Flexibility:
    • Can adapt to various forecasting horizons and data patterns.

Bootstrapping the Models for Prediction Intervals

Bootstrapping is a statistical method used to estimate the uncertainty of model predictions. In this context, the predictions from the ensemble models are bootstrapped to calculate prediction intervals.


1. Calculating Prediction Intervals

The prediction intervals are derived using the standard deviation of the predictions from the ensemble models. A 95% confidence interval is computed using the formula:

Interval=μ±1.96σ{\text{Interval} = \mu \pm 1.96 \cdot \sigma}

Where:

Code Implementation:

def get_upper_lower(preds): 
    # Calculate standard deviation across ensemble predictions
    std = tf.math.reduce_std(preds, axis=0)
    
    # Compute 95% prediction interval
    interval = 1.96 * std 
    
    # Calculate mean predictions
    preds_mean = tf.reduce_mean(preds, axis=0)
    
    # Compute lower and upper bounds
    lower, upper = preds_mean - interval, preds_mean + interval
    return lower, upper

# Compute prediction intervals for ensemble predictions
lower, upper = get_upper_lower(preds=ensemble_preds)

2. Visualization

The results are visualized by plotting:

Code Implementation:

import matplotlib.pyplot as plt

# Plot ensemble median with prediction intervals
ensemble_median = np.median(ensemble_preds, axis=0)
offset = 500  # Start plotting from this offset to reduce clutter

plt.figure(figsize=(10, 7))
plt.plot(X_test.index[offset:], y_test[offset:], "g", label="Test Data")
plt.plot(X_test.index[offset:], ensemble_median[offset:], "k-", label="Ensemble Median")
plt.fill_between(X_test.index[offset:], 
                 lower[offset:], 
                 upper[offset:], 
                 color="gray", alpha=0.3, label="Prediction Intervals")
plt.xlabel("Date")
plt.ylabel("BTC Price")
plt.legend(loc="upper left", fontsize=14)
plt.title("Ensemble Predictions with 95% Prediction Intervals")
plt.show()
Alt Text

3. Insights from the Visualization

  1. Ensemble Median:

    • Represents the central tendency of predictions from the ensemble models.
    • Smooths out fluctuations in individual predictions.
  2. Prediction Intervals:

    • Provide a range within which the true Bitcoin price is expected to fall with 95% confidence.
    • Wider intervals indicate higher uncertainty in predictions.
  3. Comparison to Test Data:

    • Allows for evaluating how closely the predictions align with actual prices.

Advantages of Bootstrapping

Training the Model for Next-Day Price Prediction

A 7-day windowed dataset is used to train a feedforward neural network for forecasting Bitcoin prices.


1. Data Preparation

Create a windowed dataset with 7-day inputs and the next day's price as the target.

X_all = bitcoin_prices_windowed.drop(["Price", "block_reward"], axis=1).dropna().to_numpy()
y_all = bitcoin_prices_windowed.dropna()["Price"].to_numpy()
dataset_all = tf.data.Dataset.from_tensor_slices((X_all, y_all)).batch(1024).prefetch(tf.data.AUTOTUNE)

2. Model Training

Train a simple feedforward neural network with two hidden layers.

model_9 = tf.keras.Sequential([
  layers.Dense(128, activation="relu"),
  layers.Dense(128, activation="relu"),
  layers.Dense(1)  # Predict 1 day ahead
])

model_9.compile(loss="mae", optimizer="adam")
model_9.fit(dataset_all, epochs=100, verbose=0)

3. Forecasting

Predict future prices iteratively using the most recent 7-day window.

def make_future_forecast(values, model, into_future, window_size=7):
    future_forecast = []
    last_window = values[-window_size:]
    for _ in range(into_future):
        future_pred = model.predict(tf.expand_dims(last_window, axis=0))
        future_forecast.append(tf.squeeze(future_pred).numpy())
        last_window = np.append(last_window, future_pred)[-window_size:]
    return future_forecast

future_forecast = make_future_forecast(y_all, model_9, into_future=14)

4. Visualization

Plot actual and predicted prices.

plot_time_series(bitcoin_prices.index, btc_price, label="Actual BTC Price")
plot_time_series(get_future_dates(bitcoin_prices.index[-1], 14), future_forecast, label="Predicted BTC Price")
plt.show()

Summary

Alt Text

Conclusion

This project showcases the application of deep learning models for Bitcoin price prediction. It highlights how sequential data and neural networks can effectively forecast cryptocurrency prices. Below are the key takeaways and potential avenues for future exploration:


Key Achievements

  1. Baseline Comparisons:

    • Developed baseline models, including a naïve forecast, for robust comparison.
    • Showed how advanced models improve upon simple benchmarks.
  2. Advanced Model Architectures:

    • Explored diverse architectures: dense networks, Conv1D, LSTM, and N-BEATS.
    • Evaluated model performance using metrics such as MAE, MSE, RMSE, and MAPE.
  3. Ensemble Learning:

    • Enhanced prediction accuracy and robustness through ensemble modeling.
    • Used bootstrapping to provide 95% prediction intervals, quantifying model uncertainty.
  4. Short- and Long-Term Forecasting:

    • Models were capable of both single-step (next-day) and multistep forecasting, adapting to various horizons.
  5. Visualization and Interpretability:

    • Provided clear visual comparisons between actual and predicted prices.
    • Demonstrated the effectiveness of prediction intervals in assessing uncertainty.

Challenges and Insights


Future Directions

  1. Incorporate External Data:

    • Add features like trading volume, social sentiment, or macroeconomic indicators to improve predictive power.
  2. Test Advanced Architectures:

    • Experiment with attention mechanisms or transformer models for time-series forecasting.
  3. Real-Time Implementation:

    • Deploy models in real-time trading environments to assess performance under dynamic market conditions.
  4. Explore Financial Metrics:

    • Evaluate models based on financial performance metrics like Sharpe Ratio or drawdown in a simulated trading strategy.

Final Thoughts

By employing innovative deep learning techniques, this project not only predicts Bitcoin prices but also offers a methodological framework for handling other financial time-series data. The findings emphasize the balance between complexity and interpretability, providing a strong foundation for further research in cryptocurrency forecasting and quantitative finance.

Alt Text