Neural Networks – quantonomics

Continuous Time Diffusion Across Networks

This describes how to define a continuous time differential equation on a network, mainly for diffusion. I’ve seen diffusion on networks estimated often with simulation which is problematic because discrete simulation doesn’t allow for hill climbing. I describe how to model continuous time diffusion through the use of the graph Laplacian.

Currently, I have been reading Geometric Deep Learning: Going beyond Euclidean Data for a multitude of reasons. The reason I searched out this topic is that I knew one could use Convolutional Neural Networks to forecast a variety of things on graphs. For example, a Conv-Net on a graph could likely forecast node characteristics in the same way as Monte Carlo methods as shown here. The second reason is not related to economics, but knot theory. From what little I know, the central question in knot theory is to define invariants, such as knots that can be untied to another knot are the same (under some definition of an equivalence class of the invariants) and knots that are incapable are different invariants. Designing invariants that capture all these quantities is an open question, which is proved to be undecidable, but current invariants don’t even come close. I figured out later that the easier way to do this would come down to computing the similarity of two knot groups (in the algebraic sense) with the Wirtinger Presentation and wouldn’t require defining a Conv-Net on a graph. Then one could run an LSTM through each “word” (I.e word representation of that group), and use some sort of comparison tool to compare the state at the end of the first word to the state at the end of the second group.

When reading this document, I discovered that you can solve differential equations on graphs. Most notably, you can solve diffusion equations analytically on a graph, which is something I want to understand more concretely and why I am writing this blog post. A huge benefit of this is that this differential equation is obviously differentiable, unlike simulated diffusion on graphs that network econometricians use a lot. This means estimation is just an application of hill-climbing as opposed to time-consuming and inaccurate brute-force grid search. Additionally, the continuous time analogs represent an expected value of diffusion, so you don’t have to simulate diffusion 100s of times to get an expected value.

Continue reading “Continuous Time Diffusion Across Networks”

Embedding Layers, Autoencoders and High Dimensional Forecasting

Introduction

This post will illustrate how to use Neural Networks to do dimensionality reduction and generate usable factors. I will first discuss methods typically used in machine learning for dimensionality reduction: PCA regression, LASSO, and Ridge Regression. I will then discuss how PCA regression lags the performance of other methods, but it is really the only technique that generates factors. One of the reasons PCA regression performs poorly is because the PCA step is not optimized to pick the best factors for the regression step. Instead, PCA is performed separately from the regression. This typically means that LASSO and Ridge—methods that optimize and perform dimensionality reduction simultaneously—typically outperform PCA regression. Knowing this fact, I will show how one can use neural networks, which can do dimensionality reduction and regression with end-to-end optimization, to extract factors. This will produce estimates on par with LASSO and Ridge. I will combat neural network over-fitting by seeding the initial values with the PCA regression weights. Then, I will demonstrate some other extensions like extracting factors for multiple outputs, nonlinear dimensionality reduction, and regularized dimensionality reduction.

Continue reading “Embedding Layers, Autoencoders and High Dimensional Forecasting”

Macroeconomic Time-Series Forecasting with Long Short-Term Memory Networks (LSTMs)

First I should mention, this is my opinion and mine alone. I do not represent the opinion of any organization I work for or am affiliated with. All errors are mine alone.

I have a working paper discussing the use of LSTM neural networks to forecast GDP mainly. It’s still very much in progress, so any comments are helpful. Here is the abstract:

This paper improves upon state-of-the-art macroeconomic forecasting using a Long Short-Term Memory Net (LSTM) (Schmidhuber, J. et al. 2005) that has had much success in machine learning time series forecasting. In GDP forecasting, this application outperforms the Survey of Professional Forecasters (SPF) and traditional economic models with near consistency over one- to five-period ahead forecasts on recent data. Additionally, the model outperforms the SPF in the critical one-horizon ahead forecast, during the Great Recession period 2007-2011, at a better than a 5% Diebold-Mariano p-value threshold (Diebold et. al., 1991). Given the impressive forecasting performance, this paper discusses implementation and techniques regarding the use of this model and suggests avenues for future improvement tailored for economics.

Here is the link to my paper.

You are welcome to use, but please cite:

Fen, Cameron. “Macroeconomic Time Series Forecasting with Long Short Term Memory Networks” <https://quantonomics.wordpress.com/2017/10/22/macroeconomic-time-series-forecasting-with-long-short-term-memory-networks-lstms/>