Physics-constrained Deep Learning of Building Thermal Dynamics

November 22, 2020

Ján Drgoňa, Aaron Tuor, Vikas Chandan, and Draguna L. Vrabie

This article summarizes work presented in NeurIPS 2020 Workshop Tackling Climate Change with Machine Learning: Physics-constrained Deep Recurrent Neural Models of Building Thermal Dynamics by Drgoňa, Tuor, Chandan, and Vrabie.

We address the challenge of developing physics-constrained and control-oriented predictive models of the building’s thermal dynamics using a dataset obtained from a real-world commercial building. The proposed method is based on the systematic encoding of the physics-based priors into structured neural architecture with a robust inductive bias towards predicting building dynamics. Specifically, our model mimics the structure of the building thermal dynamics model and leverages penalty methods to model inequality constraints. Additionally, we use constrained matrix parameterization based on the Perron-Frobenius theorem to bind the learned network weights’ eigenvalues. We interpret the stable eigenvalues as dissipativeness of the learned building thermal model. We demonstrate the proposed approach’s effectiveness on a dataset obtained from an office building with 20 thermal zones.

Graphical abstract.
Graphical abstract.


Physics-based Dynamical Model of Building Thermal Dynamics

In general, the building thermal behavior is determined by high-dimensional, nonlinear and often discontinuous dynamical processes. Thus, obtaining accurate and reliable dynamical models of buildings remains a challenging task and typically involves computationally demanding physics-based modeling. As a consequence, high computational demands and non-differentiability can easily cast the physics-based model not suitable for efficient gradient-based optimization required in various applications. Therefore, a sound trade-off between the model accuracy and simplicity is required.

Fortunately, the nature of the building’s dynamics allows us to use several assumptions to decrease the modeling complexity. First, the building envelope model can be accurately linearized around a working point. Second, the HVAC dynamics and weather disturbances can be decoupled from the linearized building envelope dynamics to form a Hammerstein-Wiener model structure illustrated in the following figure 1(a).

Generic model structure.
Generic model structure.

These simplifications, allow us to represent the building thermal response model as a following differential equation with non-linear input dynamics:

xt+1=Axt+Bqt+fd(dt),yt=Cxt,qt=m˙tcpΔTt,\begin{aligned} \bm{x}_{t+1} & = A \bm{x}_{t} + B \bm{q}_{t} + f_d(\bm{d}_{t}), \\ \bm{y}_{t} & = C \bm{x}_{t}, \\ \bm{q}_{t} & = \dot{\bm{m}}_{t} cp \Delta \bm{T}_{t}, \end{aligned}

Where xt\bm{x}_{t} and yt\bm{y}_{t} represent the values of the states (envelope temperatures), and measurements (zone temperatures) at time tt, respectively. Disturbances dt\bm{d}_{t} represent the influence of weather conditions and occupancy behavior. Heat flows delivered to the building qt\bm{q}_{t} are defined by mass flows m˙t\dot{\bm{m}}_{t} multiplied by a difference of the supply and return temperatures ΔTt\Delta \bm{T}_{t}, and by specific heat capacity cpcp.

If the model is built from physical principles, it is also physically interpretable. For instance, the AA matrix represents heat transfer between spatially discretized system states, representing temperatures of the building envelope. BB matrix defines the temperature increments caused by the convective heat flow generated by the HVAC system. While, fdf_d captures the thermal dynamics caused by the varying weather conditions and internal heat gains generated by the occupancy or appliances. However, every building represents a unique system with different operational conditions. Therefore, obtaining the parameters of the above mentioned differential equations from physical principles is a time-consuming, impractical task.

Structured Recurrent Neural Dynamical Model

To incorporate constraints and physics-based structural assumptions, we introduce a generic block nonlinear state space model (SSM) for partially observable systems. Each block of this structured graph model corresponds to a different part of the generic building model structure. Thus the overall physically interpretable architecture is shown in figure 1(b). The block-structured recurrent neural dynamics model is then defined as:

xt+1=fx(xt)+fu(ut)+fd(dt)yt=fy(xt)x0=fo([y1N;;y0])\begin{aligned} \bm{x}_{t+1} &= f_{x}(\bm{x}_t) + f_{u}(\bm{u}_t) + f_{d}(\bm{d}_t) \\ \bm{y}_t &= f_{y}(\bm{x}_t) \\ \bm{x}_{0} &= {f}_o([\bm{y}_{1-N}; \ldots; \bm{y}_{0}]) \end{aligned}

Here fxf_{x}, fuf_{u}, and fdf_{d} represent decoupled neural components of the overall system model. The advantage of the block nonlinear over black-box state space model lies in its structure. The decoupling allows us to leverage prior knowledge for imposing structural assumptions and constraints onto individual blocks ff_* of the model.

Constraining Eigenvalues of the System Dynamics

One key physics insight is that building thermal dynamics represents a dissipative system with stable eigenvalues. This inspired us to enforce physically reasonable constraints on the eigenvalues of a model’s weight matrices. We leverage the method based on the Perron-Frobenius theorem, which states that the row-wise minimum and maximum of any positive square matrix define its dominant eigenvalue’s lower and upper bound, respectively. Guided by this theorem, we can construct a state transition matrix A~\bm{\tilde{A}} with bounded eigenvalues:

M=1ϵσ(M)A~i,j=exp(Aij)k=1nxexp(Aik)Mi,j\begin{aligned} \bm{M} &= 1 - \epsilon\sigma(\bm{M'}) \\ \bm{\tilde{A}}_{i,j} &= \frac{\text{exp}(\bm{A'}_{ij})}{\sum_{k=1}^{n_x} \text{exp}(\bm{A'}_{ik})}\bm{M}_{i,j} \end{aligned}

We introduce a matrix M\bm{M} which models damping parameterized by the matrix MRnx×nx\bm{M'} \in \mathbb{R}^{n_x \times n_x}. We apply a row-wise softmax to another parameter matrix ARnx×nx\bm{A'} \in \mathbb{R}^{n_x \times n_x}, then elementwise multiply by M\bm{M} to obtain our state transition matrix A~\bm{\tilde{A}} with constrained eigenvalues in the interval (1ϵ,1)(1 - \epsilon, 1).

Inequality Constraints via Penalty Methods

Using an optimization strategy known as the penalty method, we can add further constraints to our model such that its variables remain within physically realistic bounds. We use a simple method for enforcing this property by applying inequality constraints via penalty functions p(x)p(\bm{x}) for each time step tt:

p(xt,xt): xtxt+stxstx=max(0,xt+xt)p(xt,xt): xtstxxtstx=max(0,xtxt)\begin{aligned} p(\bm{x}_t, \bm{\underline{x}}_t): \ \bm{\underline{x}}_t \leq \bm{x}_t + \bm{s}^{\underline{x}}_t \:\:\: &\cong \:\: \bm{s}^{\underline{x}}_{t} = \text{max}(0,\:-\bm{x}_t + \bm{\underline{x}}_t) \\ p(\bm{x}_t, \bm{\overline{x}}_t): \ \bm{x}_t - \bm{s}^{\overline{x}}_t \leq \bm{\overline{x}}_t \:\:\: &\cong \:\: \bm{s}^{\overline{x}}_{t} = \text{max}(0,\:\bm{x}_t - \bm{\overline{x}}_t) \end{aligned} \\

The constraints lower and upper bounds are given as yk\bm{\underline{y}}_k and yk\mathbf{\overline{y}}_k, respectively. The slack variables sky\bm{s}^{\underline{y}}_k and sky\bm{s}^{\overline{y}}_k indicate the magnitude to which each constraint is violated, and we penalize them heavily in the optimization objective by a large weight on these additional terms in the loss function.

Multi-term Loss Function

Now everything comes together in a weighted multi-term loss function. We optimize the following loss function augmented with regularization and penalty terms to train the recurrent neural model unrolled over NN steps:

LMSE(Yref,YΘ)=1Nt=1Nytrefyt22+Qdxxtxt122+Qineqysty22+Qinequstfu22+Qineqdstfd22\mathcal{L}_{\text{MSE}}(\mathcal{Y}^{\text{ref}}, \mathcal{Y} | \Theta) = \frac{1}{N} \sum_{t=1}^{N} ||\bm{y}^{\text{ref}}_{t} - \bm{y}_{t}||^2_2 + Q_{\text{dx}}||\bm{x}_{t} - \bm{x}_{t-1}||^2_2 + Q_{\text{ineq}}^{\bm{y}}||\bm{s}^{\bm{y}}_{t}||^2_2 + Q_{\text{ineq}}^{\bm{u}}||\bm{s}^{f_u}_{t}||^2_2 + Q_{\text{ineq}}^{\bm{d}}||\bm{s}^{f_d}_{t}||^2_2

The main objective of loss function computes the mean squared error between predicted y\bm{y} and observed outputs yref\bm{y}^{\text{ref}} over NN time steps. The term xtxt1\bm{x}_{t} - \bm{x}_{t-1} represents state difference penalty promoting learning of smoother and physically more plausible state trajectories. The violations of inequality constraint defining the boundary conditions of outputs y\bm{y} is penalized by incorporating weighted slack variables sy\bm{s}^{\bm{y}}. Thanks to the block-structured dynamics, we can constrain the dynamical contribution of inputs fuf_u and disturbances fdf_d towards the overall dynamics via two additional terms in the loss function. This allows us to limit the effect of the external factors to be bounded within physically plausible ranges. For instance, it is not physically realistic that 1K1K change in the ambient temperature would result in 2K2K change in indoor temperature in a single time step.

Experimental Case Study


The objective is to develop a control-oriented thermal dynamics model of a commercial office building, given a limited amount of measurement data. The building used in this study is a commercial office building with 2020 thermal zones. Heating and cooling are provided by a variable air volume (VAV) system served by 44 air handling units (AHUs) serving 2424 VAV boxes (zones). Each VAV box is equipped with a hot water reheat coil. A boiler, fed by natural gas, supplies hot water to the reheat coils and AHU coils. Chilled water is supplied by a central chiller plant.

External facade of the office building.
External facade of the office building.
Floor plan of the office building.
Floor plan of the office building.

The time series dataset DD is given in the form of tuples with input, disturbance, and output variables, respectively.

D={(ut(i),dt(i),yt(i)),(ut+Δ(i),dt+Δ(i),yt+Δ(i)),,(ut+NΔ(i),dt+NΔ(i),yt+NΔ(i))}, D = \{(\bm{u}^{(i)}_t, \bm{d}^{(i)}_t, \bm{y}^{(i)}_t), (\bm{u}^{(i)}_{t+\Delta}, \bm{d}^{(i)}_{t+\Delta}, \bm{y}^{(i)}_{t+\Delta}), \ldots, (\bm{u}^{(i)}_{t+N\Delta}, \bm{d}^{(i)}_{t+N\Delta}, \bm{y}^{(i)}_{t+N\Delta})\},

where i=N1ni = \bm{N}_1^n represents index of nn batches of time series trajectories with NN-step horizon. The data is sampled with sampling time Δ=15\Delta = 15 min. We have in total ny=20n_y = 20 output variables corresponding to zone temperatures, nu=40n_u = 40 input variables representing HVAC temperatures and mass flows, and nd=1n_d = 1 disturbance variable for ambient temperature forecast. We use \texttt{min-max} normalization to scale all variables between [0,1][0, 1]. The dataset consists of 3030 days, which corresponds to only 28802880 datapoints. We group the dataset into evenly split training, development, and test sets, 960960 data points each.

Experimental Setup

We implement the models using Pytorch, and train with randomly initialized weights using the Adam optimizer with a learning rate of 0.0030.003, and 5,0005,000 gradient descent updates. We select the best performing model on the development set, and report results on the test set. The state estimator fof_o is a fully connected neural network, while neural blocks ff_* are represented by recurrent neural networks with 22 layers and 8080 nodes. We range the prediction horizon as powers of two 2n2^n with n=3,,6n = 3, \ldots, 6, which corresponds to 22 up to 1616 hour window. The relative weights of the multi-term loss function are Qdx=0.2Q_{\text{dx}}=0.2, Qineqy=1.0Q_{\text{ineq}}^{\mathbf{y}}=1.0, Qinequ=0.2Q_{\text{ineq}}^{\mathbf{u}}=0.2, and Qineqd=0.2Q_{\text{ineq}}^{\mathbf{d}}=0.2. We set λmin=0.8\lambda_{\text{min}} =0.8 and λmax=1.0\lambda_{\text{max}}=1.0 for stability and dissipativity of learned dynamics.

Results and Analysis

We assess the simulated open-loop and NN-step MSE performance of the recurrent model with and without physics-constraints and structure. The MSE metrics are shown in figure 4 and in following table.

metric bars2




NN-step [K][K]

Open-loop [K][K]











The open-loop MSE of the best-performing constrained and structured model corresponds to 0.4880.488K. In comparison, the state of the art gray-box system identification methods trained on a similar amount of data reports open-loop MSE roughly equal to 1.01.0K. Hence our preliminary results show more than 50%50\% reduction in error against state of the art in literature. We observe that imposing physics-inspired structure and constraints not only yields 15%15\% reduction of error but allows us to train models with a larger prediction horizon NN. The performance of the best-performing models is compared in the following figures and table.

Open-loop trajectories of the learned (blue) and ground truth (red) multi-zone building thermal dynamics.
Open-loop trajectories of the learned (blue) and ground truth (red) multi-zone building thermal dynamics.

We perform dynamical simulation of the learned dynamical model by unrolling its dynamics into the future. By comparing the trajectories against measured data, we demonstrate the capability to generalize complex dynamics over 3030-days using only 1010-days of training data.

Eigenvalue Analysis and Physical Interpretability

Now we show how using eigenvalue constraints promotes physical coherence and interpretability of the learned model. Following figure 7 shows concatenated eigenvalues in the complex plane for weights of the state transition maps fxf_x with and without eigenvalue constraints. We also compare the eigenvalues of the classical unstructured recurrent neural model which lumps the building dynamics into a single block ff. Please note that we plot only eigenvalues of the neural network’s weights. Hence the dynamic effects of the activation functions are omitted in this analysis. However, all our neural network blocks are designed with GELU activation functions, which represent contractive maps with strictly stable eigenvalues. Therefore, based on the argument of the composition of stable functions, the global stability of the learned dynamics is not compromised.


Figure 7(a) shows the effect of proposed Perron-Frobenius factorization, and verifies that the dominant eigenvalue remains within prescribed bounds λmin=0.8\lambda_{\text{min}} =0.8 and λmax=1.0\lambda_{\text{max}}=1.0. Hence the disipativeness of the learned dynamics is hard constrained within physically realistic values when using eigenvalue constraints. Another interesting observation is that there are only two dominant dynamical modes with eigenvalues larger than 0.80.8, one per each layer of fxf_x. While the rest of the eigenvalues fall within 0.050.05 radius, hence representing less significant dynamic modes. This indicates a possibility to obtain lower-order representations of the underlying higher-order nonlinear system, a property useful for real-time optimal control applications. On the other hand, as shown in figure 7(c), using a standard recurrent model with lumped dynamics results in losing physical interpretability and generates physically implausible unstable dynamical modes violating the disipativeness property. Moreover, the imaginary parts of unconstrained dynamics shown in figure 7(b) and 7(c) indicate oscillatory modes of the autonomous state dynamics. However, in the case of building thermal dynamics, the periodicity of the dynamics is caused by external factors such as weather and occupancy schedules. From this perspective, the structured models using eigenvalue constrained weights are closer to the physically realistic system dynamics.


Reliable data-driven methods that are cost-effective in terms of computational demands, data collection, and domain expertise have the potential to revolutionize the field of energy-efficient building operations through the wide-scale acquisition of building specific, scalable, and accurate prediction models. We presented a constrained deep learning method for sample-efficient and physics-consistent data-driven modeling of building thermal dynamics. Our approach does not require the large time investments by domain experts and extensive computational resources demanded by physics-based emulator models. Based on only 10 days’ measurements, we significantly improve on prior state-of-the-art results for a modeling task using a real-world large scale office building dataset. A potential limitation of the presented approach is the restrictiveness of the used constraints, where wrong initial guess of the eigenvalue and penalty constraints bounds may lead to decreased accuracy of the learned model. Future work includes a systematic comparison against physics-based emulator models and other standard data-driven methods. Authors also plan to use the method as part of advanced predictive control strategies for energy-efficient operations in real-world buildings.


This research was supported by the Laboratory Directed Research and Development (LDRD) at Pacific Northwest National Laboratory (PNNL). PNNL is a multi-program national laboratory operated for the U.S. Department of Energy (DOE) by Battelle Memorial Institute under Contract No. DE-AC05-76RL0-1830.