Modeling with Numpy

Modeling with Numpy#

We will skip this notebook during the workshop. It demonstrates how to use numpy to prototype our mathematical model. For dynamic models, we strongly recommend prototyping in numpy, MATLAB, or a similar environment, especially if you are new to Pyomo. Once you are happy the model equations are reasonable, you can move to Pyomo.

The code below sets new defaults for plots.

import matplotlib.pyplot as plt

SMALL_SIZE = 14
MEDIUM_SIZE = 16
BIGGER_SIZE = 18

plt.rc('font', size=SMALL_SIZE)  # controls default text sizes
plt.rc('axes', titlesize=SMALL_SIZE)  # fontsize of the axes title
plt.rc('axes', labelsize=MEDIUM_SIZE)  # fontsize of the x and y labels
plt.rc('xtick', labelsize=SMALL_SIZE)  # fontsize of the tick labels
plt.rc('ytick', labelsize=SMALL_SIZE)  # fontsize of the tick labels
plt.rc('legend', fontsize=SMALL_SIZE)  # legend fontsize
plt.rc('figure', titlesize=BIGGER_SIZE)  # fontsize of the figure title
plt.rc('lines', linewidth=3)

Numpy simulation#

import numpy as np
from scipy.integrate import solve_ivp
import pandas as pd
import matplotlib.pyplot as plt

# known parameters
T_amb = 21  # deg C
alpha = 0.00016  # watts / (units P1 * percent U1)
P1 = 200  # P1 units

# adjustable parameters
CpH = 7  # joules/deg C
CpS = 0.01  # joules/deg C
Ua = 0.05  # watts/deg C
Ub = 0.001  # watts/deg C

# initial conditions
TH1 = T_amb
TS1 = T_amb
IC = [TH1, TS1]

# input values
U1 = lambda t: 50  # steady state value of u1 (percent)

# extract data from experiment
t_expt = np.linspace(0, 600, 601)


def tclab_ode(theta, U1, T_amb, t_expt, return_data_frame=False):
    '''ODE system for TCLab

    Arguments:
        theta: list fitted parameters CpH, CpS, Ua, Ub
        U1: function that returns the value of U1 at time t
        T_amb: ambient temperature
        t_expt: time values for the experiment

    '''

    # unpack the adjustable parameters
    Ua, Ub, inv_CpH, inv_CpS = theta

    # right hand side of the ODEs
    def deriv(t, y):
        TH1, TS1 = y
        dTH1 = (-Ua * (TH1 - T_amb) + Ub * (TS1 - TH1) + alpha * P1 * U1(t)) * inv_CpH
        dTS1 = Ub * (TH1 - TS1) * inv_CpS
        return [dTH1, dTS1]

    # define the initial conditions
    IC = [T_amb, T_amb]

    # numerically integrate the ODEs
    soln = solve_ivp(deriv, [min(t_expt), max(t_expt)], IC, t_eval=t_expt)

    if return_data_frame:
        # create dataframe with predictions
        pred = pd.DataFrame(columns=["Time"])
        pred["Time"] = t_expt
        pred = pred.set_index("Time")

        # report the model temperatures
        pred["TH1"] = soln.y[0]
        pred["TS1"] = soln.y[1]

        # create dataframe with predictions
        pred = pd.DataFrame(columns=["Time"])
        pred["Time"] = t_expt
        pred["TH1"] = soln.y[0]
        pred["TS1"] = soln.y[1]
        pred["Q1"] = U1(t_expt)

        return pred
    else:
        return soln.y[1]


pred = tclab_ode(
    theta=[Ua, Ub, 1 / CpH, 1 / CpS],
    U1=U1,
    T_amb=T_amb,
    t_expt=t_expt,
    return_data_frame=True,
)
pred

	Time	TH1	TS1	Q1
0	0.0	21.000000	21.000000	50
1	1.0	21.227741	21.011029	50
2	2.0	21.453832	21.042602	50
3	3.0	21.678288	21.092596	50
4	4.0	21.901125	21.157645	50
...	...	...	...	...
596	596.0	52.543808	52.551269	50
597	597.0	52.547056	52.550723	50
598	598.0	52.550280	52.550540	50
599	599.0	52.553480	52.550681	50
600	600.0	52.556658	52.551106	50

601 rows × 4 columns

pred[["TS1", "TH1"]].plot(xlabel="Time / seconds", ylabel="Temperature / deg C")
plt.show()

../_images/cb5867e26bca1bb2b846580a71b567b05efff88aa0d1cd2445616646f6fde2bc.png

Parameter estimation: step test#

Load data (step test)#

import sys

# If running on Google Colab, install Pyomo and Ipopt via IDAES
on_colab = "google.colab" in sys.modules
if on_colab:
    file = "https://raw.githubusercontent.com/dowlinglab/pyomo-doe/main/data/tclab_step_test.csv'"
else:
    file = '../data/tclab_step_test.csv'

df = pd.read_csv(file)
df.head()

	Time	T1	T2	Q1
0	0.00	22.84	22.84	50.0
1	1.00	22.84	22.84	50.0
2	2.01	23.16	22.84	50.0
3	3.02	22.84	22.84	50.0
4	4.01	22.84	22.84	50.0

Nonlinear regression and uncertainty analysis#

from scipy.optimize import least_squares


def covariance_to_correlation(cov):
    '''Convert covariance matrix into correlation matrix

    Argument:
        cov: covariance matrix

    Returns:
        cor: correlation matrix

    '''

    # Copy matrix
    cor = cov.copy()

    # Get number of rows
    n = cor.shape[0]

    # Loop over rows
    for r in range(n):
        # Loop over columns
        for c in range(n):
            # Scale element
            cor[r, c] = cor[r, c] / np.sqrt(cov[r, r] * cov[c, c])

    return cor


def perform_regression(data, theta_initial=[Ua, Ub, 1 / CpH, 1 / CpS]):
    '''Regress the data to find the parameters Ua, Ub, inv_CpH, inv_CpS

    Arguments:
        data: pandas DataFrame with columns "Time", "Q1", "T1", "T2"
        theta_initial: initial guess for the parameters
    '''

    # Interpolate the control signal
    U1 = lambda t: np.interp(t, data["Time"], data["Q1"])

    # Calculate the initial conditions
    T_amb = data.T1.values[0]

    # Assemble y data
    y_data = data["T1"].values

    # Define residual function
    def residuals(p):
        pred = tclab_ode(p, U1, T_amb, data["Time"].to_numpy(), return_data_frame=False)
        return pred - y_data

    # Set bounds for Ua, Ub, inv_CpH, inv_CpS
    # These are based on physical intuition
    bnds = ([1e-5, 1e-5, 1e-2, 1e-2], [2.0, 2.0, 100, 100])

    # Perform least squares nonlinear regression
    nl_results = least_squares(
        residuals, theta_initial, bounds=bnds, method='trf', verbose=2, loss="arctan"
    )
    theta_hat = nl_results.x

    # extract and print values
    Ua, Ub, inv_CpH, inv_CpS = nl_results.x
    CpH = 1 / inv_CpH
    CpS = 1 / inv_CpS

    print('CpH = ', round(CpH, 3), "J/degC")
    print('CpS =', round(CpS, 3), "J/degC")
    print('Ua = ', round(Ua, 3), "W/degC")
    print('Ub = ', round(Ub, 3), "W/degC")

    # plot the results
    pred = tclab_ode(
        [Ua, Ub, inv_CpH, inv_CpS],
        U1,
        T_amb,
        data["Time"].to_numpy(),
        return_data_frame=True,
    )
    ax = data["T1"].plot(marker='o', color='red', alpha=0.1)
    pred[["TS1", "TH1"]].plot(
        ax=ax, linewidth=3, xlabel="Time / seconds", ylabel="Temperature / deg C"
    )
    plt.show()

    # plot the residuals
    r = residuals(nl_results.x)

    # define font size
    fs = 20

    # plot data
    plt.hist(r)
    plt.xlabel("$T_1$ residuals [$^\circ{}$C]", fontsize=fs, fontweight='bold')
    plt.ylabel("Count", fontsize=fs, fontweight='bold')

    # define tick size
    plt.xticks(fontsize=fs)
    plt.yticks(fontsize=fs)
    plt.tick_params(direction="in", top=True, right=True)

    # finish plot
    plt.show()

    # Estimate covariance and correlation of fitted parameters

    # Estimate covariance of residuals
    var_residuals = r.T @ r / (len(r) - len(nl_results.x))
    print(
        "Estimated standard deviation of residuals:",
        round(np.sqrt(var_residuals), 3),
        "deg C",
    )

    # Estimate covariance of fitted parameters
    cov_p = var_residuals * np.linalg.inv(nl_results.jac.T @ nl_results.jac)

    print("\nCovariance of p = [Ua, Ub, inv_CpH, inv_CpS]")
    print(cov_p)

    print("\nCorrelation matrix")
    print(covariance_to_correlation(cov_p))


perform_regression(data=df)

   Iteration     Total nfev        Cost      Cost reduction    Step norm     Optimality   
       0              1         6.3133e+02                                    1.50e+05    
       1              2         5.8392e+02      4.74e+01       4.76e-02       5.82e+04    
       2              3         2.9743e+02      2.86e+02       9.82e+00       7.26e+05    
       3              6         2.6396e+02      3.35e+01       6.83e-03       2.92e+06    
       4              7         1.2592e+02      1.38e+02       1.94e-02       1.31e+06    
       5              8         5.2531e+01      7.34e+01       1.02e-02       5.77e+05    
       6              9         3.6753e+01      1.58e+01       3.20e-03       5.16e+05    
       7             10         2.5627e+01      1.11e+01       3.81e-03       3.35e+05    
       8             11         2.2790e+01      2.84e+00       9.23e-04       2.77e+05    
       9             12         2.0221e+01      2.57e+00       6.84e-04       3.79e+05    
      10             13         1.7188e+01      3.03e+00       2.96e-03       1.40e+05    
      11             14         1.6297e+01      8.91e-01       1.38e-03       1.72e+05    
      12             15         1.5026e+01      1.27e+00       4.06e-03       1.66e+05    
      13             16         1.3977e+01      1.05e+00       4.49e-03       1.10e+05    
      14             17         1.3596e+01      3.81e-01       3.11e-03       2.74e+04    
      15             18         1.3521e+01      7.53e-02       1.89e-03       1.41e+05    
      16             19         1.2610e+01      9.11e-01       6.17e-03       1.25e+05    
      17             20         1.2169e+01      4.41e-01       7.53e-03       1.00e+05    
      18             21         1.1888e+01      2.81e-01       9.23e-03       7.16e+04    
      19             22         1.1752e+01      1.36e-01       1.00e-02       4.45e+04    
      20             23         1.1703e+01      4.99e-02       1.49e-02       2.61e+04    
      21             24         1.1686e+01      1.65e-02       2.56e-02       1.49e+04    
      22             25         1.1681e+01      5.12e-03       4.75e-02       8.22e+03    
      23             26         1.1679e+01      1.50e-03       9.13e-02       4.41e+03    
      24             27         1.1679e+01      4.18e-04       9.93e-02       2.37e+03    
      25             28         1.1679e+01      1.16e-04       2.00e-01       1.31e+03    
      26             29         1.1679e+01      2.90e-05       4.03e-01       8.65e+02    
      27             30         1.1679e+01      9.37e-06       4.11e-01       6.12e+02    
      28             31         1.1679e+01      5.92e-06       1.05e-01       3.13e+02    
      29             32         1.1679e+01      1.49e-06       2.63e-02       1.66e+02    
      30             33         1.1679e+01      2.75e-07       6.58e-03       8.83e+01    
      31             40         1.1679e+01      0.00e+00       0.00e+00       8.83e+01    
`xtol` termination condition is satisfied.
Function evaluations 40, initial cost 6.3133e+02, final cost 1.1679e+01, first-order optimality 8.83e+01.
CpH =  6.078 J/degC
CpS = 0.011 J/degC
Ua =  0.041 W/degC
Ub =  0.001 W/degC

../_images/8679ee55df0144494ee0be811dd5f2f2390b525479843a9df7e36bf6fe0f6036.png

../_images/10c96c4b4f16ac9a24291d1d3c6af6aab7f197920f08021fb839aa1806013983.png

Estimated standard deviation of residuals: 0.162 deg C

Covariance of p = [Ua, Ub, inv_CpH, inv_CpS]
[[ 8.69913569e-11  2.00596177e-10  8.45105896e-10 -2.50867485e-05]
 [ 2.00596177e-10  4.21111809e-06  2.02824939e-06 -5.20161161e-01]
 [ 8.45105896e-10  2.02824939e-06  9.98483655e-07 -2.50544262e-01]
 [-2.50867485e-05 -5.20161161e-01 -2.50544262e-01  6.42508301e+04]]

Correlation matrix
[[ 1.          0.01048059  0.09067814 -0.01061126]
 [ 0.01048059  1.          0.98912725 -0.99999963]
 [ 0.09067814  0.98912725  1.         -0.98917836]
 [-0.01061126 -0.99999963 -0.98917836  1.        ]]

Parameter estimation: sine test (poor initial guess)#

import sys

# If running on Google Colab, install Pyomo and Ipopt via IDAES
on_colab = "google.colab" in sys.modules
if on_colab:
    file = "https://raw.githubusercontent.com/dowlinglab/pyomo-doe/main/data/tclab_sine_test_5min_period.csv"
else:
    file = '../data/tclab_sine_test_5min_period.csv'

df2 = pd.read_csv(file)
df2.head()

	Time	T1	T2	Q1
0	0.00	22.2	22.2	50
1	1.01	22.2	22.2	51
2	2.01	22.2	22.2	52
3	3.00	22.2	22.2	53
4	4.01	22.2	22.2	54

# Numpy is very sensitive to the initial guess
# With no initial theta, the values are quite far off
perform_regression(data=df2)

   Iteration     Total nfev        Cost      Cost reduction    Step norm     Optimality   
       0              1         6.3026e+02                                    5.84e+06    
       1              5         6.1654e+02      1.37e+01       4.24e-01       3.95e+05    
       2              6         6.0985e+02      6.69e+00       4.45e-02       2.64e+05    
       3              7         6.0831e+02      1.54e+00       3.60e-01       6.50e+04    
       4              8         6.0814e+02      1.71e-01       5.31e-02       8.62e+03    
       5              9         6.0800e+02      1.39e-01       5.61e-04       7.78e+02    
       6             14         6.0798e+02      1.47e-02       4.76e-05       2.80e+03    
       7             19         6.0798e+02      0.00e+00       0.00e+00       2.80e+03    
`xtol` termination condition is satisfied.
Function evaluations 19, initial cost 6.3026e+02, final cost 6.0798e+02, first-order optimality 2.80e+03.
CpH =  8.354 J/degC
CpS = 0.01 J/degC
Ua =  0.026 W/degC
Ub =  0.029 W/degC

../_images/eb611d9e188146ad7e64d34d4e7f6980d8ab7d88b5555d06decd14ace279389d.png

../_images/ddfe136affb3564344019d3b7b3b5f18fb093275906470c07195c9baed3ed81b.png

Estimated standard deviation of residuals: 12.569 deg C

Covariance of p = [Ua, Ub, inv_CpH, inv_CpS]
[[ 1.63875173e-04  2.10590923e-04  2.29570504e-04 -7.26559848e-01]
 [ 2.10590923e-04  6.23920219e+01  3.16135832e-01 -2.13932132e+05]
 [ 2.29570504e-04  3.16135832e-01  1.96281871e-03 -1.08398612e+03]
 [-7.26559848e-01 -2.13932132e+05 -1.08398612e+03  7.33538653e+08]]

Correlation matrix
[[ 1.          0.00208266  0.40478034 -0.00209558]
 [ 0.00208266  1.          0.90337715 -0.99999997]
 [ 0.40478034  0.90337715  1.         -0.90338344]
 [-0.00209558 -0.99999997 -0.90338344  1.        ]]

Parameter estimation: sine test (better initial guess)#

# Numpy is very sensitive to the initial guess
# With the step test optimal values, numpy gets fairly close
theta_initial = [0.041, 0.001, 1 / 6.078, 1 / 0.011]  # Ua, Ub, inv_CpH, inv_CpS
perform_regression(data=df2, theta_initial=theta_initial)

   Iteration     Total nfev        Cost      Cost reduction    Step norm     Optimality   
       0              1         3.3475e+02                                    1.57e+08    
       1              2         3.3448e+02      2.70e-01       1.85e-05       2.09e+08    
       2              4         3.3352e+02      9.62e-01       9.38e-07       1.50e+08    
       3              5         3.3352e+02      0.00e+00       0.00e+00       1.50e+08    
`xtol` termination condition is satisfied.
Function evaluations 5, initial cost 3.3475e+02, final cost 3.3352e+02, first-order optimality 1.50e+08.
CpH =  6.078 J/degC
CpS = 0.011 J/degC
Ua =  0.041 W/degC
Ub =  0.001 W/degC

../_images/f22206661fe875fe78c3088e62b7a7729f1a9598dae03f5c46afe04c3606a1af.png

../_images/f399b42d03a3ec73ed01181fe3842662260155306e45693b6f04bdfb316f646e.png

Estimated standard deviation of residuals: 1.437 deg C

Covariance of p = [Ua, Ub, inv_CpH, inv_CpS]
[[ 1.99483520e-14 -5.04688394e-16 -5.75632848e-15 -1.14001099e-12]
 [-5.04688394e-16  3.98626401e-16 -1.56356528e-16  1.91168746e-14]
 [-5.75632848e-15 -1.56356528e-16  5.02900388e-15  4.17152863e-14]
 [-1.14001099e-12  1.91168746e-14  4.17152863e-14  1.34636028e-10]]

Correlation matrix
[[ 1.         -0.17897269 -0.57471307 -0.69562452]
 [-0.17897269  1.         -0.11043125  0.08251886]
 [-0.57471307 -0.11043125  1.          0.05069599]
 [-0.69562452  0.08251886  0.05069599  1.        ]]

Take Away Messasge#

Numpy is an excellent tool for rapidly prototyping models and performing basic parameter estimation. However, as we saw in this example, numpy was a little slow. Moreover, for the sine wave test, the least squares algorithm got stuck in local solutions. We will see soon how Pyomo is faster for optimization and often leads to more reliable parameter estimation solution.