Note

1. This page was generated from docs/source/data-structure/supy-io.ipynb. Interactive online version: Binder badge Slideshow: Binder badge

  1. Need help? Please let us know in the UMEP Community.
  2. A good understanding of SUEWS is a prerequisite to the proper use of SuPy.

Key IO Data Structures in SuPy

Introduction

The cell below demonstrates a minimal case of SuPy simulation with all key IO data structures included:

[1]:
import supy as sp
df_state_init, df_forcing = sp.load_SampleData()
df_output, df_state_final = sp.run_supy(df_forcing, df_state_init)
  • Input: SuPy requires two DataFrames to perform a simulation, which are:

    • df_state_init: model initial states;
    • df_forcing: forcing data.

    These input data can be loaded either through calling load_SampleData() as shown above or using init_supy. Or, based on the loaded sample DataFrames, you can modify the content to create new DataFrames for your specific needs.

  • Output: The output data by SuPy consists of two DataFrames:

    • df_output: model output results; this is usually the basis for scientific analysis.
    • df_state_final: model final states; any of its entries can be used as a df_state_init to start another SuPy simulation.

Input

df_state_init: model initial states

[2]:
df_state_init.head()
[2]:
var ah_min ah_slope_cooling ah_slope_heating ahprof_24hr ... tair24hr numcapita gridiv
ind_dim (0,) (1,) (0,) (1,) (0,) (1,) (0, 0) (0, 1) (1, 0) (1, 1) (2, 0) (2, 1) (3, 0) (3, 1) (4, 0) ... (275,) (276,) (277,) (278,) (279,) (280,) (281,) (282,) (283,) (284,) (285,) (286,) (287,) 0 0
grid
98 15.0 15.0 2.7 2.7 2.7 2.7 0.57 0.65 0.45 0.49 0.43 0.46 0.4 0.47 0.4 ... 273.15 273.15 273.15 273.15 273.15 273.15 273.15 273.15 273.15 273.15 273.15 273.15 273.15 204.58 98

1 rows × 1200 columns

df_state_init is organised with *grids* in rows and *their states* in columns. The details of all state variables can be found in the description page.

Please note the properties are stored as flattened values to fit into the tabular format due to the nature of DataFrame though they may actually be of higher dimension (e.g. ahprof_24hr with the dimension {24, 2}). To indicate the variable dimensionality of these properties, SuPy use the ind_dim level in columns for indices of values:

  • 0 for scalars;
  • (ind_dim1, ind_dim2, ...) for arrays (for a generic sense, vectors are 1D arrays).

Take ohm_coef below for example, it has a dimension of {8, 4, 3} according to the description, which implies the actual values used by SuPy in simulations are passed in a layout as an array of the dimension {8, 4, 3}. As such, to get proper values passed in, users should follow the dimensionality requirement to prepare/modify df_state_init.

[3]:
df_state_init.loc[:,'ohm_coef']
[3]:
ind_dim (0, 0, 0) (0, 0, 1) (0, 0, 2) (0, 1, 0) (0, 1, 1) (0, 1, 2) (0, 2, 0) (0, 2, 1) (0, 2, 2) (0, 3, 0) (0, 3, 1) (0, 3, 2) (1, 0, 0) (1, 0, 1) (1, 0, 2) ... (6, 3, 0) (6, 3, 1) (6, 3, 2) (7, 0, 0) (7, 0, 1) (7, 0, 2) (7, 1, 0) (7, 1, 1) (7, 1, 2) (7, 2, 0) (7, 2, 1) (7, 2, 2) (7, 3, 0) (7, 3, 1) (7, 3, 2)
grid
98 0.719 0.194 -36.6 0.719 0.194 -36.6 0.719 0.194 -36.6 0.719 0.194 -36.6 0.238 0.427 -16.7 ... 0.5 0.21 -39.1 0.25 0.6 -30.0 0.25 0.6 -30.0 0.25 0.6 -30.0 0.25 0.6 -30.0

1 rows × 96 columns

df_forcing: forcing data

df_forcing is organised with *temporal records* in rows and *forcing variables* in columns. The details of all forcing variables can be found in the description page.

The missing values can be specified with -999s, which are the default NANs accepted by SuPy and its backend SUEWS.

[4]:
df_forcing.head()
[4]:
iy id it imin qn qh qe qs qf U RH Tair pres rain kdown snow ldown fcld Wuh xsmd lai kdiff kdir wdir isec
2012-01-01 00:05:00 2012 1 0 5 -999.0 -999.0 -999.0 -999.0 -999.0 4.515 85.463333 11.77375 1001.5125 0.0 0.153333 -999.0 -999.0 -999.0 -999.0 -999.0 -999.0 -999.0 -999.0 -999.0 0.0
2012-01-01 00:10:00 2012 1 0 10 -999.0 -999.0 -999.0 -999.0 -999.0 4.515 85.463333 11.77375 1001.5125 0.0 0.153333 -999.0 -999.0 -999.0 -999.0 -999.0 -999.0 -999.0 -999.0 -999.0 0.0
2012-01-01 00:15:00 2012 1 0 15 -999.0 -999.0 -999.0 -999.0 -999.0 4.515 85.463333 11.77375 1001.5125 0.0 0.153333 -999.0 -999.0 -999.0 -999.0 -999.0 -999.0 -999.0 -999.0 -999.0 0.0
2012-01-01 00:20:00 2012 1 0 20 -999.0 -999.0 -999.0 -999.0 -999.0 4.515 85.463333 11.77375 1001.5125 0.0 0.153333 -999.0 -999.0 -999.0 -999.0 -999.0 -999.0 -999.0 -999.0 -999.0 0.0
2012-01-01 00:25:00 2012 1 0 25 -999.0 -999.0 -999.0 -999.0 -999.0 4.515 85.463333 11.77375 1001.5125 0.0 0.153333 -999.0 -999.0 -999.0 -999.0 -999.0 -999.0 -999.0 -999.0 -999.0 0.0

Note:

The index of df_forcing SHOULD BE strictly of DatetimeIndex type if you want create a df_forcing for SuPy simulation. The SuPy runtime time-step size is instructed by the df_forcing with its index information.

The infomation below indicates SuPy will run at a 5 min (i.e. 300 s) time-step if driven by this specific df_forcing:

[5]:
freq_forcing=df_forcing.index.freq
freq_forcing
[5]:
<300 * Seconds>

Output

df_output: model output results

df_output is organised with *temporal records of grids* in rows and *output variables of different groups* in columns. The details of all forcing variables can be found in the description page.

[6]:
df_output.head()
[6]:
group SUEWS ... DailyState
var Kdown Kup Ldown Lup Tsurf QN QF QS QH QE QHlumps QElumps QHresis Rain Irr ... WU_Grass2 WU_Grass3 deltaLAI LAIlumps AlbSnow DensSnow_Paved DensSnow_Bldgs DensSnow_EveTr DensSnow_DecTr DensSnow_Grass DensSnow_BSoil DensSnow_Water a1 a2 a3
grid datetime
98 2012-01-01 00:05:00 0.153333 0.018279 344.310184 371.986259 11.775615 -27.541021 40.574001 -46.53243 62.420064 3.576493 49.732605 9.832804 0.042327 0.0 0.0 ... NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
2012-01-01 00:10:00 0.153333 0.018279 344.310184 371.986259 11.775615 -27.541021 39.724283 -46.53243 61.654096 3.492744 48.980360 9.735333 0.042294 0.0 0.0 ... NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
2012-01-01 00:15:00 0.153333 0.018279 344.310184 371.986259 11.775615 -27.541021 38.874566 -46.53243 60.885968 3.411154 48.228114 9.637861 0.042260 0.0 0.0 ... NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
2012-01-01 00:20:00 0.153333 0.018279 344.310184 371.986259 11.775615 -27.541021 38.024849 -46.53243 60.115745 3.331660 47.475869 9.540389 0.042226 0.0 0.0 ... NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
2012-01-01 00:25:00 0.153333 0.018279 344.310184 371.986259 11.775615 -27.541021 37.175131 -46.53243 59.343488 3.254200 46.723623 9.442917 0.042192 0.0 0.0 ... NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN

5 rows × 218 columns

df_output are recorded at the same temporal resolution as df_forcing:

[7]:
freq_out = df_output.index.levels[1].freq
(freq_out, freq_out == freq_forcing)
[7]:
(<300 * Seconds>, True)

df_state_final: model final states

df_state_final has the identical data structure as df_state_init except for the extra level datetime in index, which stores the temporal information associated with model states. Such structure can facilitate the reuse of it as initial model states for other simulations (e.g., diagnostics of runtime model states with save_state=True set in run_supy; or simply using it as the initial conditions for future simulations starting at the ending times of previous runs).

The meanings of state variables in df_state_final can be found in the description page.

[8]:
df_state_final.head()
[8]:
var aerodynamicresistancemethod ah_min ah_slope_cooling ah_slope_heating ahprof_24hr ... wuprofm_24hr z z0m_in zdm_in
ind_dim 0 (0,) (1,) (0,) (1,) (0,) (1,) (0, 0) (0, 1) (1, 0) (1, 1) (2, 0) (2, 1) (3, 0) (3, 1) ... (18, 0) (18, 1) (19, 0) (19, 1) (20, 0) (20, 1) (21, 0) (21, 1) (22, 0) (22, 1) (23, 0) (23, 1) 0 0 0
datetime grid
2012-01-01 00:05:00 98 2 15.0 15.0 2.7 2.7 2.7 2.7 0.57 0.65 0.45 0.49 0.43 0.46 0.4 0.47 ... -999.0 -999.0 -999.0 -999.0 -999.0 -999.0 -999.0 -999.0 -999.0 -999.0 -999.0 -999.0 49.6 1.9 14.2
2013-01-01 00:05:00 98 2 15.0 15.0 2.7 2.7 2.7 2.7 0.57 0.65 0.45 0.49 0.43 0.46 0.4 0.47 ... -999.0 -999.0 -999.0 -999.0 -999.0 -999.0 -999.0 -999.0 -999.0 -999.0 -999.0 -999.0 49.6 1.9 14.2

2 rows × 1200 columns