Key IO Data Structures in SuPy¶
Introduction¶
The cell below demonstrates a minimal case of SuPy simulation with all key IO data structures included:
In [1]:
import supy as sp
df_state_init, df_forcing = sp.load_SampleData()
df_output, df_state_final = sp.run_supy(df_forcing, df_state_init)
Input: SuPy requires two
DataFrame
s to perform a simulation, which are:df_state_init
: model initial states;df_forcing
: forcing data.
These input data can be loaded either through calling load_SampleData() as shown above or using init_supy. Or, based on the loaded sample
DataFrame
s, you can modify the content to create newDataFrame
s for your specific needs.Output: The output data by SuPy consists of two
DataFrame
s:df_output
: model output results; this is usually the basis for scientific analysis.df_state_final
: model final states; any of its entries can be used as adf_state_init
to start another SuPy simulation.
Input¶
df_state_init
: model initial states¶
In [2]:
df_state_init.head()
Out[2]:
var | aerodynamicresistancemethod | ah_min | ah_slope_cooling | ah_slope_heating | ahprof_24hr | ... | wuprofm_24hr | z | z0m_in | zdm_in | |||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
ind_dim | 0 | (0,) | (1,) | (0,) | (1,) | (0,) | (1,) | (0, 0) | (0, 1) | (1, 0) | ... | (20, 1) | (21, 0) | (21, 1) | (22, 0) | (22, 1) | (23, 0) | (23, 1) | 0 | 0 | 0 |
grid | |||||||||||||||||||||
1 | 2.0 | 15.0 | 15.0 | 2.7 | 2.7 | 2.7 | 2.7 | 0.57 | 0.65 | 0.45 | ... | -999.0 | -999.0 | -999.0 | -999.0 | -999.0 | -999.0 | -999.0 | 10.0 | 0.01 | 0.2 |
1 rows × 1200 columns
df_state_init
is organised with *grids* in rows and *their
states* in columns. The details of all state variables can be
found in the description
page.
Please note the properties are stored as flattened values to fit into
the tabular format due to the nature of DataFrame
though they may
actually be of higher dimension (e.g.
ahprof_24hr
with the dimension {24, 2}). To indicate the variable dimensionality of
these properties, SuPy use the ind_dim
level in columns for indices
of values:
0
for scalars;(ind_dim1, ind_dim2, ...)
for arrays (for a generic sense, vectors are 1D arrays).
Take ohm_coef
below for example, it has a dimension of {8, 4, 3}
according to the
description,
which implies the actual values used by SuPy in simulations are passed
in a layout as an array of the dimension {8, 4, 3}. As such, to get
proper values passed in, users should follow the dimensionality
requirement to prepare/modify df_state_init
.
In [3]:
df_state_init.loc[:,'ohm_coef']
Out[3]:
ind_dim | (0, 0, 0) | (0, 0, 1) | (0, 0, 2) | (0, 1, 0) | (0, 1, 1) | (0, 1, 2) | (0, 2, 0) | (0, 2, 1) | (0, 2, 2) | (0, 3, 0) | ... | (7, 0, 2) | (7, 1, 0) | (7, 1, 1) | (7, 1, 2) | (7, 2, 0) | (7, 2, 1) | (7, 2, 2) | (7, 3, 0) | (7, 3, 1) | (7, 3, 2) |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
grid | |||||||||||||||||||||
1 | 0.719 | 0.194 | -36.6 | 0.719 | 0.194 | -36.6 | 0.719 | 0.194 | -36.6 | 0.719 | ... | -30.0 | 0.25 | 0.6 | -30.0 | 0.25 | 0.6 | -30.0 | 0.25 | 0.6 | -30.0 |
1 rows × 96 columns
df_forcing
: forcing data¶
df_forcing
is organised with *temporal records* in rows and
*forcing variables* in columns. The details of all forcing
variables can be found in the description
page.
The missing values can be specified with -999
s, which are the
default NANs accepted by SuPy and its backend SUEWS.
In [4]:
df_forcing.head()
Out[4]:
iy | id | it | imin | qn | qh | qe | qs | qf | U | ... | snow | ldown | fcld | Wuh | xsmd | lai | kdiff | kdir | wdir | isec | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
2012-01-01 00:05:00 | 2012 | 1 | 0 | 5 | -999.0 | -999.0 | -999.0 | -999.0 | -999.0 | 4.515 | ... | -999.0 | -999.0 | -999.0 | -999.0 | -999.0 | -999.0 | -999.0 | -999.0 | -999.0 | 0.0 |
2012-01-01 00:10:00 | 2012 | 1 | 0 | 10 | -999.0 | -999.0 | -999.0 | -999.0 | -999.0 | 4.515 | ... | -999.0 | -999.0 | -999.0 | -999.0 | -999.0 | -999.0 | -999.0 | -999.0 | -999.0 | 0.0 |
2012-01-01 00:15:00 | 2012 | 1 | 0 | 15 | -999.0 | -999.0 | -999.0 | -999.0 | -999.0 | 4.515 | ... | -999.0 | -999.0 | -999.0 | -999.0 | -999.0 | -999.0 | -999.0 | -999.0 | -999.0 | 0.0 |
2012-01-01 00:20:00 | 2012 | 1 | 0 | 20 | -999.0 | -999.0 | -999.0 | -999.0 | -999.0 | 4.515 | ... | -999.0 | -999.0 | -999.0 | -999.0 | -999.0 | -999.0 | -999.0 | -999.0 | -999.0 | 0.0 |
2012-01-01 00:25:00 | 2012 | 1 | 0 | 25 | -999.0 | -999.0 | -999.0 | -999.0 | -999.0 | 4.515 | ... | -999.0 | -999.0 | -999.0 | -999.0 | -999.0 | -999.0 | -999.0 | -999.0 | -999.0 | 0.0 |
5 rows × 25 columns
Note:
The index of df_forcing
SHOULD BE strictly of DatetimeIndex
type if you want create a df_forcing
for SuPy simulation. The SuPy
runtime time-step size is instructed by the df_forcing
with its
index information.
The infomation below indicates SuPy will run at a 5 min (i.e. 300 s)
time-step if driven by this specific df_forcing
:
In [5]:
freq_forcing=df_forcing.index.freq
freq_forcing
Out[5]:
<300 * Seconds>
Output¶
df_output
: model output results¶
df_output
is organised with *temporal records of grids* in
rows and *output variables of different groups* in columns.
The details of all forcing variables can be found in the description
page.
In [6]:
df_output.head()
Out[6]:
group | SUEWS | ... | DailyState | |||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
var | Kdown | Kup | Ldown | Lup | Tsurf | QN | QF | QS | QH | QE | ... | DensSnow_Paved | DensSnow_Bldgs | DensSnow_EveTr | DensSnow_DecTr | DensSnow_Grass | DensSnow_BSoil | DensSnow_Water | a1 | a2 | a3 | |
grid | datetime | |||||||||||||||||||||
1 | 2012-01-01 00:05:00 | 0.153333 | 0.0184 | 344.310184 | 372.270369 | 11.775916 | -27.825251 | 0.0 | -59.305405 | 31.480154 | 0.0 | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
2012-01-01 00:10:00 | 0.153333 | 0.0184 | 344.310184 | 372.270369 | 11.775916 | -27.825251 | 0.0 | -59.305405 | 31.480154 | 0.0 | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | |
2012-01-01 00:15:00 | 0.153333 | 0.0184 | 344.310184 | 372.270369 | 11.775916 | -27.825251 | 0.0 | -59.305405 | 31.480154 | 0.0 | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | |
2012-01-01 00:20:00 | 0.153333 | 0.0184 | 344.310184 | 372.270369 | 11.775916 | -27.825251 | 0.0 | -59.305405 | 31.480154 | 0.0 | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | |
2012-01-01 00:25:00 | 0.153333 | 0.0184 | 344.310184 | 372.270369 | 11.775916 | -27.825251 | 0.0 | -59.305405 | 31.480154 | 0.0 | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
5 rows × 218 columns
df_output
are recorded at the same temporal resolution as
df_forcing
:
In [7]:
freq_out = df_output.index.levels[1].freq
(freq_out, freq_out == freq_forcing)
Out[7]:
(<300 * Seconds>, True)
df_state_final
: model final states¶
df_state_final
has the identical data structure as
df_state_init
, which facilitates the use of it as initial model
states for other simulations (e.g., diagnostics of runtime model states
with save_state=True
set in run_supy
; or simply using it as the
initial conditions for future simulations starting at the ending times
of previous runs).
The meanings of state variables in df_state_final
can be found in
the description
page.
In [8]:
df_state_final.head()
Out[8]:
var | aerodynamicresistancemethod | ah_min | ah_slope_cooling | ah_slope_heating | ahprof_24hr | ... | wuprofm_24hr | z | z0m_in | zdm_in | ||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
ind_dim | 0 | (0,) | (1,) | (0,) | (1,) | (0,) | (1,) | (0, 0) | (0, 1) | (1, 0) | ... | (20, 1) | (21, 0) | (21, 1) | (22, 0) | (22, 1) | (23, 0) | (23, 1) | 0 | 0 | 0 | |
grid | datetime | |||||||||||||||||||||
1 | 2012-01-01 00:05:00 | 2 | 15.0 | 15.0 | 2.7 | 2.7 | 2.7 | 2.7 | 0.57 | 0.65 | 0.45 | ... | -999.0 | -999.0 | -999.0 | -999.0 | -999.0 | -999.0 | -999.0 | 10.0 | 0.01 | 0.2 |
2013-01-01 00:05:00 | 2 | 15.0 | 15.0 | 2.7 | 2.7 | 2.7 | 2.7 | 0.57 | 0.65 | 0.45 | ... | -999.0 | -999.0 | -999.0 | -999.0 | -999.0 | -999.0 | -999.0 | 10.0 | 0.01 | 0.2 |
2 rows × 1200 columns