The Common Land Model (CoLM)

Technical & User Guide

Yongjiu Dai Duoying Ji

School of Geography

Beijing Normal University

Beijing 100875

China

E-mail:

July 7, 2008


Contents

1. Introduction

2. Creating and Running the Executable

2.1 Specification of script environment variables and header file

2.2 Surface data making

2.3 Initial data making

2.4 Time-loop calculation

3. CoLM Surface Dataset

4. CoLM Atmospheric Forcing Dataset

4.1 GSWP2 forcing dataset

4.2 PRINCETON forcing dataset

4.3 Temporal interpolation of the forcing data

5. CoLM Model Structure and Parallel Implementation

5.1 CoLM Model Structure

5.2 CoLM MPI Parallel Design

5.3 CoLM MPI Parallel Implementation

5.4 CoLM Source Code and Subroutines Outline

6. CoLM Parameter and Variables

6.1 Model Parameters

6.2 Time invariant model variables

6.3 TUNABLE constants

6.4 Time-varying state variables

6.5 Forcing

6.6 Fluxes

7. Examples

7.1 Single Point Offline Experiment

7.2 Global Offline Experiment with GSWP2 Dataset

Table 1: Model directory structure

Table 2: define.h CPP tokens

Table 3: Namelist variables for initial data making

Table 4: Namelist variables for Time-loop calculation

Table 5: The list of raw data available

Table 6: Description of 24-category (USGS) vegetation categories

Table 7: Description of 17-category soil categories

Table 8: The relative amounts of sand, soil, and clay

Table 9: netCDF File Information of the Processed Atmospheric Forcing Data

Table 10: Source code and Subroutines Outline

Table 11: Dimension of model array

Table 12: Control variables to determine updating on time steps

Table 13: Model time invariant variables

Table 14: Model TUNABLE constants

Table 15: Run calendar

Table 16: Time-varying Variables for restart run

Table 17: Atmospheric Forcing

Table 18: Model output in xy Grid Form

Figure 1: Flow chart of the surface data making

Figure 2: Flow chart of the initial data making

Figure 3: Flow chart of the time-looping calculation

Figure 4: Diagram of the domain partition at surface data making

Figure 5: Diagram of the domain partition at time-looping calculation

Figure 6: Diagram of the patches and grids mapping relationship


1. Introduction

This user’s guide provide the user with the coding implementation, and operating instructions for the Common Land Model (CoLM) which is the land surface parameterization used in offline mode or with the global climate models and regional climate models.

The development of the Common Land Model (hereafter we call CLM initial version) can be described as the work of a community effort. Initial software specifications and development focused on evaluating the best features of existing land models. The model performance has been validated in very extensive field data included sites adopted by the Project for Intercomparison of Land-surface Parameterization Schemes (Cabauw, Valdai, Red-Arkansas river basin) and others [FIFE, BOREAS, HAPEX-MOBILHY, ABRACOS, Sonoran Desert, GSWP, LDAS]. The model has been coupled with the NCAR Community Climate Model (CCM3). Documentation for the CLM initial version is provided by Dai et al. (2001) while the coupling with CCM3 is described in Zeng et al. (2002). The model was introduced to the modeling community in Dai et al. (2003).

The CLM initial version was adopted as the Community Land Model (CLM2.0) for use with the Community Atmosphere Model (CAM2.0) and version 2 of the Community Climate System Model (CCSM2.0). The current version of Community Land Model, CLM3.0, was released in June 2004 as part of the CCSM3.0 release (http://www.ccsm.ucar.edu/models/ccsm3.0/clm3/). The Community Land Model (CLM3.0) is radically different from CLM initial version, particularly from a software engineering perspective, and the great advancements in the areas of carbon cycling, vegetation dynamics, and river routing. The major differences between CLM 2.0 and CLM initial version are: 1) the biome-type land cover classification scheme was replaced with a plant functional type (PFT) representation with the specification of PFTs and leaf area index from satellite data; 2) the parameterizations for vegetation albedo and vertical burying of vegetation by snow; 3) canopy scaling, leaf physiology, and soil water limitations on photosynthesis to resolve deficiencies indicated by the coupling to a dynamic vegetation model; 4) vertical heterogeneity in soil texture was implemented to improve coupling with a dust emission model; 5) a river routing model was incorporated to improve the fresh water balance over oceans; 6) numerous modest changes were made to the parameterizations to conform to the strict energy and water balance requirements of CCSM; 7) Further substantial software development was also required to meet coding standards. Besides the changes from a software engineering perspective, the differences between CLM3.0 and CLM2.0 are: 1) several improvements to biogeophysical parameterizations to correct deficiencies; 2) stability terms were added to the formulation for 2-m air temperature to correct this; 3) the equation was modified to correct a discontinuity in the equation that relates the bulk density of newly fallen snow to atmospheric temperature; 4) a new formulation was implemented that provides for variable aerodynamic resistance with canopy density; 5) the vertical distribution of lake layers was modified to allow for more accurate computation of ground heat flux; 6) a fix was implemented for negative round-off level soil ice caused by sublimation; 7) a fix was implemented to correct roughness lengths for non-vegetated areas. Documentation for the Community Land Model (CLM3.0) was provided by Oleson et al. (2004). The simulations of CLM2.0 coupling with the Community Climate are described in Bonan et al. (2002). The simulations of CLM3.0 with the Community Climate System Model (CCSM3.0) are summarized in the Special Issue of Journal of Climate by Dickinson et al. (2005), Bonan and S. Levis (2005).

Concurrent with the development of the Community Land Model, the CLM initial version was undergoing further development at Georgia Institute of Technology and Beijing Normal University in leaf temperature, photosynthesis and stomatal calculation. Big-leaf treatment by CLM initial version and CLM3.0 that treat a canopy as a single leaf tend to overestimate fluxes of CO2 and water vapor. Models that differentiate between sunlit and shaded leaves largely overcome these problems. A one-layered, two-big-leaf submodel for photosynthesis, stomatal conductance, leaf temperature, and energy fluxes was necessitated to the CLM initial version, that is not in the CLM3.0. It includes 1) an improved two stream approximation model of radiation transfer of the canopy, with attention to singularities in its solution and with separate integrations of radiation absorption by sunlit and shaded fractions of canopy; 2) a photosynthesis–stomatal conductance model for sunlit and shaded leaves separately, and for the simultaneous transfers of CO2 and water vapor into and out of the leaf—leaf physiological properties (i.e., leaf nitrogen concentration, maximum potential electron transport rate, and hence photosynthetic capacity) vary throughout the plant canopy in response to the radiation–weight time-mean profile of photosynthetically active radiation (PAR), and the soil water limitation is applied to both maximum rates of leaf carbon uptake by Rubisco and electron transport, and the model scales up from leaf to canopy separately for all sunlit and shaded leaves; 3) a well-built quasi-Newton–Raphson method for simultaneous solution of temperatures of the sunlit and shaded leaves. For avoiding confusion with the Community Land Model (CLM2.0, CLM3.0 versions), we name this improved version of the Common Land Model as CoLM.

This was same as model now supported at NCAR. NCAR made extensive modifications mostly to make more compatible with NCAR CCM but some for better back compatibility with previous work with NCAR LSM. For purpose of using in a variety of other GCMs and mesoscale models, this adds a layer of complexity that may be unnecessary. Thus we have continued testing further developments with CLM initial version. Some changes suggested by Land Model working groups of CCSM are also implemented, such as, stability terms to the formulation for 2-m air temperature, a new formulation for variable aerodynamic resistance with canopy density. CoLM is radically different from either CLM initial version or CLM2.0 or CLM3.0, the differences could be summarized as follows,

1)  Two big leaf model for leaf temperatures, photosynthesis-stomatal resistance;

2)  Two-stream approximation for canopy albedoes calculation with the solution for singularity point, and the calculations for radiation for the separated canopy (sunlit and shaded);

3)  New numerical scheme of iteration for leaf temperatures calculation;

4)  New treatment for canopy interception with the consideration of the fraction of convection and large-scale precipitation;

5)  Soil thermal and hydrological processes with the consideration of the depth to bedrock;

6)  Surface runoff and sub-surface runoff;

7)  Rooting fraction and the water stress on transpiration;

8)  Use a grass tile 2m height air temperature in place of an area average for matching the routine meteorological observation;

9)  Perfect energy and water balance within every time-step;

10) A slab ocean-sea ice model;

11) Totally CoLM coding structure.

The development of CoLM is trying to provide a version for public use and further development, and share the improvement contributed by many groups.

The source code and datasets required to run the CoLM in offline mode can be obtained via the web from:

http://globalchange.bnu.edu.cn/modles.do

The CoLM distribution consists of three tar files:

CoLM_src.tar.gz

CoLM_src_mpi.tar.gz

CoLM_dat.tar.gz.

The file CoLM_src.tar.gz and CoLM_src_mpi.tar.gz contain code, scripts, the file CoLM_src.tar is the serial version of the CoLM, and the file CoLM_src_mpi.tar.gz is the parallel version of the CoLM, the file CoLM_dat.tar contains raw data used to make the model surface data. The Table 1 lists the directory structure of the parallel version model.

Table 1: Model Directory Structure

Directory Name / Description
colm/rawdata/ / "Raw" (highest provided resolution) datasets used by CoLM to generate surface datasets at model resolution. We are currently providing 5 surface datasets with resolution 30 arc second:
DEM-USGS.30s
LWMASK-USGS.30s (not used)
SOILCAT.30s
SOILCATB.30s
VEG-USGS.30s
BEDROCKDEPTH (not available)
LAI (not available)
colm/data/ / Atmospheric forcing variables suitable for running the model in offline mode
colm/mksrfdata/ / Routines for generating surface datasets
colm/mkinidata/ / Routines for generating initial datasets
colm/main/ / Routines for executing the time-loop calculation of soil temperatures, water contents and surface fluxes
colm/run/ / Script to build and execute the model
colm/graph/ / GrADs & NCL files for display the history files
colm/interp/ / Temporal interpolation routines used for GSWP2 & PRINCETON atmospheric forcing dataset
colm/tools/ / Useful programs related with model running

The scientific description of CoLM is given in

[1]. Dai, Y., R.E. Dickinson, and Y.-P. Wang, 2004: A two-big-leaf model for canopy temperature, photosynthesis and stomatal conductance. Journal of Climate, 17: 2281-2299.

[2]. Oleson K. W., Y. Dai, G. Bonan, M. Bosilovich, R. E. Dickinson, P. Dirmeyer, F. Hoffman, P. Houser, S. Levis, G. Niu, P. Thornton, M. Vertenstein, Z.-L. Yang, X. Zeng, 2004: Technical Description of the Community Land Model (CLM). NCAR/TN-461+STR.

[3]. Dai, Y., X. Zeng, R. E. Dickinson, I. Baker, G. Bonan, M. Bosilovich, S. Denning, P. Dirmeyer, P. Houser, G. Niu, K. Oleson, A. Schlosser, and Z.-L. Yang, 2003: The Common Land Model (CLM). Bull. of Amer. Meter. Soc., 84: 1013-1023.

[4]. Dai, Y., X. Zeng, and R.E. Dickinson, 2002: The Common Land Model: Documentation and User’s Guide (http://climate.eas.gatech.edu/dickinson/).

We value the responses and experiences of our collaborators in using CoLM and encourage their feedback on problems in the current model formulation and the coding, as well as insight and suggestions for future model refinement and enhancement. It would be particularly helpful if users would communicate such feedback informally and where possible share with us documented model applications including manuscripts, papers, procedures, or individual model development.


2. Creating and Running the Executable

The CoLM model can run as a stand alone executable where atmospheric forcing data is periodically read in. It can also be run as part of the Atmosphere Model where communication between the atmospheric and land models occurs via subroutine calls or the special coupler. In this User’s Guide, we’ll focus on the parallel version CoLM, most of the scripts and setting of the serial version CoLM are similar to the parallel version, and even more simple.

offline mode

In order to build and run the CoLM on offline mode, two sample scripts: jobclm.csh, jobclm_single.csh, and the corresponding Makefile files are provided in run and the source code directories respectively.

The scripts, jobclm.csh and jobclm_single.csh, create a model executable, determine the necessary input datasets, construct the input model namelist. Users must edit these scripts appropriately in order to build and run the executable for their particular requirements and in their particular environment. These scripts are provided only as an example to aid the novice user in getting the CoLM up and running as quickly as possible. The script jobclm_single.csh used to do a single-point offline simulation experiment, can be run with minimal user modification, assuming the user resets several environment variables at the top of the script. In particular, the user must set ROOTDIR to point to the full disk pathname of the model root directory. And the jobclm.csh is used to do a global or regional offline simulation experiment, usually should be modified heavily to fulfill different requirements. The following part we’ll explain the jobclm.csh in detail.

The script jobclm.csh can be divided into five sections:

1)  Specification of script environment variables, creating header file define.h;

2)  Compiling the surface data making, initial data making, time-loop calculation programs respectively.

3)  Surface data making, including input namelist creating;

4)  Initial data making: including input namelist creating;

5)  Time-loop calculation: including input namelist creating.

2.1 Specification of script environment variables

The user will generally not need to modify the section of jobclm.csh, except to:

1) set the model domain edges and the basic computer architecture,

2) set the model path directory,

3) create the subdirectory for output, and

4) create the header file $CLM_INCDIR/define.h.

Box 1: Example for specification of script environment variables
# set the basic computer architecture for the model running
#setenv ARCH ibm
setenv ARCH intel
# set the model domain for north, east, south, west edges
setenv EDGE_N 90.
setenv EDGE_E 180.
setenv EDGE_S -90.
setenv EDGE_W -180.
# set the number of grids of the CoLM and the forcing dataset at longitude and latitude directions
setenv NLON_CLM 360
setenv NLAT_CLM 180
setenv NLON_MET 360
setenv NLAT_MET 180
# set the number of processes used to parallel computing, MPI related.
setenv TASKS 24
# The user has to modify the ROOTDIR to his/her root directory, for example, /people.
setenv ROOTDIR /people/$LOGNAME
# 1) set clm include directory root
setenv CLM_INCDIR $ROOTDIR/colm/include
# 2) set clm raw land data directory root
setenv CLM_RAWDIR $ROOTDIR/colm/rawdata
# 3) set clm surface data directory root
setenv CLM_SRFDIR $ROOTDIR/colm/mksrfdata
# 4) set clm input data directory root
setenv CLM_DATADIR $ROOTDIR/colm/data
# 5) set clm initial directory root
setenv CLM_INIDIR $ROOTDIR/colm/mkinidata
# 6) set clm source directory root
setenv CLM_SRCDIR $ROOTDIR/colm/main
# 7) set executable directory
setenv CLM_EXEDIR $ROOTDIR/colm/run
# 8) create output directory
setenv CLM_OUTDIR $ROOTDIR/colm/output
mkdir -p $CLM_OUTDIR >/dev/null
#------
# build define.h in ./include directory
#------
\cat >! .tmp < EOF
#undef coup_atmosmodel
#undef RDGRID
#undef SOILINI
#define offline
#undef BATS
#undef SIB2
#undef IGBP
#define USGS
#define EcoDynamics
#define LANDONLY
#undef LAND_SEA
#undef SINGLE_POINT
#undef MAPMASK
#define NCDATA
#define PRINCETON
#undef GSWP2
#undef DOWNSCALING
#define WR_MONTHLY
EOF
if ($TASKS > 1) then
\cat > .tmp < EOF
#define MPI
EOF
Endif
\cmp -s .tmp $CLM_INCDIR/define.h || mv -f .tmp $CLM_INCDIR/define.h

The ARCH variable is used to set the architecture of the model running, and in the following section of the jobclm.csh, the make command will use the ARCH variable to invoke different Makefile to compile the model. The EDGE_N, EDGE_E, EDGE_S, EDGE_W four variables are used to locate the model domain edges, especially on the model surface data making. The number of model grids at latitude or longitude direction is set by the NLAT_CLM and NLON_CLM, these also are used for surface data making. The number of forcing dataset grids at latitude or longitude direction is set by the NLAT_MET and NLON_MET, these help do some simple forcing data downscaling when the model grids not exactly match the forcing dataset grids. The number of processors involved in the parallel computing is set by the TASKS environment variables, if TASKS is great than one, the MPI cpp token will be specified in define.h automatically, and the MPI parallel function will be build into the model, users could modify this logic according to your own requirements.