Urban Growth Modeling Workshop

Minutes

Urban Growth Modeling Workshop

March 13 - 15, 2000

Annexed is the meeting agenda.

Thanks to Dave, William, Jeannette, Noah and Keith for additional notes and/or review of the minutes. Please understand that I was not there for the entire time and notes were pooled from various sources.

Participants:

Dave Hester / USGS-RMMC
William Acevedo / NASA-Ames (USGS)
Ron Matheny / EPA
George Xian / USGS-
Keith Clarke / UCSB-Geog
Jeannette Candau / UCSB-Geog
Noah Goldstein / UCSB-Geog
Xianohang Liu / UCSB-Geog
Timothy Robinson / UCSB-Bren
Will Orr / NASA
Craig Martinsen / NASA
Elizabete A. Silva / UMASS-NCGIA
John Vogel / USGS-Santa Bar Barbara

Monday, 3/13/00: (4:00-6:00)

(Sess #1) Review of urban growth and land use change models: (Ron and George)

EPA modeling efforts with CGM (Community Growth Model).

Ron:

Four posters were presented:
Land use change due to urbanization for the Neuse River Basin.
CGM: Protected Urban Growth from 1992 to 2050: MAIA (Mid Atlantic Integrated Assessment) study area.
Alternative Future of land use change for the MA (Mid Atlantic) Region.
CGM: Neuse River Basin, NC.

Are evaluating CGM, UGM and REM (an econometric model developed by the USFS).
CGM with MRLC (Multi-Resolution Land Characteristics): not pure LANDSAT, 30 meter resolution, classification classes of water, developed, barren, gravel, ……
Assumptions: 1) open water, federal lands and land within 50 meters of streams would not be developed, 2) land would not convert from low density residential to urban and 3) any land use could convert to urban. Driver:Woods and Poole population projection data was used and aggregated to the county level.
CGM uses a 7x7 nearest neighborhood calculation in Arc Grid (FOCALSUM factor).
William ran the UGM calibration for MAIA at 1 K full resolution, which worked fine for that project.
Used Nitrogen coefficient from Tustin (ppm) done by loadings (Ron in agreement with his finding).
Neuse River Basin – less area and can easily see urban growth.
Conclusion -> visual comparison ofmodeled results look pretty good. Ron wants to go in the direction of an econometric model and test sensitivities. Possible enhancements to CGM: 1) create an integrated transition matrix with REM 2) work with putting CGM into Avenue for connection with ArcView and “aml” for Arc/Info and 3) include N:P data.

Neuse River Application: model projection land use and population out till 2050.
Apparently EPA has run the UGM for the Neuse River Basin using a 30 meter cell size and out to 2050. Ron calibrated the area at 1 km full resolution and then ran future projections at 30 m.
REM information presented by county (24). Socioeconomic, price, revenue and demand variables. Did sensitivity variable, little elasticity, model could be tested with global change.
Future plans: Integrate REM with CGM and UGM. Couple CGM and UGM with effects and exposure models. Further research on the causes of land use change demand. Make the models available to the public.
MAIA deliverables due in 2002.
CGM and UGM would be coupled with water quality models.
SAIC only reviewed the Land Use Change Model and it did not actually execute.
EPA and SAIC are finalizing review of all 25 LCE models but the documentation is not complete.
Ron plans to post the Land Use Change Model on the web for evaluation, currently it is on an internal web.
SAIC is building a Land Use Change Model database that a user can provide input to queries which models are the most appropriate for the application.

George:

George has been reviewing CUF-2, DUEM (Dynamic Urban Evolution Model) and UIUC (University of Illinois, Champaign, Hannon Urban Model).
Lora Richards plans to use DUEM for Detroit.
Janis Buchanan plan to use UPLAN for Fresno County.
Will Orr has a plan submitted to USGS/DDI to enhance UGROW land use model.
USGS Urban Dynamics Research Program doesn’t plan to execute multiple land use models for the same geographical area.
William does not want to put all of NMD’s (National Mapping Division )into urban growth and land use modeling except into the LCM UCSB model. The Urban Dynamics program is not singularly commited to UGM (SLEUTH) and is also concidering other models.
Hannon Urban Model: diffusion with roads, spread of growth, open space, population projection and self-modification based upon population projection and economic prediction.
Plan to link transportation and economic model to the urban model.
Written in STELLA, object oriented language.
U. of Illinois at Champaign has a Cray, but Hannon Model is not exactly on the supercomputer.
CUF-2 (John Landis, UC Berkeley)
written in C programming language.
Activity projector?
Spatial database
Landuse change urban model
Model uses 7 urban catagories
URISA 2000 conference – there will be a land use model panel session where CGM, UGM and CUF-2 will be presented, John Landis, Jeannette and Ron will be there.
EDC has allocated a portion of George’s FTE to do theoretical literature review of all the land use models.
DUEM is based on cellular automata; regional constraint of space, growth field space.
Written in C++.
Is primarily an urban land use change model.
Model actually grows new roads.
What other land use models should George be researching?

Tuesday, 3/14/00: (8:30-5:00)

William: Debriefing from yesterday:

Bruce Hannan (Jeanne) – Need to get people together at Banff.
Dave needs to provide George with a copy of LAM documentation.
USGS – Comparitive models. George is doing a literature review of 7 different models. $0-10 million might be coming next FY, depending on congressional budgetary decisions.
U-Grow – will be added to George’s list (Honolulu application), UPlan (Johnston).

Idea (Keith) – distribute one data set to be applied to individual urban growth models and bring to Banff the results on a future prediction. Each model developer is responsible to create additional required data, apply their model, bring results to Banff conference. How many modelers will go without funding? Maybe use the Sioux Falls data set or Albuquerque. Get USGS to get the data together. Shooting for 90% overlap so that comparisons can be done. Data would be put on the web. Model City (public domain as done by USGS). Web hosting for this would be done by UCSB from the UCIME site. Excellent publication potential.

List of Models of interest:

CGM (Community Growth Model)

UPlan (Bob Johnson)

CUF2 (John Landis – California Urban Frontiers, UCB)

LAM (Steve Burstin – landuse model)

DUEM – dynamic urban evaluation model.

UGROW Wil Orr, Prescott University

(Ses #2) Discuss issues related to UGM software development:

Ron – Code status and difference between V2.1 and V3.0:

Will be html available.
Improved greatly.
Needs to be more transparent and open to users.
Put on C90 Cray (only 20% utilized at the moment) and stayed on the cue, memory allocation problems. Needs to be minimized, so attempting to eliminate things (sequential and real numbers, few pointers). Had to rearrange things for the Cray as needs continuguous memory parallelization. V3.0 now primarily written to run on the Cray.
So Tommy eliminated 100s of memory reallocations. He is paid through the EPA via a grant sent in for last year. Part of Cray time allocation comes Tommy’s time.
Memory is limiting factor.
Maximum array size for UGM on the Cray is 1500x1500.
Random number generator is now a defined parameter.
Cray computer family:

C90: Parallel Vector Processor (PVP), 4 240 MHz processors, 128 mb main memory.

NESC Cray T3E (Hickory): Type MIP, 64 600 MHz processors and 256

mb/processor, 16 GB total.

Structures- module programming – heading for world wide coverage.
GIF – 8 Bit read for V3.0, can run with 7 bit but doesn’t like the mix (problem with all non-binary data input, so best to run 8 bit GIF image).
Need a flow chart on how the model works together. Tommy can help with this but it is a task for UCIME (Noah).
AOK to put V3.0 on UCSB web site, just need to work out the most recent bugs and work out model credits (EPA should be credited).
Final debugging of V3.0 should be done by Dave (EDC) and UCSB people who independently sent comments to Tommy to be implemented. At present V3.0 compiles but results are unreliable.
V3.0 can run with or without Deltatron (identical output to V2.1).
Each Beowolf PC cluster ideally should be the same memory and CPU processing speed, otherwise this might be a balancing problem.
Landuse probability image does not visualize well.

User interface:

Inputs all in one file (one file runs it).
Hillshade image not essential for calibration but very important for visualization.
Have a source file for editing.
Can tar it up for compression and transport.
Validation – Can be turned on or off. Improved for easier debugging. Limited # of images for memory availability.
Tommy available for help but best to send questions to Ron to access Tommy. Tommy can’t work on private contracts. So best to write Ron all questions.
2000 George and Jeannette are accounted for on Cray time. Will add Dave.
2001 George writing proposal to get time and Keith will be added. EPA looking for users so aok to request at this time.
Tommy should put together a simple manual to know how to use the Cray. (rlogin, transfer files and off you go.)

(Ses#3) Discuss status and plans for UGM web documentation including UGM implementation process flow.

Jeannette:

Process flow – no difference with V3.0.

Series of nested loops

Starting at the most basic:

Growth Cycle ( one cycle == one year )

Start with coefficient values
Apply growth rules (affected by coefficient values)
Spontaneous
New spreading center
Organic
Road influenced

Self-Modification

If growth rate excedes the CRITICAL_HIGH or CRITICAL_LOW the coefficient values are modified.
These values are the starting point for the next cycle.
Run – links several cycles together (random # seed – where it starts) from start_date to stop_date when model is at a year for which there is control data, a log file is created.

Monte Carlo Runs:

Coefficients are the same, period is the same, but seeds are changed.
At the end of n monte carlo runs the values in the log files for each control year are averaged, the result is compared with known data using an R2, and the result is written to control.stats.

N monte carlo runs == a monte carlo set?
Calibration run

Consists of many monte carlo sets. The input coefficient values are iterated over a specified range.

This is an initial suggested vocabulary to be used as a starting point for further development and critique.

Task List:

Write library/glossary of terms (possible paper).
Series of web pages for the project, user oriented. (Jeannette, Noah and Tim): web documentation, model download site, publication list, etc. (Gigalopolis, UGM and UCIME pages)
Flow chart of project. (Jeannette and Tommy)
Systems documentation – comments in code (already done by Tommy as he did a self documented code, Keith feels this is sufficient.).
Keith suggested that RMMC update the Albq report to reflect the UGM V3.0 process flow.

(Ses#4) Introduction of each model application and a brief description of the weighting and bracketing rational that were used during calibration.

Calibration – little control.
Reweight routine assign success factor to the different metric values.
Santa Barbara

full resolution dataset: used 30m data.
Jeannette minimized the importance of xmu/ymu and standard distance during calibration for sorting top coefficient values.
Jeannette is using LeeSallee, number of urbans, edges and clusters on the prior calibration return for identifying the top record coefficient value.
Goal: Decrease uncertainty, increase resolution and closer to what we want, but could miss optimal solution.

Scaling Issue->Fractal quality of cities across scales
Scaling Issue->Road Gravity – determines how far out from a newly urbanized pixel a road will be searched for. UGM2.0 and V2.1 doing a linear search. V3.0 does this differently. Correctly modified to do outward search.
Need to improve metrics.
R2 values might be causing massive over extension (estimation?). Could use a ratio (absolute values) or slope of line.
Doing Neuse River at the moment.???
MAIA

1992 DLG road data used
scanned census maps for urban
4 km coarse resolution transportation pixel.
Loveland – original 1992 landuse data source -> used MLRC instead
Experimental data set intended for model testing.

William has been weighting most heavily the LeeSallee and the number of urban pixels during calibration for identifying the best fit coefficient values.

Larger the unit the greater the degrees of freedom.
Parameter averaging. V2.1 added on avg.log and dev.log files for tracking purposes. Set to link calibration and prediction. (Graph – start date, stop data (predicting the present) and future data).
Keith feels that RMMC should be exhibiting the Albq Urban Growth Model results at a national conference and not just at local TEMs.

Neuse River

Used clipped MAIA data
Calibrated at 4,2, 1 km, predicted at 30m
Ron did not weight any of the calibration metrics to identify the best fit coefficient values. Used composite score

Present Forecast: part of calibration process, not a prediction process, start year, stop year and MC = 100. According to Jeannette the present forecast process is presented on the UCSB Gigalopolis web site under parmameter averaging in the calibration page.

Chicago
1876 seed year
Landuse: 1972: GIRAS, 1992: MLRC
Included wetlands in the excluded area

Albuquerque

(Ses#5) Discuss issues/questions related to the weighting and interpretation of model metrics. Develop an approach/methodology for answering these questions.

Data Graphing Issues

File line number can be used to link control.stats and avg.log.
The stats for “known data” can be generated with the model using TEST mode

Graphing variation in metrics through time
Keith and Jeannette suggested graphing a random sample from each metric across scales
Ron suggested graphing all samples using SAS

Alternative calibration methods for Land Use
Deltatron model is driven by number of newly urban pixels
Coefficients do not directly affect landcover
Therefore…
Calibrate coefficients using urban data only
After best values are found for urban modeling, calibrate land cover change at full resolution
Final details of how to do land cover calibration has not been worked out.

(Ses#6) Discuss issues/questions related to bracketing rational. Develop a methodology for answering these questions.

William:

MAIA (Mid Atlantic Integrated Assessment):

Way over predicting urban growth.
1950 census, 1 km sq. resolution, digital chart of the world.
Aggregated to 1 km by selecting thresholds
Used 5 cpus 625 minutes per machine – 4 iterations/runs, 30 secs per run with 2 min per Monte Carlo set for ½ resolution.
Got coefficients from Jeannette and Keith.
LeeSallee for # of pixels.
Bracketing ratio all were different (60-100).
2 min x 4 runs = 8 min.. 1656 runs x 2 parsed out for parallel running. Ran the mean 90.6 to do prediction run with only 4 M. Carlo iterations.
¼ resolution with 25 step resolution, different coefficients, hard time seeing scaling, problem in road distancing.

Keith:

Calibration methodology: May or may not be doing the scaling correctly.
Monte Carlo justification: Average or range more stable – looking at mean but want to minimize variance.
Sensitivity testing.
Metrics – scaling issue.
13 metrics good/best.
Weighting – no theory, imperically based, little understood.
3 phases – optimizing or not?

George:

1972, 1992 land cover: not comparable as used different data sources (ex. Urban areas reduced over time period). Problems.
Converted to 30 m resolution for all runs (full and half).
1970’s classic period of over estimation of urban growth. Lots of miss classifications.
Roads predictions ok as true some roads do drop out.
Varied weighting.
Showed forest area increase.
Wetland problem didn’t show up – stayed with same part of exclusion.
Good job and 1-2 papers are ready to be published.

(Ses#7) Discuss issues dealing with self-modification within the model. Analyze the graphs that have been generated for the different studies. Discuss what the coefficients were doing. Develop a plan for further study:

Only Santa Barbara coefficients were graphed, and proved to give little insight into the self-modification qualities of the model since the coefficients always tended towards a maximum value. Data for other areas might have proved more interesting.
Self-mod values (CRITICAL_HIGH, BOOM, etc) were decided upon through trial and error.
Due to modifications in V3.0 the self-modification values should probably be revisited.

(Ses#8) Discuss model metrics. Analyze the graphs showing what the model metrics were doing. How do model results compare with reality?:

Not present.

(Ses#9) Discuss how model metrics may relate to coefficients. Develop a methodology for further study:

Not present.

(Ses#10) Discuss the scalability of the calibration process including the model, data and utilities. Identify issues/questions that have arisen. Develop an approach/methodology for answering these questions:

Scalability – Tests A and B.
Coefficients were set the same for both tests.

The utility HALFGIF is not able to maintain the continuity of the transportation data. William reccomends going directly from vector to desired resolution instead of resampling grids.

(Ses#11) Discuss the Monte Carlo sensitivity of the UGM. Evaluate results of tests using data for Santa Barbara: