Windows BEOPEST
John Doherty
Watermark Numerical Computing
March 2010
1. Introduction
BEOPEST was introduced to the PEST family of programs by Willem Schreuder of Principia Mathematica. Information on BEOPEST, including download instructions for Unix source code, as well as Willem’s documentation, is available at the following site and at links available through that site:
The present document is meant to act as a supplement to Willem’s documentation, and not a replacement.
BEOPEST was originally written for use on Unix platforms but, with some help from Doug Rumbaugh from Environmental Simulations, has also been ported to Windows. Since then, I have added some refinements to the Windows version of BEOPEST. Because source code is shared between the Unix and Windows versions of BEOPEST, these refinements will eventually make their way back to the Unix version. However at the present state of development the Windows and Unix versions are a little different in that run management and run management reporting are not the same between the two versions. Also the Unix version does not offer restart capabilities.Hopefully, when the present developmental phase is complete, these differences will no longer exist.
In spite of its being the focus of recent development and refinement, one significant feature that, at the time of writing, the Windows version of BEOPEST is missing is the option to use MPI for communication between the master and slaves. This will be rectified in the not-too-distant future. In the meantime this is not expected to present a problem to many (if any) Windows users.
At the time of writing it has been well over a year since Willem developed his original version of BEOPEST. I have to admit that while I was very interested in what he had done, my interest did not extend to studying its intimate details, nor gaining the necessary knowledge to understand TCP/IP communications in the Windows setting. It was then my opinion that the traditional Parallel PEST did a good enough job in the Windows environment, and there was no reason to replace something that worked satisfactorily with anything new. Three things changed my mind about that:
- the need to reduce all impediments to the use of large number of parameters when quantifying model predictive uncertainty;
- the ubiquitous use of multicore processors in modern machines; and
- the explosion in availability of cloud computing resources.
All of these require a more efficient and more flexible parallelization paradigm than that provided by Parallel PEST. BEOPEST provides such a paradigm. Hence it is my intention to build on the wonderful work that Willem did in adding BEOPEST enhancements to PEST code by continuing to support and improve these enhancements.
Unfortunately, some problems have been encountered in developments so far. Whether these can be attributed to certain versions of the Windows operating system, or to the Intel FORTRAN compiler with which the Windows version of BEOPEST is presently compiled, is as yet unknown. In particular, though I have not encountered this myself, some users have reported that slaves “freeze” when undertaking the N’th model run, where N is a random number. Furthermore they “freeze” in such a way that the BEOPEST master cannot differentiate their inactivity from that which would occur through undertaking an unusually long model run. Fortunately this does not happen often; furthermore BEOPEST’s restart capabilities can mitigate the cost of this occurrence. Nevertheless, at the time of writing, the matter is being investigated, and the BEOPEST run manager has been altered to accommodate this situation. One way or another, these (and any other problems encountered in using BEOPEST) will be overcome over time, as I am committed to making BEOPEST the software of choice for PEST parallelization.
The present version of BEOPEST should be considered as a beta version. I ask you, the user, to report back to me any problems that you encounter in using BEOPEST, supplying as many details as you can. Through this process BEOPEST will be able to reach maturity as soon as possible and, I hope, provide some significant improvements in what we can do with models in modern computing environments.
2. Using BEOPEST
2.1 General
BEOPEST shares source code with PEST. Hence it supports all functionality that is offered by PEST. BEOPEST and PEST use exactly the same inversion algorithms; they only differ in parallel run management.
The present version of BEOPEST is compiled using the Intel FORTRAN compiler. However parts of it are written in C++; these are compiled using the Microsoft C++ compiler.
Two versions of BEOPEST are available, these being named BEOPEST32 and BEOPEST64. As the names suggest, the latter is compiled specifically for use on a 64 bit operating system and will therefore not work on a 32 bit operating system. The reverse is not true however.
2.2 Some BEOPEST Concepts
As is described in Willem’s original BEOPEST documentation, the same BEOPEST executable program serves as both the master and the slave. Its role in any particular circumstance depends on the command used to run it. In setting up a parallel BEOPEST run, there should be only one master; however there is effectively no limit to the number of slaves that can be initialized. As for the normal Parallel PEST, the user must ensure that all slaves operate in different working directories so that input and output files for different model instances used by different slaves are notconfused.
In contrast to the normal Parallel PEST, a BEOPEST slave is “smart” in that it does more than simply run the model when given the command by the PEST master program to do so. In fact the slave (and not the master as with the traditional Parallel PEST) writes the input files and reads the output filespertaining to the model of which it has control. This brings with it the advantage that the master does not need to write model input files to the slave’s working directory across what may be a busy network; nor does it need to read model output files from that directory. In fact, the PEST master does not even need to know where the slave’s working directory is.
Prior to running the model, the slave receives from PEST the set of parameters that it must use for a particular model run. When the model run is complete, it sends PEST the outcomes of the model run. Model input/output communications are handled by the slaves.Communication between PEST and its slaves is reduced to the minimum possible; the need to read/write model input/output files from afar is eliminated.
In order to write model input files and read model output files, the slave version of BEOPEST must have access to template and instruction files respectively. It obtains the names of these by reading the PEST control file, just as the master does. In most cases the directory from which each slave operates should thus be a copy of the directory from which PEST would operate if it were calibrating the model itself in serial fashion.
Use of BEOPEST does not require that the user prepare a run management file. As stated above, the master does not even need to know where the slaves are (this comprising the bulk of the information recorded in a run management file). However if a run management file is supplied, BEOPEST will read the first two lines of this file, looking for the value for the optional PARLAM variable. This matter is further discussed below.
For those interested, BEOPEST is featured in a paper that has recently appeared in the Ground Waterjournal. See:
Hunt, R.J., Luchette, J., Shreuder, W.A., Rumbaugh, J., Doherty, J., Tonkin, M.J. and Rumbaugh, D., 2010. Using the cloud to replenish parched groundwater modeling efforts.Rapid Communication for Ground Water, doi: 10.1111/j.1745-6584.2010.00699
2.3 Running BEOPEST as the Master
To run BEOPEST as the master, use a command such as the following while situated in the master directory:
beopest64 case /H :4004
If desired, the master directory can coincide with a slave directory. It will be to this directory that the run record file and all other files produced by PEST to record the status and progress of the parameter estimation process are written.
In the above command it is important to note that:
- “beopest64” can be replaced by “beopest32” on a machine that does not possess 64 bit architecture.
- “case” is the filename base of a PEST control file, for which an extension of “.pst” is expected. (The extension can be included in the above command if desired.)
- “4004” is the port number. This number can be replaced by the number of any unused port.
- A space must separate “/H” from the colon that precedes the port number; a lower case “h” can be used if desired.
As for the traditional Parallel PEST, BEOPEST can be restarted using the “/r”, “/j” or “/s” switches. For the last of these cases the above command becomes:
beopest64 case /s /H :4004
A similar protocol is followed for the other restart switches. Similarly, BEOPEST can be started with the command to read an existing Jacobian matrix file instead of calculating the Jacobian matrixduring its first iteration. This is accomplished through use of the “/i” command-line option. The command then becomes:
beopest32 case /i /H :4004
As for the normal version of PEST, BEOPEST will, in this case, prompt for the name of the JCO file that it must read. See the addendum to the PEST manual for further details.
In principle BEOPEST can restart a previously interrupted Parallel PEST run. In practice it has been found that this cannot be guaranteed due to differences in the way that programs compiled by different compilers read and write binary files. The present version of Parallel PEST is compiled using the Lahey compiler, while BEOPEST is compiled using the Intel compiler. (Theoretically the use of binary rather than unformatted file storage should eradicate such incompatibilities; however this does not appear to be the case.) BEOPEST will restart a previously interrupted BEOPEST run without difficulties however.
2.4 Running BEOPEST as the Slave
In contrast to Parallel PEST, slaves must be started after,and not before, execution of the BEOPEST master has been initiated. Once execution of the master has commenced, slaves can be started at any time thereafter, and in any order.
While positioned in a slave working directory, type a command such as the following to run BEOPEST as a slave.
beopest64 case /H masterhost:4004
where “masterhost” should be replaced by the hostname of the machine on which the master resides. Once again, make sure that there is a space between “/H” and the host name. However there should be no space between the host name and the following colon. If you are unsure of the host name of the master, type the command:
hostname
in a command-line window of the master machine. Alternatively, instead of the host name, use the IPv4 address of the master. Thus the above command becomes, for example:
beopest64 case /H 192.168.1.104:4004
If you do not know the IP address of the host machine, type the command:
ipconfig
while situated in a command-line window on the host machine.
2.5 Terminating BEOPEST Execution
Execution of BEOPEST can be brought to a halt using the PSTOP and PSTOPST commands in the usual manner. These commands should be issued from a command-line window which is open in the directory from which BEOPEST is running.
2.6 SVDA
BEOPEST’s tasks when undertaking SVD-assisted inversion are much more complicated than when undertaking normal inversion. This is because PEST writes its own parcalc.tpl template file at the start of every iteration of the parameter estimation process, this file containing the information required to calculate base parameter values from current super parameter values. When model input files are written locally by smart slaves rather than by a PEST master which is aware of the directories in which all of its slaves are operating (the latter being the modus operandi of the traditional Parallel PEST), the PEST master must communicate to each slave the means through which base parameters are re-constructed from super parameters. The BEOPEST master transfers this information to its slaves using the TCP/IP protocol in a manner that is transparent to the user.
While the user need have no involvement in this procedure, it is important however that, when preparing for a BEOPEST run, he/she transfers files from the master directory to the slave working directories after, and not before, SVDAPREP has been run in order to create a super parameter PEST control file. In particular, this new PEST control file must be transferred to the working directory of all slaves, along with the picalc.ins, picalc.tpl and svdabatch.bat files written by SVDAPREP. The base parameter PEST control file file must also be transferred to all slave working directories (for the slaves need to obtain details of base parameter names, bounds, scales and offsets from this file).Naturally, the name of the super parameter PEST control file written by SVDAPREP must be supplied to both the master and slave versions of BEOPEST through their respective command lines as execution of each of these is initiated.
3. Run Management
3.1The Run Management File
Unlike the traditional Parallel PEST, BEOPEST does not need to read a run management file. As the slaves, and not the master, write model input files and read model output files, the master does not need to know the slave working directories. Nor does it need to know in advance of a BEOPEST run how many slaves there are. It will just add slaves to its register as they open communications with the BEOPEST master through the TCP/IP protocol, and allocate them runs as long as they are still prepared to implement these runs.
Nevertheless, if a run management file is present within the directory from which the master is launched, the BEOPEST master will read the first two lines of this file. Actually it will only read one variable from this file, this being the optional PARLAM variable. This is the fourth variable on the second line of the file. Recall that its settings are as follows.
PARLAM setting / PEST action0 / Do not parallelize model runs when testing different parameter upgrades calculated on the basis of different Marquardt lambdas.
1 / Parallelize the lambda search procedure. Use all available slaves in this process.
-N / Parallelize the lambda search procedure. Use a maximum of N slaves in this process.
-9999 / Parallelize the lambda search procedure. Use a maximum of NUMLAM slaves, and undertake only one round of lambda testing.
Table 1. PARLAM settings.
As is explained in the addendum to the PEST manual, a setting of -9999 is the best to use where model run times are long and where a user has access to a moderate to high number of slaves whose run times are similar. In that case it may be wise to set the NUMLAM variable in the PEST control file to a higher-than-normal value if it would otherwise be smaller than the number of available slaves. (Recall that the NUMLAM variable is situated in the “control data” section of the PEST control file; it governs the maximum number of model runs that PEST will commit to the testing of different Marquardt lambdas.)
It is important to note that for all PARLAM settings other than -9999, PEST abandons parallelization of the lambda search procedure if any parameter encountersits bounds. Traditional lambda-based upgrading then becomes a serial procedure as the parameter upgrade direction is re-calculated in a manner that is dependent on the number of parameters that have not yet encountered their bounds. However with PARLAM set to -9999, PEST will, under no circumstances, undertake a second set of model runs during any one lambda search procedure; nor will it serialize the lambda search procedure. This ensures that no processors are idle during the lambda search procedure. Where a user has many processors at his/her disposal, some lack of efficiency in conducting the lambda search that is incurred through failure to serialize this search as parameters encounter their bounds, is more than compensated by efficiencies gained through keeping all processors busy.
If BEOPEST finds a run management file in its current directory and encounters an error condition while reading the first two lines of this file, it will cease execution with an appropriate error message. If it does not find a run management file it sets PARLAM to 1, and proceeds with its execution. The same occurs if it finds a run management file and the optional PARLAM variable is not cited within this file.
Recall that the run management file must possess a filename base which is the same as that of the PEST control file; however its extension must be “.rmf”.
3.2 Run Management Record File
As for the normal Parallel PEST, the BEOPEST master records all communications between itself and its slaves to a run management record file. The filename base of this file is the same as that of the PEST control file; its extension is “.rmr”.