Analyzing a simple model to gain insight in the fundamental principles of statistical mechanics.

In this chapter we will investigate a simple model to clearly understand the principles underlying statistical mechanics.

1. Description of the model.

Consider a set of M distinguishable, non-interacting harmonic oscillators.

Each oscillator is in one of the available energy eigenstates, with energies

, (neglect zeropoint energy of )

Quantum states of the complete system are specified by listing the quantum number for each oscillator.

, .

The total energy of the system is given by

.

The key feature of the system is that there is a large degree of degeneracy. Any set of quantum numbers that add up to N has the same energy, but different (ordered) sets correspond to different states. This number grows very fast.

Simplest representation of states: list the possible configurations, by specifying the quantum numbers that are occupied, listed from low to high. To get a count of the actual number of states one has to keep track of the number of distinct permutations (= distinct states) of the quantum numbers in a particular configuration.

Example 1: 3 oscillators,.


arranged low to high / # of distinct permutations of ( states)
0 / 0 / 5 / 3
0 / 1 / 4 / 6
0 / 2 / 3 / 6
1 / 1 / 3 / 3
1 / 2 / 2 / 3
21 states in total

Another example might be

Example 2: 4 oscillators,


arranged low to high / # of distinct permutations of
= # of states / configuration
0 / 0 / 0 / 6 / 4 =4! /(3! 1!)
0 / 0 / 1 / 5 / 12 = 4! / (2! 1! 1!)
0 / 0 / 2 / 4 / 12
0 / 1 / 1 / 4 / 12
0 / 0 / 3 / 3 / 6 = 4! /(2! 2!)
0 / 1 / 2 / 3 / 24 = 4!
1 / 1 / 1 / 3 / 4
0 / 2 / 2 / 2 / 4
1 / 1 / 2 / 2 / 6
84 states in total

As the number of oscillators grows, this is a tedious way of keeping track of things.

More convenient to enumerate the configurations by listing how many oscillators are have zero quanta, how many have 1 quanta, etc. This avoids writing long lists of 0’s, 1’s that simply indicate how many states have 0 quanta, how many have 1, and so forth. In general we will have the situation that M, the number of oscillators, is much larger than the highest occupied level, and the alternative representation is more economical. Consider the following example:

Example 3: 6 oscillators,

# of oscillators in level / # of permutations
0 / 1 / 2 / 3 / 4 / 5
5 / 1 / 6! / 5! = 6
4 / 1 / 1 / 6! / 4! = 30
4 / 1 / 1 / 6! / (4!)=30
3 / 2 / 1 / 6! / 3! 2! = 60
3 / 1 / 2 / 6! / 3! 2! = 60
2 / 3 / 1 / 6! / 3! 2! = 60
1 / 5 / 6! / 5! = 6

A configuration can hence be specified by the number of oscillators in each level. If an entry is left empty, it simply means there are no oscillators with this energy level. Let us denote the configuration by the vector . In principle the configuration vector can be thought to be infinite, as there are an infinite number of energy levels for each oscillator. If the total energy is given by , N is the highest level that can be occupied.

We have the following constraints on the configuration vector m that yields a particular total energy

, the total number of oscillators

, where is the total energy, or

The number of permutations, , or the number of states corresponding to a particular configuration is given by

This formula is well known in combinatorics, and it is easily checked for the examples given. As the number of oscillators becomes large, the number of states per configuration becomes a highly peaked distribution. This can be seen by looking at the previous example, but now taking 100 oscillators with the same total energy

Example 4: 100 oscillators,

# of oscillators in level / # of permutations,
0 / 1 / 2 / 3 / 4 / 5
99 / 1 / 100! / 99! = 100
98 / 1 / 1 / 100! / 98! 100^2
98 / 1 / 1 / 100! / (98!) 100^2
97 / 2 / 1 / 100! / 97! 2! 100^3 / 2
97 / 1 / 2 / 100! / 97! 2! 100^3 / 2
96 / 3 / 1 / 100! / 96! 3! 100^4 / 6
95 / 5 / 100! / 95! 5! 100^5 / 120

As can be seen the last two configurations are far more likely than the other configurations. This effect grows as you increase the number of oscillators.

The basic principle of statistical mechanics is that macroscopic properties are evaluated as averages over all possible states, and that each state in an isolated system (of specific total energy, total volume and total number of particles) is equally likely. This is a fundamental postulate of statistical mechanics, and Gibbs called this the principle of equal a priori probability.

In the above example 4) each quantum state is equally likely. But this means that the lowest two configurations in the table are far more likely than any of the other configurations. They have far more states corresponding to the configuration. Since any property is the same for each state in a configuration (as they only differ by a permutation of quantum numbers over equivalent oscillators), it follows that the average value is dominated by the contributions from the most likely configurations (configurations that have many states).

If one deals with very large number of particles (on the order of say) then the most likely configuration contains overwhelmingly more states than other configurations.

Plot the number of states as a function of m, very peaked distribution.

Configurations that are still somewhat likely compared to the most likely configuration, differ only little from the most likely configuration, similar properties.

e.g. compare while another similar configuration, that also corresponds to a large number of states might be

In statistical mechanics one proceeds by calculating the most likely configuration, and one obtains properties for this most likely configuration.(very small errors).

2. Determination of the most likely configuration corresponding to a particular total energy.

The number of states corresponding to a particular configuration is given by

.

We wish to find the particular configuration m* for which this number is a maximum. However, is a gigantically large number, and typically we therefore calculate , (employing Stirlings formula for the factorials), which is a monotonously increasing function of . Hence, rather than , we maximize , which will lead to the same most likely configuration.

In addition we need to impose constraints on the configuration vector msuch that it yields a particular total energy E, and corresponds to a particular number of total oscillators.

, the total number of oscillators

Use Lagrange’s method of undetermined multipliers. Define:

The function F is the original function to be optimized plus an undetermined multiplier times each constraint. (The signs are chosen with hindsight such that the multipliers will be positive numbers). This function can then be used in an unconstrained optimization, provided that the function is also made stationary with respect to changes in the multipliers. The function is hence required to be stationary in the variables .

The stationarity conditions w.r.t. the Lagrange multipliers yield

These are precisely the constraints. If these are satisfied the function reduces to . The other stationary conditions are

Hence, to carry out the optimization we need to take the partial derivatives of

To evaluate the logarithms of factorials we use Stirlings approximation

Stirling:

And hence

or

Combining this with the expression for , we find

Now using the constraint on the total number of oscillators we can eliminate

Note that the may no longer be integers. But always ‘close’ to the most likely configuration.

Imposing the constraint on the total energy in principle determines the parameter

This equation is not easily solved, however, for and the most likely configuration can be thought to be an explicit function of , which then determines the total energy in the system, E.

Interpretation: For the most likely configuration, , we have obtained

For any configuration the quantity is the fraction of oscillators in energy level k, which in Metiu would be denoted as , the frequency of the energy level.

The most likely configuration can be used to determine the average values.

The fractions in the most likely configuration are therefore denoted the probabilities,, to find an oscillator in energy level k.

We have derivedthe Boltzmann distribution, if we identify, . The average energy in the ensemble is then given by

and the average energy depends on the parameter .

Of course we can also define a partition function for future convenience

.

The quantity can be written as

where we used .

For the most likely distribution we would write . The quantity scales linearly with the number of oscillators, or the size of the system, and is extensive.

The average quantity

is independent of the size of the system, and this is the fundamental quantity that is optimized to reach the most likely distribution (under constraints).

It is of interest to note that the sum runs over the levels of an individual oscillator, and there is no reference anymore to the total number of oscillators M.

As discussed before(in Metiu) entropy is defined as , and optimizing the probabilities to find the most likely configuration is precisely equivalent to finding the state of maximum ‘entropy’.

This is also the rational behind the “Maximum Entropy Principle” we discussed in the notes on thermodynamics. Upon lifting constraints the system quickly equilibrates and reaches the most likely configuration. Randomly sampling all possible states, amounts in practice to residing in the overwhelmingly most likely configuration with minor fluctuations.
Alternative derivation of finding the most likely configuration, using the probabilities as the fundamental variables.

Maximize under the constraints

: normalization condition,

: conservation of energy for the complete system.

This maximizing probability distribution is overwhelmingly likely, if one assumes that every quantum state of a given total energy is equally likely in an isolated system at equilibrium. All averages can be obtained from this maximizing probability distribution.

Again we will use lagrange multipliers anddefine the function

Impose stationarity of the functional with respect to all parameters to find (besides the constraints, which derive from stationarity with respect to the multipliers):

and from the normalization condition

Or, the probabilities in the most likely distribution are given by

The parameter is in one to one correspondence with the average energy of an oscillator, (the energy constraint):

Since the Lagrange multiplier corresponding to overall normalization can always easily be eliminated (no dependence on i), we can use a short cut, by minimizing

where the are relative probabilities, which are not yet properly normalized. Then we can define a partition function and normalized probabilities accordingly

In the sequel we will follow this procedure for many different types of “ensembles”. We can impose different constraints, but always optimize the basic quantity to find the appropriate formulas for the probabilities and partition functions.

NumercialSimulation (see problem set):

Equilibrate a system as the one introduced above, by starting from an arbitrary state defined by the quantum numbers of each individual oscillator, of a certain (total) energy, and then randomly change the state by raising one (randomly chosen) oscillator by one level, and lowering another (randomly) oscillator by one level, such that the total energy is conserved. (This is one possible procedure, see also further notes 2).

Repeat random perturbation of the state and collect statistics:

count how many oscillators are in level 0 (), how many in level 1 () and so forth. The quantity gradually increases (whilefluctuating) to reach its maximum value. Once the system has reached equilibrium, the fractions will keep fluctuating around the most probable‘Boltzmann’ values.

It is interesting to reflect on this numerical experiment. There is no real dynamics in the system, just random moves. Because the most likely configuration (particular values of the ) has by far the most states, and because we randomly sample the various accessible states (all of the same energy), the system spends most of the time in the most likely configuration. It can move away from it, lowering the function , but it is unlikely to deviate much as equilibrium is reached. The fluctuations are very small for simulations (or experiments) involving a very large number of molecules as are encountered in real systems.

3. Towards thermodynamical interpretation of the basic quantities.

We will assume the most likely configuration and define average quantities accordingly. Up to this point we have obtained and/or defined the following

We would like to determine from first principles considerations how to interpret, and also , and in this way provide the connection to thermodynamics. Both and are functions of . Let us first investigate their derivatives.

Hence we see that is always negative. This derivative is directly related to fluctuations around the mean energy, or the variance of the energy.We will return to this issue at a later time.

Next consider

where we used the first line of the analysis and the fact that .

I am using partial derivatives here out of habit. The quantities at this moment only depend on , and it would be more appropriate to use total derivatives.

It then follows

(note:

which we will use in a moment.

To illuminate the meaning of the parameter , consider two systems, the first consisting of oscillators, states (in the most likely configuration) and a total average energy . The other system consists analogously of oscillators, states in the most likely configuration, and average energy per oscillator .

The total number of states is given by , and hence we can maximize the function

which includes a constraint to preserve the total energy (and an associated Lagrange multiplier ).

The maximum of the function is reached if this function is stationary with respect to changes in , hence

and using the relation , derived previously, we obtain

and hence we obtain the important result that the most likely (equilibrium) configuration is reached if . (as in thermodynamics):

Lagrange multiplier plays the same role as temperature in thermodynamics: , where f indicates some monotonous and universal function of temperature. (Universal, as we did not make any assumptions about the second system. Monotonous, otherwise one value of might correspond to two different temperatures, or conversely).

Let us return once more to the relation , and substitute . Hence

(the latter expression from thermodynamics)

It is now easy to see that the identifications

and are suggested by this relation, where could still be an arbitrary constant, but it turns out to be the Boltzmann constant. The above is only a plausibility argument and not a first principles derivation of the connection between thermodynamics and statistical mechanics. I am not even sure, if the connection necessarily has to be like it is. I have not yet encountered fully convincing arguments in the literature.

4. Generalization of the model, and abstraction of the defining features.

Define an ensemble of M ‘similar’ non-interacting systems, and assumes solved the Schrödinger equation for the individual systems .

This is analogous to the use of the harmonic oscillator model.

The wave function for the complete ensemble is then a product (or (anti)symmetrized product) of the individual system wave functions .

This ensemble wave function, analogous to the harmonic oscillator states , has a large degeneracy, and the basic assumption is that each state is equally likely to occur (in an isolated ensemble of a particular total energy).

Take the average over all possible, accessible, ensemble wave functions.

Provides connection to thermodynamic quantities.

Averaging procedure: Define configurations, and count how many individual systems are in quantum state and so forth. These configurations are analogous to the vectors .

We only need the frequencies of occurrence , and statistical averages can be obtained by considering exclusively the most likely configuration, in which the frequencies become the probabilities .

These probabilities corresponding to the most likely configuration are obtained by maximizing , subject to constraints that define the particular characteristics of the ensemble. Different types of ensembles (constraints) then lead to different thermodynamic identifications. This will be the subject of the next handout.

1