Non parametrizedfunctionals with empirical dispersion corrections: a happy match?

Diane Bousquet1, Eric Brémond1,Juan C. Sancho-García2, Ilaria Ciofini1 and Carlo Adamo1,3*

1Laboratoire d’Electrochimie, Chimie des Interfaces et Modélisation pour l’Energie, CNRS UMR-7575, ChimieParisTech, 11 rue P. et M. Curie, F-75231 Paris Cedex 05, France.

2Departamento de Química-Física, Universidad de Alicante, E-03080 Alicante, Spain

3Institut Universitaire de France, 103 Boulevard Saint Michel, F-75005 Paris, France.

Abstract

The performances of two parametrized functionals (namely B3LYP and B2PYLP) have been compared to those of two non parametrizedfunctionals (PBE0 and PBE0-DH) on a relatively large benchmark set when three different types of dispersion corrections are applied (namely the D2, D3 and D3(BJ) models). Globally the MAD computed using non parametrized functionals decreases when adding dispersion terms although the accuracy not necessarily increases with the complexity of the model of dispersion correction used. In particular, the D2 correction is found to improve the performances of both PBE0 and PBE0-DH, while no systematic improvement is observed going from D2 to D3 or D3(BJ) corrections. Indeed when including dispersion, the number of sets for which PBE0-DH is the best performing functional decreases at the benefit of B2PLYP.

Overall our results clearly show that inclusion of dispersion corrections is more beneficial to parametrized DH functionals than for non parametrized ones. The same conclusions globally hold for the corresponding global hybrids, showing that the marriage between non parametrized functionals and empirical corrections may be a difficult deal.

*corresponding author:

1.Introduction

In the field of Density Functional Theory (DFT), the development of more accurate approximate forms of exchange correlation functionals for the description of a wider variety of properties, ranging from thermochemistry to photophysics, still remains a major research topic. In the last few years, many functionals belonging to the higher rung of the Perdew ladder1 have indeed been developed. Among these, global or range-separated hybrid functionals, which are characterized by a dependence on occupied Kohn-Sham (KS) orbitals due to the inclusion of a fraction of Hartree-Fock(HF) like exchange, such as for instance PBE02, B3LYP3, CAM-B3LYP4 or LC-wPBE5-7, normally present a good performances to computational time ratio, and for this reason they have been intensively applied to model the reactivity and properties of many chemical systems during the last years8-11. More recently, following an approach firstly proposed by Truhlar12, different double-hybrid (DH) functionals have been developed including also a dependency on the virtual KS orbitals. This latter dependence is related to the presence of a second-order perturbative Moller-Plesset (MP2) contribution to the correlation energy13,14. Since that pioneering work, a significant number of DHs have been developed, the most famous family of parametrized DHs being the one introduced by Grimme and including functionals such as B2PLYP15,16 andmPW2-PLYP17.Following the same parametrization philosophy,other double-hybrids have been developed by Martin, such as B2GP-PLYP18, B2T-PLYP and B2K-PLYP19,demonstrating to be particularly accurate on an impressing number of properties. Furthermore, the inclusion of a spin component or spin-opposite scaling of the MP2 contribution was recently proven to ameliorate the performances such in the case of DSD-BLYP20, DSD-PBEB8621 or the latest Grimme’s functional (PWPB9522).

In this overall context a DH, the so-called PBE0-DH23-25, was recently developed by Adamo and coworkers, following the same parameter free approach used for the development of the parent and well performing global hybrid PBE026-32. This functional actually includes the Perdew-Burke-Ernzerhof (PBE) exchange-correlation functional in a double hybrid formulation without the inclusion of any empirical parameters since both the fraction of HF-like exchange and MP2 correlation contribution were fixed making use of theoretical constrains and physical considerations33 without resorting on fitting over different experimental sets.

In a recent paper34, the performances of this double hybrid were tested on the extended benchmark set GMTKN30, developed by Goerigk and Grimme35,36 and compared to thoseof B2PLYPand of their parent Global Hybrids (GHs), PBE0 and B3LYP, in order to compare two DHs functionals possessing the same level of complexity and representatives of the parametrized (B2PYLP) and non parametrized (PBE0-DH) development philosophies. In particular, many recent papers of Grimme and collaborators clearly showed that explicit inclusion of dispersion corrections strongly ameliorate the overall performances of functionals not only for properties directly related to dispersion, such as for instance structures of systems of biological relevance, but also for properties that could appear, at first sight, hardly affected by London interactions37-39.

The development of corrections able to correctly deal with London forces, has indeed become a flourishing field in DFT. Clearly, the inclusion of a fraction of MP2 correlation in global hybrids represents a good starting point to correctly estimate non covalent interactions, such as van der Waals ones. Indeed, a systematic improvement of functionals’ performances going from global to corresponding double hybrid functionals was observed for benchmark sets dominated by dispersion and non covalent interactions36. The better performances of DH with respect to GH is indeed related to the different long range behavior of the correlation energy terms in the two families of functionals: an exponential decrease exponentially with the interatomic distance in the case of GH versus the presence of a long range term in DH, due the inclusion of an MP2 correlation fraction. However, according to Grimme works22,40, the contribution of the MP2 correlation term is not sufficient to compensate the DFT repulsive part, so that even at DH level, dispersion forces are not correctly described making the addition of an empirical correction term necessary for the correct description dispersion interactions41.

To this end, different empirical corrections, containing parameters fitted on relevant benchmark sets, were developed and tested in conjunction of a wide variety of parametrized DHs showing excellent results. Nonetheless, up to now none of these corrections were tested when combined to a non-parametrized DH such as PBE0-DH. In order to test if the systematic and sizable improvement of DH performances by inclusion of empirical dispersion corrections observed for parametrized DHs also holds in the case of PBE0-DH, the inclusion of three different empirical dispersion corrections, namely the so-called –D238, D339 and D3(BJ)42-45 ones, in the PBE0-DH functional form will be considered in this work. The performances of the so-obtained dispersion corrected DH will be compared to those of an empirical DH of analogous complexity (B2PLYP), also dispersion corrected, in order to basically show if non-empirical DH and empirical dispersion corrections really make a happy match.

The paper is structured as follows: after a brief recall of the dispersion corrections that will be applied and a description of the fitted parameters that will be used (Section 2), the computational details will be given in Section 3. Next, the performances of dispersion corrected PBE0-DH and B2PYLP will be compared over the GMTKN30 benchmark set (Section 4) and some general conclusions will be drawn in Section 5.

  1. Dispersions interactions: the models and the parameters

In order to deal with inter- and intra-molecular dispersion contributions in the DFT framework, a set of corrections to Kohn-Sham energies where recently introduced by Grimme, namely the DFT-D137, DFT-D238 and, more recently, the DFT-D339 models, these can be viewed as a kind of hierarchy of increasing complexity arising from the approximate yet accurate modeling of pair-wise long-range interactions between weakly interacting densities or fragments.

Basically, the total energy of a system can be written as the sum of the Kohn-Sham energy and of an energy correction for dispersion interactions as follows:

/ (1)

where is an atom pair wise correction term expressed as:

/ (2)

andwhere (for ) represents respectively the two body () and three-body () dispersion correction terms. Truncated expression of (2) to the first order correction () leads to the so-called DFT-D2 correction, while the DFT-D3 model is obtained truncating for. The leading term is indeed the two-body one, expressed as:

/ (3)

where represent the number of atoms in the system, n represent the order (here limited to 3) is the internuclear distance between atoms and and is the dispersion coefficient for a pair of atoms ij, calculated as the geometrical average of the dispersion coefficient of the atoms pair:

/ (4)

In equation (3), is a damping function introduced in order to avoid as much as possible the double counting of electronic correlation at intermediate distances. actually defines the range of in which the correction plays a role, and, in the case of the DFT-D2 model is expressed as:

/ (5)

Starting from the DFT-D2 model, the DFT-D3 one was developed allowing to obtain a better accuracy for the description of all systems, including particularly metallic and heavy atoms containing systems39,while reducing concomitantly the empiricism contained in the model.In this latter model, the dispersion coefficients specific to each atoms pair and the cutoff radii are explicitly calculated, taking into account the chemical environment too.On the other hand the three body termshave been proven to yield negligible effects or even a deterioration of the results when included for small sized systems, most probably related to an overestimation of the three body terms in the strong density regions39.Therefore these terms are often neglected, although they may play a role,as in the case of the Grimme S12L subset46 for which three-body corrections are proved to play an important role. In this line it is worth mentioning the recent work of Tkatchenko and collaborators dealing with the effect of many body corrections47.

Overall, the dispersion correction term at DFT-D3 level () is thusvery similar to DFT-D2 one, except that all the terms in should be taken into account. The scaling factors (eq. 3) are adjusted in the case of , to allow a correct asymptotic behavior, which is only the case when is exact (). For functionals that already take into account a part of the long range correlation, a correct asymptotic behavior will only be reproduced when using , which will be the case for the studied double hybrids PBE0-DH and B2PLYP. Indeed, after intensive testing, Grimme found out that the inclusion of terms higher in order than 8 can generate instability while not upgrading the performances significantly39. Thus, as in the case of DFT-D2, only and terms will be included. On the other hand, the DFT-D3 zero damping function used is slightly different from the one of the DFT-D2 model, as proposed by Chai and Head Gordon48:

/ (6)

where is the scaling factor dependent on the cutoff radii (). Grimme38 proposed to fix to a value of unity, needing thus to only optimize the value. Regarding the parameters, they were adjusted manually to fit some physical constrains39, being 14 and 16.

At last, the DFT-D3 model was also tested using the damping function presented by Becke and Johnson, leading to the so-called DFT-D3(BJ). In this variant of the DFT-D3 dispersion model, the dispersion energy is expressed as:

/ (7)

with the corresponding damping function defined as. Because of the form of the dumping funtion, there is no need to define cutoff radii. The parameters and are fitted parameters, and is defined as:

/ (8)
In order to use the three above mentioned models in conjunction with the PBE0-DH functional, some parameters were needed to be defined. All coefficients used in this work are listed in Table 1, along with the corresponding values obtained from literature when available35,40,49-51. In the case of the DFT-D2 model, being d fixed to 20, the only parameter needed, that is the coefficient51,was obtained via a fitting procedure on the S2249 subset. For the DFT-D3 model, for which the d value is fixed to 6, three parameters need to be fitted, and the procedure described by Grimme39 was followed. This latter consists in fixing the parameter to one in order to ensure a correct asymptotic behavior for global hybrid functionals (ie PBE0 and B3LYP). The determination of for double hybrids is done based on a system of three rare gas dimers (Ne2, Ar2 et Kr2). By applying this method to PBE0-DH, a value of 0.84 is obtained for Then, the other parameters are determined by minimizing the Mean Absolute Deviation (MAD) with a non linear least square method on the S130 set. This set is composed of the following subsets: S2239,49,50 S22+, PCONF52, SCONF53, ACONF54, CYCONF55, ADIM656, RG657-60. Note that S22+ contains the same complexes as the S22 set, only considering systems at larger internuclear distances. The total MAD is then calculated, using the MAD of RG6 weighted by 20. The mapping of the MAD evolution and its dependence on the and parameters, can be found in Figure 1. This procedure allowsus to find and values of 0.784 and 1.394, respectively as reported in Table 1. Regarding the optimization of the DFT-D3(BJ) parameters, the procedure followed is strictly the same applied for the DFT-D3 model, and allow obtaining = 0.095, = 0.0 and = 6.102.

Inspection of the parameters obtained for the different GHs (B3LYP and PBE0) and DHs (B2PYLP and PBE0-DH) allows to drawn some general comments. First, in the case of the DFT-D2 model, the values are substantially different for B3LYP and PBE0; the latter being significantly smaller while PBE0-DH and B2PYLP shows significantly similar parameters, the latter being only slightly larger. On the other hand, in the case of DFT-D3 models, PBE0-DH always possesses a larger and a smaller . Interestingly, for PBE0-DH in the case of D3(BJ) model for the same value two sets of parameters providing very similar MAEs on the fitting subset (i.e. the S130 one) have been found. The one (hereafter referred to as PBE0-DH-D3(BJ)) providing the smallest MAE (=0.840; =0.095 and=6.102, Table 1) will be used throughout in the paper, and the results obtained using the second set parameters, referred to as PBE0-DH-D3(BJ)2 (=0.840; =0.747 and=5.276, Table 1) and providing a MAE of only 0.003kcal/mol larger on the S130, will be discussed in section 4.

Interestingly, these two sets of parameters closely resemble to the D3(BJ) parameters obtained for DSD-BLYP and PWPB95 functionals (both having a null value) and for the B2PLYP one, clearly pointing out how the search of global minima over the MAE surface can be quite tricky and that parameter of the same range can be obtained independently of the functional used.

3. Computational details

The PBE0-DH functional has been implemented in a local version of the Gaussian program61, so that all the standard features, including analytical derivatives for geometry optimization, harmonic frequencies and properties, are also available. In analogy with previous work (see for instance references 2, 36, 62,63), the performances of the PBE0, B3LYP, B2PLYP and PBE0-DH functionals have been tested on the GMTKN30 database using both Gaussian and Orca64 Version 2.9.1-RELEASE program packages.

To easily compare the obtained results with the values presented by Grimme, we used in particular the large Ahlrichs’ type quadruple-ζ basis set def2-QZVP,65 which is expected to provide converged results even for DHs. Indeed, the use of a quadruple-ζ base is sufficient to reach the convergence limit,51,66even if DH approaches could have a significantly larger dependency on basis set then corresponding non-DH forms, due to the presence of the MP2 component. Regarding the calculations of electron affinities, and of binding energies of water (subsets G21EA67 and WATER2768), diffuse s and p functions were taken from the Dunning aug-cc-pVQZ,69 obtaining the aug-def2-QZVP basis, which was recommended by Grimme for those particular subsets.

The PCONF52, WATER2768, IDISP70, ISOL2271, ADIM656, BHPERI72 and BSR3673-77werecomputedmaking use of the RI-MP278-80approach as implemented in Orca. Available data for B3LYP,B2PYLP, PBE0, B3LYP-D3, B2PYLP-D3 and PBE0 wereretrivedfrom the Grimme website81.

4. Results and discussion

Being mostly interested in the effect of dispersion correction on non parametrizedfunctionals, firstly the global behavior of PBE0-Dx and PBE0-DH-Dx functionals will be compared to that computed in absence of dispersion correction on the whole GMTKN3035 set. Next, the performances of the different functionals will be analyzed over different groups of subsets, defined in a previous paper36 and based on the dominant properties they are aimed to benchmark, in order to see which type of interaction are mostly affected by the inclusion of dispersion. In particular, the GMTKN30 datasets will be classed in subset probing thermochemistry (decomposition and atomization energies), adiabatic processes, reaction energies and barriers heights, conformers and isomers relative energies. On the other hand the DC982-89 subset, which includes nine reactions known to be difficult to be treated at DFT level, and SIE1133subset, dealing with test cases for self interaction error, will be discussed separately. The MADs computed on each GMTKN30 subset are reported in Supporting Information.

4.1 Dispersion corrected global (PBE0) and double hybrid (PBE0-DH) benchmarked overthe GMTKN30 set

The overall performances of PBE0 and PBE0-DH over the GMTKN30 set with and without inclusion of dispersion corrections are summarized in Figure 2 by their associated Mean Absolute Deviation (MAD), together with the results obtained at B3LYP and B2PLYP level.

Globally speaking, the MAD computed with both PBE0 and PBE0-DH decreases when adding dispersion terms, although not necessarily after increasing the complexity of the correction used. In particular, if the D2 correction sizably ameliorates the performances of both PBE0 and PBE0-DH, the systematic improvement observed going from -D2 to -D3 (or -D3(BJ)) corrections in the case of B3LYP is here not observed, in analogy to what is computed in the case of B2PLYP.

On the other hand, both in the case of the non-parametrized and parametrized families of hybrids, the effects of dispersion correction terms are quantitatively larger on global hybrids with respect to double hybrids so that the computed MAD difference between DH and corresponding GH decreases by addition of dispersion. For instance, the difference of MADs between PBE0 and its corresponding double hybrid (PBE0-DH) is smaller when adding dispersion corrections, and even if PBE0-DH still performs better than PBE0, the difference in MAD decreases from 0.29 kcal/mol when dispersion is not included to 0.08, 0.19 or 0.15 kcal/mol when dispersion is included at the D2, D3 and D3(BJ) level, respectively.

Globally speaking, this is also what isobserved in the case of the parameterized family of functionals (BLYP and B2PLYP) although the overall MAD difference between GH and DH is always larger than 1kcal/mol. Furthermore, while the largest decrease in MAD is computed when applying the D3 type corrections in the case of B3LYP, for PBE0 the largest effect is found when using the D2 corrections. Indeed, both for PBE0-DH and PBE0, the more complex D3 corrections seem to add a negligible accuracy.

The difference inperformancesof the two families of functionals(i.e. the PBE-based and BLYP-based) could be related to the dissimilarbehavior of the exchange component in the limit of large reduced density gradient. Indeed, it has been shown that the Becke's exchange functional (B88)90 is repulsive in dispersion-bound complexes so that simple van der Waals systems are precited to be unbounded91,92. This is not the case for the PBE exchange, which is more actractive93. It is therefore likely thatBLYP based functionals benefit more from the always attractive, DFT-D type corrections. More insigths on this point could be obtained by analyzing the performances of “cross” functionals, such as BPBE-DH or PBE-LYP-DH. However, these unusual exchange and correlation functioals’ combinations areof scarse theoretical interest, since the PBE contributions were defined to match each-other94. Furthermore, a correct definition of these functionals requires the full optimizations of the coefficients ruling the different contributions to the functionals, so that the differences between the two orginal DH functionals (PBE0-DH and B2PLYP) and their derivatives could be not clearly ascribed to a single effect, making the obtained results difficult to interpret. In this intricate context, we believe that this point does not deserve further explorations and prefer to rest on the above mentioned qualitative interpretation of the exchange behavior.