Journal Club Theme of February 2009: Finite Element Methods in Quantum Mechanics

N. Sukumar's picture

Welcome to the February 2009 issue. In this issue, we will discuss the use of finite elements (FEs) in quantum mechanics, with specific focus on the quantum-mechanical problem that arises in crystalline solids. We will consider the electronic structure theory based on the Kohn-Sham equations of density functional theory (KS-DFT): in real-space, Schrödinger and Poisson equations are solved in a parallelepiped unit cell with Bloch-periodic and periodic boundary conditions, respectively. The planewave pseudopotential approach is the method of choice in such quantum-mechanical simulations, but there has been growing interest in recent years on the use of various real-space mesh approaches, of which finite elements are gaining increasing prominence. Most of us are very well-aware of the use of finite elements in solid and structural mechanics applications, but might be less familiar with its place or potential in quantum-mechanical calculations. To bridge this gap, I provide a brief introduction to the topic and sketch the main ingredients, which is supplemented by links to three review articles for further details. Parts of these journal articles would be very accessible to readers of iMechanica who are familiar with FE and other grid-based methods. In the September 15, 2008 Journal Club issue, the computational challenges in electronic-structure calculations have been outlined by Vikram Gavini; please take a look at the nice overview of DFT that he has written.

The solution of the equations of DFT provide a means to determine material properties completely from quantum-mechanical first principles (ab initio), without any experimental input or tunable parameters. This facilitates fundamental understanding and robust predictions for a wide range of properties across the gamut of material systems. The quantum-mechanical problem in a crystalline solid consists of determining the electronic charge density (corresponding wavefunction and effective potential) for a system consisting of positively charged nuclei, surrounded by negatively charged electrons. In the all-electron problem, the Coulomb potential Z/r diverges (solution has cusps and oscillates rapidly near nuclear positions), and hence is not readily amenable to accurate numerical calculations for even moderate system sizes. For first principles computations in crystalline solids, the pseudopotential approach is widely adopted in most quantum molecular dynamics codes. The electrons in the inner shell (core electrons) are tightly bound and do not contribute to any valence bonding. In the pseudopotential approach, the core electrons are frozen in their atomic state, and the divergent Coulomb potential is replaced by a modified potential (pseudopotential) such that the valence states close to the core are less oscillatory, but do not change outside the core region. Only pseudoatomic wavefunctions for the valence (outermost shells) states are solved for in the Schrödinger equation, which makes it numerically tractable via planewaves or with real-space mesh techniques.

On using the pseudopotential approach, the KS-DFT equations consist of the solution of single-particle like Schrödinger equations, which are coupled to a Poisson equation. The single-particle Schrödinger equation (atomic units used) for the ith state is:

subject to boundary conditions consistent with Bloch's theorem. In the above equation, and are the pseudoatomic wavefunction and energy eigenvalue, respectively. The total effective potential now consists of a local ionic part, a non-local ionic part, the electronic (Hartree) potential and the exchange-correlation potential (due to many-body interactions and Pauli's exclusion principle). Typically, to enable predictive capabilities, energy eigenvalues need to be computed within 1 mHa (chemical) accuracy (1 Hartree ~ 27.21 eV). The single-particle pseudowavefunctions are squared sum to form the charge density, which is used in the Poisson equation to solve for the Coulomb potential (local ionic and electronic contributions). Once again is formed and the process is repeated until self-consistency is attained (charge density and effective potential do not change). The total energy, forces, etc., can now be computed to enable quantum molecular dynamics simulations.

The solution of the Poisson equation scales linearly with the number of degrees of freedom N, but the Schrödinger solution varies as the cube of N (see this plot). For self-consistency, since repeated Poisson and Schrödinger equations are solved for and in excess of 1000 eigenfunctions may be required for large systems (~100 atoms or more), the solution of the Schrödinger equation is the limiting step in the solution of the equations of DFT. Unlike linear equations that are relatively easy to solve, accurate eigensolutions for thousands of eigenpairs are computationally demanding and algorithmically challenging, since the eigenfunctions also have to meet the orthogonality constraint.

From Planewaves to Real-Space Mesh Techniques

As the preceding discussion indicates, the efficient solution of the Schrödinger equation in DFT computations is of paramount importance. This need is especially pronounced for metallic systems with heavier atoms and under extreme conditions (high-temperature and/or -pressure) whose pseudopotentials are deep and sharply localized. In such instances, planewaves (Fourier bases) are less than ideal since they have the same resolution everywhere, and hence real-space approaches with their ability to have variable resolution in space (adaptive) become more attractive. More importantly, for large-scale electronic-structure calculations, basis-sets that are compactly-supported such as finite elements or wavelets lead to structured sparse system matrices and hence are more amenable to iterative solution schemes and to implementation on massively parallel computational platforms. As a variational method, finite elements are systematically improvable and convergence of the energy eigenvalues is from above (min-max theorem). Furthermore, boundary conditions (periodic, Bloch, Dirichlet or a combination of these) are easily incorporated within the weak formulation, which make FEs attractive for modeling crystalline solids, molecules, clusters, or surfaces. These attributes make finite elements particularly promising for density functional calculations, and hence the optimism that in the upcoming years there will emerge wider interest in exploring finite element methods in quantum-mechanical computations. I close with a reading list, and a short summary of each review article.


  1. T. L. Beck (2000), "Real-Space Mesh Techniques in Density-Functional Theory," Reviews of Modern Physics, Vol. 72, Number 4, pp. 1041–1080. [arXiv] [Journal]

    Beck's review discusses finite-difference and finite element formulations in DFT. Emphasis is placed on Poisson and nonlinear Poisson-Boltzmann equations in electrostatics and on solutions of Hartree-Fock and KS-equations (eigenvalue problems) of DFT. First, Beck provides a context for real-space mesh techniques by describing some of the prominent developments so far on electronic structure methods (planewaves, Gaussian, LCAO, etc.). Then, the theory behind KS-DFT and classical electrostatics is described. Second-order FD is briefly discussed, and then a higher-order finite-difference method is used to solve the Poisson equation with a singular charge density. The finite element formulation for the Poisson equation is presented. Multigrid solvers are attractive for real-space methods and the main feature of a multigrid method are discussed. Beck presents energy minimization techniques to solve the Schrödinger eigenproblem, and provides a historical perspective on the developments in finite-difference and finite element methods for self-consistent calculations. Lastly, attention is given to time-dependent DFT calculations. 

  2. J. E. Pask and P. A. Sterne (2005), "Finite Element Methods in Ab Initio Electronic Structure Calculations," Modelling and Simulation in Materials Science and Engineering, Vol. 13, pp. R71–R96. [Journal]

    A PDF of this paper is uploaded (courtesy of Pask). Pask and Sterne review finite element bases and their use in the self-consistent solution of the Kohn-Sham equations of DFT. The solution of the Schrödinger and Poisson equations is discussed, with particular attention to the imposition of the required Bloch-periodic and periodic boundary conditions, respectively. The use of these solutions in the self-consistent solution of the Kohn-Sham equations and computation of the DFT total energy is then discussed, and applications are given. To impose Bloch-periodic boundary conditions, the wavefunction is rewritten in terms of a periodic function u(x), and the variational (weak) form for the Schrödinger equation in terms of u(x)is derived. The weak form within the unit cell and expressions for the contributions of the non-local part of the pseudopotential are presented. In general, the trial and test functions can now be complex-valued functions, and the overlap matrix S and the local part of the Hamiltonian H are Hermitian. Band structure for a Si pseudopotential is presented and the optimal sextic convergence in energy for cubic finite elements is demonstrated. In the context of crystalline calculations, particular attention is given to the handling of the long-range Coulomb interaction via use of neutral charge densities in the Poisson solution. Self-consistent finite element solutions for the band structure of GaAs are shown, and uniform convergence with mesh refinement to the exact self-consistent solution is demonstrated. 

  3. T. Torsti et al. (2006), "Three Real-Space Discretization Techniques in Electronic Structure Calculations,"Physica Status Solidi. B, Basic Research, Vol. 243, Number 5, pp. 1016–1053. [arXiv] [Journal]

    In this paper, Torsti and co-workers compare and contrast the performance of finite-differences, finite elements, and wavelets in electronic-structure calculations. The computational problems under consideration to be solved are the single-particle Schrödinger equation for the pseudowavefunction, and the Poisson equation for the Coulomb potential. Basic theory on FD, FE, and wavelets is first presented; hierarchical finite element bases are also touched upon. Solution approaches for linear equation solvers and for generalized eigenproblem solvers are discussed in significant detail. Applications of finite-differences to quantum dots, surface nanostructures and positron calculations are presented, whereas finite element solutions for all-electron calculations for molecules are performed. Finally, the similarities and differences between the different methods of discretization are nicely summarized.

pask_review2005.pdf1.36 MB
John E. Dolbow's picture

advantages over wavelets?


Thanks for introducing a very interesting theme this month.  Naively, it would seem to me that wavelets might be better suited to this class of problems than Lagrange-interpolant based finite elements.  Can you comment further on the differences/relative advantages?

N. Sukumar's picture

FEM and wavelets


Yes, good point, and I do not have a definitive answer.  Wavelets and FE in DFT calculations are contemporaries (similar starting times), and in recent years, use of wavelets (Daubechies/interpolatory) has also been on the rise.  Goedecker in Basil and co-workers have over the past 2-3 years published quite a few nice papers showing all-electon calculations for the Poisson equation and also a recent one on DFT calculations using wavelets where the promise of the approach is revealed (have posted the links to two of the refs at the bottom). Unlike FD, FE and wavelets are both basis-set approaches and the variational arguments also holds for both. On the plus side, wavelets are localized and also orthogonal in real and Fourier space (simpler matrix structures), and they do have the adaptive nature built-in (via scaling) in the construction. For FEM, typically upto cubic FE (serendipity elements) have been used so far in DFT calculations, but would be interesting to see what spectral FE can provide for these problems. Unlike most structural/solid applications where the mesh can become complicated, here the FE meshes are either regular or parallelepiped and hence the basis functions are piecewise-polynomial. The bottom-line is the need to reduce the number of degrees of freedom to solve the Schrodinger equation (at an acceptable accuracy), and hence codes with similar capabilities (e.g., wavelets and Lagrange/spectral FE) need to be available so that head-to-head comparisons can be made.

Genovese et al. (2007), Efficient and accurate three-dimensional Poisson solver for surface problems, J. Chem. Phys., 127, 054704.

Genovese et al. (2008), Daubechies wavelets as a basis set for density functional pseudopotential calculations, J. Chem. Phys., 129, 014109.

FE vs. wavelets

The main advantages of wavelets relative to FE are orthogonality (for some types) and ease of local refinement (hierarchical).

The main disadvantages (the price paid) are lesser locality (thus potentially less efficient massively parallel implementation), more rapid oscillations and thus more difficult matrix element and other integrals, and lesser simplicity (being non-polynomial, in some cases defined only by recursion).

Goedecker's recent outstanding paper

Genovese et al. (2008), Daubechies wavelets as a basis set for density functional pseudopotential calculations, J. Chem. Phys., 129, 014109.

is the culmination of a decade or so of work on wavelets in DFT. Other major work in the area has been by Arias. See, e.g.,

Title: Multiresolution analysis for efficient, high precision all-electron density-functional calculations
Author(s): Engeness TD, Arias TA
Source: PHYSICAL REVIEW B   Volume: 65   Issue: 16 Article Number: 165106   Published: APR 15 2002

A close reading of these should make clearer the above enumerated issues.

Which basis's (FE, wavelet, or other) mix of advantages and disadvantages will prevail in the large-scale DFT context in the coming decade or so will become clearer as the parallel implementations of each are brought to fruition. As of now, it is very much an open question.

My own sense, however, as evidenced by my continued work in the area, is that the ultimate simplicity and locality of FE will prevail in the large-scale, parallel context. Looking forward, especially interesting in this regard are hierarchical, spectral, and partition-of-unity FE. The latter of which I am pursuing presently.

Vikram Gavini's picture

FE vs. Max-ent

Suku, Thanks for the interesting post.

John, Thanks for clarifying the pros and cons of wavelets vs finite-elements.

I am curious if any one tried max-ent functions as a basis for solving the Kohn Sham equations, or if you have any thoughts in that direction? They are smooth and can capture the wave functions with lesser number of basis functions than standard FE. At the same time they offer the desirable coarse-graining features of FE.

N. Sukumar's picture

Max-ent for Schrodinger


Thanks for the remarks and questions. I haven't tried max-ent for the three-dimensional Schrödinger operator, but here are a few related points. Accuracy considerations when solving the Schrödinger eigenproblem are a lot more restrictive and demanding than most typical FE applications. With it, robustness and systematic improvability (e.g., with mesh refinement/more basis functions) are especially germane. Trilinear finite elements provide very slow convergence for the Schrödinger eigenproblem, and hence Pask chose cubic FE. A general higher-order max-ent basis might open some doors in this regard (Arroyo/Ortiz have come-up with quadratically smooth max-ent). With max-ent and meshfree in general, numerical integration of the weak form integral also needs to be carefully addressed since numerical integration errors can limit convergence in the energy eigenvalues when very high precision is required.  Of course, smooth bases are very desirable for solving KS-equations; a compactly-supported, higher-order-precision smooth basis set that can be integrated (very accurately) and which facilitates easy imposition of periodic/Bloch-periodic boundary conditions would be a preferred choice. On the face of it, given that the mesh for electronic-structure calculations is simple enough, element-centric basis functions might prove to be useful since they render the numerical integration issue to be less difficult. Possibly smooth bases like NURBS or higher-order max-ent on elements might be able to fit within this slot. Setting aside the smoothness property, piecewise-polynomial FE basis have all the other desired attributes, and one also has a good understanding of what is required to obtain variational convergence (avoid `variational crimes') with them, which makes them attractive.

Henry Tan's picture


Thanks for the interesting topic. I am trying to understand the method. What does particle here represent?

N. Sukumar's picture



I presume you point to the reference to `single-particle' just before the Schrödinger equation. This is a consequence of the Kohn-Sham ansatz to get to the electronic charge density: the interacting system of electrons is replaced by an equivalent one of non-interacting `electrons' in a mean (potential) field. Each single-electron (single-particle) obeys the Schrödinger equation: the wavefunctions are pseudowavefunctions and their sole purpose is to construct the charge density, when squared-summed.

Pradeep Sharma's picture

non-equilibrium problems

Dear Suku, thanks for the informative discussion on this interesting topic. Recently, non-equilibrium problems have acquired increased prominence in the context of many applications e.g. nanocapacitors for energy storage, quantum dots among others. For example, a nanocapacitor (metal-dielectric-metal thin film supercell) under electric field conditions cannot be handled by DFT. Time dependent DFT is a possible solution but convergence is often a question mark and certainly the problem becomes computaitonally intensive. Are finite element based methods also likely to offer advantages in such classes of problems?

N. Sukumar's picture

Time-dependent DFT


Sorry, not aware of the FE literature in time-dependent DFT. Here is a ref. that Pask suggested that might be of interest:

Title: Finite-element implementation for electron transport in nanostructures
Author(s): Havu P, Havu V, Puska MJ, et al.
Source: JOURNAL OF CHEMICAL PHYSICS Volume: 124 Issue: 5 Article Number: 054707 Published: FEB 7 2006

DFT and TDDFT with finite-elements

Dear all,

We just submitted an article to J. Chem. Phys. about DFT and TDDFT with finite elements. Here is the preprint:

Our implementation uses hiearchical finite-elements on unstructured meshes. It is still in quite early stage of development, but it works and the results are promising. We are working on parallel implementation and we have some ideas how to reduce the computational cost significantly.

N. Sukumar's picture

Re: DFT and TDDFT with FEs


Thanks for sharing your work, which is very interesting. Nice to read about how higher-order tets, advanced mesh generation algorithms for molecules, and all-electron (not pseudopotentials) computations for DFT/TDDFT can all come-together. In the all-electron case using finite elements, how is the 1/r Coulomb singularity handled (sufficiently large value used) in the numerical computations?  I'd welcome your comments and perspectives on the pros/cons of FEs vis-a-vis existing all-electron methods. For those interested, here is a short description on TDDFT, and an article on the basics of TDDFT.

Re: DFT and TDDFT with FEs

Coulomb singularity

We haven't done any detailed analysis, but the Coulomb singularity does not seem to be a big problem. Our mesh generation scheme produces highly nonuniform meshes. The mesh has a vertex at each nucleus (the source of the singularity), and then we create (~spherically) symmetric layers of tetras around each nucleus. The radius of the layers increases geometrically (r_i = q r_i-1, where e.g. q = sqrt(2)), until the radius is order of the inter-atomic distance. Finally, these atomic meshes are merged together. The largest edge of the mesh to the smallest edge ratio is order of 103-104.

The Coulomb singularity could be a problem only when assembling the matrix element (of the Hamiltonian) int chi_i(r) 1/|r| chi_i(r) dr which is associated to the vertex basis function at the nucleus, because other basis functions are zero at this point. As we use the Gauss-Legendre quadrature, the 1/r is never evaluated at the nucleus (i.e., r=0). So, in practice, no special treatment is required. However, maybe some kind of special quadrature scheme would be more efficient and reduce the number of elements required near the nuclei. I would appreciate any comments about this.


FEM vs other all-electron methods

At first, I have to say that it is way too early to draw any conclusions, but a few things come into my mind...

At the moment, I can remember only three other types of all-electron methods. (1) The linear combination of atomic orbitals (LCAO) either with numerical atomic orbitals (FHIaims, Dmol) or with Gaussians (Gaussian, TurboMole, Dalton, ...) is the most used one. (2) The linearly augmented planewave (LAPW) and (3) the projector augmented wave (PAW) method are the two other ones. The PAW can be used together with planewaves or finite difference. LAPW and PAW use "atomic spheres" - nonoverlapping spheres around atoms, inside which a special treatment is applied for all-electron wavefunctions. So, one has finite-difference or plane-wave part plus all-electron atomic spheres.


Pros vs. LCAO:

1) Real-space & local basis functions => easy and efficient parallelization via domain decomposition

2) Different boundary conditions (cluster/molecule, periodic, mixed) are easy to implement

3) Strong external (especially, time-dependent) potentials (e.g. electric or magnetic fields) are not a problem.

Cons vs. LCAO:

1) Huge number of degrees of freedom (DOF), LCAO: 10-100/atom vs. FEM 1000-10000/atom

2) LCAO has tens of years history behind => strong experience


Pros vs. LAPW & PAW:

1) All (core and valence) electrons are treated on equal footing (i.e. no atomic spheres) => simplicity, adding new features is easy

2) Nonuniform discretization => sparse material can be handled without wasting DOFs

Cons vs. LAPW & PAW:

1) Performance??? DOF in FEM is more expensive than finite-difference or planewave DOF


To summarize, I would say that, for large scale adoption of FEM, what matters is the speed. A FEM based method should be able to perform a basic DFT calculation for a relatively large and dense system (e.g. carbon dioxide on metal surface/slab, ~100 atoms) roughly in the same time as PAW or LAPW based method. At the moment, my FEM implementation is one order of magnitude solver than FD+PAW method, but we have some ideas how to reduce the computational cost significantly. There exist also a few special systems, for example, time-depedent systems beyond linear response regime, where FEM has significant advantages over above methods.


But again, it's quite early to draw any conclusions. Within one year, I should have much more insight to this question.

N. Sukumar's picture

FEs and Coulomb singularity


Thanks for your detailed response, and for providing us with the place of FEs within the context of widely used all-electron methods. Clearly, fewer number of DOFs and improved speed are two issues that are important for finite elements, and as you allude to, it will take time for further developments and maturity in FE codes are needed for the gap to potentially close.

For the Coulomb singularity, the need for a geometric mesh in 3D with element edge ratios of , renders the mesh generation (element quality) an important research problem unto itself, which you have solved. Insofar as the quadrature goes, tailored quadrature rules would seem to be beneficial, and there is quite a bit of experience in the boundary element community in treating singular and hypersingular integrands. We also have some recent experience on developing Gauss-like quadratures for polynomial and non-polynomial (singular) integrands for finite elements. First impressions are that some of these approaches or variants of the same might have a place to improve the efficiency of evaluating the Hamiltonian with the 1/r singularity.



Pradeep Sharma's picture


Lauri, thanks...this is a very useful summary.

Jarson Zhang's picture

It's really a interesting

It's really a interesting and useful topic, but it's too hard! Thank you too!