User login

You are here

Derivatives of the invariants of a tensor

When you first start learning finite deformation plasticity, you will run into a plastic flow rate $ \ensuremath{\boldsymbol{d}}_p$ that can be derived from a flow potential $ \phi$ such that 

$\displaystyle \ensuremath{\boldsymbol{d}}_p = \ensuremath{\frac{\partial \phi}{\partial \ensuremath{\boldsymbol{\sigma}}}}$


where$ \ensuremath{\boldsymbol{\sigma}}$is the Cauchy stress.  For an isotropic material with scalar internal variables, the plastic
flow potential can be assumed to have the form 

$\displaystyle \phi \equiv \phi(p, J_2, J_3;  T, q_j)$


where $ p$ is the pressure, $ J_2, J_3$ are invariants of the deviatoric stress$ \ensuremath{\boldsymbol{s}}$, $ T$is the temperature, and $ q_j$are the internal variables. The quantities $ p$, $ J_2$, and $ J_3$ are defined as 

= -\ensuremath{\frac{1}{3}} \ensuremath{\te...<br />
			...t{tr}\left(\ensuremath{\boldsymbol{s}}^3\right)} . \end{aligned}\end{equation*}

Using the chain rule you can write 

$\displaystyle \ensuremath{\boldsymbol{d}}_p = \ensuremath{\frac{\partial \phi}{...<br />
			... \ensuremath{\frac{\partial J_3}{\partial \ensuremath{\boldsymbol{\sigma}}}} .$


The first problem that you run into is how to find the derivatives of the
invariants. My first attempt was to express everything in terms of components
and do the differentiations. That works but can be tedious.

An experienced mechanician would just have gone and read Truesdell and Noll
[1] and picked out the formulas from page 26 of that book.
However, that book scared me with all its old German and Hebrew notation.
For those of you who find Truesdell and Noll difficult to read, here's the
way that book deals with the problem of finding the derivatives of invariants.
Hope you find it useful.

The derivative of a scalar valued function $ \phi(\ensuremath{\boldsymbol{A}})$ of a second order tensor$ \ensuremath{\boldsymbol{A}}$can be defined via the directional derivative using 

$\displaystyle \ensuremath{\frac{\partial \phi}{\partial \ensuremath{\boldsymbol...<br />
			...(\ensuremath{\boldsymbol{A}}+ s \ensuremath{\boldsymbol{B}}) \right\vert _{s=0}$


where$ \ensuremath{\boldsymbol{B}}$is an arbitrary second order tensor.  The invariant $ I_3$ is given by 

$\displaystyle I_3(\ensuremath{\boldsymbol{A}}) = \det(\ensuremath{\boldsymbol{A}}) .$


Therefore, from the definition of the derivative, 

\begin{equation*}\begin{aligned}\ensuremath{\frac{\partial I_3}{\partial \ensure...<br />
			...\boldsymbol{B}}\right)\right] \right\vert _{s=0} . \end{aligned}\end{equation*}

Recall that we can expand the determinant of a tensor in the form of
a characteristic equation in terms of the invariants $ I_1,I_2,I_3$using 

$\displaystyle \det(\lambda \ensuremath{\boldsymbol{\mathit{1}}}+ \ensuremath{\b...<br />
			... I_2(\ensuremath{\boldsymbol{A}}) \lambda + I_3(\ensuremath{\boldsymbol{A}}) .$


Using this expansion we can write 

\begin{equation*}\begin{aligned}\ensuremath{\frac{\partial I_3}{\partial \ensure...<br />
			...symbol{A}}^{-1}\cdot\ensuremath{\boldsymbol{B}}) . \end{aligned}\end{equation*}

Recall that the invariant $ I_1$is given by 

$\displaystyle I_1(\ensuremath{\boldsymbol{A}}) = \ensuremath{\text{tr}\left(\ensuremath{\boldsymbol{A}}\right)} .$



 \ensuremath{\boldsymbol{B}} .$


 Invoking the arbitrariness of $ \ensuremath{\boldsymbol{B}}$we then have 

$\displaystyle \boxed{ \ensuremath{\frac{\partial I_3}{\partial \ensuremath{\bol...<br />
			...} = \det(\ensuremath{\boldsymbol{A}}) [\ensuremath{\boldsymbol{A}}^{-1}]^T . }$



In an orthonormal basis the components of $ \ensuremath{\boldsymbol{A}}$can be written asa matrix $ \ensuremath{\mathbf{A}}$. In that case, the right hand side corresponds the

cofactors of the matrix.


For the derivatives of the other two invariants, let us go back to the
characteristic equation 

$\displaystyle \det(\lambda \ensuremath{\boldsymbol{\mathit{1}}}+ \ensuremath{\b...<br />
			... I_2(\ensuremath{\boldsymbol{A}}) \lambda + I_3(\ensuremath{\boldsymbol{A}}) .$


  Using the same approach as before, we can show that 

$\displaystyle \ensuremath{\frac{\partial }{\partial \ensuremath{\boldsymbol{A}}...<br />
			...da \ensuremath{\boldsymbol{\mathit{1}}}+\ensuremath{\boldsymbol{A}})^{-1}]^T .$


 Now the left hand side can be expanded as 

\begin{equation*}\begin{aligned}\ensuremath{\frac{\partial }{\partial \ensuremat...<br />
			...rtial I_3}{\partial \ensuremath{\boldsymbol{A}}}} . \end{aligned}\end{equation*}


$\displaystyle \ensuremath{\frac{\partial I_1}{\partial \ensuremath{\boldsymbol{...<br />
			...ambda \ensuremath{\boldsymbol{\mathit{1}}}+\ensuremath{\boldsymbol{A}})^{-1}]^T$



$\displaystyle (\lambda \ensuremath{\boldsymbol{\mathit{1}}}+\ensuremath{\boldsy...<br />
			...athit{1}}}+ \ensuremath{\boldsymbol{A}}) \ensuremath{\boldsymbol{\mathit{1}}} .$


Expanding the right hand side and separating terms on the left hand side


$\displaystyle (\lambda \ensuremath{\boldsymbol{\mathit{1}}}+\ensuremath{\boldsy...<br />
			...+ I_1 \lambda^2 + I_2 \lambda + I_3\right] \ensuremath{\boldsymbol{\mathit{1}}}$



$\displaystyle \left[\ensuremath{\frac{\partial I_1}{\partial \ensuremath{\bolds...<br />
			...I_1 \lambda^2 + I_2 \lambda + I_3\right] \ensuremath{\boldsymbol{\mathit{1}}} .$


 If we define = 1$and = 0$, we can write the above as 

$\displaystyle \left[\ensuremath{\frac{\partial I_1}{\partial \ensuremath{\bolds...<br />
			...I_1 \lambda^2 + I_2 \lambda + I_3\right] \ensuremath{\boldsymbol{\mathit{1}}} .$


Collecting terms containing various powers of $ \lambda$, we get 

\begin{equation*}\begin{aligned}\lambda^3&\left(I_0 \ensuremath{\boldsymbol{\mat...<br />
			...partial \ensuremath{\boldsymbol{A}}}}\right) = 0 . \end{aligned}\end{equation*}

Then, invoking the arbitrariness of $ \lambda$, we have 

\begin{equation*}\begin{aligned}I_0 \ensuremath{\boldsymbol{\mathit{1}}}- \ensur...<br />
			..._3}{\partial \ensuremath{\boldsymbol{A}}}} & = 0 . \end{aligned}\end{equation*}

This implies that 

$\displaystyle \boxed{ \ensuremath{\frac{\partial I_1}{\partial \ensuremath{\bol...<br />
			...1 \ensuremath{\boldsymbol{A}}+ I_2 \ensuremath{\boldsymbol{\mathit{1}}})^T . }$


Other interesting relations that can be inferred based on the above are 

$\displaystyle \ensuremath{\boldsymbol{A}}^{-1} = \cfrac{1}{\det(\ensuremath{\bo...<br />
			..._1 \ensuremath{\boldsymbol{A}}+ I_2 \ensuremath{\boldsymbol{\mathit{1}}}\right]$



$\displaystyle \ensuremath{\frac{\partial I_3}{\partial \ensuremath{\boldsymbol{A}}}} = I_3 [\ensuremath{\boldsymbol{A}}^T]^{-1} .$


Recall that 

$\displaystyle p = - \ensuremath{\frac{1}{3}} \ensuremath{\text{tr}\left(\ensure...<br />
			...}}\right)} = -\ensuremath{\frac{1}{3}} I_1(\ensuremath{\boldsymbol{\sigma}}) .$



$\displaystyle \ensuremath{\frac{\partial p}{\partial \ensuremath{\boldsymbol{\s...<br />
			...l{\sigma}}}} = -\ensuremath{\frac{1}{3}} \ensuremath{\boldsymbol{\mathit{1}}} .$


Also recall that 

$\displaystyle J_2 = \ensuremath{\frac{1}{2}} \ensuremath{\text{tr}\left(\ensure...<br />
			...uremath{\boldsymbol{s}}^2\right)}\right] = -I_2(\ensuremath{\boldsymbol{s}}) .$



$\displaystyle \ensuremath{\frac{\partial J_2}{\partial \ensuremath{\boldsymbol{...<br />
			...symbol{\mathit{1}}}+ \ensuremath{\boldsymbol{s}}= \ensuremath{\boldsymbol{s}} .$



$\displaystyle J_3 = \det(\ensuremath{\boldsymbol{s}}) = I_3(\ensuremath{\boldsymbol{s}}) .$



$\displaystyle \ensuremath{\frac{\partial J_3}{\partial \ensuremath{\boldsymbol{...<br />
			...ft(\ensuremath{\boldsymbol{s}}^2\right)} \ensuremath{\boldsymbol{\mathit{1}}} .$


Collecting these results together, we get 

$\displaystyle \boxed{ \ensuremath{\frac{\partial p}{\partial \ensuremath{\bolds...<br />
			...(\ensuremath{\boldsymbol{s}}^2\right)} \ensuremath{\boldsymbol{\mathit{1}}} . }$




C. Truesdell and W. Noll.

The Non-linear Field Theories of Mechanics.

Springer-Verlag, New York, 1992.


Amit Acharya's picture


Indeed, algebra and calculus of the type that never loses its charm.

Another way: The derivatives of the first and second invariants are easily done by realizing that the trace of a second order tensor is its inner product with the identity. For the second invariant, couple this observation with the chain rule.

 The third invariant, as well as the derivative of the inverse of a (invertible) tensor is more interesting. I learnt the following from my advisor Don Carlson.

 Once you know how to do the derivatives of the first and second invariants as above, and then realize that a tensor satisfies its own characteristic equation (Cayley-Hamilton theorem), then taking a derivative of the tensor characteristic equation gives the derivative of the third invariant in terms of the derivatives of the first two invariants, and the derivatives of the tensor itself and its square.

Multiply the tensor characteristic equation by the inverse of the tensor and then take a derivative; then, if one knows the derivatives of the invariants, then the derivative of the inverse falls out.

 The next interesting question is how to do all this if the domain of these functions was a nontrivial manifold, e.g. suppose your s was incompressible (i.e. det s = 1). Of course, even with the inverse there is a bit of an issue to wade through as the set of all invertible  second order tensors is not a vector space, but one gets away because it is at least an open set and diferentiation works out as sufficiently small excursions from a point in the domain remains in the domain (a requirement for, say, your definition of the directional derivative to make sense).




Thanks for the pointer.  I'll try to write out the derivation when I get the time and post it here.

Could you elaborate on why "the set of all invertible  second order tensors is not a vector space" and how that affects differentiation?   An example of a situation where an excursion leaves the manifold will really be helpful.



Amit Acharya's picture


Consider the invertible tensors I and -I. Their sum is not invertible. Hence. not a vector space.

Differentiation of a function at x requires the value of the function to be defined at x + h for h sufficiently small. If the domain of the function involved is a vector space this is not a problem, and it is not a problem even if it is not the entire space but only an open set of a vector space.

For geometric understanding, consider a scalar function φ defined on a two dimensional shell surface in 3-d space. Let h be a tangent vector to the shell at x and let the shell not be flat at x. Then φ(x + h) is not strictly defined for every non-zero tangent vector h, however small. For the same reason the definition of the directional derivative also changes to

d/ds φ(f(s))|s=0, where f is a curve on the manifold with f(0) = x. 

If (y^i) is a local parametrization of the shell at x, and identifying the derivative at x with the gradient vector

grad φ = φ_,i e^i where (e^i) is the dual basis corresponding to the parametrization. Thus the gradient vector is tangent to the shell whereas if φ was defined everywhere then the gradient vector would have a component normal to  the shell.

Practically, this comes up all the time when dealing with finite rotations in FE implementations of shells, while deriving Jacobians.

Hope this helps.

You wrote: 

   For geometric understanding, consider a scalar function φ defined on a
two dimensional shell surface in 3-d space. Let h be a tangent vector
to the shell at x and let   the shell not be flat at x. Then φ(x + h) is
not strictly defined for every non-zero tangent vector h, however small.

 I think this is because the point (x+h) leaves the surface of the shell. Am I correct?


Amit Acharya's picture


Yes, you are correct.

Amit Acharya's picture

Of course, the easiest way for the deriv. of the inverse is to work from SS^-1 = I.

Attila Kossa's picture

Dear All!


There is a tricky question: why do we get different result for the derivative of J3 with respect to s if we start from the definition J3=1/3*Tr[s^3] ?

Solution 1:

given above in this blog entry in eq (32) as:  s^2 - 1/2 Tr[s^2] * 1

Solution 2:

derivative of  1/3*Tr[s^3] with respect to s is simply  s^2

[See: Jirásek, M. and Bazant (2000): Inelastic analysis of structures, p. 653 eq. (D.55); or see
Zienkiewicz & Taylor (2000): The …nite element method Volume 2, p. 435 eq. (A.25)]

[See: Itskov (2009): Tensor algebra and tensor analysis for engineers 2nd ed., p. 124 eq. (6.52)]


Best Regards,


s is the deviatoric part of sigma and Tr(s) = 0.

P.S. This comment is irrelevant. Please disregard.

Amit Acharya's picture

I admit that I have not checked the solution you provide and neither the references you give. But here are a couple of thoughts you may want to comment on:

1) Biswajit's formula above is valid only when the symmetric deviatoric tensor where the derivative is being done is invertible. A general traceless tensor need not satisfy this condition.

2) The other question that just occurred to me (and I have not thought about it much at all) is, if we think of J3 in your case as defined on the space of symmetric deviatoric tensors, which is a vector space (as far as I can think of), then calculating the gradient of a differentiable real valued function on this space should yield an element of the space itself - i.e. a symmetric deviatoric tensor. So as you say, s^2 need not be deviatoric, so what are you thinking of as the domain of the function J3 you consider - symmetric-deviatoric tensors, or only symmetric tensors, or invertible-symmetric-deviatoric tensors?

I note here that if we were to consider the fn J3 to be defined on the domain of invertible, symmetric deviatoric tensors the formulae that both you and  Biswajit write will have to be corrected to remove their hydrostatic parts to obtain the gradient and then they would yield identical formulae.


Agree with your caveats (1).  It's been a long time since I thought about these things and I can't recall where I first saw that result.  But it did look strange and hence the contortions to get to it.

(2) is quite a bit deeper and I hadn't thought in those terms.   That's worth more examination from others in this forum.


-- Biswajit



Attila Kossa's picture

I made a detailed derivation steps regarding to my question.

I think method 2 is the correct solution for the derivative of J3.

But what about Method 1?




Attila Kossa's picture

Dear Biswajit,


when you compute d(J3)/d(sigma11), I think you forgot to compute d(p)/d(sigma11). p itself is obviously depends on sigma11. Therefore a term 1/3 is missing in that line, I think.


The point of the above was to show one way in which brute force can be used to get solution 1. 

We have assumed dJ3/dp = 0.  But, as Amit had pointed out somewhere else, dJ3/dp derivative is ill defined because J3 can take an infinite number of values for a given p. 

Option 4, dJ3/ds = Dev[s^2], gives you the tracefree solution that Amit points to, i.e, the solution that is tangent to the DevSymm manifold.

-- Biswajit

Gayan Aravinda's picture

I am agreeing with Attila. 'p' it self is a fucntion of sigma11 and biswajit has not taken it to account. I have deirved the derivatives of third invariant of deviatoric stress tensor with respect to direct terms and also by indirect terms of deviatoric stress tensor. but those expressions are not similar to the above expressions..since p is also a fucntion of direct stress terms; derivatives of the above expression should be considered as a derivatives of a mutiplication...


Another possibility :)

J3 deriavtive option 4

Amit Acharya's picture

What's going on here is interesting indeed.

Without loss of generality, let's consider your (1) to be valid for symmetric second order tensors and (2) to be valid for symmetric, invertible second order tensors.

Consider your eqn. 11.

First equality: First equality

The first equality there holds if you view J_3 and tr(s^3) to be defined as functions *ONLY* on the space of symmetric deviatoric tensors.   If you go off the manifold defiend by tr(s) = 0, then that relation does not hold.

Second equality: Second equality

However, the second equality is written considering tr(.) as a function on the space of symmetric tensors, and in that space that is the correct definition of the derivative of tr(s^3) following (1). It can be shown by being careful about the meanings of the derivative and the gradient of a real valued function on a vector space that the gradient of tr(s^3) on the manifold defined by tr(s) = 0 can be written down by subtracting from the gradient of the same function on the whole space of symmetric second order tensors its projection on the normal to the constraint manifold. All this amounts to simply subtracting off the projection fo s^2 on I/(sqrt 3), so that the correct result 12 should be

J3 derivative in symm dev space---- (**)

This result holds for all symmetric deviatoric tensors, whether invertible or not.

 Now let's go to your Method 1: The definition 5 and formulae 2 and 7 require J_3 to be defined only on the space of invertible symmetric tensors (no deviatoric requirement).The gradient of J_3 on the space of invertible symmetric tensors is indeed given by (7) - the domain is not a vector space but an open set of the space of symmetric tensors and this suffices. If you now ask for what the gradient of the J_3 should be on the space of invertible, deviatoric symmetric tensors, then again you take the formula in 7 and subtract off its component along I/(sqrt 3) and you get the eqn  (**) above, by using your (8) and (9) which is valid for symmetric deviatoric tensors.

So d J_3/ds  on the space of symmetric, invertible deviatoric tensors is the same no matter what formula you use. Of course, your corrected method 2 is more general because it gives the formula on the whole space of symmetric deviatoric tensors.

So I would summarize by saying that what your are looking for when you write dJ_3/ds is the gradient of the function det(s)  on the (linear) manifold/subspace of the vector space of second order tensors defined by tr(s) = 0. This requires a little care, but it all works out consistently

I hope this helps.


Dear Prof. Acharya,

The expression 

dJ_3/ds = s^2 - (1/3)(s^2:I)I

actually means that we are computing the tensor tangent to the constraint manifold. This is the key difference between the derivations in previous posts. But the questions is that how we can compute the normal to this manifold (or what is the equation for this manifold) and why the derivative of J_3 should be tangent to the constraint manifold (in some sense a covariant derivative on constraint manifold)?



Amit Acharya's picture

Your remark about the derivative being tangent to the constraint manifold is correct - see my comment entitled "Differentiation" earlier on in this post (from a long time ago).

The equation of the manifold is tr(s) = 0 (it is actually a linear subspace of the space of symmetric second order tensors) - for the invertible part, one has to do a little song and dance but things work out as if you did not worry about the invertible part.

The way you calculate the normal is as one learns in advanced calculus - you calculate the gradient fo the function that defines the level set. Assuming the Froebenius norm on the space of second order tensors (i.e the trace inner product metric) is involved in this calculation as in forming the unit normal.

Look at my comment on "differentiation" above  and let me know if that clarifies your question about why the 'derivative' of J_3 on a manifold should be tangent to the constraint manifold. 

Dear Prof. Acharya,

If I'm correct, this means that we want to keep the derivative onto the constraint manifold and if not so, the constraint would be meaningless (there would be no difference between derivatives of deviatoric symmetric invertible tensors and derivatives of symmetric invertible tensors). But as a consequence of this constraint all the derivatives of deviatoric symmetric invertible tensors should be traceless (such as dJ_3/ds in our discussions). Should it be the case?


Amit Acharya's picture

In everything you say above, I suspect you mean derivatives of real valued functions of (deviatoric) symmetric tensors and not derivatives of the tensors themselves. For the answer to the question you ask, see my comment on Feb. 26 in this post entitled"Re:J3 derivative", item 2).

The reason that the gradient of a differentiable real valued function on a finite dimensional vector space with an inner product has to be an element of the space is not because one 'wants' to have it that way - it is a theorem. At this point, it may also be a good idea to realize (and if you are not familiar with these notions, to look up a good book on advanced calculus, e.g. Fleming - Functions of Several Variables) that the derivative of a real valued. fn. on a vector space is a linear transformation on the space to the reals (often called a linear functional). Thus, it is not a member of the space itself. However, there is a very important representation theorem for linear functionals (the proof of which is easy in the finite dimensional case) that shows that every linear functional L on the vector space can be represented by a *unique* element l *in the space itself* such that

L(h) = l.h for all h in the vector space. If L was the derivative of the real valued fn. we are thinking of, l is the gradient of the function.

In our discussion, det(S) and (1/3)tr(s^3) are equal real-valued functions on the set of deviatoric symmetric tensors. Thus it only makes sense to ask that their derivatives, or their gradeints in the above sense, be equal in this space. But the gradients are then members of the space of deviatoric symmetric tensors.

From the above definition you can now see why, if one has the gradient of the 'same' function on the whole space, subtracting off the projection of the gradient on the normal to the subspace will give the right answer for the unique gradient of the restriction of the function on the subspace.

Dear Prof. Acharya,

In my previous post I actually meant derivatives of real valued functions of (deviatoric) symmetric tensors (such as invariants of that tensor as in our discussion) and not of course tensor valued functions. By the way, your comments concerning the gradient and derivatives of real valued functions of (deviatoric) symmetric tensors are very helpfull.



Attila Kossa's picture

My question is related to real-valued symmetric second order invertible tensor s.

Only one solution should exist for the derivative of J3 with respect to s.


In this general case, J2 is not zero. However, Method1 and Method2 are the same only if J2=0.

But, I am interested in the case when J2 not =0, J3 not =0.


Consequently, I am still surprised why exist two alternatives for this derivative in textbooks.

Where is the "tricky" step in the derivations causing two solutions?





Amit Acharya's picture

I gave you the unique solution for the gradient of J_3 as a function of symmetric deviatoric invertible tensors, as well as symmetric deviatoric tensors. J2 does not have to be zero for that answer to be valid.

I haven't checked what the text books exactly state for the gradient of J2 on the space of symmetric deviatoric invertible tensors. However, where your derivation goes wrong is in interpreting your first equality in (11), First equality, as valid on the whole space of symmetric tensors. This is not true. The second equality in (11), Second equality, is true in the whole space. if you restrict attention to the subset where J_3 is indeed equal to 1/3tr(s^3), i.e deviatoric tensors, on that set the two 'derivatives' match indeed, as I showed.

May be it is easier to see what is going on in a simple setting - take two real valued functions defined on the x-y plane. Let these two functions agree on a straight line in the x-y-plane - so think of two arbitrary surfaces that intersect on a straight line. Would it be reasonable to expect that the gradients of the these two functions should be equal everywhere just because their values match on the line? Even on the line, would it be reasonable to expect the 'derivative' of the functions to be equal in directions not along the line on which they have common values? Of course, if you restricted attention to the functions only along the line then it is eminently reasonable to expect that the 'derivatives' of the restrictions of the two functions on the line should be exactly equal, and indeed this is what happens.

Makes sense?

Attila Kossa's picture

It is obvious.

I didn't write Tr[s].

Tr[s^2] and Tr[s^3] are not 0.


Subscribe to Comments for "Derivatives of the invariants of a tensor"

Recent comments

More comments


Subscribe to Syndicate