Show/Hide Chain Rule Summary

\begin{align*}

\Big[f\big(g(x)\big) \Big]’ &= f’\big(g(x)\big)\cdot g'(x) \\[8px] &= [\text{derivative of the outer function, evaluated at the inner function}] \\[8px] &\qquad \cdot [\text{derivative of the inner function}] \end{align*}

\[\dfrac{dy}{dx} = \dfrac{dy}{du} \cdot \dfrac{du}{dx}\]

\[\dfrac{df}{dx} = \left[\dfrac{df}{d\text{(stuff)}}\text{, with the same stuff inside} \right] \times \dfrac{d}{dx}\text{(stuff)}\]

[collapse]

Many functions require using the Chain Rule more than once, since there are multiple embedded functions — deeper layers, or nests, in the compound function. To find the derivative of such a function, we simply apply the Chain Rule to *every* function nested inside, linking the chain of derivatives together.

An example to start

As an example, we’ll use one of the compound functions from a few screens back: $s(x) = \dfrac{1}{1+e^{-x}}.$ The table shows how we broke the function into its *three* constituent parts.

Further Inside | Inside | Outside | |
---|---|---|---|

description | make the input negative | raise e to the power of the input, and add 1 | take the reciprocal of the input |

boxes | $-\Box$ | $1+e^{\Box}$ | $\dfrac{1}{\Box} = \Box^{-1}$ |

function notation | $w= h(x) = -x$ | $u = g(w) = 1+e^w$ | $f(u) = \dfrac{1}{u} =(u)^{-1}$ |

derivative of function | $h'(x) = -1$ | $g'(w) = e^w$ | $f'(u) = -\dfrac{1}{u} =(u)^{-2}$ |

The figure below shows the same decomposition, now as links in the chain. Notice that below each written-out component function near the bottom is the derivative of that function. You probably wouldn’t write out each of these as we have here, but we think it’s helpful to be able to *see* how we’re stringing the pieces of the chain together when we use the Chain Rule.

Applied to a function comprised of two inner functions, the Chain Rule states

\begin{align*}\Bigg[ f\Bigg(g\Big(h(x)\Big)\Bigg) \Bigg]’ &= f’\Bigg(g\Big(h(x)\Big)\Bigg) \cdot \Bigg[g\Big(h(x)\Big) \Bigg]’\\[8px]
&= f’\Bigg(g\Big(h(x)\Big)\Bigg) \cdot g’\Big(h(x)\Big) \cdot h'(x) \\[5px]&=\text{[derivative of the outside function, evaluated at the inside function] } \\[5px]&\qquad \times \text{ [derivative of the inside function, evaluated at the further inside function]} \\[5px]&\qquad \quad \times \text{ [derivative of the further inside function]}\end{align*}

You can see how the Chain Rule is chaining together the pieces here: the first line applies the Chain Rule to the outer function $f,$ saying its derivative equals $f’$ evaluated at its inner function (that happens to itself be a composite function $g\big(h(x)\big)),$ multiplied by the derivative of that inner function. The second line then writes out that latter derivative, *g’* evaluated at its inner function $h(x),$ multiplied by the derivative of *h.*

If that seems abstract, let’s make it concrete by finding the derivative of this function.

Differentiate $f(x) = \dfrac{1}{1+e^{-x}}.$

*Solution.*

We’ll once again solve this two ways, first formally, and then informally the way you would probably quickly reason through, essentially just writing down the answer as you go.

*Solution 1* (more formal)

We have the outside function $f(u) = \dfrac{1}{u},$ the inside function $u = g(w) = 1 + e^w,$ and the further inside function $w = h(x) = -x.$

Their derivatives are $f'(u) = -\dfrac{1}{u^2},$ $g'(w) = e^w,$ and $h'(x) = -1.$

Then applying the Chain Rule gives

\begin{align*}

\Bigg[ f\Bigg(g\Big(h(x)\Big)\Bigg) \Bigg]’ &= f’\Bigg(g\Big(h(x)\Big)\Bigg) \cdot g’\Big(h(x)\Big) \cdot h'(x) \\[8px]
&= -\frac{1}{u^2} \cdot e^w \cdot (-1) \\[8px]
&= -\frac{1}{\left( 1+e^{-x}\right)^2} \cdot e^{-x} \cdot (-1) \quad \cmark \\[8px]
&= \frac{e^{-x}}{\left( 1+e^{-x}\right)^2} \quad \cmark

\end{align*}

While the final line shows the answer in its simplified form, we suggest focusing on the preceding line (also with a green checkmark) since — again with practice — you can *see* how the Chain Rule has been applied to reach this result, starting with the outer function and then working your way inward. The following ‘informal reasoning’ solution makes this even more apparent.

*Solution 2* (less formal, the way you’re likely to come to reason quickly)

The overall function is $\Big[\text{some stuff}\Big]^{-1},$ where you’re holding in your head that this “stuff” = $1 + e^{-x}.$ So its derivative is

\begin{align*}

f'(x) &= -\frac{1}{(\text{that same stuff})^2} \times \text{the derivative of that stuff} \\[8px]
&= -\frac{1}{\left( 1+e^{-x}\right)^2} \times \text{the derivative of that stuff}

\end{align*}

Now “that stuff” is $1 + e^{-x},$ or “one plus *e* to some new stuff,” where “new stuff” = $-x.$

So its derivative is $e^{\text{this new stuff}},$ times the derivative of this new stuff. Hence

\[f'(x) = -\frac{1}{\left( 1+e^{-x}\right)^2} \times e^{-x} \times \text{the derivative of }-x\]
And that last derivative, as you probably immediately thought, is -1:

\[f'(x) = -\frac{1}{\left( 1+e^{-x}\right)^2} \times e^{-x} \times (-1) \quad \cmark\]
We’re not going to bother simplifying this time, because we really want the focus on the chaining in the Chain Rule here: can you see in that last line where each term comes from? Can you imagine that, with practice, you’ll simply write down this result, without having to go through all of the formal reasoning? (If you can’t imagine that yet, no worries: your abilities will develop with practice, of course!)

Students often request “harder examples,” so let’s find the derivative of a function comprised of 6(!) functions. You can step through finding the derivative of each component piece, but really we hope that — with practice, at least — you’ll be able to start with the outside function and work your way inward, and write this answer down in one line. Indeed, we’re putting the answer right at the top, so you can imagine doing just that, simply immediately writing down the answer, going term-by-term as you work your way further and further inside the nests of the function.

Show that the derivative of

\begin{align*}

y(x) &= \frac{1}{\sin(3\sqrt{e^{-x}}-1)} \\[8px]
&= \big[\sin(3\sqrt{e^{-x}}-1) \big]^{-1}

\end{align*}

is

\[\dfrac{dy}{dx} = -\left[ \sin(3(e^{-x})^{1/2}-1)\right]^{-2} \cdot \cos(3(e^{-x})^{1/2}-1) \cdot 3 \cdot \frac{1}{2} (e^{-x})^{-1/2} \cdot e^{-x} \cdot (-1)\]

*Solution.*

Let’s first show a picture of the chain that is this compound function. You probably wouldn’t ever create this for yourself, but it’s helpful for us to visualize together as we go step-by-step.

When presented with a more complicated function like *y,* we find it easier to rewrite reciprocals and roots in terms of negative and rational powers:

\[y(x)= \frac{1}{\sin(3\sqrt{e^{-x}}-1)} = \left[ \sin(3(e^{-x})^{1/2}-1)\right]^{-1}\]
This makes it easier to see where we need to use power rule, and gives us more of a visual cue for when we hit a new interior function.

Let’s summarize our answer, showing how each term arose:

\[\dfrac{dy}{dx} = \overbrace{-\left[ \sin(3(e^{-x})^{1/2}-1)\right]^{-2}}^{\dfrac{df_6}{df_5}} \cdot \overbrace{\cos(3(e^{-x})^{1/2}-1)}^{\dfrac{df_5}{df_4} } \cdot \overbrace{3}^{\dfrac{df_4}{df_3} } \cdot \overbrace{\frac{1}{2} (e^{-x})^{-1/2}}^{\dfrac{df_3}{df_2} } \cdot \overbrace{e^{-x}}^{\dfrac{df_2}{df_1} } \cdot \overbrace{(-1)}^{\dfrac{df_1}{dx} }\] If you wrote your answer like we did at the top of this Example (repeated here, with the added labels like $\dfrac{df_6}{df_5}$), you have completely taken the derivative, so celebrate! Furthermore, your grader is probably scanning for a line like this (without the labels), which clearly shows how you applied the Chain Rule to every inside function. If you got to this point, your Calculus is perfect; most of us are likely to introduce an algebraic error if you then go to simplify — especially if rushing on an exam. So check with your teacher about what an acceptable “final form” is for your answer. If you can box the result above and receive full credit, we recommend doing so. (And again, celebrate your growing Calculus skills!)

For completeness, here is the same answer in simplified form:

\[\dfrac{dy}{dx} = \frac{3\cos(3\sqrt{e^{-x}}-1)}{2\sqrt{e^{x}}\sin^2(3\sqrt{e^{-x}}-1)} \]

FAQ: When do I STOP applying the Chain Rule?

A student, Kiran, says:

“Now that I’m thinking about taking derivatives more layers down in the function, earlier why didn’t we do

\[\dfrac{d}{dx}\sin(2x) \overbrace{=}^{?} \cos(2x) \cdot (2) \cdot \cancelto{0}{\dfrac{d}{dx}(2)} \overbrace{=}^{?} 0\,?!?\] I mean, if I apply the Chain Rule to the every layer of the function it seems like I should take the derivative of that last “2,” but that makes the whole thing zero! That doesn’t seem right, or match what we did before. How do I know where to stop?!?”

\[\dfrac{d}{dx}\sin(2x) \overbrace{=}^{?} \cos(2x) \cdot (2) \cdot \cancelto{0}{\dfrac{d}{dx}(2)} \overbrace{=}^{?} 0\,?!?\] I mean, if I apply the Chain Rule to the every layer of the function it seems like I should take the derivative of that last “2,” but that makes the whole thing zero! That doesn’t seem right, or match what we did before. How do I know where to stop?!?”

Great question! Kiran is certainly right that he’s gone a step too far in taking that last derivative. If when using the Chain Rule you suddenly find yourself taking the derivative of a constant and the whole thing goes to zero, like Kiran’s example, then you’ve also gone a step too far.

The reason is that the innermost *function* in $\sin(2x)$ is $2x.$ Let’s think about the chain for this compound function:

Creating this image (in our heads, if nothing else) shows where the calculation above went wrong: there is no inner function that is $f_3 = 2,$ so there is no third term in the Chain Rule statement. Instead,

\[\dfrac{d}{dx}\sin(2x) = \overbrace{\cos(2x)}^{\dfrac{df_2}{df_1}} \cdot \overbrace{2}^{\dfrac{df_1}{dx}} \quad \cmark\]
That is, we stop when we take the derivative of the innermost function, which is $2x.$

(We love that Kiran is thinking all of this through!)

And now, time to practice applying the Chain Rule to more complex functions!

This content is available to logged-in users.

This content is available to logged-in users.

This content is available to logged-in users.

This content is available to logged-in users.

This content is available to logged-in users.

On our final screen focused on the Chain Rule we’ll use the rule to develop some further derivatives and facts that are useful to have in mind as we proceed. Of course this gives you yet more practice at using it, too.

What questions do you have about what’s on this screen, or other derivative problems you’re working on? Please post on the Forum and let us know!