Back Home

Q: Is there an intuitive proof for the chain rule?

By Posted On

Posted in Cool

Physicist: The chain rule is a tool from calculus that says that if you have one function “nested” inside of another, , then the derivative of the whole mess is given by . There are a number of ways to prove this, but one of the more enlightening ways to look at the chain rule (without rigorously proving it) is to look at what happens to any function, , when you muck about with the argument (the “x” part).

When you multiply the argument by some amount, the graph of the function gets squished by the same amount. If you, for example, plug in “x=3” to f(2x), that’s exactly the same as plugging in “x=6” to . For , everything happens at half the original x value.

However, while when x=3 is the same as when x=6, the same is not true of their slopes. The slope (derivative) is “rise over run” and the run just became half as long, so the slope just got twice as big. Scrunching a graph makes the slope steeper (see picture above).

So, the slope of at x=3 is actually double the slope of at x=6. You can write this in general as .

Here’s the calculus leap: replacing the x in with 2x clearly means that you’re running through the function twice as fast, so when you take the derivative you just multiply by two to deal with the scrunching. But, if you instead replace x with a more complicated function, , then the amount of speed up and slow down depends on . If has a slope of 2 at some point, then it’s acting like 2x and you get the same “times two” slope. If it’s got a slope of 3 or 1/5, then the slope of at the corresponding point will be multiplied by 3 or 1/5 respectively.

So, to find the slope of , which is just the derivative, you first find what the slope of would be at the appropriate x value, , and then multiply by how much is speeding things up or slowing things down (scrunching or expanding). The slope of is just the derivative, so you’re multiplying by .

Boom! Chain rule:

It’s worth pointing out that, like all calc rules, it doesn’t matter that this rule only talks about two functions. If you have something like , then you can treat as one function, and you’ll find that after running through the chain rule once you’ll be faced with another, simpler, chain rule problem: