08 09 A dive into the paradoxical nature of the Chain rule Richard Xu The Chain rule, as many of us know, is just the formula “dy/du * du/dx”, a simple way for us to differentiate a composite function. On the surface it appears quite logical, as the “du” part of both derivatives simply cross cancel and we are left with the fraction of “dy/dx”. For a while, that’s what I personally believed as well, that this was just a simple rule and concisely proves the feasibility of the Chain rule. However, if we take a moment to think about this conjecture more deeply, we realise that derivatives aren’t fractions. For example, “dy/dx” isn’t really a fraction but instead “d/dx *(y)” and when deconstructed, means “y” differentiated with respect to “x” or a more intuitive definition: the instantaneous rate of change of “y” with respect to “x” and therefore illustrating that “d/dx” is a notation. Why does the chain rule work then? Brief run through of how the Chain rule works: OR A more intuitive understanding of chain rule is if we imagine these functions in a real-life scenario. Let’s say in a race Adam is 3 times faster than Belial and Belial is 5 times faster than Charlie. Then it can be concluded that Adam is 15 times faster than Charlie. This example could then be distilled down into this: Delving into the origin of differential calculus and the chain rule: There are major debates about whether calculus was discovered by either Newton or Leibniz. However, since our focus is the chain rule, I will explore more of Leibniz’ notation of calculus but feel free to research more about this controversial debate more on your own. When Leibniz first came up with his notation of differentiation, he believed that “dy/ dx” was in fact a quotient or more commonly known as fractions. This was due to how he defined differentiation as an infinitesimal (infinitely small number) change in value of “y”, caused by an infinitesimal change in value of “x”, divided by an infinitesimal change in value of “x”. Visualized: Then, based on this idea, it would be natural for the idea of chain rule to work as both derivatives “dy/du” and “du/dx” are in fact fractions and can therefore cross cancel. Leibniz’ definition of differentiation of course makes intuitive sense but if we were to consider the actual definition of differentiation: the instantaneous rate of change of “y” with respect to “x” suggests that differentiation allows us to find the rate of change at any single point. This is where Leibniz’ definition falls apart as no matter how infinitely small the distance between two points, is there still is, a distance. However, there is a way to make Leibniz’ definition of differentiation work which I will go through later. 