Part 5 - Directional derivatives

Posted on Sep 6, 2023

(Last updated: May 26, 2024)

Introduction

In this part we’ll cover directional derivatives and the gradient vector.

Directional derivatives

To understand this part, we’ll first need to recall vectors from linear algebra. Warning: In this series we’ll use notation that is not that common.

Vectors

We all know that vectors have a direction and a magnitude (or length), from physics and linear algebra. Let’s just quickly define these: $$ \vec{a} = \langle a, b \rangle $$

$$ a = x_2 - x_1 \newline b = y_2 - y_1 $$

Length of $\vec{a}$: $$ |\vec{a}| = \sqrt{a^2 + b^2} $$

Important note, we call a vector a unit vector if it has the length one, meaning: $$ |\vec{a}| = \sqrt{a^2 + b^2} = 1 $$

We will also use the dot product (also called scalar product) which is the operation denoted with $\cdot$ $$ \vec{a} = \langle a_1, a_2, a_3 \rangle \newline \vec{b} = \langle b_1, b_2, b_3 \rangle \newline \vec{a} \cdot \vec{b} = a_1b_1 + a_2b_2 + a_3b_3 $$

We can also geometrically define the dot product with: $$ \vec{a} \cdot \vec{b} = |\vec{a}| |\vec{b}| cos(\alpha) $$

Where $\alpha$ is the angle between the vectors.

Let’s quickly also recall the standard basis vectors: $$ \vec{i} = \langle 1, 0, 0 \rangle \newline \vec{j} = \langle 0, 1, 0 \rangle \newline \vec{k} = \langle 0, 0, 1 \rangle $$

Okay that’s all the linear algebra that we need. Let’s look at the definition of partial derivatives: $$ f_x(x_0, y_0) = \lim_{h \to 0} \dfrac{f(x_0 + h, y_0) - f(x_0, y_0)}{h} $$

For $y$: $$ f_y(x_0, y_0) = \lim_{h \to 0} \dfrac{f(x_0, y_0 + h) - f(x_0, y_0)}{h} $$

As we defined, we can interpret this as:

Rate of change of, $f$, in the $\vec{i}/\vec{j}$ direction - near the point $(x_0, y_0)$.

Therefore, we can ask the question. What is the rate of change of, $f$, in any arbitrary direction?

Which we’ll now define.

Definition of directional derivatives

The directional derivative of a function, $f$, of two variables, at point $(x_0, y_0)$ in the direction of a unit vector, $\vec{u}$. $$ \vec{u} \langle a, b \rangle $$

We denoted the directional derivative with a $D_{\vec{u}} f$. $$ D_{\vec{u}} f(x_0, y_0) = \lim_{h \to 0} \dfrac{f(x_0 + ha, y_0 + hb) - f(x_0, y_0)}{h} $$

Reminder

Since the definition is a limit, this limit may not exist, the directional derivative may not exist in a certain direction.

This also means that $f_x$ and $f_y$ are special cases for directional derivatives: $$ D_{\vec{i}} f(x_0, y_0) = \lim_{h \to 0} \dfrac{f(x_0 + ha, y_0 + hb) - f(x_0, y_0)}{h} = f_x $$

$$ D_{\vec{j}} f(x_0, y_0) = \lim_{h \to 0} \dfrac{f(x_0 + ha, y_0 + hb) - f(x_0, y_0)}{h} = f_y $$

How to compute the directional derivative

How do we efficiently compute the directional derivate? We can use the limit definition, but this is tedious.

Let’s denote $f(x_0 + ha, y_0 + hb)$ with $g(h)$. This means $g(0) = f(x_0, y_0)$.

Which means we can write the equation as: $$ D_{\vec{u}} f(x_0, y_0) = \lim_{h \to 0} \dfrac{g(h) - g(0)}{h} $$

Which is just: $$ D_{\vec{u}} f(x_0, y_0) = \lim_{h \to 0} \dfrac{g(h) - g(0)}{h} = g’(0) $$

Let us call $x = x_0 + ha$ and $y = y_0 + hb$. By the chain rule this means! $$ \dfrac{dg}{dh} = \dfrac{\partial f}{\partial x} \dfrac{dx}{dh} + \dfrac{\partial f}{\partial y} \dfrac{dy}{dh} $$

Which is just: $$ \dfrac{dg}{dh} = f_x a + f_y b $$

Which means: $$ g’(0) = f_x(x_0, y_0) a + f_y(x_0, y_0) b $$

$$ D_{\vec{u}} f = f_x a + f_y b $$

Let’s rewrite this properly.

Computation of directional derivate

Let, $f$, be a differentiable function of two variables. The directional derivate in the direction of unit vector, $\vec{u} = \langle a, b \rangle$ is: $$ \boxed{D_{\vec{u}} f = f_x a + f_y b} $$

We can also express this as: $$ D_{\vec{u}} f = \langle f_x, f_y \rangle \cdot \langle a, b \rangle $$

This vector has a special name, it’s called the gradient and is denoted by nabla symbol, $\nabla$.

Definition of gradient

The gradient of, $f$, is the vector-function: $$ \nabla f = \langle f_x, f_y \rangle $$

Therefore, we usually write: $$ \boxed{D_{\vec{u}} f = \nabla f \cdot \vec{u}} $$

Let’s try an example now

Example

Find the directional derivate of the function $f(x, y) = x^2y^3 - 4y$ at $(2, -1)$ in the direction: $$ a)\ \vec{v} = \langle \dfrac{1}{\sqrt{2}}, \dfrac{1}{\sqrt{2}} \rangle $$

$$ b)\ \vec{v} = 2\vec{i} + 5\vec{j} $$

Solution: $$ a) D_{\vec{v}} f = \nabla f \cdot \vec{v} $$

Let’s first compute the gradient: $$ \nabla f = \langle f_x, f_y \rangle = \langle 2xy^3, 3x^2y^2 - 4 \rangle $$

At point $(2, -1)$: $$ \nabla f(2, -1) = \langle -4, 8 \rangle $$

Now we need to check if our vector is a unit vector: $$ |\vec{v}| = \sqrt{\left( \dfrac{1}{\sqrt{2}}\right)^2 + \left(\dfrac{1}{\sqrt{2}}\right)^2} = 1 \ | \ \text{unit vector} $$

$$ \begin{align*} D_{\vec{v}} f(2, -1) & = \langle -4, 8 \rangle \cdot \langle \dfrac{1}{\sqrt{2}}, \dfrac{1}{\sqrt{2}} \rangle \newline & = \dfrac{-4}{\sqrt{2}} + \dfrac{8}{\sqrt{2}} \newline & = \boxed{\dfrac{4}{\sqrt{2}}} \end{align*} $$

For b): $$ \vec{v} = 2\vec{i} + 5\vec{j} = \ldots = \langle 2, 5 \rangle $$

Let’s check if this is a unit vector: $$ |\vec{v}| = \sqrt{2^2 + 5^2} = \sqrt{29} \neq 1 $$

This means we need to normalize: $$ \vec{u} = \dfrac{\vec{v}}{|\vec{v}|} = \langle \dfrac{2}{\sqrt{29}}, \dfrac{5}{\sqrt{29}} \rangle $$

$$ \begin{align*} D_{\vec{u}} f(2, -1) & = \langle -4, 8 \rangle \cdot \langle \dfrac{2}{\sqrt{29}}, \dfrac{5}{\sqrt{29}} \rangle \newline & = \dfrac{-8}{\sqrt{29}} + \dfrac{40}{\sqrt{29}} \newline & = \boxed{\dfrac{38}{\sqrt{29}}} \end{align*} $$

Let’s take one more example:

Example

Find the rate of change of the function, $f(x, y) = xy^2 + x$ at the point $P(2, 0)$ in the direction from $P$ to $Q\left(\dfrac{1}{2}, 2\right)$

The direction vector is: $$ \vec{v} = \langle \dfrac{1}{2} - 2, 2 - 0 \rangle = \langle -\dfrac{3}{2}, 2 \rangle $$

$$ \nabla f = \langle f_x, f_y \rangle = \langle y^2 + 1, 2xy \rangle $$

$$ \nabla f(2, 0) = \langle 1, 0 \rangle $$

$$ |\vec{v}| = \sqrt{\left(-\dfrac{3}{2}\right)^2 + 2^2} = \ldots = \dfrac{5}{2} $$

We need to normalize: $$ \vec{u} = \dfrac{\vec{v}}{|\vec{v}|} = \langle -\dfrac{3}{5}, \dfrac{4}{5} \rangle $$

$$ D_{\vec{u}} f(2, 0) = \langle 1, 0 \rangle \cdot \langle -\dfrac{3}{5}, \dfrac{4}{5} \rangle = -\dfrac{3}{5} + 0 = \boxed{-\dfrac{3}{5}} $$

Maximum rate of change

If we ask ourselves the question, in what direction is $D_{\vec{u}} f(x_0, y_0)$ the largest?

If we look at the definition: $$ D_{\vec{u}} f(x_0, y_0) = \nabla f(x_0, y_0) \cdot \vec{u} $$

If we use the geometrical definition: $$ D_{\vec{u}} f(x_0, y_0) = |\nabla f(x_0, y_0)| |\vec{u}| cos(\alpha) $$

This expression is the largest when $cos(\alpha) = 1$, which means $\alpha = 0$. In other words, the directional derivative is the largest when $\vec{u}$ is the same direction as $\nabla f(x_0, y_0)$.

Geometrical interpretition

Suppose we have a graph like a mountain, where $(x_0, y_0, f(x_0, y_0)$ is our position. The fastest way to get to the top of the mountain is going in the direction of the gradient.

To go down the mountain, in the fastest way possible, we go in the direction of $-\nabla f$.

Example

Given $f(x, y) = xe^y$ and $P(2, 0)$.

a) In what direction is the rate of change the largest?

b) What is the maximum rate of change?

a):

In the direction of the gradient at the point: $$ \nabla f(2, 0) = \langle f_x(2, 0), f_y(2, 0) \rangle \newline \nabla f(2, 0) = \langle e^0, 2e^0 \rangle = \langle 1, 2 \rangle $$

b): $$ |\nabla f(2, 0)| = \sqrt{1^2 + 2^2} = \boxed{\sqrt{5}} $$

Gradient and level curves

Recall that level curves are curves with the equation: $$ f(x, y) = k $$

Where $k$ is just any constant. For each point $(x_0, y_0)$, $\nabla f(x_0, y_0)$ is orthogonal to the level curve that contains $(x_0, y_0)$

Explanation

We already know that the shortest way up the mountain is to go in the direction of the gradient. On the other hand, the shortest way is to go from level curve to level curve as fast as possible. But shortest way between two lines is the perpendicular way!