In the previous post on the Rayleigh quotient we defined the directional derivative of a function \(f: \mathbb{R}^n \to \overline{\mathbb{R}}\) as
$$\begin{aligned}f'(x; h) = \lim_{t \downarrow 0} \frac{f(x+th)-f(x)}{t},\end{aligned}$$
provided that the limit exists.
It turns out that all convex functions are directionally differentiable on the interior (actually, the core) of their domains and \(f'(x; \cdot)\) is sublinear. However, the sublinearity property may fail when working with nonconvex functions. This motivates the definition of generalised directional derivatives which will hopefully be accompanied by some good calculus rules.
1. Generalised Directional Derivative
1.1. Dini Directional Derivative
The Dini directional derivative of a function \(f: \mathbb{R}^n \to \overline{\mathbb{R}}\) at a point \(x\) along a direction \(h\) is a generalised derivative given by
$$\begin{aligned}f^-(x; h) = \liminf_{t \downarrow 0} \frac{f(x+th)-f(x)}{t}.\end{aligned}$$
However, as we will see later, the Dini directional derivative is not always sublinear.
1.2. Clarke Directional Derivative
We also define the Clarke directional derivative of \(f\) at \(x\) for \(h\) is given by
$$\begin{aligned}f^\circ(x; h) {}={}& \limsup_{y \to x, t \downarrow 0} \frac{f(y+th)-f(y)}{t} \\ {}={}& \inf_{\delta>0}\, \sup_{\substack{\|y-x\|\leq \delta \\ t \in (0,\delta)}} \frac{f(y+th)-f(y)}{t}.\end{aligned}$$
We can show that
Proposition 1. The Clarke directional derivative is positive homogeneous and sublinear, i.e.,
$$\begin{aligned}f^\circ(x; u + v) \leq f^\circ(x; u) + f^\circ(x; v),\end{aligned}$$
for all \(x, u, v\).
Proof. To show that \(f^\circ(x; {}\cdot{})\) is positively homogeneous, take \(\lambda > 0\). We have
$$\begin{aligned}f^\circ(x; \lambda h) {}={}& \limsup_{y\to x, t\downarrow 0}\frac{f(y+\lambda t h) - f(y)}{t}\\{}\overset{\eta = \lambda t}{=}{}& \limsup_{y\to x, \eta\downarrow 0}\frac{f(y+\eta h) - f(y)}{\tfrac{1}{\lambda}\eta} = \lambda f^\circ(x; h).\end{aligned}$$
We need to show that \(f^\circ(x; h+h') \leq f^\circ(x; h) + f^\circ(x; h')\). We have
$$\begin{aligned}f^\circ(x; h+h') {}={}& \limsup_{y\to x, t\downarrow 0}\frac{f(y+th+th') - f(y)}{t}\\{}={}& \limsup_{y\to x, t\downarrow 0}\frac{f(y+th+th') - f(y+th') + f(y+th') - f(y)}{t} \\ {}\leq{}& \limsup_{y\to x, t\downarrow 0}\frac{f(y+th+th') - f(y+th') }{t} \\ &\qquad+ \limsup_{y\to x, t\downarrow 0}\frac{f(y+th') - f(y)}{t}\end{aligned}$$
Now in the first limsup let us define \(\tilde{y} = y+th\) and see that \(\tilde{y} \to x\) as \(y \to x\) and \(t \downarrow 0\), therefore the first limsup is equal to \(f^\circ(x; h)\). \(\blacksquare\)
Observe that the Clarke directional derivative majorises the Dini derivative \(f^-(x; h) \leq f^\circ(x; h)\). This is easy to prove - if not, please write a comment here below and I will elaborate.
Moreover, in \(f\) is Lipschitz, we can show the following
Proposition 2. Let \(f\) be locally Lipschitz with modulus \(K\) around \(x\). Then,
$$\begin{aligned}f^\circ(x; u) \leq K\|u\|,\end{aligned}$$
for all \(u\).
Proof. We have
$$\begin{aligned}f^\circ(x; h) {}={}& \inf_{\delta>0}\, \sup_{\substack{\|y-x\|\leq \delta \\ t \in (0,\delta)}} \frac{f(y+th)-f(y)}{t} \\ {}\leq{}& \inf_{\delta>0}\, \sup_{\substack{\|y-x\|\leq \delta \\ t \in (0,\delta)}} \frac{|f(y+th)-f(y)|}{t} \\ {}\leq{}& \inf_{\delta>0}\, \sup_{\substack{\|y-x\|\leq \delta \\ t \in (0,\delta)}} \frac{K\|y+th -y\|}{t} = K\|h\|, \end{aligned}$$
which completes the proof. \(\blacksquare\)
1.3. Michel-Penot Directional Derivative
The Michel-Penot directional derivative is defined as
$$\begin{aligned}f^\#(x; h) {}={}& \sup_{u\in\mathbb{R}^n} \limsup_{t \downarrow 0} \frac{f(x+t(h+u))-f(x+th)}{t}.\end{aligned}$$
Proposition 3. The Michel-Penot directional derivative is sublinear, i.e.,
$$\begin{aligned}f^\#(x; h_1 + h_2) \leq f^\#(x; h_1) + f^\#(x; h_2),\end{aligned}$$
for all \(x, u, v\).
Proof. Let us define the function
$$\begin{aligned}\hat{f}(x, h, u){}={}\limsup_{t \downarrow 0} \frac{f(x+th+tu))-f(x+tu)}{t}.\end{aligned}$$
Note that \(f^\#(x; h) = \sup_u \hat{f}(x, h, u)\). We shall prove that \(\hat{f}\) is sublinear in its second argument; for fixed \(x\) and \(u\) we have
$$\begin{aligned}\hat{f}(x, h_1+h_2, u) {}={}& \limsup_{t \downarrow 0} \frac{1}{t}\left[f(x+t(h_1+h_2+u))-f(x+tu)\right] \\ {}={}& \limsup_{t \downarrow 0} \frac{1}{t}\big[ f(x+t(h_1+h_2+u)) -f(x+th_2+tu) \\ &\qquad\quad+f(x+th_2+tu)-f(x+tu)\big] \\ {}\leq{}& \limsup_{t \downarrow 0} \frac{1}{t}\left[ f(x+t(h_1+h_2+u)) -f(x+th_2+tu)\right] \\ &\qquad\quad+\limsup_{t \downarrow 0} \frac{1}{t}\left[ f(x+th_2+tu)-f(x+tu)\right] \\ {}\leq{}& \sup_u\limsup_{t \downarrow 0} \frac{1}{t}\left[ f(x+th_1+t(h_2+u))) -f(x+t(h_2+u))\right] \\ &\quad\qquad{}+{} f^{\#}(x, h_2) \\ {}={}& \sup_{v}\limsup_{t \downarrow 0} \frac{1}{t}\left[ f(x+th_1+t v) - f(x+tv)\right] + f^{\#}(x, h_2) \\ {}={}& f^{\#}(x, h_1) + f^{\#}(x, h_2). \end{aligned}$$
Now taking a supremum with respect to \(u\), the assertion follows. \(\blacksquare\)
A notable property is stated below.
Proposition 4. We have
$$\begin{aligned}f^{-}(x; h) \leq f^\#(x; h) \leq f^{\circ}(x; h),\end{aligned}$$
for all \(x, h\).
Proof. This is Proposition 6.1.1 in [1] and the proof can be found there. \(\blacksquare\)
Observe that
$$\begin{aligned}f^-(x; h) \leq f^\#(x; h) \leq f^\circ(x; h).\end{aligned}$$
To prove the second inequality, \(f^\#(x; h) \leq f^\circ(x; h)\), fix \(u\) and observe that
$$\begin{aligned}f^\circ(x; h) = \limsup_{y\to x, t\downarrow 0} \frac{f(y+th) - f(y)}{t} \geq \limsup_{\substack{y=x+tu,\\ t\downarrow 0}} \frac{f(x+tu+th) - f(x+tu)}{t},\end{aligned}$$
and take the supremum with respect to \(u\) on both sides.
2. Examples
2.1. Absolute value
Let us first see how the above three derivatives behave on a nonsmooth convex function. Consider the function \(f(x) = |x|\), with \(x \in \mathbb{R}\). The Dini derivative of \(f\) at \(x=0\) for \(h\) is
$$\begin{aligned}f^-(x; h) = \liminf_{t\downarrow 0} \frac{|th|}{t} = |h|.\end{aligned}$$
For the Clarke derivative we have that
$$\begin{aligned}f^\circ(0; h) = \limsup_{y\to 0, t\downarrow 0} \frac{|y+th|-|y|}{t} \leq \limsup_{y\to 0, t\downarrow 0} \frac{|y| + |th| - |y|}{t} = |h|,\end{aligned}$$
using the triangle inequality. By choosing the sequences \(y_\nu = \tfrac{1}{\nu^2}\) and \(t_\nu = \tfrac{1}{\nu}\) we have that
$$\begin{aligned}f^\circ(0; h) {}={}& \limsup_{y\to 0, t\downarrow 0} \frac{|y+th|-|y|}{t} \\ {}\geq{}& \lim_{\nu} \frac{|y_\nu + t_\nu h|-|y_\nu|}{t_\nu} = |h|,\end{aligned}$$
therefore, \(f^\circ(0; h) = |h|\). Since the Michel-Penot directional derivative is between the Dini and Clarke derivatives, we conclude that \(f^\#(0; h) = |h|\) (however, it can be easily determined using the definition).
2.2. Negative Absolute value
Consider the function \(f(x) = -|x|\) with \(x \in \mathbb{R}\). This is a nonsmooth concave function. The Dini derivative of \(f\) at \(x=0\) along the direction \(h\) is
$$\begin{aligned}f^-(0; h) {}={}& \liminf_{t\downarrow 0} \frac{-|th|}{t} = -|h|.\end{aligned}$$
The Michel-Penot derivative is
$$\begin{aligned}f^\#(0; h) {}={}& \sup_{u\in\mathbb{R}}\limsup_{t\downarrow 0} \frac{-|th + tu|+|tu|}{t} = \sup_u |u| - |h+u| = -|h|.\end{aligned}$$
The last equality is because (i) \(\sup_u |u| - |h+u| \geq |0| - |h+0| = -|h|\) and (ii) by the triangle inequality \(\sup_u |u| - |h+u| \leq -|h|\).
Lastly, the Clarke derivative of \(f(x) = -|x|\) at 0 is
$$\begin{aligned}f^\circ(0; h) {}={} \limsup_{y \to 0, t\downarrow 0} \frac{|y|-|y+th|}{t} {}\geq{} |h|,\end{aligned}$$
by using the triangle inequality. By following a similar procedure as in \(|x|\) in Example 1, we conclude that \(f^\circ(0; h) = |h|\).We can arrive at the same conclusion using Proposition 5 below.
We can state the following result that gives the Clarke derivative of \(-f\) in terms of that of \(f\).
Proposition 5. It is \((-f)^\circ(x; h) = f^\circ(x; -h)\).
Proof. We have
$$\begin{aligned}f^\circ(x; -h) {}={} \limsup_{y \to x, t\downarrow 0} \frac{f(y-th)-f(y)}{t},\end{aligned}$$
Define \(z = y + th\) and observe that as \(t \downarrow 0\) and \(y \to x\), \(z \to x\). Additionally, \(y = z + th\). Therefore we have
$$\begin{aligned}f^\circ(x; -h) {}={} \limsup_{y \to x, t\downarrow 0} \frac{f(y-th)-f(y)}{t} = \limsup_{z\to x, t \downarrow 0} \frac{f(z) - f(z+th)}{t} = -f^\circ(x; h),\end{aligned}$$
which completes the proof. \(\blacksquare\)
Note that the property of Proposition 5 does not hold for the Dini derivative, but the Michel-Penot derivative satisfies \(f^\#(x; -h) = (-f)^\#(x; h)\).
It is also easy to prove that
Proposition 6. It is \((\lambda f)^\circ(x; h) = \lambda f^\circ(x; h)\), for \(\lambda \geq 0\).
Proposition 6 holds for the Michel-Penot derivative too.
2.3. An interesting case
Consider the following function
$$\begin{aligned}f(x) = \begin{cases}x^2 \sin\tfrac{1}{x}, & \text{ for } x \neq 0 \\ 0, &\text{ for } x=0\end{cases}\end{aligned}$$
This is how the graph of this function looks like:
and then if we zoom in even more, this is what we see
This function is not differentiable at zero; this is its derivative:
We are interested in the (generalised) derivatives at \(x=0\). The Dini derivative is
$$\begin{aligned}f^-(x; h) {}={} \liminf_{t \downarrow 0}\frac{t^2h^2 \sin\tfrac{1}{th}}{t}=\liminf_{t \downarrow 0} th^2 \sin\tfrac{1}{th}=0.\end{aligned}$$
In order to determine the Clarke derivative we use the fact that the function is continuously differentiable at all \(y \neq 0\), therefore, by Taylor's approximation theorem,
$$ \begin{aligned}f(y + th) {}={} y(t) + f'(y)th + o(|th|),\end{aligned}$$
for adequately small \(t\), and for \(y \neq 0\),
$$\begin{aligned}f'(y) {}={} 2y\sin\tfrac{1}{y} - \cos\tfrac{1}{y}.\end{aligned}$$
Then, for any fixed \(h\),
$$\begin{aligned}f^\circ(0; h) {}={}& \limsup_{y\to 0, t \downarrow 0}\frac{f'(y)th + o(|th|)}{t} \\ {}={}& \limsup_{y\to 0, t \downarrow 0} f'(y)|h| + \frac{o(|th|)}{t} = |h|.\end{aligned}$$
The Michel-Penot derivative is
$$\begin{aligned}f^\#(0; h) {}={}& \sup_u\limsup_{t \downarrow 0}\frac{(th+tu)^2\sin\frac{1}{th+tu} - t^2u^2\sin\frac{1}{tu}}{t} \\ {}={}& \sup_u \limsup_{t \downarrow 0} t \left[ (h+u)^2\sin\frac{1}{th+tu} - u^2\sin\frac{1}{tu}\right] = 0.\end{aligned}$$
2.4. An even more interesting example
Consider the function
$$\begin{aligned}f(x) {}={} \begin{cases}3^n, & \text{ if } 3^n \leq x \leq 2(3^n), n \in \mathbb{Z} \\ 2x - 3^{n+1}, &\text{ if } 2(3^n) \leq x \leq 3^{n+1} \\ 0, & \text{ if } x \leq 0\end{cases}\end{aligned}$$
The Dini derivative of \(f\) at 0 for a negative direction \( h < 0\) is clearly equal to 0. For \( h>0\), by definition
$$\begin{aligned}f'(0; h) = \liminf_{t \downarrow 0}\frac{th}{t} = \lim_{\epsilon \downarrow 0} \inf_{0 < t < \epsilon}\frac{f(th)}{t} = \frac{h}{2},\end{aligned}$$
so, overall \( f'(0; h) = \max\{0, \tfrac{h}{2}\}.\)
The liminf of the Dini derivative with \( h>0\) is illustrated below:
For the Clarke derivative with \( h>0\) it is convenient to use the following expression:
$$\begin{aligned}f^\circ(0; h) ={}& \limsup_{y\to 0, t \downarrow 0}\frac{f(y+th) - f(y)}{t} \\ {}={}& \lim_{\epsilon \downarrow 0} \underbrace{\sup_{\substack{|y| < \epsilon \\ 0 < t < \epsilon}} \frac{f(y+th) - f(y)}{t}}_{2} = 2h,\end{aligned}$$
whereas for \( h < 0\) we have \( f^\circ(0; h) = 0\), so overall \( f^\circ(0; h) = \max\{0, 2h\}\).
Lastly, the Michel-Penot derivative at 0 along a direction \( h<0\) is equal to zero because it is bounded between the Dini and Clarke derivatives, which vanish for \( h<0\). For \( h > 0\) the Michel-Penot derivative is \( f^\#(0; h) = 2h\) and the optimal u in the supremum is \( u = 2\). Overall \( f^\#(0; h) = \max\{0, 2h\}\).
Below you can see an animation showing the convergence of the limsup in the definition of the Michel-Penot derivative for \( h=1\) and for different values of \( u\) (\( u=1,\tfrac{1}{2},2\)).
2.5. Summary
Function | Dini, \(f'(0; h)\) | Michel-Penot \(f^\#(0; h)\) | Clarke \(f^\circ(0; h)\) |
---|---|---|---|
\(f(x) = |x|\) | \(|h|\) | \(|h|\) | \(|h|\) |
\(f(x) = -|x|\) | \(-|h|\) | \(-|h|\) | \(|h|\) |
Example 2.3 | \(0\) | \(0\) | \(|h|\) |
Example 2.4 | \(\max\{0, \tfrac{h}{2}\}\) | \(\max\{0, 2h\}\) | \(\max\{0, 2h\}\) |
Bibliographic notes
[1] The examples given here are the solutions of Exercise 1 in Section 6.1 (Generalized derivatives) of the book: JM Borwein and AS Lewis, Convex Analysis and Nonlinear Optimization, Canadian Mathematical Society, Second Edition, Springer, 2010 (p. 127-128).
[2] FH Clarke, Optimization and Nonsmooth Analysis, SIAM Classics in Applied Mathematics, Wiley, 1983: A great book to understand the concept of Clarke’s generalised derivative.