Nowhere coexpanding functions

We define a family of $C^1$ functions which we call"nowhere coexpanding functions"that is closed under composition and includes all $C^3$ functions with non-positive Schwarzian derivative. We establish results on the number and nature of the fixed points of these functions, including a generalisation of a classic result of Singer.


Lead paragraph
The set of functions with negative Schwarzian derivative is closed under composition.As a result, it is a popular tool for studying the dynamics of functions in dimension one.A comprehensive treatment involving one-dimensional dynamics, with many results involving negative Schwarzian derivatives, is given in the book of de Melo and van Strien [1993].More recent results appear in Kozlovski [2000], Webb [2009], and Mora [2018].In this paper, we study the dynamics of functions belonging to a new class F. The class F includes all C 3 functions whose Schwarzian derivative is non-positive, but also includes many C 1 functions.Our main result states that for a function in F with no critical points, the set of fixed points is either an interval or has cardinality at most three.Further we adapt a classic result of Singer [1978] to apply to functions in F, showing that for an attracting periodic point of a function in F, either its immediate basin is unbounded, or its orbit attracts the orbit of a critical point.For some applications, functions in F can be glued together to make new functions in F which are in general not in C 3 ; we give examples of this in Section 4.

Statement of main results
To state our main results, we first introduce our function class F -a subset of scalar C 1 functions.Consider a C 1 function f : I → R defined on an interval I ⊂ R. For now assume that f has no critical points.If f has two fixed points, x and y such that then x and y are said to be coexpanding fixed points of f .
We may generalise this definition to any two distinct points x, y ∈ I. Let ∆ denote the diagonal of I, and define the function χ f : We say x and y (which are not necessarily fixed points) are coexpanding if χ f (x, y) > 1.In other words, x and y are coexpanding if there is a scalar affine function A : R → R such that x and y are coexpanding fixed points of A • f .We say f is nowhere coexpanding if there are no such points in I.
Now suppose f : R → R and the set Crit(f ) := {x ∈ R : f ′ (x) = 0} of critical points is non-empty.We say f is nowhere coexpanding if its restriction to each connected component of R \ Crit(f ) is nowhere coexpanding.Define F to be the set of all nowhere coexpanding functions.Observe that any affine function A belongs to F, since χ A ≡ 1 when A is non-constant.
Section 2 shows that F is closed under composition, and classifies all possible sets of fixed points for functions in F with no critical points.Let Fix(f ) := {x ∈ R : f (x) = x}.Our main result is as follows: Theorem 1.If f ∈ F has no critical points, then either Fix(f ) is a closed interval, or f has at most three fixed points.
Section 3 provides a brief background on the Schwarzian derivative and shows that the class F, when restricted to C 3 , is the class of functions with non-positive Schwarzian derivative.This fact is made clear in the following result: Theorem 2. Let f : R → R be a C 3 function.Then f ∈ F if and only if its Schwarzian derivative satisfies S f (x) ⩽ 0 for all x ∈ R \ Crit(f ).
In fact, Singer's result for functions with negative Schwarzian derivative may be generalised to nowhere coexpanding functions.To state the result, we need the following definitions.Let p be an attracting period n point of a function f ∈ C 1 .The basin of p is The point p is topologically attracting if Basin(p) is a neighbourhood of p.The immediate basin of p is the connected component of the basin containing p.
Theorem 3. If f ∈ F and p is a topologically attracting periodic point of f , then either the immediate basin of p is unbounded, or there is a critical point of f whose orbit is attracted to the orbit of p.
Section 2 proves this Theorem.

Fixed points of nowhere coexpanding functions
In this section, we build a better understanding of the properties of nowhere coexpanding functions.We first show that F is closed under composition.Then we prove a list of necessary conditions for the nowhere coexpanding property and use these conditions to prove Theorems 1 and 3. Proposition 4. The set F of nowhere coexpanding functions is closed under composition.
To show this, we need the following lemmas.
Lemma 5.For any non-constant affine functions A and B, a function f is nowhere coexpanding if and only if A • f • B is nowhere coexpanding.
Proof.By the definition of a nowhere coexpanding function, f ∈ F if and only if A • f ∈ F. Therefore, it suffices to show that a function g is in F if and only if h := g • B is in F. But these are equivalent, since a quick calculation shows that χ h (x, y) = χ g (B(x), B(y)) for all distinct x and y. □ Lemma 6.Consider f, g ∈ F. If their composition h := g • f has no critical points on an interval I, then χ h (x 1 , x 2 ) ⩽ 1 for any distinct x 1 , x 2 ∈ I.
Proof.Let A be the affine function such that x 1 and x 2 are fixed points of ĥ : , and let B be the affine function taking x 1 to y 1 , and We now establish properties of the derivative of a nowhere coexpanding function.
Lemma 7. If f ∈ F is regular on an interval I, then f ′ has no local minima in I.
Proof.Suppose f ′ has a local minimum at p ∈ I. Then there exist points a and b in I with a < p < b and such that f ′ (a) = f ′ (b) > f ′ (x) for all x in (a, b).By the mean value theorem, and therefore ] and [b, c] are invariant, f ′ must average to unity on these intervals.Since [a, b] and [b, c] cannot be subsets of Fix(f ), there exist points p ∈ (a, b) and q ∈ (b, c) with f ′ (p) > 1 and f ′ (q) > 1, yielding a local minimum in f ′ in (p, q) which contradicts Lemma 7. □ We are now ready to prove Theorem 3.
Proof of Theorem 3. Let g := f n , where n is double the period of p. Thus, g ∈ F and 0 < g ′ (p) ⩽ 1. Observe that replacing f by g does not change the immediate basin of p.
) is a critical point for f .Further, if x c is attracted to the orbit of p by f , then so is f k (x c ).Thus, it suffices to show that either the immediate basin of p under g is unbounded, or there is a critical point of g whose orbit is attracted to p.
Let I be the immediate basin of p.To prove by contradiction, assume there are no critical points in I and that I is bounded.Then both endpoints of I are fixed points of g, contradicting Lemma 8. □ For the remainder of this section, we narrow our focus to the set of nowhere coexpanding functions with no critical points, in order to prove Theorem 1. Proof.As Fix(f ) has interior, there is an interval [a, b] ⊂ Fix(f ).By Lemma 5, we can conjugate by a translation and reduce to the case where a < 0 < b.The result is immediate for Observe that the solution u(x) = x to the corresponding initial value problem, is unique.It is a standard result (see Chapter 3 of Hartman [2002] for instance) that since f satisfies the differential inequality (2), it is majorised by u.That is, f (x) ⩽ x for all x ∈ [b, ∞) and so f ′ (x) ⩽ 1 follows from (2).A similar argument may be used for the case x ∈ (−∞, a].□ It is now easy to prove Theorem 1. Proof of Theorem 1.If Fix(f ) has non-empty interior, then by Lemma 10, f ′ (x) ⩽ 1 for all x ∈ R, and so Fix(f ) is connected.Otherwise, Lemma 9 applies.□

The Schwarzian derivative
In this section, we use Schwarzian derivatives to characterise C 3 functions in F. Recall the Schwarzian derivative of a C 3 function f is defined by In complex analysis, the Schwarzian derivative measures how much a function differs from a Möbius transformation.In the realm of functions on R, the Schwarzian derivative vanishes for affine functions, and is therefore a measure of how much a function varies from being affine.The following are standard results, which we will use in the proof of Theorem 2. For details on Lemma 11, see for instance Hawley and Schiffer [1966].
Lemma 11.Given a C 3 function f , let The next lemma states the chain rule for Schwarzian derivatives as well as some properties that follow from it.The proof is omitted.
Lemma 12.For C 3 functions f and g, Properties ( 2) and ( 3) still hold if all the inequalities are changed to be strict.
The following lemma is adapted from Chapter 9.4 of Robinson [2012].
Proof.Assume S f (x) ⩽ 0 for all x ∈ (a, b).Since a and b are fixed points, f cannot be expanding for all x in (a, b).Thus, there exists c ∈ (a, b) . By the mean value theorem applied to log(f ′ (x)) on [a, c], there exists r ∈ (a, c) such that g(r) < 0. Similarly, there exists t ∈ (c, b) such that g(t) > 0, and hence there exists s ∈ (r, t) such that g(s) = 0.A simple calculation shows g ′ (x) = S f (x) + 1 2 (g(x)) 2 , and by our assumption, g satisfies the differential inequality Similar to the proof of Lemma 10, it follows that g(x) ⩽ 0 for x ∈ [s, b].This contradicts g(t) > 0, so our initial assumption was wrong.□ We are now ready to prove Theorem 2.
Proof of Theorem 2. Suppose f ∈ F. Evaluating U f in Lemma 11 yields By Lemma 13, there exists p ∈ (a, b) such that S g (p) > 0. Therefore, by items ( 2) and (3) of Lemma 12, S f (q) > 0 for some q.□

Examples
For functions f ∈ F with no critical points, Theorem 1 allows for zero, one, two, three, or infinitely many fixed points, where in the last case, Fix(f ) is some interval.In this section, we provide examples of functions if F for each of these cases.Further, we introduce a way of gluing some functions in F together to produce new functions in F.
Example 1.The following functions in F have no critical points and finitely many fixed points.In each case the number of fixed points is robust under C 1 small perturbations.Theorem 2 shows that the above functions are in F.
We now introduce another way of constructing new functions in F from other known functions in F. We say a function f ∈ F is glueable if f ′ (0) = 1 and |f (x)| ⩽ |x| for all x in the connected component of R \ Crit(f ) containing zero.For glueable functions f and g, define Lemma 14.If f and g are glueable, then f ⋆ g ∈ F.
Proof.By the definition of f ⋆g, we may assume without loss of generality that f is odd.Take x, y > 0, where −x and y both lie in the connected component of R \ Crit(f ⋆ g) containing zero.It suffices to show χ f ⋆g (−x, y) ⩽ 1.Since f and g are glueable, xf (x)g(y) ⩽ xf (x)y and yf (x)g(y) ⩽ yxg(y).Adding these together yields (x + y)f (x)g(y) ⩽ xy(f (x) + g(y)), and thus f (x)g(y) f (x) + g(y) x + y xy ⩽ 1, and and the same for g.Therefore Since F is closed under composition with affine functions, without loss of generality, we only need to find examples of h where Fix(h) is R, (−∞, 0], and [0, 1].We saw in Example 1 that tanh ∈ F. Observe that tanh is also glueable.Furthermore, the identity function, id, is the only glueable affine function.We treat each case of Fix(h) separately.
(1) Observe h = id ∈ F is the only choice of h satisfying Fix(h) = R.

Conclusions
Part of our original motivation in studying this topic was to look at activation functions used in machine learning, the properties of these functions under composition, and of their fixed points.If an activation function lies in F then its composition with itself and with affine functions (in dimension one) will still lie in F. This suggests that such a function might not be well suited to approximating general functions, but more research is needed to see what relevance this one-dimensional work has to high-dimensional neural networks.
Early in the development of machine learning, the logistic sigmoid function 1/(1 + e −x ) was a popular choice of activation function, and it lies in F.More recently, the ReLU function has become popular.Since the ReLU function is not C 1 , our theory here does not apply.However, there are more regular variants of the ReLU function.One of these is the ELU function [Clevert et al., 2016], defined as gluing together the identity function x → x for x ⩾ 0 and x → e x − 1 for x ⩽ 0. Lemma 14 shows that the ELU function lies in F.
We conclude by demonstrating with an example that F is not closed under addition.Consider the function f (x) := tanh(4x) + tanh(x/4), shown in Figure 2. Evaluating S f at x = 1 yields S f (1) > 1, so by Theorem 2, f / ∈ F. Figure 3 shows how a composition of two copies of f together with three affine functions yields a function with five fixed points.The affine functions are included to show how extra fixed points can be obtained.
A bounded C 2 function f is sigmoidal if f ′ (x) > 0 for all x, and f has exactly one inflection point [Han and Moraga, 1995].There are many examples of sigmoidal functions in F, such as tanh, the logistic sigmoid, erf, and arctan.The function f in the above example is also sigmoidal, but is not in F.

Lemma 9 .
If f ∈ F has no critical points and Fix(f ) has no interior, then f has at most three fixed points.Proof.If a and b are fixed points, then by Lemma 8, for any fixed point p ∈ (a, b), f ′ (p) > 1.Since expanding fixed points cannot be adjacent, there is at most one fixed point in (a, b).As a and b were arbitrarily chosen, f cannot have more than three fixed points.□ Lemma 10.If f ∈ F has no critical points and Fix(f ) has non-empty interior, then f ′ (x) ⩽ 1 for all x ∈ R.
has coexpanding fixed points a < b for some affine function A and the interval (a, b) contains no critical point of f .Let B be the affine function interchanging a and b.Then g

Figure 1 .
Functions in F plotted with the diagonal in grey.

Example 2 .
Here, we show how to construct a function h ∈ F such that Fix(h) = [a, b] for any prescribed, possibly unbounded, interval [a, b].

Figure 3 .
Figure 3.A plot of the composition x → 4f (f (x + s) − 2s) + s + 4 with parameter s = 0.94 plotted with the diagonal to show where the five fixed points are.