We have encountered
the notion of the slope of a tangent line in several places in the previous
chapter. When we come to discuss Newton's Method of Approximation later, we
will supply more detail about derivatives. Here, we will develop some of the
geometric aspects of derivatives, and will list some of the basic algebraic
rules for calculating them.
Suppose given
a function of time
|
|
and suppose we know that for the particular time
that the value is a definite number:
.
We might then like to know how f changes as t changes from
to
.
This would give an estimate of the rate at which f is changing
near time
.
We would say
that the average rate of change of the function from time
to
is the quotient
|
|
where the symbol
is read "delta" and means "change in." and
.
Now this average
rate of change
is called a "difference quotient" and it can be interpreted as the
slope of a line, the line connecting the points
and
.
That line is called a "secant line." In the picture below, the function
is
and its graph is the blue graph. Suppose we are interested in the average rate
of change as t varies from 0 to 3. We see that
and
,
so we draw the line through points
.
That is the red line in the picture. In this case,
.
The slope of
the red line gives the average rate of change of the function as t changes
from 0 to 3. It is easy to see that this slope is 1. Now, as we choose smaller
and smaller values for
,
we see that the secant lines, all of which pass through
,
tend to approach the line through
that only "touches" the curve once and does not cross through it.

That line is called the "tangent line" at
,
from the Latin "tangere" meaning to "touch" as in the word
"tangible."
If we asked:
"How quickly is the function f growing precisely when
?" we might conclude that the answer ought to be the slope of the green
tangent line. Why? Because if we measure the rate of change more and more precisely,
we find that the slopes of these secant lines become closer and closer to the
slope of the tangent line.
This is precisely
what Newton and Leibniz did, and they discovered in that process a very powerful
tool for describing changing quantities. Now the idea behind this construction
is very rich and deep. In Newton's view, for example, it was a matter of passing
from ratios of real changes
to a new kind of ratio, a ratio of infinitely small changes (infinitesimal
changes)
where dy really is an infinitely small number, and dt really is
an infinitely small number (equal to 0 ?). This notion caused great consternation
and confusion when it was introduced, but Newton had the advantage of knowing
that the method worked. It gave correct answers, and eventually led to the unraveling
of the mystery of Kepler's Laws, as we shall see. It is difficult to argue with
success, even if it is based on such a patent absurdity as trying to give meaning
to
.
Newton's intuition
was later justified with the development of the Real numbers
and the theory of Limits. We understand what he did in that form
today, or else see that it is justified logically in the elaborate setting of
"Non-standard analysis." We will interpret the measurement process
that leads to the interpretation of the slope of a tangent line as the "limiting
number" that
tends to as
tends to 0 as a Limit. That "limit" will be called the "derivative"
of the function f at the point
.
Let us start
with the notion of a limit. Consider the question again: "How quickly is
the function f growing precisely when
?" We might guess that the answer is 0, or that it is not growing at all,
but how would we know? If we examine the difference quotients
|
|
when
we will see
|
|
We want to know
what happens as h gets close to 0. Here, we make an important observation.
We do not believe the difference quotient has a value when
,
and we really do not care to evaluate there. We are really asking how the quotient
behaves when h is close to 0, but different from 0. In that case
we can replace
with
.
And this is a perfectly well-defined number. We see (intuitively for now) that
as we let h approach 0 through values different from 0, then
|
|
We will have
to give a meaning to the word "approaches" or the equivalent "
" and we will do that in a moment. But the conclusion is the intuitive
one. The instantaneous rate of change of f at
is the slope of the tangent line at
,
the number 0. If we could simply equate instantaneous rate of change with the
slope of the tangent line, we would have nothing more to say, and would take
this as the definition of the instantaneous rate of change.
But there are
two reasons why we cannot stop here. The first is that "slope of the tangent
line" is a geometric concept. It tells us nothing about how to compute
it. We have to learn how to calculate it, especially for more complicated functions
than the one just used. And the second reason is that, while slopes of tangent
lines are prime examples of limits, they are not the only ones. The concept
of limit arises in other ways than through derivative calculations. So we need
a refined concept of limit on which to base derivative calculations, anyway.
Now suppose
we changed the question to: "How quickly is the function f growing
precisely when
?" You see from an imaginary tangent line at
that it really is growing there at a rate different from 0. The appropriate
picture would now be:

Now if we examine
the difference quotients
|
|
when
we will see
|
|
and since
this is equal to
.
Now we see that as
,
|
|
We simply substitute
in
,
whereas, we could not make that substitution in
without some preliminary precaution. As a first tentative definition of a limit
let us say this:
Definition 1: A function
approaches a limit L at
,
written
|
|
if
gets and stays as close to the number L as we might desire, if
we only restrict t to be sufficiently close to
.
This definition
captures the essential idea in words. But it is not quite serviceable for calculation
because of the lack of clear meaning in the phrases: "as close to the
number L as we might desire" and "sufficiently close
to
."
What it does capture clearly is that the limit of a function (if it exists)
is a number. And to prove that it exists, one has to establish the truth
of an implication: "If we only restrict t to be sufficiently
close to
" then
gets and stays as close to the number L as we like. This If-Then
aspect of the limit definition is what we will try to understand next.
If I say that
the slope of the graph of the function f above is exactly
at
then no direct calculation of
for any particular nonzero h will establish that fact. The only h we
could use is
,
and that is "verboten." I have to say something more subtle.
Let us establish
an important term here so that we will have the language to express what we
need to say. That will give us control of ideas like "as close as we like"
and "sufficiently close."
Definition 2: Suppose that a positive number
is chosen. Think of it as an error estimate. Then given
,
say that another number t is within "
tolerance of
" if the distance from t to
is less than
.
Now let us formulate
a more precise definition of limit using this term.
Definition 3: A function
approaches a limit L at
,
written
|
|
if for any "
tolerance" that is chosen in advance, there is a "
tolerance" such that if t is within
tolerance of
(and t is not equal to
) then
is within
tolerance of L.
In this form,
it is clear that in order to establish
we must be prepared to associate with each
tolerance for L a corresponding
tolerance for
,
for which a certain implication is true.
Our final definition
of limit will simply replace the tolerance language with mathematical terminology
more amenable to calculation. Here it is:
Definition 4: A function
approaches a limit L at
,
written
|
|
if for any
that is chosen in advance, there is a
,
such that
|
|
We should observe
right away that if a function
has a limit L at
,
then that limit is unique. We see this by supposing that it had another limit,
say
at
.
Then we would have
|
|
Now if
then let
.
There would have to exist tolerances
such that on the one hand,
|
|
and on the other,
|
|
Now suppose
that
is any number smaller than both
.
Then if
,
it would follow that
|
|
By the triangle
inequality for numbers: ![]()
|
|
but
|
|
So this would
mean that
|
|
That is impossible if
> 0! Therefore
and so the limit is unique.
It is just for
the reason of being able to formulate such clear arguments that we must insist
on a precise definition of limit.
On this definition,
we see that the function g may not even be defined at
,
although it should be defined at all points different from
but within some
tolerance for
.
The important thing is how g behaves near
,
not how g behaves at
.
This theme will recur frequently in the aspect of Calculus that we call the "Art of Approximation."
And in this
regard, if we are also interested in the behavior of g at
,
we make a special definition.
Definition 5: A function
is said to be continuous at a point
if g is defined at every point within some
tolerance for
and if
|
|
Most of the
functions that arise in our models are in fact continuous at every point at
which they are defined. For a continuous function, one calculates the limit
simply by evaluating the function at the point of interest. But the difference
quotients that we will use in a moment to define derivatives:
|
|
are not defined at 0, and so are not continuous there. We have to do algebraic
calculations to evaluate their limits at 0, that is, to determine the slope
of the tangent line at
.
We now define
the derivative of a function
at a point
in its domain of definition.
Definition 6: A function
has a derivative at a point
on which it is defined if the limit
|
|
exists. In that case, this limit is written
and its value is the slope of the tangent line to the graph of f at the
point
.
If the limit
does not exist, then we say the function has no derivative at the point
.
Question 1: Convince yourself that the absolute value function
does not have a derivative at
.
Before we write
down the rules for calculating derivatives, we study the rules for calculating
limits. We will state them first, then see why they are true.
In what follows,
let us assume that f and g are functions and that
|
|
We see from rules
5 and 9 for example that the function
is continuous wherever it is defined, and from rules 7 and 8 that the function
is continuous everywhere.
Let us see why
these are true.
Rule 1: If
is constant, then we want to show that if
is chosen, there is a
such that for any
,
|
|
Proof: Whatever
is given,
for all t just because
,
So any
will satisfy the condition.
Rule 2: If
is the identity function, then we want to show that if
is chosen, there is a
such that for any
,
|
|
Proof: Given
,
let
.
Then since
it follows that if
then
|
|
So
will satisfy the condition.
Notice that in
both cases so far, our job was to respond the challenge of an
with an appropriate
tolerance for t.
Rule 3: We have assumed that
.
Now if we are given
,
we want to find a
such that
|
|
whenever
.
Proof: Do it in two steps and use the triangle inequality. First
|
|
by the triangle inequality. Now find tolerances
such that
|
|
and
|
|
Now let
be the smaller of
so that
|
|
Therefore
|
|
This shows that
the limit of a sum is the sum of the limits. Next comes a more difficult one.
Rule 4: Again, we have assumed that
.
Now if we are given
,
we want to find a
such that
|
|
whenever
.
Proof: We use the triangle inequality again, in a more devious way. First observe that
|
|
So, by the triangle inequality,
|
|
Now the idea
is to make each summand
and
separately less than
and argue as we did before. But we have to use a little care. Therefore, let
us first choose
so that
|
|
We can do that by observing that
and then choosing the
so that
|
|
(We needed the intermediate step in case
.)
That was the easy part.
On the other
hand, to make the product
small we have to be sure that near
,
does not become large without bound. It doesn't. In fact, we already know that
|
|
Therefore, we can apply the triangle inequality "backwards" to conclude that
|
|
and so for ![]()
|
|
|
That is, |
|
Now, find a
smaller than
such that
|
|
Then
|
|
Let this
(which is smaller than
) be your response to
.
It does the job.
Let us prove
Rule 6, and use it to prove Rule 5.
Rule 6: If
and h is continuous at L, then ![]()
Proof: We are given
,
and we want to find a
such that
|
|
whenever
.
First find a
that will guarantee that
|
|
This is guaranteed by continuity of h at L.
Now use
as your new
.
That is, find a
such that
|
|
Now when
we just saw that
by the way
was chosen. Putting it together,
|
|
and we are done!
Rule 5: Now to prove Rule 5, we will show that the function
is continuous wherever it is defined. Therefore since in Rule 5, we assume that
and that
,
it will follow from the continuity of
and from Rule 6 that
|
|
And then Rule 4 will give the result.
To show that
is continuous wherever it is defined, we suppose that
and then, given
,
we seek a
such that
|
|
Now
|
|
It is enough
to handle the case where
since the other case will be similar. Now there is a
with the property that
|
|
Then
|
|
and this implies that
|
|
Which finally implies that
|
|
Therefore, if
then
|
|
Now if we further restrict t to satisfy
where
and
then
|
|
That gives the result.
Rule 7: All polynomial functions are continuous everywhere, is an immediate consequence of Rules 1-4.
We will skip
Rule 8, which is technical, but presents no surprises, and discuss Rule
9.
Rule 9: The trigonometric functions
are continuous everywhere. To show this, it is enough to show two things:
|
|
Actually, the
second follows from the first, Rules 7 and 8, and the Pythagorean Identity
|
|
when
.
But we also know that if
then
.
To see this, consider the picture of a circle of radius 1. Let

t be the length of arc
Then the length of segment
is equal to
And the area of
.
But the area of the sector
of the circle is
.
Therefore in this case
.
The same is true for negative t. Now to show that
we observe that given
,
we can find a
such that
|
|
Simply choose
.
Question 2: Show that
directly from the picture above instead of using Rule 8.
Now
to finish the proof of Rule 9, we simply use the identities:
|
|
and
|
|
Question 3: Use these identities to establish that the trigonometric
functions
are continuous everywhere.
If we ask of
a function of time
,
how quickly it is changing at an instant
,
the answer is given by the limit ![]()
|
|
if that limit exists. We call that limit the derivative of
,
and interpret it geometrically as the slope of the tangent line to the graph
of f at the point
.
The chameleon derivative appears in many forms. Sometimes, for example, we will
write it as
|
|
Our task now
is to learn how to calculate it. That will occupy the remainder of this lecture.
Just as there are rules for calculating limits, there are algebraic rules to
calculate derivatives. And if in the end the algebraic rules do not suffice,
then remember that a derivative is always just a limit.
First, we will
observe that if a function
has a derivative at
,
say
then f is continuous at
.
To see this, we cast the derivative in a new light, as a means of approximation.
Calculus provides a very convenient method for approximating functions called
differential approximation. We will describe the general procedure
here, and return to it later in the context of Newton's Method.
Suppose given
a function
|
|
and suppose we know that for the particular number
that the value is another definite number:
.
We might then like to approximate the value of this function at a nearby point
.
That is, we might like to get a good estimate of
when
is small, or, put another way, when
is close to
.
Now, if we define
the new function of
|
|
then this measures the difference between the actual value of
at
and the
-coordinate of the point on the tangent line
to the graph of
at
whose
-coordinate is
.
We draw a picture.
Let
and ![]()

The point on the tangent line
|
|
approximates the actual point
|
|
We may say that in general, for arbitrary
,
the point on the tangent line
|
|
approximates the actual point
|
|
Now, in the picture,
is the length of the vertical segment between the horizontal lines.
From the definition of the derivative, we know that, assuming that
is differentiable at the point ![]()
|
|
This means that
the function of
that appears to the right of the minus sign in the definition
|
|
|
|
is a very good approximation to the function
|
|
The new function
is essentially
,
except that we measure the independent variable, h, from
,
and the graph of
is the tangent line when we measure its independent variable also from
.
This function
is called the "differential approximation" to
.
We have said nothing new. We are simply reinterpreting the derivative.
Now why is
if
?
Well, in that case, it is clear that
|
|
but
|
|
and so since
|
|
we see that
|
|
and this means that
|
|
In what follows,
let us assume that f and g are functions and that they have derivatives
at ![]()
|
|


Rule 1 says that the derivative of a constant function is 0.
|
Proof: |
|
Rule 2 says that the derivative of the identity function is 1.
|
Proof: |
|
Rule 3 says that ![]()
Proof: ![]()
|
|
Rule 4 says that ![]()
Proof: This requires a little work.
|
|
Write
|
|
But, by continuity of
,

Similarly,
|
|
That
finishes the proof of the product formula.
Rule 5) We will now observe that Rule 5 can be reduced into two steps.
then we can write
|
|
If we can show that if
then
|
|
then rule 5 will follow from product rule 4.
Now
is the composition of two functions,
|
|
where
|
|
We will show directly that
|
|
and then Rule 6 (called the Chain Rule) will allow us to conclude that,
for
,
|
|
Therefore, we
have to prove two things:
We have to prove that
and we have to prove Rule 6.
Lemma 1: If
,
then ![]()
Proof: We want to calculate
|
|
Then it follows from Limit Rule 5 that ![]()
Rule 6) Now, we prove the Chain Rule. It says that if
and
,
|
then if |
Proof: We use the differential approximation formula. First from
|
|
we can write
|
|
and we know that
|
|
Next, from the
fact that
,
we can write for a new function ![]()
|
|
where
|
|
and we can write
|
|
Now, letting
|
|
we see that
|
|
and since
|
|
we have
![]()
or
|
|
Now we claim
that
|
|
In fact, it is obvious that
|
|
What about the other summand?
|
|
There are two
cases to consider.
Case 1:
In this case,
|
|
Therefore there is a
such that
.
In that case, we may write, while
,
|
|
and it is clear that
|
|
and so
|
|
Meanwhile,
|
|
so in that case,
|
|
This brings
us to
Case 2)
In that case, we must show that
|
|
Now since in general,
certainly there is a
with
|
|
Thus, as long
as
we know that
.
We can guarantee that
by restricting
to be sufficiently close to 0 because
is continuous at
.
End of Proof
As you might
have guessed, the chain rule is by far the most powerful of the derivative rules.
You can derive almost everything else from it. For example, we may prove Rule
10 easily from the chain rule.
Rule 10) We are saying that
.
Therefore, by the chain rule,
|
|
And so
|
|
Once we prove
Rule 7, Rule 8 will follow by an application of the Chain Rule.
Rule 7) We use Rule 4 and a technique of proof called Mathematical
Induction to show that the functions
for n a positive integer, have derivatives
.
Let the proposition about the natural numbers
:
1,2,3, 4, ... state that
|
|
Certainly
is true. It is just Rule 2. Now suppose for some (large) natural number M,
is false. Certainly
,
since
is true. An axiomatic property of the natural numbers is that if some subset
is not empty, then it contains a smallest element. This is called the "Well-ordering"
of the Natural numbers.
Therefore let
the smallest number for which the proposition is false be denoted
.
We still know that
.
Perhaps it is 17 gazillion +143. Who knows? In any case, we know that
is true.
|
|
But let us now apply rule 4. Let ![]()
Then
|
|
But this shows
that the proposition is true for
:
is true. This contradiction means that the set of natural numbers for which
the proposition is false is empty. That proves Rule 7.
Question 4: Prove Rule 8 from Rule 7 and using the chain rule.
You may assume the functions
are differentiable and that
.
The last derivative rule that we will consider here is Rule 9 that explains how to differentiate trigonometric functions. It states:
for all x.
It is a bit different from the others, as we will see in the proof. In order to calculate
|
|
we will of course appeal to the trigonometric identity
|
|
And that will lead us to the expression
|
|
Now we can examine the expression and we will see
|
|
We write it this way because, as it happens,
|
|
and
|
|
Once we know these limits, it is easy to see that
|
|
So ![]()
Also, when we examine the difference quotient for the differentiation of the cosine, we see from the trigonometric identity
|
|
|
|
and the expression can be written as
|
|
and so those same limits tell us that: ![]()
And that is Rule 9. It is easy to derive all of the other formulas from it. For example, using Rule 5,
|
|
and so on.
But why are the limits:
and
![]()
true? The answer to that is in the geometry of the wrapping function.

For small positive angle
at
the center of the unit circle, construct the sector of arc
Drop
a perpendicular from B to radius OA to construct right triangle
.
And let line segment
be
tangent to the circle at B. Then the area of
is
.
The area of
is
and
the area of
is
![]()
So we have the inequality
|
|
For small
,
we will have
|
|
Now since cosine is continuous, as we showed earlier, and
,
it is easy to see that
|
|
And therefore, from the Limit Rule 5,
|
|
Actually, we only proved this for
,
but the extension of this argument to include
is
easy.
To see that
|
|
we do an algebraic trick. For say, ![]()
![]()
and so
|
|
And that finishes our discussion of derivatives of trigonometric functions for now.
In this exploration,
we will work with 15 built-in examples to illustrate the role of the derivative
to give qualitative information about the behavior of a function, such as where
it is increasing, where decreasing, and so on. And when you are comfortable
with the ideas, then you can try your own functions. We will show you how to
do that, too.
As you supply
your own functions, you will see how the graph looks, and what relation the
graph has to its critical points and inflections.
A function is
a convenient way to represent a relationship between two variables. When two
variables, say x and y stand for measurements of some pair of
quantities that change together in some process, and when for each value that
x takes only one y value corresponds to it, then we say that y depends
on x. This is because we only need to know the value of x to
determine the pair. When x and y are bound together in this way,
we often say also that y is a function of x. The idea is that
somehow each value that x takes determines the value that y takes
in the measurement process.
When, in some process, y is a function of x, we may write:
.
And we think of f (the function) as a rule that assigns to each value
of x, the y that corresponds to it in the process. Now, processes are most
often thought of as temporal things, but they may be more abstract than that.
When the process is temporal, then one natural choice for the independent variable,
say x, is the measurement of elapsed time itself, as for example, when
we let an object fall, and bind the time to the height above the ground. But
we may, for example imagine an experiment in which we construct circles with
various diameters, and bind together the measurement of diameter to the measurement
of circumference.
In that case, the "process" is not really temporal, but
it is a conceptual and geometrical process in which, for each construction,
given a unit of measurement, a pair of numbers is bound together: We may call
the diameter of the circle D, and the circumference C. Then we
might say that C was a function of D, say
.
But now something interesting happens. For such an abstract process, we are
permitted to write:
.
And the functional relationship between D and C is now expressed
in symbolic terms. We have, in fact, an algebraic procedure for producing C
whenever we
know the value of D. Simply multiply D by
.
So we say:
.
For "abstract" relationships such as this, or as those proposed by our models, the functional dependence is reduced to an algebraic procedure. And we describe the functional relationship with the formula that is used to produce the value of the dependent variable (y) when the value of the independent variable (x) is known. This is fine. But we must understand that the algebraic procedure is a property of our model. The functional dependence between the variables measured in the process expresses a richer and a deeper thing. As it did for Galileo, for example. In a sense, it expresses a belief (that the model attempts to articulate) that a certain process of measurement will yield the results predicted by the formula.
Below, we will see a few things that can be learned about functions
- as snapshots of processes that bind measured variables - from their pictures.
A graph of a function is indeed a "picture" of it. It is just the
set of bound pairs
where
drawn
in a plane. The independent variable (often t or x) is represented
on the abscissa (or horizontal axis), and the dependent variable is represented
on the ordinate (or vertical axis).
Much can be learned about the behavior of the variables from a picture like this. The information is qualitative, that is, "fuzzy," but that is usually the way it is with real data before it is modeled by formulas. The examples that we will use will in fact come from formulas. But you should keep in mind that many of the conclusions you draw from these examples will still be meaningful even when the data is not produced by formula, but by some measurement process. Let us start with a simple thing we can observe about a function.
We say that a function is monotonic on an interval (a,b) of values of the independent variable if either: It is increasing as the independent variable increases in (a,b) (monotonic increasing) or it is decreasing as the independent variable increases (monotonic decreasing). For example, if the independent variable is amount spent on advertising, and the dependent variable is net profit, then it is very useful to know if the "function" is monotonic on an interval in time, and which way! That ought to affect corporate strategy.
Let's see what this looks like. In the upper right corner of the
screen is a button:
Presently,
example #1 is installed. There are 15 examples, and to get one, just write
the number in the field and press the button. The current example function
definition will appear in the box below:
will always be a function of x (The independent variable will always be x).
You may type numbers into the Get Example field, or just
click
for
the next example. When you tire of our examples, create your own. Just write
their definition in the "f(x) :" box and the system will use
them. Be sure the independent variable is x in your formula.
And you may set the Abscissa, Ordinate, and Domain intervals (for example. to
get a close-up view, or to look at the graph on some restricted domain) by
typing the interval in the appropriate box below the graph screen, and then
pressing the button.
To see the graph, press the graph function button in the cluster of buttons:
You will see the first example.

The 3 highlighted
points correspond to the "zeros" of the function, the values of x
for which the function takes the value 0. Now, what are the regions of monotonicity?
To see the intervals on which y increases with x, press the increasing button in that cluster. You will see:

The section of the graph highlighted in black is the part of the single interval on which f is increasing. As you move from left to right, the graph rises there. Next, to see where f is decreasing, press the decreasing button. You see:

And the regions highlighted in white are over the 2 intervals on which f is decreasing. On each of those intervals, y gets smaller as x gets larger.
What can we say about the points on the borders between increasing and
decreasing f ? For example when f stops decreasing (with increasing
x) and begins to increase, we say that f has attained a local minimum.
This happens at the point whose coordinates are approximately
We
shall see what the exact coordinates are shortly. Again, at the point whose
coordinates are roughly 