Sobolev Spaces on Euclidean Space
Graeme Wilkin
January 29, 2011

1. Introduction

The purpose of these notes is to outline the basic definitions and theorems for Sobolev spaces defined on open subsets of Euclidean space. Of course, there are already many good references on this topic, and, rather than duplicate this here, instead the goal is to give examples where possible to illustrate the theory, and to orient the reader towards the different approaches contained in the literature. In addition, there is an appendix containing some basic results from measure theory (again, this contains examples and references to some of the literature on the subject).

There are a number of more advanced topics that have been left to future versions of these notes; for example, complete proofs of the embedding and compactness theorems (as well as examples where embeddings don’t exist), the chain rule and the behaviour of weak derivatives under co-ordinate transformations. Good references for this material include the book [1] by Adams (a classic on the subject), and Ziemer’s book [16]. Sobolev spaces on manifolds and their use in gauge theory would also be good topics for an expanded version of these notes. Future versions of these notes will also contain more examples.

Notation. First-order partial derivatives uxi are denoted xiu or iu. Higher-order partial derivatives use the standard notation for multi-indices (see [4, Appendix A]): Given a multi-index α=α1,,αn0n we write α=α1++αn for the order of α, and define the αth order partial derivative by

Dαu=αux1α1xnαn=x1α1xnαnu.

2. Definition of Sobolev spaces

This section contains all of the necessary definitions needed to define Sobolev spaces on open subsets of n. In order to get a feel for distributional derivatives and Sobolev spaces, some basic examples are given throughout the section.

The approach taken in these notes is to follow the historical definition of Sobolev spaces. First, in this section, we define the distributional and weak derivatives, and then define Sobolev spaces in terms of these. Later on, in Section 3.2, we prove the Meyers-Serrin theorem, which says that Sobolev spaces are the completion of the space of smooth functions in the Sobolev norm.

2.1. Distributions and test functions

Let Ωn be open and non-empty, and let 𝒞cptΩ denote the space of smooth complex-valued functions with compact support in Ω.

Definition 2.1.

The space of test functions on Ω, denoted 𝒟Ω, is the locally convex topological vector space (more precisely the LF-space) consisting of all the functions in 𝒞cptΩ, with the following notion of convergence: A sequence ϕmm𝒞cptΩ converges in 𝒟Ω to the function ϕ𝒞cptΩ if and only if there is some fixed compact set K such that the support of ϕm-ϕ is in K for all m, and that DαϕmDαϕ uniformly for each α.

Remark 2.2.
  1. The notation 𝒟Ω is used to emphasise the topology on 𝒞cptΩ described above.

  2. Note that the definition does not imply that the constants from the uniform convergence are independent of α.

  3. To see that 𝒟Ω is a locally convex topological vector space, indeed, let us construct a family of seminorms which induces the topology on 𝒟Ω. To this end denote first by EcptΩ the set of compact exhaustions of Ω, which means the set of all families K=Kii such that KiΩ is compact, iKi=Ω and KiKi+1 for all i. For every compact exhaustion K and every pair M=mii and N=nii of sequences of natural numbers denote then by pK,M,N:𝒟Ω0 the map defined by

    pK,M,Nϕ=i=0supxKi+1Kisup0αminiDαϕx,ϕ𝒟Ω.

    Note that, the sum in this formula is always finite since the support of ϕ is compact.

    It is straightforward to check that pK,M,N is a seminorm on 𝒟Ω and that the family pK,M,NKEcptΩ,M,N defines a locally convex topology on 𝒟Ω that exactly recaptures the convergence in 𝒟Ω as defined above (see [14, Chapter 13] for more details on the LF-space structure of 𝒟Ω). We present this explicit description of a family of seminorms describing the locally convex topology on 𝒟Ω here, since we do not know of a reference to this in the literature.

Definition 2.3.

A distribution is a continuous complex-valued linear functional on the space of test functions 𝒟Ω. The space of distributions is the dual

𝒟(Ω)*={T:𝒟(Ω)Tlinear and continuous}.
Remark 2.4.

In the above definition, linearity simply means that if T𝒟Ω* and ϕ,ψ𝒟Ω, then

Tλϕ+μψ=λTϕ+μTψfor all λ,μ.

Continuity means that whenever a sequence ϕmm𝒟Ω converges in 𝒟Ω (in the sense of Definition 2.1) to ϕ𝒟Ω, then TϕmTϕ as a sequence in .

The space 𝒟Ω* is also equipped with a notion of convergence, defined as follows.

Definition 2.5.

A sequence Tnn𝒟Ω* converges in 𝒟Ω* to T𝒟Ω* if for every ϕ𝒟Ω we have TnϕTϕ in .

Remark 2.6.

This is the usual notion of convergence in the weak* topology on a dual space.

The following gives some examples of distributions (recall the definition of LlocpΩ from Appendix A).

Example 2.7.
  1. Given xΩ, the delta functional is the distribution

    δxϕ=ϕx.

    Clearly this is linear. To see that it is continuous, note that if ϕmϕ in 𝒟Ω, then ϕmxϕx, and so δxϕmδxϕ.

  2. The functional

    Tϕ=Ωϕxdx

    is a distribution. (Note that since ϕ is continuous with compact support then it is also integrable.) Again, this is clearly linear. It is also continuous, since if ϕmϕ in 𝒟Ω, then, by definition, there exists a fixed compact set K such that suppϕm-ϕK. Therefore

    Tϕ-Tϕm=Ωϕx-ϕmxdx=Kϕx-ϕmxdx,

    and since ϕmxϕx uniformly on the compact set K, then Kϕx-ϕmxdx0. Note that it is essential that K has finite measure for this argument to work.

  3. Given gLloc1Ω, let Tg be the functional

    Tgϕ=Ωϕxgxdx. (2.1)

    Note that Hölder’s inequality shows that TgϕϕLKgL1K, where K denotes the (compact) support of ϕ𝒟Ω. Therefore Tgϕ is always finite since ϕ𝒞cpt implies that ϕL is finite, and so Tgϕ for all ϕ𝒟Ω. As for the previous examples, clearly Tg is linear, and it only remains to show that it is continuous. Note that if ϕmϕ, then there is a fixed compact set K with suppϕm-ϕK, and so

    Tgϕ-Tgϕm=Kϕx-ϕmxgdx0

    since gLloc1Ω and ϕmxϕx uniformly on K. Therefore Tg𝒟Ω*.

    We will revisit this example later, since it appears in the definition of weak derivative in Section 2.2.

The last example above is an important one, it shows that there is a linear map Lloc1Ω𝒟Ω* given by gTg (recall that all Lp and Llocp spaces are defined to be equivalence classes of functions that are equal almost everywhere, and note that this map is well-defined on equivalence classes of functions in Lloc1Ω, since f=g a.e. implies that Ωfϕdx=Ωgϕdx for any test function ϕ). In fact, since Hölder’s inequality shows that there is an inclusion LlocpΩLloc1Ω for all 1<p, then there is also a map LlocpΩ𝒟Ω*. The next theorem says that this map is injective.

Theorem 2.8.

Let Ωn be open, and let f and g be functions in Lloc1Ω. Suppose that the distributions Tf and Tg are equal, i.e. Tfϕ=Tgϕ for all test functions ϕ𝒟Ω. Then f=g a.e. in Ω.

Proof.

This is proved in [8, Theorem 6.5] using convolutions, however, for variety, here we give a slightly different proof. Firstly note that it is sufficient to prove the result for real-valued functions f and g, since we can take real and imaginary parts. Suppose that there exists a set K whose Lebesgue measure is finite and non-zero, and which satisfies fxgx for all xK. Since the Lebesgue measure is Borel regular (see Lemmas B.20 and B.21) then we can assume without loss of generality that K is compact. Define K+K to be the subset such that fx>gx, and again note that without loss of generality we can assume that K+ is compact with non-zero measure. Define the constant

C:=K+fx-gxdx>0.

Now let Vnn be a collection of open sets such that

  1. K+Vn and Vn+1Vn for each n, and

  2. the Lebesgue measure of VnK+ satisfies VnK+<1n.

The existence of each Vn is guaranteed since the Lebesgue measure is Borel regular. Now use Urysohn’s lemma (see Appendix A.3) to construct a smooth positive function ϕn𝒞cptΩ such that 0ϕnx1 for all xΩ, ϕx=1 for all xK, and ϕx=0 for all xΩVn. Therefore

Ωf-gϕndx=K+f-gϕndx+VnK+f-gϕndx+ΩVnf-gϕndx
=K+f-gϕndx+VnK+f-gϕndx
=C+VnK+f-gϕndx.

The last term in the above equation satisfies the estimate

VnK+f-gϕndxVnK+f-gϕndx
VnK+f-gdx,

and, since VnK+0 as n, then

VnK+f-gdx0as n,

since the integral of a fixed measurable function is an absolutely continuous set function (see for example [15, Corollary 10.41]).

Therefore there exists an n such that

Ωf-gϕndx=C+VnK+f-gϕndx>0,

which is a contradiction. Therefore f=g almost everywhere. ∎

A consequence of this theorem is that the distribution Tf associated to a function fLloc1Ω uniquely determines an equivalence class in Lloc1Ω. Therefore, the following definition makes sense.

Definition 2.9.

A distribution T𝒟Ω* represents the function fLloc1Ω if

T(ϕ)=Ωf(x)ϕ(x)dx=:Tf(ϕ)

for all test functions ϕ𝒟Ω. A function fLloc1Ω is represented by the distribution Tf𝒟Ω*.

Theorem 2.8 shows that each distribution can represent at most one element of Lloc1Ω, i.e. the map Lloc1Ω𝒟Ω* given by gTg is injective. The following example shows that not all distributions represent functions in Lloc1Ω, i.e. the map Lloc1Ω𝒟Ω* given by gTg is not surjective.

Example 2.10.

Given any xΩ, let δx𝒟Ω be the delta functional defined in Example 2.7. We claim that this does not represent any function in Lloc1Ω. To see this, suppose for contradiction that δxϕ=Ωfxϕxdx for some fLloc1Ω and every ϕ𝒟Ω. Consider a sequence of bump functions ϕn such that for all n satisfying Bx,1nΩ we have

  1. ϕnx=1,

  2. suppϕn=Bx,1n,

  3. 0ϕnx1 for all yBx,1n, and

  4. Ωϕnxdx=1n.

Then, since fxϕnxfx has support in B0,1¯, dominated convergence shows that Ωfxϕnxdx0 as n, which contradicts δxϕn=1 for all n.

Therefore the delta functional is an example of a distribution that cannot be represented by a function in Lloc1Ω. It can, however, be represented by a measure (see the measure in Example 3.28), and in Section 3.4 we will show that positive distributions can always be represented by measures (see Theorem 3.35).

2.2. Distributional derivatives and Sobolev spaces

Before defining Sobolev spaces, first we have to define the notion of the derivative of a distribution.

Definition 2.11.

Let Ωn be open, let T𝒟Ω*, and let α0n. The αth distributional derivative of T is the distribution DαT defined by

(DαT)(ϕ):=(-1)αT(Dαϕ)

for all test functions ϕ𝒟Ω. The distributional gradient, denoted T, is the n-tuple of distributions

T=1T,,nT.

If T and DαT both represent functions in Lloc1Ω (i.e. T=Tf and DαT=Tg for some f,gLloc1Ω) then we say that g is a weak derivative of f, and write g=Dαf. In this case we say that the weak derivative of f exists.

Remark 2.12.
  1. Since the weak derivative is defined by the relation

    ΩfxDαϕxdx=-1αΩgxϕxdx

    then it is only defined up to equivalence almost everywhere.

  2. The distributional derivative always exists for any multi-index α, since the definition only involves differentiating test functions, which are smooth. Since partial derivatives of smooth functions commute, then distributional derivatives also commute, i.e.

    ijT=jiT.
  3. As we will see in the examples below, the weak derivative does not always exist, and, in fact, may not even exist for any value of α (in Example 2.19 we show that the step function is an example of such a function).

The following lemma shows that the weak derivative extends the notion of classical derivative of differentiable functions. It says that the distributional derivative of the distribution associated to a differentiable function g is the distribution associated to the classical derivative of g.

Lemma 2.13.

Let gCαΩ. Then for all ϕ𝒟Ω we have

DαTgϕ=-1αΩDαϕxgxdx=ΩϕxDαgxdx=TDαgϕ (2.2)
Proof.

The proof simply involves applying the definitions and the integration by parts formula from Section A.2. ∎

Remark 2.14.
  1. It is important to emphasise that Dα is used to denote both the distributional derivative and the classical derivative in the statement of the lemma: DαTg is the distributional derivative of the distribution Tg associated to the function g, and TDαg is the distribution associated to the classical derivative Dαg.

  2. It is an important exercise to think through the precise meaning of all of the statements above, to understand the distinction between a weak derivative and a distributional derivative, and to understand the meaning of each term in (2.2).

The next lemma shows that functions that are equal almost everywhere have the same distributional derivatives. As a consequence, when defining Sobolev spaces in Definition 2.16, we can define them as subsets of the Lp and Llocp spaces (i.e. we consider equivalence classes of functions that are equal almost everywhere).

Lemma 2.15.

If f=g almost everywhere, then DαTf=DαTg as distributions.

Proof.

The proof is another straightforward application of the definition of distributional derivative. For any test function ϕ𝒟Ω we have

DαTfϕ=-1αTfDαϕ=-1αΩfxDαϕxdx
=-1αΩgxDαϕxdx(since f=g a.e.)
=-1αTgDαϕ
=DαTgϕ.

Therefore, DαTfϕ=DαTgϕ for all ϕ𝒟Ω, and so DαTf=DαTg as elements of DΩ*. ∎

Now that we have developed the necessary machinery, we are ready to define Sobolev spaces.

Definition 2.16.

The Sobolev space Wk,pΩ is the space of equivalence classes of all functions fLpΩ such that the weak derivative Dαf exists and is in LpΩ for all α such that αk.

The Sobolev space Wlock,pΩ is the space of all functions fLlocpΩ such that the weak derivative Dαf exists and is in LlocpΩ for all α such that αk.

The space Wk,pΩ has a norm given by

fWk,pΩ=j=0kα:α=jDαfLpΩ,

and we define W0k,pΩ to be the closure of the space 𝒞cptΩ in the topology induced by this norm.

For a compact subset KΩ we define the norm

fWk,pK=j=0kα:α=jDαfLpK,

where the weak derivatives Dαf are defined on Ω.

Lemma 2.17.

The norm Wk,pΩ gives Wk,pΩ the structure of a normed linear space for 1p.

Proof.

Recall that we have to check

  1. The space has a unique element of zero norm, i.e. f=0 if and only if fWk,pΩ=0.

  2. The norm is linear with respect to scalar multiplication, i.e. cfWk,pΩ=cfWk,pΩ for all c and fWk,pΩ.

  3. The triangle inequality holds, i.e.

    f+gWk,pΩfWk,pΩ+gWk,pΩ

    for all f,gWk,pΩ.

It is easy to check (2.2): since the result is true for LpΩ, we have Wk,pΩLpΩ, and fLpΩfWk,pΩ for all fWk,pΩ.

The weak derivative commutes with scalar multiplication, i.e. Dαcf=cDαf for all c, and so we also have cfLpΩ=cfLpΩ. Therefore (2.2) is satisfied by definition of the Sobolev norm.

The triangle inequality for Wk,pΩ follows from the definition of the Sobolev norm and the triangle inequality for LpΩ (which is Minkowski’s inequality, see for example [15, Theorem 8.10]). ∎

It is worth recalling that Wk,pΩ can never be a normed linear space for 0<p<1, since the triangle inequality fails in this case. See for example the remark on p130 of [15], and also [15, Theorem 8.16]. For more discussion of Lp spaces for 0<p<1, see [13, pp35-36].

Remark 2.18.

We will see later, in Section 3.1, that Wk,pΩ is a Banach space with this norm.

It is worth studying some examples of distributional and weak derivatives. The first example is the step function, for which the distributional derivative is the delta functional from Example 2.7. This is an important example, since it shows that the step function is not in any Sobolev space Wk,pΩ or Wlock,pΩ for k1, because the delta functional cannot be represented by a function.

Example 2.19.

Let g: be the step function

g(x)={1x0,0x<0.

Given a test function ϕ𝒞cpt, consider the integral

gxxϕxdx=0xϕxdx=ϕx0=-ϕ0.

(Recall that ϕ vanishes at infinity since it has compact support.) Therefore the distributional derivative of Tg is the linear functional xTg𝒟* given by xTgϕ=ϕ0, i.e. xTg is the delta functional δ0. Example 2.10 shows that this cannot be represented by a function, and therefore the weak derivative of the step function does not exist, so the step function is not in W1,p or Wloc1,p for any p.

Example 2.20.

Let fx=x. To compute the weak derivative we first consider

fx1ϕxdx=-0-x1ϕxdx+0x1ϕxdx=-xϕx0+-0ϕxdx+xϕx0-0ϕxdx=-0ϕxdx-0ϕxdx. (2.3)

Let

g(x)={1x0-1x<0,

and note that the previous calculation (2.3) shows that fx1ϕxdx=-gxϕxdx. Therefore the weak derivative of fx=x is the step function gx.

The next example generalises the method of the previous example to locally Lipschitz functions.

Example 2.21.

In this example we show that if f is locally Lipschitz on Ω then fWloc1,Ω. Rademacher’s theorem shows that the partial derivatives of f exist almost everywhere (see Corollary A.17), and the goal of this example is to show that these partial derivatives are equal almost everywhere to the weak derivative of f in each co-ordinate direction.

For each compact set K, let MK be the associated Lipschitz constant, i.e. for all x,yK we have

fx-fyMKx-y. (2.4)

(Note that this differs slightly from Definition A.14, however we can easily extend this to compact sets K by taking an open cover of K.)

Equation (2.4) implies that fLlocΩ. Therefore the integral

Ωfxϕxdx

is defined for any ϕ𝒞cptΩ. To show that f has a weak derivative, we need to show that there exists g such that

Ωfxjϕxdx=-Ωgxϕxdx

for all test functions ϕ𝒞cptΩ.

Let K=suppϕ. Since K is compact then there exists ε>0 such that if h<ε then x+hejΩ for all xK, and so ϕx+hej is well-defined for small values of h. Therefore

Ωfxjϕxdx=Ωfxlimh0ϕx+hej-ϕxhdx.

The next step involves using dominated convergence to interchange the order of integration and differentiation. Since this is a standard technique that is used in many examples then we include all of the details here. First note that since ϕ is smooth with compact support, then it is uniformly Lipschitz, and so the absolute value of the difference quotients ϕx+hej-ϕxh is uniformly bounded by a constant (call it M~) for h<ε. Since fLloc1Ω and the difference quotients have compact support, then

fxϕx+hej-ϕxhM~fxLloc1Ω,

and so we can use dominated convergence to write

Ωfxlimh0ϕx+hej-ϕxhdx=limh0Ωfxϕx+hej-ϕxhdx.

Changing variables, and recalling that the upper bound on h was chosen so that x-hejΩ for all xsuppϕ, gives us

limh0Ωfxϕx+hej-ϕxhdx=limh0Ω1hfxϕx+hejdx-limh0Ω1hfxϕxdx=limh0Ω1hfx-hejϕxdx-limh0Ω1hfxϕxdx=limh0Ωfx-hej-fxhϕxdx. (2.5)

(Even though x-hej may not be in Ω for arbitrary xΩ, we do have x-hejΩ for all xK. Since the support of ϕ is KΩ, then we can define

Ω1hfx-hejϕxdx:=K1hfx-hejϕxdx,

and therefore the integral in the above calculation makes sense.)

The quanitity fx-hej-fxhϕx is uniformly bounded for h12ε (since f is locally Lipschitz), and so another application of dominated convergence gives us

limh0Ωfx-hej-fxhϕxdx=Ωlimh0fx-hej-fxhϕxdx. (2.6)

Rademacher’s theorem shows that for each j=1,,n, the partial derivative jf exists almost everywhere in Ω, and, on the compact set K=suppϕ it is bounded above by the Lipschitz constant MK. Let gjx be a function defined on all of Ω that is equal almost everywhere to jfx. Therefore

Ωlimh0fx-hej-fxhϕxdx=-Ωlimh0fx-hej-fx-hϕxdx=-Ωgjxϕxdx,

and so we have shown that

Ωfxjϕxdx=-Ωgjxϕxdx.

Therefore the weak derivative exists and is equal almost everywhere to jfx. Since gjxMK almost everywhere on each compact set K, then fWloc1,Ω.

Remark 2.22.

The part of the above proof that requires the Lipschitz condition on f is the application of dominated convergence in (2.6). The fact that the derivative of f exists almost everywhere is not sufficient for a weak derivative to exist, for example, the derivative of the step function is zero almost everywhere, but we showed in Example 2.19 that the step function does not have a weak derivative. The reason is that (2.6) fails for the step function (the rest of the proof does go through for the step function).

3. Basic properties of Sobolev spaces

In this section we prove some basic results about Sobolev spaces. The results of Sections 3.1 and 3.3 describe basic functional analytic properties of Sobolev spaces, while Section 3.2 gives an alternative characterisation of Sobolev spaces as the completion of the space of smooth functions. Section 3.4 provides an answer to an earlier question by showing that, although distributions cannot always be represented by locally integrable functions, the positive distributions can always be represented by regular Borel measures.

3.1. Banach and Hilbert space structure of Sobolev spaces

It is well-known that LpΩ (with the Lp norm) is a Banach space, and that L2Ω (with the L2 inner product) is a Hilbert space. In a similar way, we can show that the Sobolev spaces Wk,pΩ have the structure of a Banach space, and that Wk,2Ω has the structure of a Hilbert space, and it is the goal of this section to give the details of this proof. This is a useful theorem, since it allows us to use theorems from functional analysis to study sequences of functions in Sobolev spaces.

Firstly, recall that the space LpΩ, together with the Lp norm, is complete when 1p (see for example [8, Theorem 2.7] or [15, Theorem 8.14]). To extend this to the Sobolev space Wk,pΩ, we use an inductive argument. The proof of the following lemma gives the basic idea of this argument for k=1.

Lemma 3.1.

Let Ωn be an open set and 1p. Then the space W1,pΩ is complete in the norm W1,pΩ.

Proof.

Let umm be a Cauchy sequence in W1,pΩ. Then, by definition of the Sobolev norm, umLpumW1,p, and so umm is also Cauchy in Lp. Similarly, since jumLpumW1,p (again this follows from the definition of the Sobolev norm), we have that jumm is a Cauchy sequence in Lp.

Since LpΩ is complete, then there are functions v0,v1,,vn such that

umLpv0
jumLpvj,j=1,,n.

Hölder’s inequality shows that LpΩLloc1Ω, and so each um determines a distribution Tum𝒟Ω* given by

Tumϕ=Ωumϕdx

for all test functions ϕ𝒟Ω.

Another application of Hölder’s inequality gives the following estimate for any ϕ𝒟Ω

Tumϕ-Tv0ϕΩumx-v0xϕxdxϕLqum-v0Lp,

where q is the conjugate Hölder exponent of p. (Note that the integral exists since supϕ is bounded, ϕ has compact support, and um-v0Lloc1Ω.) Since umv0 in Lp then this shows that TumTv0 in 𝒟Ω*.

The same argument with um replaced by jum and v0 replaced by vj shows that TjumTvj. We then have for every test function ϕ𝒟Ω

Tvjϕ=limmTjumϕ
=-limmTumjϕ
=-Tv0jϕ
=Tjv0ϕ(by definition of distributional derivative).

Therefore, by Theorem 2.8, we have vj=jv0 almost everywhere, where j is the weak derivative, which exists since v0W1,pΩ. Therefore, we have shown that umv0 in W1,pΩ, and so W1,pΩ is complete. ∎

Using this technique we can now prove the following theorem, which, together with Lemma 2.17, says that Wk,pΩ is a Banach space.

Theorem 3.2.

Let Ωn be open and 1p. Then Wk,pΩ is complete in the norm Wk,p for all k0. In particular, Wk,pΩ is a Banach space for all 1p and k0.

Proof.

The proof uses induction on k. The case k=0 follows from standard results about Lp spaces (see for example [15, Theorem 8.14]). Suppose that Wk-1,pΩ is complete, and let umm be a Cauchy sequence in Wk,pΩ. Therefore the sequences umm and jumm (for j=1,,n) are Cauchy, and the completeness of Wk-1,p shows that there exist functions v0,v1,,vn such that

um--Wk-1,pv0
jum--Wk-1,pvjfor all j=1,,n.

Note that the inductive hypothesis shows that DαjumDαvj in Lp for all multi-indices α such that αk=1, and so it only remains to show that jv0=vj for each j=1,,n.

As in the previous proof we can show that

Tjumϕ-TvjϕϕLqjum-vjLp,

and so for all test functions ϕ𝒟Ω we have

Tvjϕ=limmTjumϕ=-limmTumjϕ=-Tv0jϕ=Tjv0ϕ,

and so Theorem 2.8 shows that vj=jv0 almost everywhere. This, together with the previous statement that DαjumDαvj in Lp for all multi-indices α such that αk-1, shows that DαumDαv0 in Lp for all α such that αk.

Therefore, we have shown that there exists v0Wk,pΩ such that um--Wk,pv0, and so Wk,pΩ is complete. ∎

In the case p=2, the previous theorem, together with the following inner product, gives Wk,2Ω the structure of a Hilbert space.

Definition 3.3.

The inner product on Wk,2Ω is defined to be

f,gWk,2Ω:=0αkΩDαfDαg¯dx. (3.1)
Remark 3.4.

The Sobolev norm on Wk,2Ω is the same as the norm induced by the inner product

fWk,2Ω=f,fWk,2Ω12.
Theorem 3.5.

(Wk,2(Ω),,Wk,2Ω) is a Hilbert space.

Remark 3.6.
  1. In view of Theorem 3.2, the proof of Theorem 3.5 only requires checking that the axioms for an inner product are satisfied.

  2. In order to emphasise the Hilbert space structure, the space Wk,2Ω is often denoted HkΩ.

3.2. Sobolev spaces are the completion of the space of smooth functions in the Sobolev norm (the Meyers-Serrin theorem)

In this section we prove the Meyers-Serrin theorem, which says that the Sobolev spaces defined in Section 2 are the completion of the space of smooth functions in the Sobolev norm. Therefore we now have two equivalent definitions of Sobolev spaces, which gives us a broader range of techniques to draw upon when proving theorems.

First recall the following well-known theorem that says that a normed linear space has a unique completion (see for example [11, Theorem I.3]).

Theorem 3.7.

If (V,V) is a normed linear space, then there exists a unique complete normed linear space (V~,V~) such that V is isometric to a dense subset of V~.

Let 𝒞kΩ be the space of k-times differentiable functions f:Ω. Since the weak derivative of a differentiable function is just the classical derivative (Lemma 2.13), then the weak derivatives of any ϕ𝒞kΩ exist up to order k, and we can define the subspace

Sk,pΩ=ϕ𝒞kΩ:ϕWk,pΩ<Wk,pΩ.

Let Bk,pΩ denote the completion of Sk,pΩ in the Wk,pΩ-norm. Since Wk,pΩ is complete by Theorem 3.2, and Sk,pΩWk,pΩ, then we have proved

Lemma 3.8.

For 1p we have

Bk,pΩWk,pΩ.

It turns out that the converse is also true for 1p<, this is known as the Meyers-Serrin theorem, and the proof will occupy the rest of this section.

Example 3.9.

To see that the converse of the previous lemma can never be true for p=, in this example we show that Bk,ΩWk,Ω. Consider first the case k=0 and Ω=, where the step function

f(x)={-1ifx<01ifx0

is not in the completion of S0,, since for any continuous function gS0, we have f-gL1. To extend this example to Wk, for k>0, simply consider the function

f(x)={-xkifx<0xkifx0,

and note that dkfdxk is a step function. It is easy then to extend this idea to the case where the domain is an open subset of n.

Next, we recall some basic facts needed in the proof of Theorem 3.15. The first is the existence of partitions of unity.

Theorem 3.10.

Let A be an arbitrary subset of n, and let O=UααI be a collection of open sets in n that cover A. Then there exists a collection Ψ=ψββJC0n such that

  1. For every βJ and every xn, we have 0ψαx1.

  2. If KA then all but at most finitely many ψβΨ vanish identically on K.

  3. For every βJ there exists αI such that suppψβUα.

  4. For every xA we have βJψβx=1 (note that the sum makes sense because of the local finiteness condition (3.10)).

The collection Ψ is called a partition of unity of A subordinate to O.

Proof.

The case where A is compact is given in [12, Theorem 2.13]. If A is open, then for each j define

Aj:=xA:xjanddistx,A1j,

and note that Aj is compact and satisfies AjintAj+1 for each j. Moreover, we can also write A as the union of compact sets

A=jAj=jAjintAj-1.

Also, for notational convenience in what follows, define A0=A-1=.

Given an open cover O=UααI of A, for each j we can define an open cover of the compact set AjintAj-1 by

Oj:=UαintAj+1Aj-2:αI.

By the result for compact sets, for each j there exists a partition of unity Ψj=ψj,nn=1Nj for the compact set AjintAj-1 that is subordinate to Oj, and has finitely many elements. Moreover, since UαintAj+1Aj-2A for each αI and j, then suppψj,nA for each ψj,nΨj. Therefore, since each xA satisfies xAjintAj-1 for at most finitely many j, then the sum

σx=jψΨjψx

has at most finitely many terms for each x, and also satisfies σx1 for each xA. Now define the collection of functions

Ψ:={fj,n(x)={ψj,nxσxxA0xA|:j,1nNj}.

This is now a partition of unity of A subordinate to O.

In the case where A is an arbitrary subset of n with an open cover O=UααI, define the open set B=αIUα, note that O is an open cover of B, and apply the previous result to find a partition of unity Ψ of B subordinate to O. Since AB then Ψ is also a partition of unity of A subordinate to O. ∎

The second basic fact needed is the convergence of sequences of mollified functions. Let J be a non-negative real-valued function in C0n such that

  1. Jx=0 if x1.

  2. nJxdx=1.

For example we can choose

J(x)={kexp-11-x2if x<10if x1,

where k is chosen so that nJxdx=1. The function Jx is called a mollifier. For any ε>0, let Jεx=1εnJxε, and define the mollification of uLpΩ to be the convolution

Jε*ux=nJεx-yuydy.
Lemma 3.11.

If uWk,pΩ then Jε*u is smooth for all ε>0.

Since Jε is smooth for all ε>0, then this follows from [15, Theorem 9.3].

Theorem 3.12.

Let Ω be an open subset of n, and let ΩΩ be an open subset with compact closure. If 1p< and uWk,pΩ, then

limε0+Jε*u=u

in Wk,pΩ.

Proof.

When k=0 this is a standard result for Lp spaces (see for example [15, Theorem 9.6] for a proof). The general case follows by reducing to the k=0 case.

First we show that for any ε<distΩ,Ω we have DαJε*u=Jε*Dαu in the distributional sense on Ω. To see this, let u~ denote the zero extension of u from Ω to all of n, and note that for any test function ϕ𝒞cptΩ we have

ΩJε*uxDαϕxdx=nnu~x-yJεyDαϕxdxdy
=-1αnΩDαux-yJεyϕxdxdy
=-1αΩJε*Dαuxϕxdx.

(All of the derivatives above are taken with respect to the variable x.)

Since DαuLpΩ for each 0αk, then the result for Lp spaces shows that

limε0+DαJε*u-DαuLpΩ=limε0+Jε*Dαu-DαuLpΩ=0.

This is true for all α such that 0αk, and so Jε*u converges to u in the Wk,pΩ norm. ∎

Next, we introduce the notion of a nested open cover, which will be used in the sequel.

Definition 3.13.

Let Ω be an open subset of n. A nested open cover of Ω is a collection of open sets Ωjj such that

  1. ΩjΩj+1Ω for all j.

  2. For all xΩ there exists j such that xΩj.

Lemma 3.14.

Let Ω be an open set in n, and let Ωjj be a nested open cover of Ω. If fWk,pΩ satisfies fWk,pΩjC for all j, then fWk,pΩC.

Proof.

The inclusion ΩjΩ induces an inclusion 𝒟Ωj𝒟Ω. Therefore the weak derivative of f on Ωj is just the restriction of the weak derivative of f on Ω, since for all test functions ϕ𝒟Ωj we have

ΩjfDαϕdx=ΩfDαϕdx=-1αΩDαfϕdx=-1αΩjDαfϕdx.

The dominated convergence theorem shows that limjDαfLpΩj=DαfLpΩ for each α, and as a consequence we have

DαfLpΩsupjDαfLpΩjfor each α.

Therefore fWk,pΩC. ∎

Now we are ready to prove that the space of smooth functions is dense in Wk,p.

Theorem 3.15 (Meyers-Serrin).

Let Ω be an open subset of n, and let 1p<. Then for any uWk,pΩ, and for every ε>0, there exists ϕ𝒞Ω such that u-ϕWk,pΩ<ε.

Proof.

Fix ε>0. For each j, define the open sets

Ωj:=xΩ:x<j,anddistx,Ω>1j
Uj:=Ωj+1ΩΩ¯j-1.

Then Ωjj is a nested open cover of Ω, and, in particular, we can apply Lemma 3.14 (we will use this at the end of the proof). Moreover, each Ωj has compact closure in Ω, and so Theorem 3.12 applies. Define O=Ujj, and note that O is also an open cover of Ω (although it is not nested).

Let Ψ=ψjj be a partition of unity for Ω subordinate to O, and note that the local finiteness property of partitions of unity shows that ψj𝒞Uj for all j, and we also have

j=1ψjx=1

for all xΩ.

From the definition of Uj, if 0<εj<1j+1j+2=1j+1-1j+2 then Jεj*ψju has support in the set

Vk:=Ωj+2(ΩΩ¯j-2)Ω.

Since ψjuWk,pΩ, then by Theorem 3.12 we can find εj such that 0<εj<1j+1j+2 and

Jεj*ψju-ψjuWk,pΩj+2<ε2j+1.

Define

ϕ=j=1Jεj*ψju.

On any compact subset KΩ, all by finitely many terms in the sum vanish, and so ϕ𝒞Ω. Now note that if xΩ, then

ux=j=1+2ψjxux,ψx=j=1+2Jεj*ψjux,

and so for each

u-ϕWk,pΩj=1+2Jεj*ψju-ψjuWk,pΩj+2<12ε.

An application of Lemma 3.14 then shows that u-ϕWk,pΩ12ε<ε, as required. ∎

This theorem shows that Wk,pΩBk,pΩ. Combining this with Lemma 3.8 gives us the following corollary, which states that Wk,pΩ is the completion of the space of 𝒞kΩ functions in the Sobolev norm Wk,pΩ.

Corollary 3.16.

Let Ω be an open subset of n, and let 1p<. Then

Bk,pΩ=Wk,pΩ

for any k0.

Remark 3.17.

The statement of the corollary above is that Wk,pΩ is the completion of the space of 𝒞k functions with respect to the Wk,p-norm. Since 𝒞Ω𝒞kΩ and Theorem 3.15 is stated for smooth functions, then we also have that Wk,pΩ is the completion of the space of smooth functions in the Sobolev norm.

3.3. The dual space of a Sobolev space

In this section Ω denotes an open subset of n, 1p<, and q denotes the conjugate exponent to p, i.e. q=pp-1 if 1<p< and q= if p=1.

First recall Theorem A.4, which says that the dual of LpΩ is isomorphic to LqΩ if 1p<. The proof of this theorem involves showing that for each linear functional Λ:LpΩ there exists a function vLqΩ (unique up to equivalence in LqΩ) such that

Λu=Ωuvdx

for all uLpΩ. Moreover, as part of the construction, the proof also shows that vLqΩ=ΛLpΩ*. The converse is also true, so we have an isometric isomorphism LpΩ*LqΩ.

The goal of this section is to provide a description of the dual space to the Sobolev space Wk,pΩ. It is important to point out that most of the hard work is done in proving the previous theorem for Lp spaces, and that the proofs given below rely heavily on this construction. More details can be found in [1, Chapter 3]

Let ,:Lp(Ω) × Lq(Ω) denote the dual pairing

u,v:=Ωuvdx

for uLpΩ and vLqΩ, and, for N, let LqΩN:=LqΩ ×  × LqΩ denote the product of N copies of LpΩ.

There is a map F:LqΩNWk,pΩ* that takes a vector of functions vα to the linear functional Λu=0αkDαu,vα. The first theorem below shows that this map is surjective, and therefore we can characterise elements of the dual Wk,pΩ in terms of elements of LqΩN.

Theorem 3.18.

Given k0, let N be the number of multi-indices α such that 0αk. For every functional ΛWk,pΩ* there exists vα0αkLqΩN such that for all uWk,pΩ we have

Λu=0αkDαu,vα,

Moreover, if we define V to be the set of all vα0αkLqΩN satisfying the previous equation, then

ΛWk,pΩ*=infvαVvαLqΩN, (3.2)

and this infimum is attained by some vαLqΩN.

Proof.

First note that, by the definition of Wk,pΩ, there exists a linear map

P:Wk,pΩLpΩN
uDαu0αk.

By the definition of the norms on Wk,pΩ and LpΩN, the map P is an isometry, and therefore P is an isometric isomorphism onto its image.

Given ΛWk,pΩ* define Λ*PWk,pΩ*, a linear functional on the image of P, by

Λ*Pu=Λufor all uWk,pΩ.

Since P is an isometric isomorphism, then

Λ*PWk,pΩ*=ΛWk,pΩ*.

The Hahn-Banach theorem (see for example [11, p76]) shows that there is a norm-preserving extension Λ~ of Λ* to all of LpΩN, and, together with the characterisation of the dual of LpΩ, this shows that there exists vαLqΩN such that

Λ~w=0αkwα,vα

for any w=wαLpΩN. Moreover, we also have

Λ~LpΩN*=0αkvαLqΩNq1q.

Therefore, we have shown that for any ΛWk,pΩ* there exists v=vα0αkLqΩN such that for all uWk,pΩ we have

Λu=Λ*Pu=Λ~Pu=0αkDαu,vα.

Moreover, at each stage of the construction, we also showed that

ΛWk,pΩ*=Λ*PWk,pΩ*=Λ~LpΩN*=vαLqΩN.

Unfortunately this map F:LqΩNWk,pΩ* is not an isomorphism, since it may have a non-trivial kernel, as the next example shows.

Example 3.19.

Let Ω be an open subset of , and let φ be a smooth function on Ω with compact support. Then

Ωxuφdx=-Ωuxφdx (3.3)

by the definition of weak derivative. Now consider the vector xφ,φLqΩ2. The linear functional ΛW1,pΩ* associated to this vector is

Λu=u,xφ+xu,φ,

which is zero by (3.3). Therefore, for every non-zero smooth function φ with compact support contained in Ω, the vector xφ,φLqΩ2 is a non-trivial element of the kernel of the map F:LqΩ2W1,pΩ*.

Remark 3.20.

More generally, if the functional Λ is represented by a vector of smooth functions, i.e. vα𝒞cptΩNLqΩN, then we can write

Λu=0αkDαu,vα=0αku,-1αDαvα.

Therefore Λu=u,f, where f=-1αDαvα. In particular, we see that Λ is the zero functional if f0.

The next lemma shows that each element of the dual of a Sobolev space can be regarded as an extension of some distribution.

Lemma 3.21.

Let ΛWk,pΩ*. Then there exists T𝒟Ω* such that Λϕ=Tϕ for all ϕ𝒟Ω.

Proof.

Using the previous theorem, there exists v=vα0αkLqΩN such that

Λu=0αkDαu,vα

for every uWk,pΩ. Note that if ϕ𝒟Ω, then

Λϕ=0αkDαϕ,vα=0αkΩDαϕvαdx
=0αkTvα(Dαϕ)=0αk(-1)αDαTvα(ϕ),

where, in the second last term, Dαvα refers to the weak derivative of vα.

Define

T=0αk-1αDαTvα𝒟Ω*.

Then we have shown that Tϕ=Λϕ for all ϕ𝒟Ω. ∎

The previous theorems give different characterisations of elements of the dual of Wk,pΩ: Theorem 3.18 shows that there is a surjective map F:LqΩNWk,pΩ*, while Lemma 3.21 shows that the restriction of each linear functional to 𝒟Ω is a distribution. Therefore we have maps LqΩNWk,pΩ* and Wk,pΩ*𝒟Ω*.

Unfortunately, these results do not give a nice description of the kernel of the first map and the image of the second map. In addition, the second map may have a non-trivial kernel (see Remark 3.24). It turns out that W0k,pΩ has better properties with respect to the second map, and the next theorem describes the image of the subspace W0k,pΩ*Wk,pΩ*𝒟Ω*.

Theorem 3.22.

The dual space W0k,pΩ* is isometrically isomorphic to the Banach space consisting of those distributions T𝒟Ω* that satisfy

T=0αk-1αDαTvα (3.4)

for some v=vαLqΩN, and whose norm is given by

T:=infvLqΩ*:vLqΩNandT=0αk-1αDαTvα. (3.5)
Proof.

Given v=vαLqΩN, let V𝒟Ω* be the space of distributions satisfying (3.4). Let TV. The goal of the proof is to show that T has a unique extension to some ΛW0k,pΩ, and, moreover, that this map TΛ is the inverse of the restriction map from the previous lemma.

Given uW0k,pΩ, let ϕnn be a sequence of test functions converging to u in the Wk,pΩ-norm (note that this is not the same as convergence in the topology on the space of test functions). Such a sequence exists by the definition of W0k,pΩ. We claim that Tϕnn is a Cauchy sequence in , which is a consequence of the following calculation

Tϕm-Tϕn0αkTvαDαϕm-Dαϕn
0αkDαϕm-ϕnLpΩvαLqΩ(Hölder’s inequality)
ϕm-ϕnWk,pΩ0αkvαLqΩN(definition of Wk,p norm),

which converges to zero, since ϕnn is a Cauchy sequence in Wk,pΩ. Therefore limnTϕn exists, and we claim that the limit only depends on u. To see this, consider another sequence φnn of test functions converging to u in the Wk,pΩ norm, and note that the same calculation as above shows that

Tϕn-Tφnϕn-φnWk,pΩ0αkvαLqΩN
ϕn-uWk,pΩ+φn-uWk,pΩ0αkvαLqΩN,

which converges to zero as n. Therefore, we can define

Λu:=limnTϕn.

Clearly Λ is linear, since both T and the operation of taking the limit in Wk,pΩ are linear. To see that Λ is bounded, we compute

Λu=limnTϕnlimnϕnWk,pΩ0αkvαLqΩN=uWk,pΩ0αkvαLqΩN,

and so ΛWk,pΩ*0αkvαLqΩN.

Therefore, we have shown that T has an extension to ΛW0k,pΩ*, and, moreover, this extension is unique since 𝒞cptΩ is dense in W0k,pΩ. More precisely, any other bounded linear functional Λ that restricts to T on 𝒞cptΩ must satisfy

Λu-Λu=limnΛϕn-limnΛϕn(since Λ    and   Λ are both continuous)
=limnT(ϕn)-limnT(ϕn)=0.

By construction, Λϕ=Tϕ for every test function ϕ, and so the map VWk,pΩ* is the inverse of the restriction map from Lemma 3.21. To see that this is an isometry, note that Theorem 3.18 shows that the norm on V given by (3.5) is the same as the norm on Wk,pΩ* given by (3.2). Therefore V is isometrically isomorphic to W0k,pΩ, which also implies that V is a Banach space. ∎

Remark 3.23.

The space V is a strict subset of 𝒟Ω*, since there are many distributions that cannot be written as

T=0αk-1αTvα

for some vαLqΩN. For example, the delta functional can never be written in this form, since Example 2.10 shows that it cannot be represented by a function.

Remark 3.24.
  1. As part of the previous proof we showed that the restriction map

    W0k,pΩ*𝒟Ω*

    is injective. It is natural to ask whether these results can be extended to Wk,pΩ, however the previous proof will not work since it depends on the fact that, by definition, W0k,pΩ is the completion of 𝒞cptΩ in the Wk,p norm (since the first step is to approximate an element of W0k,pΩ by a sequence of smooth functions with compact support).

  2. One could still ask whether there is an alternative proof that works for Wk,pΩ, however it turns out that in general the answer is no, since the extension of a linear functional T𝒟Ω* to a linear functional ΛWk,pΩ* may be non-unique. When the domain Ω is bounded and the boundary has good properties, then one can construct examples using the trace operator W1,pΩLpΩ (see [4, Section 5.5] for the construction), which is zero on 𝒞cptΩ but non-zero in general. Therefore the restriction map Wk,pΩ*𝒟Ω* has non-zero kernel, and so we cannot identify Wk,pΩ* with a subspace of 𝒟Ω* in this case. Note that the trace operator is zero precisely on the subspace W0k,pΩ (see [4, Theorem 2, Section 5.5] for more details).

3.4. Positive distributions can be represented by measures (the Riesz representation theorem)

Given the results of the previous section on the dual space of a Sobolev space, it is natural to ask whether there is a nice characterisation of 𝒟Ω* in terms of familiar objects, and it is the goal of this section to answer this question for positive, real-valued distributions.

As we have seen from Theorem 2.8, there is an injective map Lloc1Ω𝒟Ω*. Unfortunately, as explained in Example 2.10, the set Lloc1Ω is too small to provide a unique representative for every distribution. In Theorem 3.35 we show that regular Borel measures are the right class of objects to represent distributions.

This theorem is also proved in [12, Theorem 2.14] (for the dual of the space of continuous functions with compact support) and [8, Theorem 6.22] (for the dual of the space of smooth functions with compact support). Both proofs follow a similar strategy, which involves first using the distribution to define an outer measure, and then showing that open sets are all measurable with respect to this outer measure. Rudin also considers the case of complex-valued distributions in [12, Theorem 6.19], and a more general proof (for the dual of the space 𝒞cptΩ,m) is given in [5, Section 1.8]

Note that in [8] the proof only uses the Riemann integral, and in particular it does not involve Lebesgue measure. Since we are assuming the construction of Lebesgue measure (and a construction using outer measure is also given in Definition B.18), then we are free to use it here where it simplifies the proof.

For this entire section we use the following notation: let Ω be an open subset of n, let OΩ denote the collection of open subsets of Ω, and let denote the Borel σ-algebra generated by the open subsets of Ω.

Definition 3.25.

Let T𝒟Ω*. The distribution T is a positive distribution if Tϕ0 for all ϕ𝒟Ω such that ϕx0 for all x.

In the following, let UΩ be an open set, and define 𝒞U to be the set of functions ϕ𝒞cptΩ with 0ϕ1 and suppϕU (note that Urysohn’s lemma shows that this set is nonempty if U is nonempty).

Lemma 3.26.

Let T𝒟Ω* be a positive distribution. Then the function μ:OΩ defined by

μ(U):={supϕ𝒞UTϕif U nonempty0if U= (3.6)

satisfies

  1. μU1μU2 if U1U2 are open sets,

  2. μnUnnμUn for every countable collection of open subsets UnnO.

Proof.

The first property follows from the fact that U1U2 implies that 𝒞U1𝒞U2.

To prove the second property, we first show that

μU1U2μU1+μU2

for any open sets U1,U2O. Given any ϕ𝒞U1U2, let K=suppϕ, and apply Lemma A.12 to show that there exist functions ϕ1 and ϕ2 such that ϕϕ1𝒞U1, ϕϕ2𝒞U2, and ϕϕ1+ϕϕ2=ϕ. Therefore

Tϕ=Tϕϕ1+Tϕϕ2μU1+μU2

for all ϕ𝒞U1U2, and so μU1U2μU1+μU2. Induction then shows that for any N

μn=1NUnn=1NμUn, (3.7)

and so it only remains to extend this to countable collections of open sets. To do this, note that any ϕ𝒞nUn has compact support in Ω, and so there exists a finite collection of sets (re-order so that these are U1,,UN) such that suppϕn=1NUn. Equation (3.7) then gives us

Tϕn=1NμUnnμUn,

which completes the proof. ∎

Now extend μ to a function μ* on the set of all subsets of Ω by

μ*A:=infμU:AUandUO. (3.8)
Lemma 3.27.

The function μ* is an outer measure on Ω.

Proof.

Recall that we have to prove that each of the following conditions hold.

  1. μ*A0 for all AΩ and μ=0,

  2. μ*A1μ*A2 if A1A2, and

  3. μ*nAnnμ*An for any countable collection of sets AnnPΩ.

The first two of the above properties follow easily from the respective definitions of μ* and μ, and so it only remains to show countable subadditivity. For any ε>0, let Unn be a collection of open subsets of Ω such that μ*Un=μUnμ*An+2-nε (these sets exist since μ* is defined using the infimum). Then

μ*nAnμ*nUnnμ*An+ε.

Since we can do this for any ε>0, then we have

μ*nAnnμ*An,

as required. ∎

It is worth pausing at this stage to consider some examples.

Example 3.28.
  1. Given xΩ, let T=δ0, the delta functional. Then for any subset AΩ we have

    μ(A)={1xA0xA
  2. Let Ω=n with co-ordinates x1,,xn, and let T be the distribution defined by integration on the subspace 1={x2==xn=0}, i.e.

    Tϕ=-ϕt,0,,0dt.

    Then for any subset AΩ we have μA=1A, where || denotes the one-dimensional Lebesgue measure on 1.

Theorem B.16 shows that to construct a measure μ from μ* we need to restrict to the σ-algebra of measurable subsets. The next lemma shows that, for the outer measure constructed above, this σ-algebra contains the Borel σ-algebra .

Lemma 3.29.

All open sets UOΩ are measurable with respect to μ*, i.e. for every set AΩ we have

μ*A=μ*AU+μ*AΩU.
Proof.

Since A=AUAΩU, then the inequality

μ*Aμ*AU+μ*AΩU

follows from the previous lemma, and so it only remains to show the reverse inequality. First consider the case where A is an open subset of Ω. Given any open set UΩ and any ε>0, choose ϕ𝒞AU such that Tϕμ*AU-12ε (such a ϕ exists since μ*AU=μAU is defined using the supremum). Let K=suppϕ. Then ΩK is open, and KAUU implies that ΩUΩK.

Now choose ψ𝒞ΩKA such that Tψμ*ΩKA-12ε (again, such a ψ exists since μ*ΩKA=μΩKA is defined using the supremum). Since suppϕ=K and suppψΩKAΩK, then ϕ and ψ have disjoint support, and so

μ*A=μATϕ+Tψ
μ*AU-12ε+μ*ΩKA-12ε
μ*AU+μ*AΩU-ε,

where the last step follows from Lemma 3.26 and the fact that ΩUΩK. We can do this for any ε>0, and so μ*Aμ*AU+μ*AΩU for any open set UΩ.

Now consider the case where A is an arbitrary subset of Ω. Then for any open set UΩ containing A and any open set VΩ we have from Lemma 3.27

μ*Uμ*Asince AU
μ*UVμ*AVsince AVUV
andμ*UΩVμ*AΩVsince AΩVUΩV.

Therefore

μ*U=μ*UV+μ*UΩVμ*AV+μ*AΩV

for every open set UΩ containing A, and any open set VΩ. Therefore, since μ*A is defined using the infimum, then

μ*Aμ*AV+μ*AΩV,

which completes the proof. ∎

Therefore, by Theorem B.16, the function μ* restricts to a measure (call it μ) on the Borel sigma algebra . Note that this measure μ is given by (3.6) on open sets. The next two lemmas give a characterisation of μ on compact sets.

Lemma 3.30.

Given any compact set KΩ, and any ψ𝒞cptΩ such that ψ1 on K and 0ψ1 on Ω, we have μKTψ.

Proof.

(See also [12, p43].) For all α such that 0<α<1, let Vα=x:ψx>α. Then each Vα is open, and since ψ1 on K we have KVα. Moreover, if ϕ𝒞Vα then αϕxψx for all xVα. Therefore Tϕ1αTψ (since T is a positive distribution) and we have

μKμVα=supTϕ:ϕ𝒞Vα1αTψ

for all α such that 0<α<1. Therefore μKTψ. ∎

Corollary 3.31.

If K is compact, then μK is finite.

Lemma 3.32.

Let KΩ be a compact set. Then

μK=infTψ:ψ𝒞cptΩ,and ψ1 on K. (3.9)
Proof.

Firstly note that compact sets are closed and therefore elements of the Borel σ-algebra. Given any ε>0, let U be an open set such that KUΩ and μUμK+ε (the existence of U follows from outer regularity of μ, which is a direct consequence of the definition of μ* in (3.8)). Recall from Urysohn’s lemma (Theorem A.11) that there exists ψ𝒞cptΩ such that suppψU, 0ψx1 for all xU, and ψ1 on K. Then ψ𝒞U, and so TψμU by (3.6). Therefore Lemma 3.30 implies that

μKTψμUμK+ε.

We can do this for any ε>0 and any compact set KΩ, therefore (3.9) holds for any compact set K. ∎

Lemma 3.33.

Given any ε>0 and any measurable set A there exists an open set U with AU and μUA<ε.

Proof.

If μA is finite then the result follows easily, since A is measurable and μ*A is defined to be the infimum of μU for UA open.

If μA is infinite, then we first write the open set Ω as the countable union of compact sets

Ω=K

(for example we could take each K to be a closed ball), and note that

A=AK.

Each AK is a subset of a compact set, and therefore has finite measure, so we can find an open set U such that AKU and

μUA<2-ε.

Then U=U is an open set containing A, and

μUA=μUAKμUAK<ε.

We can now show that the measure μ is Borel regular (recall Definition B.8).

Lemma 3.34.

μ is a regular Borel measure on Ω.

Proof.

Outer regularity of μ follows easily from the definition of μ*, and therefore it only remains to show that it is inner regular, i.e. for any measurable set AΩ we have

μA=supμK:KAand K is compact. (3.10)

Given ε>0, outer regularity of μ shows that there exists an open set U such that ΩAU and μUΩA<ε. Then, since we also have ΩUA, then

UΩA=UA=AΩU,

and so the previous lemma shows that there exists a closed set F=ΩU such that

μAF<ε.

Any closed set Fn is the countable union of compact sets; for example we can take K=FB0,¯ for each and write F=K. For F=ΩU as above, let Fn==1nK. If μA is infinite, then limnμFn is infinite also. If μA is finite, then so is μF, therefore there exists N such that nN implies that μFn>μF-ε.

In both of these cases we see that μA can be approximated by the measure of compact sets contained in A, which completes the proof of (3.10). ∎

We are now ready to prove the main theorem of this section.

Theorem 3.35.

Given a positive distribution T there is a unique, positive, regular Borel measure μ on Ω such that

  1. μK< for all compact KΩ, and

  2. for all ϕ𝒟Ω we have

    Tϕ=Ωϕxdμ. (3.11)
Proof.

Given such a distribution T, we have already constructed a positive regular Borel measure μ, which is defined on Ω and is finite on compact sets, and so it only remains to show (3.11) for all ϕ𝒟Ω.

First note that we can reduce to the case of ϕ0, since both T and the integral with respect to μ are linear, and any test function ϕ can be written ϕ=ϕ+-ϕ- for non-negative test functions ϕ+,ϕ-0 (Lemma A.13).

For each j,n, define compact sets Kjn=xΩ:ϕxjn (these are compact since ϕ is continuous with compact support), and define K0n=suppϕ. Let χjn be the characteristic function of Kjn. Then

ϕx<fnx:=1nj0χjnx.

Moreover, fnx converges pointwise to ϕx, since fnx-ϕx1n for each n. Since fnxsupxΩϕx+1n, and suppϕ is compact, then we can construct a function in L1μ that dominates fn, and so the dominated convergence theorem shows that

limnΩfnxdμ=Ωϕxdμ.

Therefore it only remains to show that the integral of fn with respect to μ converges to Tϕ. To see this, note that for each ε>0 outer regularity of μ shows that we can choose Ujn to be an open set containing Kjn such that μUjn<μKjn+ε, and use Urysohn’s lemma (Theorem A.11) to find ψjn𝒞cptΩ such that ψjn1 on Kjn and suppψjnUjn. Then by construction, we have

ϕx<fnx1nj0ψjnx,

and therefore Tϕ1nj0Tψjn. By the definition of μ on open sets, we also have

1nj0Tψjn1nj0μUjn<1nj0μKjn+ε.

This is true for all ε>0, and so

Tϕ1nj0μKjn=Ωfnxdμ

for all n. Taking the limit as n gives us TϕΩϕxdμ.

Similarly, we can approximate ϕ from below by simple functions to obtain the opposite inequality. Since the idea is the same as above then we only sketch the details here.

For each j,n, let Ojn=xΩ:ϕx>jn, and let ξjn be the characteristic function of Ojn. Then

gnx:=1nj1ξjnxϕx

for all n, and gn converges pointwise to ϕ since ϕx-gnx1n. Dominated convergence then shows that

limnΩgnxdμ=Ωϕxdμ.

For each j,n, inner regularity of μ implies that we can find a compact set Cjn such that CjnOjn and μCjn>μOjn-ε. Then use Urysohn’s lemma to find ψjn𝒞cptΩ such that ψjn1 on Cjn and suppψjnOjn. Then the same argument as before shows that

Tϕ1nj1Tψjn1nj1μCjn1nj1μOjn-ε.

This is true for all ε>0, and so

Tϕ1nj1μOjn=ΩgnxdμΩϕdμ.

Therefore Tϕ=Ωϕxdμ, as required. ∎

Remark 3.36.

Recall that fLloc1Ω defines a distribution Tfϕ=Ωfϕdx. Conversely, the Radon-Nikodym theorem and the Lebesgue decomposition (Theorems B.28 and B.30 respectively) show that the distribution can be represented by a function in L1Ω if and only if the measure μ from Theorem 3.35 is absolutely continuous with respect to Lebesgue measure. We have already seen that there exist distributions that cannot be represented by functions in L1ΩLloc1Ω, for example the delta functional from Example 2.10. For these distributions, the measure constructed in Theorem 3.35 will not be absolutely continuous with respect to Lebesgue measure, i.e. it will have a non-trivial singular component with respect to the Lebesgue decomposition (B.7).

4. Embedding and compactness theorems

The goal of this section is to state the Sobolev Embedding Theorem and the Rellich-Kondrachov compactness theorem. For now, the proof has been postponed until a future version of these notes. An excellent source for the embedding and compactness theorems is [1], which also contains many examples that show the bounds from the theorems are sharp.

First, we have to define the class of domains under consideration. Given an open subset Ωn, let

Ωδ:={xΩ:dist(x,Ω)<δ}.
Definition 4.1.

Let Ω be an open subset of n. We say that Ω satisfies the cone condition if there exists a finite cone C such that each xΩ is the vertex of a finite cone Cx contained in Ω and congruent to C.

We say that Ω satisfies the uniform cone condition if there exists a locally finite open cover Uj of the boundary of Ω and a corresponding sequence Cjj of finite cones, each congruent to some fixed finite cone C, such that

  1. there exists M< such that every Uj has diameter less than M,

  2. Ωδj=1Uj for some δ>0,

  3. QjxΩUjx+CjΩ for every j, and

  4. for some R>1, every collection of R of the sets Qj has empty intersection.

Since Ω is open, then continuously differentiable on Ω does not imply bounded. For j0, define CBjΩ to be the space of functions in CjΩ that are bounded and have bounded partial derivatives up to jth order.

This is a Banach space with norm

fCBjΩ=max0αjsupxΩDαfx.

Recall that a linear map T:AB of normed linear spaces is an embedding if T is bounded with respect to the norms on A and B. Since the elements of Wk,pΩ are equivalence classes of functions defined almost everywhere, then the meaning of an inclusion map from Wk,pΩ into CBjΩ is that each equivalence class in Wk,pΩ contains a function in CBjΩ.

The meaning of an inclusion map from Wk,pΩ into Wj,qΩk (where Ωk is the intersection of Ω with a plane of dimension k in n) is that each function in Wk,pΩ is the limit of a sequence of C functions (see Section 3.2) and the restriction of these smooth functions to Ωk converges to a limit in Wj,qΩk. For the map to be well-defined then this limit needs to be independent of the original choice of sequence, however this is guaranteed if the norm on Wj,qΩk is bounded by a constant times the norm on Wk,pΩ (which always occurs in the cases considered below).

Theorem 4.2 (Sobolev Embedding Theorem).

Let Ωn be an open set satisfying the cone condition, and, for 1kn, let Ωk be the intersection of Ω with a plane of dimension k in n. Let j0 and m1 be integers, and let 1p<. Then

  1. If either m-np>0, or m=n and p=1, then

    Wj+m,pΩCBjΩ

    and

    Wj+m,pΩWj,qΩk,Wm,pΩLqΩforpq<.
  2. If 1kn and m-np=0, then

    Wj+m,pΩWj,qΩkforpq<.
  3. If m-np<0 and either m-np>-kp, or p=1 and m-np-kp, then

    Wj+m,pΩWj,qΩkwhenpqandm-np-kq.

Note that in each case, it is the quantity m-np that determines the allowed embeddings. Increasing this quantity by either (a) giving up more derivatives, or (b) increasing the power p, allows for “better” embeddings in the following sense: when m-np>0 then we get an embedding into the space of continuously differentiable functions (the first case above), and when k1-np1>k2-np2 then we get an embedding Wk1,p1ΩWk2,p2Ω (the third case above). The same philosophy applies to the compactness theorem below, as well as the Sobolev multiplication theorem (which has been postponed until a future version of the notes).

Theorem 4.3 (Rellich-Kondrachov compactness theorem).

Let Ωn be an open set satisfying the cone condition, let Ω0 be a bounded open subset of Ω, and let Ω0k be the intersection of Ω0 with a k-dimensional plane in n. Let j0 and m1 be integers, and let 1p<. Then

  1. If m-np>0 then the following embeddings are compact

    Wj+m,pΩCBjΩ0,
    andWj+m,pΩWj,q(Ω0)if1q<.
  2. If m-np0, then the following embeddings are compact

    Wj+m,pΩWj,q(Ω0k)if0>m-np>-kp,q1,andm-np>-kq.
    Wj+m,pΩWj,q(Ω0k)ifm-np=0and1q<.

Appendix A. Notation and basic definitions

A.1. Lp spaces and Llocp spaces

The spaces LpΩ and LlocpΩ form the basis for the definition of the Sobolev spaces Wk,pΩ and Wlock,pΩ in Definition 2.16, and so we review some of their basic properties here.

Definition A.1.

Let Ωn be an open set, let 0<p<, and let Ω denote the space of Lebesgue measurable functions on Ω. Define

Lp(Ω)={fΩ:Ω|f|pdx<}/

where fg if f=g almost everywhere. When p=, define

L(Ω)={fΩ:esssupΩf<}/,

where

esssupΩf=infα:xΩ:fx>α=0

is the essential supremum of f on Ω.

It is well-known that when 1p the spaces LpΩ are Banach spaces with the norm fLpΩ=Ωfp1p (see for example [12, Theorem 3.11] or [15, Theorem 8.14]).

Definition A.2.

Let 1<p<. The conjugate exponent of p is the real number 1<q< such that

1p+1q=1.

If p=1 then the conjugate exponent of p is q=, and if p= then the conjugate exponent of p is q=1.

One of the most important inequalities for Lp spaces is Hölder’s inequality. For a proof, see for example [12, Theorem 3.5]

Theorem A.3 (Hölder’s inequality).

Let Ωn be open, and let fLpΩ, gLqΩ, where p and q are conjugate exponents. Then

fgL1ΩfLpΩgLqΩ.

The following theorem characterises the dual space of LpΩ. It is also well-known, for a proof see for example [2, Chapter IV], [8, Theorem 2.14], or [15, Theorem 10.44].

Theorem A.4.

Let Ωn be open, let 1p<, and let q be the conjugate exponent of p. Then

LpΩ*LqΩ.
Remark A.5.
  1. The isomorphism LqΩLpΩ* has an explicit form

    Lq(Ω)f(Tf:gΩf(x)g(x)dx)Lp(Ω)*
  2. It is not true that LΩ*L1Ω, since, for any xΩ, the Hahn-Banach theorem shows that the delta functional δxf=fx defined on 𝒞Ω extends to a bounded linear functional (call it δ~x) on LΩ. A similar argument to Example 2.10 shows that δ~x cannot be represented by a function in L1Ω, i.e. there is no gL1Ω such that

    δ~xf=Ωfgdx

    for all fLΩ.

More generally, this theorem is true for any σ-finite measure space (see [15, pp182-185]). Since the proof uses the Radon-Nikodym theorem then the result may not be true if the measure is not σ-finite (see Appendix B for the relevant definitions and statements of the theorems). The following example illustrates this for a simple case.

Example A.6.

Let Σ=,X be the trivial σ-algebra on a space X, and let μ be the measure μX=, μ=0. Then the measurable functions f:X are constants, and so we see that L1X,dμ0 consists of only the zero function. Therefore the dual is L1X,dμ*0, however LX,dμ, and so the dual of L1X,dμ is not isomorphic to LX,dμ in this case.

Next, we define the space LlocpΩ, which consists of locally integrable functions, in the sense that their integral is finite on compact sets.

Definition A.7.

Let Ω be an open set in n, and let be the set of Lebesgue measurable functions on Ω. Then

LlocpΩ=f:fLpKfor all compact KΩ.
Remark A.8.

One can easily extend this definition to arbitrary measure spaces that also have a topology (and hence a notion of compactness).

Clearly we have an inclusion LpΩLlocpΩ. The following examples show that this is not surjective.

Example A.9.
  1. Ω=n. Let f1, and note that

    Kfxpdx=mK<,

    where mK denotes the Lebesgue measure of K. Therefore fLlocpΩ for any p>0, even though fLpΩ.

  2. Ω=0,ε. Let fx=1x, and note that f is bounded on any compact subset K0,ε. Therefore Kfxpdx<, and so fLlocpΩ, but fLpΩ for any p1. (Note that we can extend this to any p>0 by choosing fx=1xn for some n, or even a function that grows faster at the origin, such as fx=exp1x2.)

The spaces Lp and Llocp for 0<p<1 have radically different properties to those described above for other values of p. These properties are discussed further in [13, pp35-36].

A.2. Integration by parts

Since we use integration by parts on open subsets of n in Section 2.2, then we recall the formula here. See [4, Appendix C.1] for a more complete description.

Given an open set Ωn with a C1 boundary, define

ν=ν1,,νn

to be the outward pointing normal at each point of the boundary Ω, and let dS denote the volume element on the boundary.

Theorem A.10 (Integration by parts).

Let Ω be a bounded open subset of n with a C1 boundary, and let u,vC1Ω¯. Then for all i=1,,n we have

Ωxiuvdx=-Ωuxivdx+ΩuvνidS.

If u has compact support in Ω, then the boundary term disappears, and we have

Ωxiuvdx=-Ωuxivdx.

A.3. The smooth Urysohn lemma and partitions of unity

The goal of this section is to give some consequences of the smooth Urysohn lemma and the existence of partitions of unity on open subsets of n.

Theorem A.11.

Let Ωn be an open set, let KΩ be compact, and let U be an open set such that KUΩ. Then there exists a smooth function f𝒞Ω such that 0fx1 for all xΩ, f1 on K, and f0 on ΩU. Moreover, there also exists f𝒞cptΩ such that 0fx1 for all xΩ, and f1 on K.

See [3, Theorem 2.6.1] for a proof.

As a consequence of Urysohn’s lemma, we have the following useful results.

Lemma A.12.

Let U1,U2 be open sets in n, and let KU1U2 be compact. Then there exist non-negative functions ϕ1,ϕ2 which are smooth on U1U2 and satisfy

  1. ϕ1x+ϕ2x=1 for all xK,

  2. ϕ1𝒞cptU1 and ϕ2𝒞cptU2.

Proof.

Urysohn’s lemma shows that there exists ϕ𝒞cptU1U2 such that 0ϕ1 and ϕ1 on K. Apply Urysohn’s lemma again to find ϕ~1𝒞cptU1 such that 0ϕ~11 and ϕ~11 on a neighbourhood of the compact set suppϕU2c. Let ϕ1=ϕϕ~1, and note that

  1. ϕ𝒞cptU1,

  2. 0ϕ1ϕ,

  3. ϕ11 on KU2c,

  4. suppϕ1suppϕ and ϕ1=ϕ on U2c.

Define ϕ2=ϕ-ϕ1, and note that

  1. 0ϕ21,

  2. ϕ2𝒞cptU2, and

  3. ϕ1+ϕ21 on K.

Therefore ϕ1 and ϕ2 satisfy the stated conditions. ∎

Any smooth function can be written as the difference of two non-negative continuous functions, just by taking the positive and negative parts of the original function. The next lemma shows that a smooth function can also be written as the difference of two non-negative smooth functions.

Lemma A.13.

Let ϕ𝒞cptΩ. Then there exist functions ϕ+,ϕ-𝒞cptΩ such that ϕ+x0 and ϕ-x0 for all xΩ and ϕx=ϕ+x-ϕ-x for all xΩ.

Proof.

Using Urysohn’s lemma, construct a non-negative smooth function ψ such that ψxsupxΩϕx on suppϕ, and suppψΩ. Then both ψ and ψ-ϕ are non-negative smooth functions, and so we can define ϕ+=ψ, ϕ-=ψ-ϕ. ∎

A.4. A corollary of Rademacher’s theorem

Rademacher’s theorem states that a Lipschitz function f:Ωnm is differentiable almost everywhere (with respect to Lebesgue measure). This is used in Example 2.21 as part of the proof that a Lipschitz continuous function is in Wloc1,Ω. The actual statement used in Example 2.21 is that the partial derivatives of f exist almost everywhere (a slightly weaker statement than Rademacher’s theorem). The purpose of this section is to recall the basic definitions and state the theorem. A proof of Rademacher’s theorem can be found in [5].

First, recall the following definition.

Definition A.14.

Let Ωn be an open set. A function f:Ωm is locally Lipschitz continuous on Ω if for every xΩ there exists a constant Cx and a neighbourhood U of x such that the following inequality is satisfied

fx-fyCxx-y. (A.1)

If there exists a uniform constant C such that fx-fyCx-y for all x,yΩ then we say that f is uniformly Lipschitz on Ω. The smallest value of the constant C is called the Lipschitz constant

Lipf=supx,yΩfx-fyx-y. (A.2)
Theorem A.15.

Let Ωn be open, and let f:Ω be locally Lipschitz on Ω. Then f is differentiable almost everywhere in Ω (with respect to the Lebesgue measure on n).

For a proof, see [5, Section 3.1.2].

Remark A.16.

Uniformly Lipschitz implies absolutely continuous, and so we know that the theorem is true for functions f:Ω with one-dimensional domains by general theory of absolutely continuous functions (see for example [15, Theorem 7.29]). This fact is used in the proof for n2, however some further analysis is also necessary (see [5, Section 3.1.2] for the details).

As a consequence of this, we have the following

Corollary A.17.

Let Ω be an open subset of n, and let f:Ω be a locally Lipschitz function. Then the partial derivatives of f exist almost everywhere on Ω (with respect to the Lebesgue measure on n).

Appendix B. Basic results from measure theory

Since Section 3.4 deals with measure, then, for completeness, here we review the basic definitions. In particular, this includes the definition of a complex measure. Since these notes assume knowledge of Lebesgue integration and the basic theorems associated to this (monotone convergence, dominated convergence, etc.) then this is not included here, the purpose is just to recall the important definitions that are used elsewhere in the notes. Of course, there are already good sources for this material such as [12], [15], [7], [10] (and many more!), so only the material relevant to the rest of the notes is covered in this section. Examples are included wherever possible in order to clarify the theory.

Since we want to deal with sets of infinite measure (n is the standard example), then first we have to define arithmetic in -,+. This is an extension of the usual operations of addition and multiplication on , together with the following definitions.

a=a={if a00if a=0

Care must be taken when cancelling terms from an equation: a+b=a+c implies b=c only if a, and ab=ac only if a. The consequence of these definitions is that the integral of any function over a set of measure zero will be zero, and the integral of the zero function over any set will also be zero.

Definition B.1.

A collection Σ of subsets of a set X is a σ-algebra on X if all of the following hold.

  1. XΣ.

  2. If AΣ, then XAΣ.

  3. If AnΣ for all n, then nAnΣ.

Example B.2.
  1. The set of all subsets of X forms a σ-algebra.

  2. Σ=,X is a σ-algebra, called the trivial σ-algebra on X.

  3. The set of all subsets of that are open is not a σ-algebra, since the complement of an open set is not necessarily open.

  4. The set of all subsets of n that are either open or closed is not a σ-algebra, since it is not closed under the operation of countable unions. For example, in the case n=1

    a,b=na,b-1n

    is neither open nor closed.

  5. If Σ is a σ-algebra on X and UX, then the collection

    ΣU=AU:AΣ

    is a σ-algebra on U.

Since open and closed subsets of n are of fundamental importance, then it would be useful to have a σ-algebra that contains all of these sets. The σ-algebra of all subsets of n is too large for interesting measures to exist (see [10, Section 5] for more insight into why this is the case), so it would also be useful for this σ-algebra to have some minimality property, i.e. it is the “smallest” σ-algebra that contains all of the open and closed subsets of . The next theorem shows that such a σ-algebra exists.

Theorem B.3.

Let be a collection of subsets of a set X. Then there exists a unique σ-algebra on X, call it Σ, such that

  1. Σ, and

  2. any other σ-algebra Σ containing satisfies ΣΣ (i.e. Σ is the smallest σ-algebra containing ).

This σ-algebra is called the σ-algebra generated by .

Proof of Theorem B.3.

Consider the family of all σ-algebras on X that contain . Since the set of all subsets of X is a σ-algebra containing , then this family is non-empty. We claim that the intersection Σ of all σ-algebras containing is also a σ-algebra, and the result will then follow, since such a σ-algebra clearly satisfies both of the conditions of the theorem.

Firstly note that the set X is in every σ-algebra containing , and so XΣ also. If a subset AX is in Σ, then it is in every σ-algebra containing , and so XA is in every σ-algebra containing , therefore XAΣ also. Therefore it only remains to check that Σ is closed under countable unions. To see this, let Ann be a countable collection of sets in Σ. Then AnnΣ for every σ-algebra Σ containing , and so A:=nAnΣ also. Therefore AΣ, and we have shown that Σ is a σ-algebra. ∎

Definition B.4.

The Borel σ-algebra on n is the smallest σ-algebra that contains the collection of open and closed subsets of n. The sets in are called the Borel subsets of n.

Definition B.5.

Let Σ be a σ-algebra on a set X. A positive measure on Σ is a function μ:Σ0, such that

μnAn=nμAn (B.1)

for any disjoint collection AnnΣ. A function μ satisfying (B.1), but without the restriction that the range is 0,, is called a countably additive set function.

A complex measure on Σ is a countably additive function μ:Σ (see [12, Chapter 6] for more about complex measures).

Definition B.6.

A measure space X,Σ,μ consists of a set X, a σ-algebra Σ of subsets of X, and a measure μ on Σ.

A measure space X,Σ,μ is finite if μX is finite. A measure space X,Σ,μ is σ-finite if X is the countable union of sets XnΣ with μXn finite for each n.

Remark B.7.

A σ-finite measure is the countable sum of finite measures. To see this, let X,Σ,μ be σ-finite, with X=nXn and μXn finite for each n. Define measures

μnE=μEXn,

and note that μE=nμnE.

An important class of measures on n are those defined on the Borel σ-algebra.

Definition B.8.

A Borel measure on n is a measure defined on the Borel σ-algebra.

A Borel measure μ is inner regular (resp. outer regular) if for every En we have

μ(E)=sup{μ(K):K compact and KE}(resp. μ(E)=inf{μ(U):U open and EU}).

If a Borel measure μ is both inner and outer regular, then we say that μ is Borel regular.

Remark B.9.
  1. Again, this notion can be extended to measures on a locally compact Hausdorff space (see [12]).

  2. The definition of a Borel regular measure given above is equivalent to the requirement that every measurable set E has the same measure as some Borel sets B1E and B2E. To see this, let B1 be the intersection of open sets Unn such that UnE and μUnE<1n, and let B2 be the union of compact sets Knn such that KnE and μEKn<1n.

A useful way to construct a measure with certain desired properties is to start with an outer measure. For example, Lebesgue measure and Hausdorff measure can both be constructed using outer measures (see also [12] for a construction of Lebesgue measure that doesn’t use outer measure), and the construction in Section 3.4 of a measure associated to a distribution also uses outer measure.

Definition B.10.

A function μ*:PX0, defined on the power set PX of a space X is called an outer measure on X if it satisfies all of the following.

  1. μ*A0, μ*=0.

  2. μ*A1μ*A2 if A1A2.

  3. μ*nAnnμ*An for any countable collection of sets AnnPX.

The point of studying outer measures is that it is easy to construct an outer measure with certain properties.

To get a feel for outer measure we recall here two main examples: Lebesgue outer measure and Hausdorff outer measure.

Example B.11 (Lebesgue outer measure).

The Lebesgue outer measure is defined on compact rectangular subsets

R=a1,b1 ×  × an,bnn

by

m*R:=j=1nbj-aj.

For an arbitrary subset En, consider the collection K of all countable covers Rnn of E by rectangular sets Rn, and define

m*E:=infKnm*R. (B.2)

It is easy to check that this definition satisfies the conditions of an outer measure (see for example [15, Theorems 3.3 & 3.4]).

The next theorem is a useful characterisation of the Lebesgue outer measure.

Theorem B.12.

Let En. Then for each ε>0, there exists an open set Un such that EU and m*Um*E+ε.

In particular, we have

m*E=infm*U:U open and EU.

For a proof, see [15, Theorem 3.6].

The Hausdorff outer measure and the associated Hausdorff measure are useful for studying certain subsets of n that have Lebesgue measure zero. For example, the d-dimensional Hausdorff measure of a d-dimensional ball in n is non-trivial, even though the Lebesgue measure is zero. Furthermore, the Hausdorff measure can be used to distinguish fractal sets (sets of fractional Hausdorff dimension), and the study of the properties of Hausdorff measure is a major component of Geometric Measure Theory (see [5], [6], [9]).

Example B.13 (Hausdorff outer measure).

The diameter of a set En is defined to be

δE:=supx,yEx-y.

Fix m>0 (not necessarily an integer), and let En. Let Kε denote the collection of countable covers Enn of E such that δEn<ε for each n. Given ε>0, define

HαεE=infKεnδEnα.

If ε1<ε2, then Kε1Kε2, and so Hαε1EHαε2E. Therefore

HαE:=limε0HαεEexists.

HαE is called the α-dimensional Hausdorff outer measure of E. Again, it is easy to check that this is an outer measure (see for example [15, Theorem 11.12]).

The next definition and theorem show that each outer measure has an associated σ-algebra and that the restriction of the outer measure to this σ-algebra is a measure.

Definition B.14.

Let μ* be an outer measure on X. A subset EX is μ*-measurable if and only if

μ*A=μ*AE+μ*AAE (B.3)

for every subset AX.

Remark B.15.
  1. An equivalent definition is that E is μ*-measurable if and only if

    μ*A1A2=μ*A1+μ*A2 (B.4)

    whenever A1E and A2XE. To see that the first definition implies the second, given any sets A1E and A2EX, let A=A1A2. Clearly (B.3) implies (B.4). Conversely, given any set AX, let A1=AE and A2=AAE. Clearly these satisfy the requirements A1E and A2XE, and we have A=A1A2. Again, it is clear that (B.4) implies (B.3).

  2. Both of these definitions have the same basic idea: the μ*-measurable subsets of X are those for which μ* is additive on arbitrary decompositions into disjoint subsets.

The next theorem justifies the use of the term “measurable” in the previous definition.

Theorem B.16 (Caratheodory).

Let μ* be an outer measure on X. Then the collection of μ*-measurable subsets of X forms a σ-algebra, and the restriction of μ* to this σ-algebra is a measure.

For a proof, see for example [8, Theorem 1.15].

Remark B.17.

This theorem is used in Section 3.4 to construct the measure associated to a positive distribution.

Definition B.18.
  1. The Lebesgue measure, denoted m(), is the measure associated to the Lebesgue outer measure from Example B.11.

  2. The α-dimensional Hausdorff measure, denoted Hα, is the measure associated to the α-dimensional Hausdorff outer measure from Example B.13.

Remark B.19.

Open and closed sets are Lebesgue measurable, and therefore the σ-algebra of Lebesgue measurable sets contains the Borel σ-algebra.

Using this definition of Lebesgue measure, together with Theorem B.12, we see that Lebesgue measure is Borel outer regular.

Lemma B.20.

For any Lebesgue measurable set En we have

mE=infmU:U open and EU.

The proof follows by restricting the result of Theorem B.12 to the σ-algebra of Lebesgue-measurable sets. By taking complements, we also see that E is Borel inner regular.

Lemma B.21.

For any Lebesgue measurable set En we have

mE=supmK:K compact and KE.

This is a consequence of [15, Lemma 3.22], which states that E is measurable if and only if for all ε>0 there exists a closed set FE such that mEF<ε. The lemma above then follows by taking a sequence of compact sets Kn=FB0,n¯.

A natural question arising from Theorem B.3 is whether two measures that agree on the Borel subsets of n also agree on the Borel σ-algebra (the minimal σ-algebra generated by the Borel subsets). This question is answered in more generality by the Caratheodory-Hahn Extension Theorem, for which we first need the following definitions.

Definition B.22.

An algebra of subsets of X is a non-empty collection 𝒜 of subsets of X that is closed under the operations of taking complements and finite unions.

Note that, as a consequence, an algebra is also closed under finite intersections, and therefore both X and the empty set are both in 𝒜. The difference between this definition and that of a σ-algebra is that a σ-algebra is also closed under countable unions. For example, the set of all open and closed subsets of n is an algebra, but not a σ-algebra.

Definition B.23.

A measure on an algebra 𝒜 is a function μ:𝒜0, such that μ=0, and

μnAn=nμAn

whenever An is a countable collection of disjoint sets in 𝒜 whose union also belongs to 𝒜.

Given a measure on an algebra 𝒜, we can construct an outer measure μ* on X as follows. For each subset AX, let 𝒞=Ann be the collection of countable covers of A by sets in 𝒜. Define

μ*A=inf𝒞nμAn. (B.5)
Theorem B.24.

Let μ be a measure on an algebra 𝒜, and let μ* be as defined in (B.5). Then

  1. μ* is an outer measure,

  2. μ*A=μA for all A𝒜, and

  3. A is μ*-measurable for all A𝒜.

For a proof, see [15, Theorems 11.18 and 11.19].

Definition B.25.

Let μ be a measure on an algebra 𝒜. If μ~ is a measure on a σ-algebra Σ containing 𝒜, and μ~A=μA for all A𝒜, then we say that μ~ is an extension of the measure μ to the σ-algebra Σ.

Theorem B.16 shows that the outer measure μ* defined in (B.5) is a measure on some σ-algebra 𝒜* containing 𝒜. The next theorem shows that this is the unique extension of μ to any σ-algebra contained in 𝒜*.

Theorem B.26 (Caratheodory-Hahn Extension Theorem).

Let μ be a measure on an algebra 𝒜, let μ* be the corresponding outer measure, and let 𝒜* be the σ-algebra of μ*-measurable sets. Then the restriction of μ* to 𝒜* is an extension of μ. Moreover, if μ is σ-finite with respect to 𝒜, and if Σ is any σ-algebra with 𝒜Σ𝒜*, then μ* is the only measure on Σ that is an extension of μ.

For a proof see [15, Theorem 11.20].

Sobolev spaces are defined in terms of distributions, and in many of the examples from Sections 2 and 3 we consider distributions T𝒟Ω* that are represented by a function fLloc1Ω, i.e. T=Tf where

Tfϕ:=Ωfϕdx.

Many distributions cannot be represented by a function, for example the delta functional from Example 2.7. The main theorem of Section 3.4 shows that instead of using functions to represent distributions, the right class of objects to look at is the class of regular Borel measures (see Theorem 3.35). A natural question is to ask when a measure can be represented by a function, and, if not, then how can this failure be expressed in terms of properties of the measure. This is the content of the Lebesgue decomposition and Radon-Nikodym theorem.

Definition B.27.

Let μ and ν be measures on the same σ-algebra Σ on a space X. The measure ν is absolutely continuous with respect to μ if νE=0 for every set EΣ with μE=0. The measure ν is singular with respect to μ if there is a set ZΣ with μZ=0, and νE=0 for every EΣ such that EXZ.

In other words, if sets of μ-measure zero are also sets of ν-measure zero, then ν is absolutely continuous with respect to μ. If ν is supported on a set of μ-measure zero then it is singular with respect to μ.

Theorem B.28 (Radon-Nikodym theorem).

Let X,Σ,μ be a σ-finite measure space, and let α be a measure on Σ that is absolutely continuous with respect to μ. Then there exists fL1X,dμ such that

αE=Efdμ (B.6)

for each EΣ.

Theorem B.29.

Let X,Σ,μ be a measure space, and let σ be a measure on Σ that is singular with respect to μ. Then there exists a set Z with μZ=0, and

σE=σEZ

for each EΣ.

Theorem B.30 (Lebesgue Decomposition).

Let μ be a σ-finite measure on a σ-algebra Σ, and let ν be a finite measure on Σ. Then there is a unique decomposition

ν=α+σ, (B.7)

where α and σ are measures on Σ such that α is absolutely continuous with respect to μ, and σ is singular with respect to μ.

See [12] or [15] for different proofs of the above statements. Note that Rudin in [12] considers the more general case of a complex measure.

The following simple example shows that σ-finiteness is a necessary condition in the Radon-Nikodym theorem. Another example using the counting measure is described in [12, pp123-124].

Example B.31.

Let Σ=,X be the trivial σ-algebra on a set X, and let μ and ν be measures on X with μ=0, μX=, ν=0, and νX=1. Note that ν is absolutely continuous with respect to μ, and that μ is not a σ-finite measure on X. Then the Σ-measurable functions f:X are the constants (since f measurable implies that f-1UΣ for all open sets U), and so Xfdμ= for any non-zero measurable function. Since νX=1, then there cannot exist any measurable function f such that Xfdμ=νX, and therefore the Radon-Nikodym theorem does not hold in this case.

The next lemma is a consequence of the well-known Vitali covering lemma.

Lemma B.32.

Let Ωn be an open set. Then for all δ>0 there exists a countable collection Bnn of disjoint closed balls in Ω such that

  1. diamBnδ for all n, and

  2. ΩnBn has Lebesgue measure zero.

See [5, Corollary 2, p28] for a proof.

References

  • 1
    Robert A. Adams.
    Sobolev spaces.
    Academic Press [A subsidiary of Harcourt Brace Jovanovich, Publishers], New York-London, 1975.
    Pure and Applied Mathematics, Vol. 65.
  • 2
    S. Banach.
    Theory of linear operations, volume 38 of North-Holland Mathematical Library.
    North-Holland Publishing Co., Amsterdam, 1987.
    Translated from the French by F. Jellett, With comments by A. Pełczyński and Cz. Bessaga.
  • 3
    Lawrence Conlon.
    Differentiable manifolds: a first course.
    Birkhäuser Advanced Texts: Basler Lehrbücher. [Birkhäuser Advanced Texts: Basel Textbooks]. Birkhäuser Boston Inc., Boston, MA, 1993.
  • 4
    Lawrence C. Evans.
    Partial differential equations, volume 19 of Graduate Studies in Mathematics.
    American Mathematical Society, Providence, RI, 1998.
  • 5
    Lawrence C. Evans and Ronald F. Gariepy.
    Measure theory and fine properties of functions.
    Studies in Advanced Mathematics. CRC Press, Boca Raton, FL, 1992.
  • 6
    Herbert Federer.
    Geometric measure theory.
    Die Grundlehren der mathematischen Wissenschaften, Band 153. Springer-Verlag New York Inc., New York, 1969.
  • 7
    Paul R. Halmos.
    Measure Theory.
    D. Van Nostrand Company, Inc., New York, N. Y., 1950.
  • 8
    Elliott H. Lieb and Michael Loss.
    Analysis, volume 14 of Graduate Studies in Mathematics.
    American Mathematical Society, Providence, RI, second edition, 2001.
  • 9
    Frank Morgan.
    Geometric measure theory.
    Academic Press Inc., San Diego, CA, third edition, 2000.
    A beginner’s guide.
  • 10
    John C. Oxtoby.
    Measure and category, volume 2 of Graduate Texts in Mathematics.
    Springer-Verlag, New York, second edition, 1980.
    A survey of the analogies between topological and measure spaces.
  • 11
    Michael Reed and Barry Simon.
    Methods of modern mathematical physics. I.
    Academic Press Inc. [Harcourt Brace Jovanovich Publishers], New York, second edition, 1980.
    Functional analysis.
  • 12
    Walter Rudin.
    Real and complex analysis.
    McGraw-Hill Book Co., New York, third edition, 1987.
  • 13
    Walter Rudin.
    Functional analysis.
    International Series in Pure and Applied Mathematics. McGraw-Hill Inc., New York, second edition, 1991.
  • 14
    François Trèves.
    Topological vector spaces, distributions and kernels.
    Academic Press, New York, 1967.
  • 15
    Richard L. Wheeden and Antoni Zygmund.
    Measure and integral.
    Marcel Dekker Inc., New York, 1977.
    An introduction to real analysis, Pure and Applied Mathematics, Vol. 43.
  • 16
    William P. Ziemer.
    Weakly differentiable functions, volume 120 of Graduate Texts in Mathematics.
    Springer-Verlag, New York, 1989.
    Sobolev spaces and functions of bounded variation.
Search
Metadata
Downloads