We discuss the notion of covariant derivative, which is a coordinate-independent way of differentiating one vector field with respect to another. This will be useful for defining the acceleration of a curve, which is the covariant derivative of the velocity vector with respect to itself, and for defining geodesics, which are curves with zero acceleration. This is following Lee’s Riemannian Manifolds, $\S2$ and $\S4$.


Tangent vector as derivation

Let $M$ be a smooth manifold, which means every point $p \in M$ has a neighborhood $U \subset M$ that maps smoothly to $\R^n$, so we can identify $p \in M$ with its image $(x^i) \equiv (x^1,\dots,x^n) \in \R^n$, although formally each coordinate $x^i$ is a map from $U$ to $\R^n$.

Attached to each $p \in M$ is the tangent space $T_pM$, which is a vector space of dimension $n$. Each tangent vector $v \in T_pM$ represents an equivalence class of curves passing through $p$ with velocity $v$.

However, there is another, more abstract definition of tangent vector as derivation, which means each tangent vector $v \in T_pM$ acts on the space $C^\infty(M)$ of smooth functions $f \colon M \to \R$ by taking directional derivative along $v$:

Thus, each local coordinate $(x^i)$ gives rise to a basis $(\partial_i)$ of $T_pM$ consisting of the partial derivatives $\partial_i \equiv \partial/\partial x^i$. Henceforth, we use the Einstein’s summation convention and hide the $\sum_{i=1}^n$ symbol.

The defining property of the derivation operation is that it satisfies the product rule over $C^\infty(M)$:

Moreover, it is linear over $v$, which is clear from the definition.

Now suppose we have a vector field $w$ on $M$, which means at each point $p \in M$ we have a tangent vector $w(p) \in T_pM$ that varies smoothly with $p$. If we write $w(p) = (w^i)$ in the basis $(\partial_i)$ of $T_pM$, each $w^i$ is a smooth function from $M$ to $\R$, so we can differentiate it the usual way. Then we can collect the results together into a vector:

where $\part{w}{x}$ is the Jacobian matrix of $w$. In terms of components, this is (in Einstein’s convention):

This is the Euclidean way of differentiating vector fields. However, it turns out there are many other possible ways of defining how to differentiate vector fields, as we see below.


Connection

We generalize the example above and define an abstract way of differentiating a vector field with respect to another, via the notion of a connection (sometimes also called linear or affine connection).

Let $TM = \bigsqcup_{p \in M} T_pM$ be the tangent bundle of $M$, which is the collection of all tangent spaces of $M$ glued to $M$ at the base points, so each element in $TM$ is of the form $(p,v)$ with $p \in M$ and $v \in T_pM$. This has a smooth manifold (vector bundle) structure. Let $\pi \colon TM \to M$ denote the projection operator which recovers the base point, $\pi(p,v) = p$.

A section of $TM$ is a map $F \colon M \to TM$ such that the composition $\pi \circ F \colon M \to M$ is the identity, which means each $p \in M$ is mapped to its own tangent space, $F(p) = (p,v)$ for some $v \in T_pM$. Thus, a section of $TM$ is simply a vector field on $M$. In other words, we can abstractly think of a vector field as a section of the tangent bundle. We say the vector field is smooth if the section $F \colon M \to TM$ is smooth as a map between manifolds; this coincides with the usual definition of smooth vector fields.

Let $\T(M)$ denote the space of smooth sections on $TM$, i.e., the space of smooth vector fields on $M$. This is an infinite-dimensional vector space, with pointwise addition and scalar multiplication.

We define a connection on $M$ as a map

written $(v,w) \mapsto \nabla_v w$, satisfying the properties:

  • $\nabla_v w$ is linear over $C^\infty(M)$ in $v$:
  • $\nabla_v w$ is linear over $\R$ in $w$:
  • most importantly, $\nabla$ satisfies the product rule, which is the vector analog of the scalar product rule we have seen above:

We call $\nabla_v w$ the covariant derivative of $w$ in the direction of $v$.


Christoffel symbols

Given the rules above, we can compute the formula for the covariant derivative as follows. Let $(E_i)$ be a local frame for $TM$ on an open subset $U \subset M$, which means $E_i \in \T(U)$ is a vector field on $U$, and at each $p \in U$, the vectors $(E_1(p), \dots, E_n(p))$ form a basis for $T_pM$; an example is the partial derivative vector field $E_i = \partial_i$. For any indices $i$ and $j$, we can expand the vector field $\nabla_{E_i} E_j$ in terms of this same frame:

which defines $n^3$ smooth functions $\Gamma_{ij}^k$ on $U$, called the Christoffel symbols of $\nabla$ with respect to $(E_i)$.

Then for any vector fields $v,w \in \T(U)$, we can compute the covariant derivative $\nabla_v w$ as follows. First, write $v = v^i E_i$ and $w = w^j E_j$ in terms of the local frame $(E_i)$. Then by the product rule:

And by linearity over $v$, we can write the first term as:

where we have used the definition of the Christoffel symbols. Combining the two equations above (and renaming the dummy index), we obtain the formula:

For example, the Euclidean connection, which is the Euclidean way of differentiating vector field above, corresponds to the choice $\Gamma_{ij}^k = 0$. But as the formula above shows, there are many other connections. In fact, for $M = \R^n$, or when $M$ is covered by a single chart, each choice of $n^3$ smooth functions $\Gamma_{ij}^k$ specifies a connection via the formula above (Lee, Lemma 4.4).


Covariant derivative along curves

Let $\gamma \colon I \to M$ be a smooth curve, where $I \subset \R$ is an interval. At any time $t \in I$, the velocity $\dot \gamma(t)$ is invariantly defined as the push-forward $\gamma_\ast(d/dt)$ of the time differential operator $d/dt \in T_tI$, which means it acts on functions by:

This abstract definition agrees with the usual coordinate definition of velocity, so if we write $\gamma(t) = (\gamma^1(t),\dots,\gamma^n(t))$ in coordinates, then $\dot \gamma(t) = \dot \gamma^i(t) \partial_i$ where $\dot \gamma^i(t) = \frac{d}{dt} \gamma^i(t)$.

A vector field along a curve $\gamma$ is a smooth map $v \colon I \to TM$ such that $v(t) \in T_{\gamma(t)}M$; an example is the velocity vector field $\dot \gamma(t) \in T_{\gamma(t)}M$. Let $\T(\gamma)$ denote the space of vector fields along $\gamma$.

Then we can show (Lee, Lemma 4.9) that any linear connection $\nabla$ on $M$ determines a unique operator

satisfying the properties:

  • Linearity over $\R$:
  • Product rule:
  • If $v$ is extendible, which means there is a vector field $\tilde v$ in the neighborhood of the image of $\gamma$ such that $v(t) = \tilde v_{\gamma(t)}$, then:

For any $v \in \T(\gamma)$, $D_tv$ is called the covariant derivative of $v$ along $\gamma$.

The uniqueness of $D_t$ follows because we can compute a formula for it. Given $v \in \T(\gamma)$ and $t_0 \in I$, choose coordinates near $\gamma(t_0)$, and write $v(t) = v^j(t) \partial_j$. Then by the linearity and product rule:

And since $\partial_j$ is extendible, the second term is:

where $\Gamma_{ij}^k$ are the Christoffel symbols with respect to $(\partial_i)$. Thus, we obtain the formula for the covariant derivative of $v$ along $\gamma$:

This also proves existence, since we can define $D_t v$ via the formula above, which satisfies the defining properties.


Geodesics

Let $M$ be a manifold with a linear connection $\nabla$, and let $\gamma$ be a smooth curve in $M$.

We define the acceleration of $\gamma$ to be the vector field $D_t \dot \gamma = \nabla_{\dot \gamma} \dot \gamma$ along $\gamma$. From the formula above, we can write the acceleration of $\gamma$ in components as:

Note that this says to differentiate $\dot \gamma$ with respect to time (along $\gamma$), we need to take the covariant derivative of $\dot \gamma$ with respect to itself. This sounds a bit odd, but ultimately this is because $\dot \gamma = \gamma_*(d/dt)$ is the pushforward of the time differential operator, so it also represents the time derivative on $\gamma$.

Then we can define a geodesic (with respect to $\nabla$) to be a curve $\gamma$ with zero acceleration: $D_t \dot \gamma \equiv 0$. From the formula above, we see that $\gamma(t) = (\gamma^1(t),\dots,\gamma^n(t))$ is a geodesic if and only if its components satisfy the geodesic equation:

This is a system of second-order differential equations for the functions $\gamma^i(t)$, and we can show existence and uniqueness of solution, at least for a small interval of time around the initial condition. Furthermore, we can prove that for any starting point $p \in M$ and initial velocity $v \in T_pM$, there is a unique maximal geodesic $\gamma \colon I \to M$ with $\gamma(0) = p$ and $\dot \gamma(0) = v$, defined on some open interval $I \subset \R$, where maximal means the domain $I$ cannot be extended further (Lee, p. 59).

For example, in the Euclidean connection where $\Gamma_{ij}^k = 0$, the geodesic equation becomes $\ddot \gamma^k(t) = 0$, which means the geodesics are straight lines $\gamma(t) = \gamma(0) + \dot \gamma(0) t$, defined for all time $t \in \R$. But for other choices of connections, we often cannot solve the geodesic equation explicitly.

Finally, we note that in our discussion we have not mentioned the metric property of $M$. Indeed, we can define the notion of a geodesic with respect to a linear connection $\nabla$ on $M$, and in principle there are many choices of $\nabla$, each giving different geodesics. However, it turns out that when $M$ is endowed with a Riemannian metric structure, there is a unique connection that is “compatible” with the metric, called the Levi-Civita connection, which we will discuss further next time.