# Taylor's Method

## What is it?

It's a method for recovering the configuration of a 3D articulated object as e.g. the human body.

The only assumption we do is that the 3D → 2D projection can be roughly modeled by a scaled orthographic projection.

Thereby only three sources of information are needed:

• the 2D projection (2D position) of some points of the 3D model
• the lengths l of the segments of the 3D model (for the humand body: the limb lengths)
• the scale s of the scaled orthographic projection

## Who invented it?

The idea was presented by Camillo J. Taylor in his paper

which appeared in the Journal of Computer Vision and Image Understanding, Vol.80, pp. 677-684, 2000.

## When is a scaled orthographic projection a good assumption?

3d to 2d projections are divided into

• parallel projections where the direction of projection (DOP) is the same for all 3d points
• in orthographic projections the DOPs are perpendicular to the image plane
• in oblique projections the DOPs are not perpendicular to the image plane
• and perspective projections where the DOPs are not the same for all 3d points

In general 2D images of the 3D world are perspective projections: objects become smaller as they get further from the camera.

In a parallel projection scenario two objects of the same 3D size would be projected to the same 2D projection size regardless of a different distance to the camera.

But if the depth of field of the object of interest is small with respect to the camera ↔ object distance the scaled orthographic projection is a good approximation for the perspective one. Thus it is also called weak perspective projection.

Here is a sample of a 3D pose (markers: light blue) projected to a virtual image plane (the transparent rectangle) by a scaled orthographic projection. The 2D pose markers are colored in dark blue:

## How does it work?

Assuming the projection from 3D to 2D can be modeled roughly by a scaled orthographic projection and

• we know the 2D positions (u,v) in the image plane of some of the 3D model points (x,y,z) with u = s*x+dx, v=s*y+dy
• we know the relative lengths of the model segments
• we have an approximation for the scale s of the scaled orthographic projection

If we know the 2D position (u1,v1) of the start point and the end position (u2,v2) of a segment, i.e. the 2d projection length l' of a model segment and the 3D length l of the segment, we can use this foreshortening information of body segments to constrain the offset dz=z1-z2 between the corresponding 3D points (x1,y1,z1) and (x2,y2,z2) up to a sign if we know the scale s of the projection.

Since we cannot know which of the two points has the smaller z-coordinate this ambiguity is left. It's the ambiguity / information loss introduced by the nature of a 3D to 2D projection.

## The maths of Taylor's method in a nutshell

Ups! The last formula is wrong… ((u1-u2)^2 + (v1-v2)^2 ) / s^2.

There has to be a + and not a - between the (u1-u2)^2 and (v1-v2)^2.

## Do we get an unique result?

No! Unfortunately not…

Even if we know

• the scale of the scaled orthographic projection
• and the true bone lengths

there is still a huge set of possible poses that fit Taylor's equations.

Reason: since we have the +/- sign amibiguity for each of the bone of the body model (for each bone: 2 possible values of dz), for a body model with N bones we have 2^N possible poses.

In the following animation I rendered some to show how ambiguous the result still is: all the displayed 3D poses are projected to the same 2D pose!

## Example of a wrong 3D pose that maps to the same 2D pose as the ground truth pose

Ground truth 3D pose:

Wrong pose, but it maps to the same 2D pose as the ground truth pose!

## What happens if we estimate the bone length or the scale wrong?

Then the dz value will be wrong!

Consider the difference below the square root.

Bone length:

• if the bone length l is estimated too big ⇒ minuend will be too big ⇒ dz will be too big
• if the bone length l is estimated too small ⇒ minuend will be too small ⇒ dz will be too small

Scale:

• if the scale s is estimated too big ⇒ subtrahend will be too small ⇒ dz will be too big
• if the scale s is estimated too small ⇒ subtrahend will be too big ⇒ dz will be too small

Underestimating the scale s ⇒ dz values get too small (same effect as understimating the bone lengths):

Overestimating the scale s ⇒ dz values get too big (same effect as overestimating the bone lengths):