|
For another use of the term median in geometry, see Median (geometry).
The geometric median of a discrete set of sample points in a Euclidean space is the point minimizing the sum of distances to the sample points. This generalizes the median, which has the property of minimizing the sum of distances for one-dimensional data, and provides a central tendency in higher dimensions. It is also known as the Fermat–Weber point or 1-median.[1] The geometric median is an important estimator of location in statistics. It is also a standard problem in facility location, where it models the problem of locating a facility to minimize the cost of transportation. The special case of the problem for three points in the plane (that is, m = 3 and n = 2) is sometimes also known as Fermat's problem; it arises in the construction of minimal Steiner trees, and was originally posed as a problem by Pierre de Fermat to Evangelista Torricelli, who solved it. Its solution is now known as the Fermat point of the triangle formed by the three sample points. Alfred Weber's name is associated with the more general Fermat–Weber problem due to a discussion of the problem in his 1909 book on facility location. Wesolowsky (1993) provides a survey of the problem. See Fekete, Mitchell, and Beurer (2003) for generalizations of the problem to non-discrete point sets.
[edit] DefinitionFormally, for given a set of m points
Note that argmin means the argument for which the sum is minimized. In this case, it is the point y from where the sum of all Euclidean distances to the xi's is minimum. [edit] Properties
[edit] Special cases
[edit] ComputationDespite being an easy to understand concept, computing the geometric median poses a challenge. The centroid or center of mass, defined similarly to the geometric median as minimizing the sum of the squares of the distances to each sample, can be found by a simple formula — its coordinates are the averages of the coordinates of the samples — but no such formula is known for the geometric median, and it has been shown that no formula involving only arithmetic operations and kth roots can exist in general.[3] However, it is straightforward to calculate an approximation to the geometric median using an iterative procedure in which each step produces a more accurate approximation. Procedures of this type can be derived from the fact that the sum of distances is a convex function, since the distance to each sample point is convex and the sum of convex functions remains convex. Therefore, procedures that decrease the sum of distances at each step cannot get trapped in a local optimum. One common approach of this type, called Weiszfeld's algorithm[4], is a form of iteratively re-weighted least squares. This algorithm defines a set of weights that are inversely proportional to the distances from the current estimate to the samples, and creates a new estimate that is the weighted average of the samples according to these weights. That is, Bose et al (2003) describe more sophisticated geometric optimization procedures for finding approximately optimal solutions to this problem. [edit] Implicit formulaIf y is distinct from all the given points, xj, then y is the geometric median if and only if it satisfies: This is equivalent to: which is closely related to Weiszfeld's algorithm. If y is equal to some of the given points, then y is the geometric median if and only if there are vectors uj such that: where for xj ≠ y, and for xj = y, [edit] See also
[edit] Notes
[edit] References
Directorio de Enlaces Directorio dmoz Directorio espejo dmoz Pedro Bernardo |