A mathematical problem related to big data was solved by Jean-Francois Puget, engineer in the Solutions Analytics and Optimization group at IBM France. The problem was first mentioned on Data Science Central, and an award was offered to the first data scientist to solve it.
Bryan Gorman, Principal Physicist, Chief Scientist at Johns Hopkins University Applied Physics Laboratory, made a significant breakthrough in July, and won $500. Jean-Francois Puget completely solved the problem, independently from Bryan, and won a $1,000 award.

Example of rare, special permutation investigated to prove the theorem
The
competition has been organized and financed by Data Science central.
Participants from around the world submitted a number of interesting
approaches. The mathematical question was asked by Vincent Granville, a
leading data scientist and co-founder at Data Science Central. Granville
initially proposed a solution after performing large-scale Monte Carlo
simulations, but his solution turned out to be wrong.
The
problem consisted in finding an exact formula for a new type of
correlation and goodness-of-fit metrics, designed specifically for big
data, generalizing the Spearman's rank coefficient, and being especially
robust for non-bounded, ordinal data found in large data sets. From a
mathematical point of view, the new metric is based on L-1 rather than
L-2 theory: In other words, it relies on absolute rather than squared
differences. Using squares (or higher powers) is what makes traditional
metrics such as R squared notoriously sensitive to outliers, and avoided
by savvy statistical modelers. In big data, outliers are plentiful and
it can render conclusions from a statistical analysis invalid, so this
is a critical issue. This outlier issue is sometimes referred to as the curse of big data.....[more]
-tj
| Free forum by Nabble | Edit this page |