Unraveling the mystery of multiple sclerosis with big data

Stuart SchlossmanMS Research Study and Reports

The State University of New York (SUNY) at Buffalo is home to one of the leading multiple sclerosis (MS) research centers in the world. It’s one spot where big-data-powered analysis is helping researchers understand potential causes and treatments of the disease, accelerating the race to a cure.
What causes MS is not precisely known. Currently it is believed to originate from some complex combination of a virus and gene defect(s), perhaps in association with such environmental factors as sunlight and cigarette smoke.
[ Download the Big Data Analytics Deep Dive by InfoWorld’s David Linthicum for a comprehensive, practical overview of this booming field. | Harness the power of Hadoop with InfoWorld’s 7 top tools for taming big data. ]
Dr. Murali Ramanathan, is co-director of the Data Intensive Discovery Initiative at the SUNY research center. A technique developed there called AMBIENCE enables them to efficiently search for the interaction of multiple genetic variations — called single nucleotide polymorphisms (SNP, pronounced “snips”) — and environmental factors that raise the risk of patients contracting multiple sclerosis.
The data sets used in this multivariable research total more than 250TB — and the analysis is very demanding computationally because the researchers are looking for significant interactions between thousands of genetic and environmental factors.
In this research, there were two main issues to overcome: crunching through the immense data set and achieving sophisticated and easily customizable analytic models across a wide range of data sets. The researchers wanted to see not only which individual variables were significant, but also which combinations of variables stood out.
Running the algorithms required with sample data on commodity hardware took almost a week. It quickly dawned on the researchers that it would take many weeks to run the algorithms with all the data — the results from which would lead to additional questions, algorithm adjustments, data changes, and so on.
Visit: www.msviewsandnews.org  to register
.
Visit our MS Learning Channel on YouTube: http://www.youtube.com/msviewsandnews