Mountains of health care data are being collected from many avenues—research studies, clinical trials, electronic medical records and even smart phones.

Researchers around the globe are then taking these data and mining the depths and details looking for patterns that can provide insights into the symptoms and causes of diseases that will allow them to enhance prediction and prevention.

But all the data in the world can’t solve problems unless the right questions are being asked. And researchers at the school of nursing are doing just that.


The Mutation of Genetic Factors
in Obesity-related Cancers

Dr. Su Yon Jung is in a relatively emerging field of molecular genetic cancer epidemiology and is looking at the interactions between lifestyle factors and genetic variants in disease outcomes. By definition, epidemiology is the study (scientific, systematic, and data-driven) of the distribution (frequency, pattern) and determinants (causes, risk factors) of health-related outcomes in specified populations (neighborhood, school, city, state, country, global).

In cancer epidemiologic studies, behavioral and observational studies have been conducted in parallel with genome-wide association studies. Cancer researchers widely agree that one-sided research (either behavioral or genomic) cannot completely present the essential risk factors of cancers

“Gene-behavior interaction research provides different results than the genetic or behavioral study alone,” said Jung.” For example, a small number of people may have the BRCA gene, but for those that do, if you are also obese, you have increased risk of cancer.”

She has used data from the Women’s Health Initiative dbGaP Study, which has recruited post-menopausal women from large academic centers around the United States to address the most common causes of death, disability and impaired quality of life. This initiative looked at cardiovascular disease, cancer, and osteoporosis in this population.  More than 160,000 women have been involved over the past 20 years.


Jung examined, for example, that obesity interacts with genetic variants and both factors play a key role in altering the risk of cancer. 

“This is all very exciting to me and my colleagues. If we can link behaviors and genetics, researchers can target efforts within the group with risk genotypes to promote intervention strategies. We can also develop data on potential genetic targets in clinical trials for cancer prevention and intervention to reduce cancer risk.”

What Came First – Sleep Apnea
or Hypertension

Research done by Dr. Paul Macey has long been focused on the subject of sleep apnea.  Now, working with data gathered by the UCLA Health System, he is looking at more than 200,000 records hoping to determine whether sleep apnea leads to hypertension or if hypertension lead to sleep apnea.

“People have believed that sleep apnea leads to hypertension, but it is not clear or certain that this is what is going on,” said Macey. “So we are looking at all the records to see if someone was diagnosed with both, which came first.”

He likes using the large data set that comes from the hospital system because “it is hard for most researchers to individually recruit such a large group of people.  Large data allows us to look at patterns.”

“Going beyond those diagnoses, big data allows us to ask questions about whether there are ways to predict that someone will get sleep apnea. If we go back a year earlier and look at their lab values or prescriptions or how often they were visiting their primary care physician—is there any pattern?”

Macey theorizes that if we can find there is a correlation, then a medical record could be flagged because a patient has these factors and a doctor may recommend getting a sleep study.

Macey says the possibilities with data are endless. “We can put in all sorts of variables and determine whether we
can predict something—for example,
a prescription for anti-anxiety medicine plus five primary care visits plus a diagnosis of depression might mean
a 95% chance of hypertension.”

‘I believe that using big data can transform the practice of healthcare and our search for what is making people get sick—especially for chronic diseases.”

Understanding High Risk Behavior

Dr. Dorothy Wiley collects her own “big data.” Over the past 25 years, she studied, and has since worked with, mentors who led the largest study of HIV-infected and uninfected gay, bisexual and other men who have sex with men (MSM) across the United States. These men enrolled in research when they were 30 years and older in 1984-85 or in 2001-03, when the study reopened for enrollment.  She and her team study data for these men who have been examined twice a year, some of them for up to 33 years.

Her most recent results found that older HIV-positive MSM are at higher risk of becoming infected with the HPVs that cause most anal cancers.

“Invasive anal cancer is a health crisis for gay, bisexual and other men who have sex with men,” she said. “Right now, invasive anal cancer rates among HIV-infected men who have sex with men surpass rates for seven of the top 10 cancers in men.”

Interestingly, other behavioral factors also impact the risk of infection. Tobacco has been long associated with HPV cancers, such as cervical cancer, in women. Likewise, these researchers also report that smoking increases the risk of infection with specific types of HPV among both HIV-infected and uninfected older men by up to 20%.  Similarly, they recently reported that testosterone levels measured in blood increased risk for high-risk HPV viruses. Again, while female sex hormones are linked to HPV infections and cervical cancer in women, no prior studies have evaluated testosterone in men.

She is also a mentor to three students who are using data from HCUP, the most comprehensive source of hospital care data in the United States, which includes information on in-patient stays, ambulatory surgery and services visits, and emergency department encounters. HCUP allows researchers and others to study health care delivery and patient outcomes over time, at the national, regional, state, and community levels.

“If they tried to recruit their own participants, they might get 25, 50, maybe 100 participants, which doesn’t really allow you to define a pattern,” said Wiley.

One student is looking at individuals who come to the Emergency Room with PTSD versus other mental illnesses and whether the PTSD diagnosis changes the probability of hospitalization. Another is looking at the trend in soccer and futsal (a derivative of soccer played with five-man teams on a basketball-style court) injuries—most specifically in children. Is there a similarity in injuries and if so, can they be prevented somehow? Finally, a third student is looking at whether individuals with breast cancer have better survival rates if they participate in a clinical trials; in other words, what is the value of participating in clinical trials?

The future of data

Not all research uses big data, but for the research that is using big data, the right questions need to be asked. And these researchers are asking the right questions for discoveries that will make a difference in improving health.