Big data isn’t dead, says AI expert

It takes a while to scroll all the way to the bottom of the Google Scholar page for Atif Farid.  As senior research scientist of Ontrak, Inc. and AI professor at the University of North Carolina Charlotte, Farid has authored or co-authored over 50 research articles on topics ranging from cloud computing to service-oriented architecture…

It takes a while to scroll all the way to the bottom of the Google Scholar page for Atif Farid

As senior research scientist of Ontrak, Inc. and AI professor at the University of North Carolina Charlotte, Farid has authored or co-authored over 50 research articles on topics ranging from cloud computing to service-oriented architecture frameworks for small spacecraft control. He has licenses and certifications in machine learning, data science, blockchain, and parallel computing, among other topics. In his spare time, he volunteers as a research scientist for NASA at the Squirrel Valley Observatory in western North Carolina, watching the skies for asteroids and other near-earth objects with the potential to make planetary conditions inhospitable for humans and pretty much all non-tardigrade lifeforms.

Atif Farid

This afternoon he will lead a breakout session on the use of AI in healthcare at the AIM Institute’s virtual Heartland Developer Conference.

Farid’s presentation will touch on a variety of issues, processes and technologies related to applying AI to the healthcare field, including predictive analysis, deep learning, image recognition, natural language processing, data gravity and big data.

He will discuss some of his own research as well as research produced by his company, Ontrak (formerly Catasys), which helps those suffering from untreated behavioral health problems that worsen chronic medical diseases to improve their health and reduce their medical expenses.

BECOME A SPONSOR

Farid said he is especially excited to share an innovation he devised involving the use of gated recurrent units, an AI deep learning technique, to build a framework for transferring knowledge from social media to the mental health domain. 

“Whoever speaks with us has a digital footprint somehow, whether it be on Facebook, Instagram, Twitter, or some sort of online forum where they were sharing ideas,” he said. 

Relying on such publicly available datasets helps inform technologies relevant to patients, Farid said, and lead to treatments that factor in the influence of a person’s zip code and social environment down to the neighborhood level. For example, a person from around 96th Street and 2nd Avenue in Manhattan might respond well to a bedside manner that could alienate someone from Grand Forks, North Dakota. And vice versa.

“There’s a significant difference,” Farid said. “A New Yorker will go and speak to the point. Whereas in Iowa City, Iowa (for instance), they will be very relaxed, and they will be talking to you about weather and the landscape and all that.” 

He’s not just saying this. The “geo-dispersion” of communication styles is borne out by a several terabyte dataset that includes the accumulation of textual data sets over multiple years. Now that’s big data.

“People might say big data is dead, Hadoop is dead,” Farid said. “People who say that don’t know Hadoop or big data at all.”

He likened those who make such sweeping generalizations to amateurs who think they can fly a plane because they’ve played a lot of Flight Simulator.

Natural language processing, unsurprisingly, will take a starring role in Farid’s breakout session. 

Despite the prevalence of speech-to-text, text-to-speech, automated online assistants and apps like Google Translate, he believes we have barely scratched the surface of what natural language processing can do.

“Natural language processing is still in its infancy. When an infant is born, the very first thing they do is listen. And then they see, and then their mind is absolute ground-zero to start embedding information. That’s where natural language processing is at this moment,” he said.

For a field still in its infancy, Farid has been at it for a long time. His 1996 Master’s thesis was on the use of machine language to extract English translations of Urdu, the national language of Pakistan. The following year, he published an article on the use of AI to recommend homeopathic medicines to patients. 

“I was, like, born to do this,” he laughed.

Editor’s note: a previous version of this article contained superfluous details about the data collection process and has been revised to be more concise.

This story is part of the AIM Archive

This story is part of the AIM Institute Archive on Silicon Prairie News. AIM gifted SPN to the Nebraska Journalism Trust in January 2023. Learn more about SPN’s origin »

Get the latest news and events from Nebraska’s entrepreneurship and innovation community delivered straight to your inbox every Wednesday.