Data Science: Refining Data into Knowledge, Turning Knowledge into Action

Jennifer Phillips-Cremins, Rob Riggleman, Dan Roth, (upper row, left to right) Victor Preciado, Eric Stach, and Paris Perdikaris (bottom row, left to right)
Jennifer Phillips-Cremins, Rob Riggleman, Dan Roth, (upper row, left to right) Victor Preciado, Eric Stach, and Paris Perdikaris (bottom row, left to right) each use elements of data science in their fields of research, which cut across topics as diverse as genetics, medical imaging, materials design and more.

More data is being produced across diverse fields within science, engineering, and medicine than ever before, and our ability to collect, store, and manipulate it grows by the day. With scientists of all stripes reaping the raw materials of the digital age, there is an increasing focus on developing better strategies and techniques for refining this data into knowledge, and that knowledge into action.

Enter data science, where researchers try to sift through and combine this information to understand relevant phenomena, build or augment models, and make predictions.

One powerful technique in data science’s armamentarium is machine learning, a type of artificial intelligence that enables computers to automatically generate insights from data without being explicitly programmed as to which correlations they should attempt to draw.

Advances in computational power, storage, and sharing have enabled machine learning to be more easily and widely applied, but new tools for collecting reams of data from massive, messy, and complex systems—from electron microscopes to smart watches—are what have allowed it to turn entire fields on their heads.

“This is where data science comes in,” says Susan Davidson, Weiss Professor in Computer and Information Science (CIS) at Penn’s School of Engineering and Applied Science. “In contrast to fields where we have well-defined models, like in physics, where we have Newton’s laws and the theory of relativity, the goal of data science is to make predictions where we don’t have good models: a data-first approach using machine learning rather than using simulation.”

Penn Engineering’s formal data science efforts include the establishment of the Warren Center for Network & Data Sciences, which brings together researchers from across Penn with the goal of fostering research and innovation in interconnected social, economic and technological systems. Other research communities, including Penn Research in Machine Learning and the student-run Penn Data Science Group, bridge the gap between schools, as well as between industry and academia. Programmatic opportunities for Penn students include a Data Science minor for undergraduates, and a Master of Science in Engineering in Data Science, which is directed by Davidson and jointly administered by CIS and Electrical and Systems Engineering.

Penn academic programs and researchers on the leading edge of the data science field will soon have a new place to call home: Amy Gutmann Hall. The 116,000-square-foot, six-floor building, located on the northeast corner of 34th and Chestnut Streets near Lauder College House, will centralize resources for researchers and scholars across Penn’s 12 schools and numerous academic centers while making the tools of data analysis more accessible to the entire Penn community.

Faculty from all six departments in Penn Engineering are at the forefront of developing innovative data science solutions, primarily relying on machine learning, to tackle a wide range of challenges. Researchers show how they use data science in their work to answer fundamental questions in topics as diverse as genetics, “information pollution,” medical imaging, nanoscale microscopy, materials design, and the spread of infectious diseases.

Read more about how Jennifer Phillips-Cremins, Rob Riggleman, Dan Roth, Victor Preciado, Eric Stach, and Paris Perdikaris use data science at Penn Today, or in Penn Engineering Magazine.