Penn Engineers: New Machine Learning Approach Predicts Which 2D Materials Can Be Synthesized

Penn Engineers: New Machine Learning Approach Predicts Which 2D Materials Can Be Synthesized

By Lauren Salig

Machine learning and artificial intelligence are being applied to an increasing number of tasks, from recognizing faces in photos, to recommending movies, even to driving cars. The key ingredient that enables machine learning to be so effective is the availability of staggering amounts of labeled data. People have long been labeling data for Google, Facebook and Netflix by tagging friends in pictures, identifying stop signs in grainy images before logging in, and rating films and TV shows.

But using machine learning in materials science, which attempts to design and make materials for use in future technologies, has proven to be more difficult due to the lack of labeled data in the field. In materials science, data about materials that have successfully been created is considered labeled data, but information about the vast pool of materials postulated but yet to be synthesized is unlabeled. As such, creating new materials can feel a bit like guesswork for scientists, but a recent study at Penn strives to bring more clarity to the synthesis of new materials through an innovative machine learning technique.

Vivek Shenoy, Eduardo D. Glandt President’s Distinguished Professor with appointments in Materials Science and Engineering, Mechanical Engineering and Applied Mechanics, and Bioengineering, oversaw the study, which was led by Nathan Frey, a graduate student in Shenoy’s group and a National Defense Science and Engineering graduate fellow. Collaborators from Drexel University and the University of Puerto Rico at Mayagüez also contributed to the research.

The study was published in the journal ACS Nano.

Shenoy and Frey set out to apply machine learning to materials science, specifically focusing on the creation of two-dimensional (2D) materials, or materials with a thickness of just one or a few layers of atoms. Much of the research currently available on 2D materials focuses on demonstrating the possible synthesis of 2D materials and their potential to hold unique, useful properties. Theoretical calculations posit many promising 2D materials, but only a handful have proved possible to actually synthesize. Shenoy and Frey’s goal was to distinguish synthesizable materials from the illusory materials that are possible in theory but not in reality.

“The problem is, in general, we have no idea which of these proposed materials can be made in the lab,” Frey says. “It’s kind of like having a gallery of real, original paintings, and we’d like to buy some more but we need to be able to tell originals from fakes. It’s a hard problem, but if we study the originals enough and identify what makes them unique, we can learn to recognize the forgeries.”

Being able to distinguish 2D materials that are synthesizable masterpieces from their “fake” counterparts is a difficult challenge facing materials scientists. Experimentally testing materials synthesis is an expensive, time-consuming process fraught with unpredictability, but the scientists also lack a solid body of labeled data, or known data about synthesized materials, on which to base typical machine learning approaches. Shenoy and Frey addressed this problem by using a different machine learning technique that focuses on analyzing successful examples of 2D materials — the masterpieces we already have.

“We applied a machine learning method called ‘positive and unlabeled’ learning to the problem of figuring out which materials should be the easiest to synthesize in the lab,” says Frey. “Our ‘positive’ data are the materials that have already been successfully made. We invent new materials by taking the existing ones and imagining switching the atoms that make them up with similar elements from the periodic table. All these theorized materials are ‘unlabeled,’ because we don’t know if they can be made or not.”

In this way, the team used the limited data available to make predictions about the potential successes or failures of specific 2D materials synthesis. If scientists could consistently apply machine learning in this way, they could focus on making materials that have a high likelihood of successful synthesis and integration into future technology and avoid expending time and resources on materials that will likely fail.

In this study, Shenoy and Frey focused on a family of layered materials known as MAX phases that can be chemically altered to create a class of 2D materials called MXenes, which are of particular interest for real-world applications.

“The 2D nature of these MXenes imparts them with all sorts of interesting properties that aren’t seen in conventional 3D materials. MXenes in particular have an amazingly wide range of applications from energy storage to water purification and biosensing,” says Frey.

The research team’s “positive and unlabeled” machine learning model predicted 18 MXene compounds that are good candidates for experimental synthesis. Some of these potential compounds contain elements never seen before in MXenes, expanding the list of 2D materials that are alluring options for future industry applications.

Although, at this point, the research simply suggests that certain 2D materials could be successfully created, Frey is already thinking about how, once synthesized, those materials could become critical components in the coming waves of technological progress.

“These materials could be used in next-generation battery technologies or as building blocks for information processing platforms that surpass currently available computers,” says Frey.

In addition to Shenoy and Frey, Jin Wang, postdoctoral researcher in the Department of Materials Science and Engineering, Gabriel Iván Vega Bellido, an undergraduate student from the University of Puerto Rico participating in the NSF-MRSEC summer REU program, and two collaborators from the Drexel Nanomaterials Institute, Babak Anasori and Yury Gogotsi, contributed to the study.

The research was supported by the Army Research Office through contract W911NF-16–1–0447, the National Science Foundation by grants EFMA-542879, CMMI-1727717, DMR-1740795 and MRSEC DMR-1720530, the Department of Defense through the National Defense Science & Engineering Graduate Fellowship Program, and the National Institute of General Medical Sciences of the National Institutes of Health under Award Number T34GM008419.