New cars are packed with helpful technology. Downward-facing cameras help drivers stay within lanes, and adaptive cruise control can brake and accelerate a vehicle based on other drivers’ speeds. Likewise, banks use encryption software that changes your banking information into code that only your bank can use and read, and bank software even analyzes financial markets to make investments.
These features are based on software systems that rely on over 100 million lines of code, with separate programs for each component of each system. But as technology evolves, the software behind these systems needs to keep up.
Stephanie Weirich, ENIAC President’s Distinguished Professor in Computer and Information Science, aims to make software systems more reliable, maintainable and secure. Her research improves tools that help programmers to determine the correctness of their code, which is applicable to a broad scope of software. Specifically, Weirich researches and improves Haskell, a programming language that places a lot of emphasis on correctness, thanks to its basis in logic and mathematical theories.
“People might not realize how much computational power underlies our society,” Weirich says. “Cars, for example, possess very strong correctness requirements as they have become so reliant on computation. If banks mess up their code, it can cause disaster for our financial systems. The security and correctness of these programs is very important.”
If a hacker goes after the software behind a less-correct (and thus less-protected) component of a car, such as the brakes, the results could be dangerous and devastating. Not only could the hacker gain control, but because each individual system is interconnected as a whole, the other programs for different components could be prone to errors as well. Similarly, if a driver is using lane-keeping assist and adaptive cruise control on the highway, a bug in a less-correct braking system might tell the adaptive cruise control that the car is braking when in fact it isn’t, which could be deadly.
“Automation is everywhere. Cars today are just computers that have a steering wheel instead of a keyboard,” says Antal Spector-Zabusky, one of Weirich’s doctoral students. “It’s very important that these computers are as reliable as possible so that everything functions correctly, along with every other interlocking system, to prevent software from crashing and to ward off hackers.”
TRAVELING ACROSS LANGUAGES
Programs for embedded systems, like those found in cars, are typically written in a programming language called “C.” Programmers make sure that their software will use data correctly by combining relevant variables into classifications known as “data types.” Types are what allow a programmer to assign rules to all of the different components of a computer program.
Weirich focuses on Haskell because she uses it to improve the type system of the language itself, which leads to even more extensive correctness for programmers. She’s making the types more expressive, and as a result, programmers can make better use of the type system to help them develop correct code. For example, you could represent a date, such as February 29, 2019, using three integers: 2, 29 and 2019. However, the non-expressive “integer” type does not capture the relationship between these numbers. A more expressive type used for storing dates would flag this value as invalid by encoding the fact that February only has 28 days in non-leap years.
While these tools and systems are not directly usable across different languages, the ideas are. For example, Weirich says Mozilla’s Rust language, a new programming language similar to C, draws from research on type systems, such as the type system research in the Haskell community. Wherever they’re implemented, the more expressive the type system, the more it can check complex, intricate relationships between components of a program. By contrast, a less expressive type system might not be able to detect when such relationships are violated when the program is compiled, resulting in errors and incorrect behavior at runtime.
Stronger types and better system verification software allow programmers to ensure they’re writing code correctly. Weirich has also worked with Spector-Zabusky to improve Haskell’s compiler, which is what turns the Haskell language into a language used by computers.
“Instead of getting rid of bugs afterward, you get rid of them in the first place,” Weirich says. “The idea is that since you’re ruling out bugs at the beginning by how you are defining your types, you might be shortening development time. Also, because you don’t have to implement a wrong program and then redo that program, you’re shortening the maintenance time, because the compiler can help you figure out what part of the code can be changed.”
DEEPSPEC
Many professors and students in the Department of Computer and Information Science collaborate in a group called Programming Languages @ Penn, or PLClub. This includes Weirich and Spector-Zabusky, who have been working on a project called DeepSpec, a National Science Foundation flagship program, more formally known as “Expeditions in Computing: The Science of Deep Specification.” The DeepSpec project is a collaboration between Penn, MIT, Princeton and Yale.
“DeepSpec is examining this question: What does it really take to specify software correctly?”, says Spector-Zabusky. “We want to specify software that is used in the real world.”
To specify software is a fundamental component of ensuring that a program is as correct as possible. Specifications range in intensity, all the way from simple specifications, such as ensuring that an application won’t crash when it is used, to deep specifications, which could include ensuring that a complex numerical simulation computes correctly.
Weirich’s research directly informs the DeepSpec project, particularly her work with SpectorZabusky to verify the Haskell compiler. The group aims to develop computer system verification for an entire computer system, which includes the operating system, hardware code and every other component. This takes correctness properties a step further, or deeper, than types can, resulting in a higher degree of confidence in these systems.
COMPLEXITY IN CS
Computer science’s complexity is what originally attracted Weirich to the field.
“Everything changes rapidly, and there’s always new stuff,” she says. “Computer science is very broad, so it would be impossible to keep up with every aspect of every field. It makes more sense to gain expertise in specific areas.”
Weirich has been accumulating expertise in statically typed programming languages like Haskell for over twenty years. She continues to do so, and students from all corners of the University, from freshmen to doctoral candidates, benefit tremendously.
This semester, Weirich is teaching CIS 552: Advanced Programming to graduate students and select undergraduates.
“In Advanced Programming, I demonstrate ideas that are most expressible in the Haskell language,” Weirich says. “I take ideas from my research and get to teach them to people who want to become software developers. This gives them not only a new way to develop code, but also a new perspective on programming.”
In CIS 120: Programming Languages and Techniques, which Weirich will teach in spring 2020, she introduces freshmen to computer science through program design. She says she enjoys teaching this course, partly because she sees students progress from battling the difficult content to understanding it.
“Overall, undergraduates recognize that so many different fields now rely on computation,” she says. “There’s a big distribution in skill level and understanding, so throughout the semester, it’s rewarding to see that switch to understanding at different points for different students.”
This article appeared in the Fall 2019 edition of Penn Engineering Magazine.