This is the first in a series of interviews we plan to do with programming languages researchers working in industry.
In this post, I interview Facebook’s Avik Chaudhuri, who has worked on language implementations at Facebook and Adobe (and is an alumnus of our group here at PLUM). Thanks, Avik, for taking the time to do this!
The interview is broken into three parts: Background; Facebook’s new language, Flow; and reflections on the value of a PhD and the challenges of research in industry.
Background
How did you become interested in programming languages?
I think my love for programming languages has a lot to do with the happiness I derive from programming. I was fortunate to be in a school where I learned programming very early (in LOGO, then BASIC). When I was an undergrad at IIT Delhi, one of my first courses introduced me to functional programming, which was a whole new world. I did an internship at INRIA Rocquencourt with Gerard Huet, where I wrote a distributed structured editor in OCaml, while soaking in lunch conversations about OCaml, Coq, and the like from the very creators of these incredible tools. During my undergrad thesis work with Sanjiva Prasad, I dabbled in things like polytypic programming and proof-carrying code, which exposed me to a lot of cool papers and advanced ideas in the field.
I did my PhD at UC Santa Cruz, advised by Martín Abadi. I learned about things like pi calculus, object types, and secure communication. Combining ideas from these diverse areas, I developed formal models of dynamic access control, which I used to analyze the designs of secure storage systems and operating systems. I continued this theme of work during a post-doc with Jeff Foster at Maryland, where I studied the security of web applications (Rails) and mobile applications (Android) using PL techniques (type systems, symbolic analysis). On the side, I also resumed working on pure PL topics like concurrency control and type systems for dynamic languages.
After your post-doc, you worked at Adobe Research. What projects did you work on there?
Serendipitously, while I was thinking about type systems for dynamic languages during my post-doc, an opportunity came by at Adobe to apply some of these ideas to a new version of ActionScript, the language of Flash applications. We developed a type inference algorithm that we could layer on to existing ActionScript programs to improve their performance without changing their runtime behaviors: the additional (but optional) types would simply enable more VM-level optimizations. Much of this work was done with an intern, Aseem Rastogi, and was published at POPL’12. More aggressive work followed, on new features and an improved VM, but ultimately the project was abandoned as the industry moved away from browser plugins and towards open standards.
Facebook and Flow
Most recently you’ve been at Facebook, working primarily on a new dialect of JavaScript called Flow. What is Flow, and how did it get started?
Flow is a gradually typed extension to JavaScript. This means that you can add type checking “gradually” to JavaScript programs, in exchange for incremental benefits such as finding bugs early, better tooling, and more code transformation opportunities.
When I arrived at Facebook in late 2013, Facebook programmers had spent a few years using an internally developed, gradually typed extension to PHP called Hack, and this use had moved them towards realizing the value of static typing, especially for large codebases. Developers missed the benefits of static typing when they had to switch from PHP (which powers the Facebook back-end) to JavaScript (for the front-end), and so the time was ripe for something like Flow.
What’s been happening with Flow, and what’s next?
Flow development has moved extremely quickly: in a little more than a year, with a very small team, we went from the first line of code to an alpha-version open-source launch (in November 2014). Facebook has a culture of moving fast and open-sourcing its projects early: our aim was not to perfect everything before releasing the code, but instead incrementally improve it in public. We also managed to reuse a lot of Hack’s system infrastructure code, letting us focus on implementing the type system.
So far, a few teams have begun using Flow inside Facebook, and adoption is growing pretty fast overall: eventually, we want to see most of our JavaScript type checked by Flow.
There has also been a lot of interest in Flow externally: the topic of adding types to JavaScript has heated up in recent times, and developers seem to like that Flow encourages the use of JavaScript’s powerful features, working with them rather than against them. Since the launch, the project has seen lots of downloads on GitHub and some notable external contributions. The project is still in its early days: developers regularly find bugs, but instead of complaining, they mostly help in getting them fixed. This aspect of outside involvement has been very successful. We hope that as we continue to work on ironing out issues, Flow’s external adoption will continue to increase organically.
How have the PL ideas learned in your academic training been important in the development of Flow?
There was a lot of influence of academic PL ideas in the early design of the type system, and in thinking about scalability. For example, we needed to be very careful about deciding what kind of type inference we wanted: trading off time and space complexity against precision, and having enough expressivity to model JavaScript’s sophisticated semantics while remaining decidable. So we did a thorough study of the state of the art. System issues were important too: e.g., to scale seamlessly to millions of lines of code, we needed to plan for parallelization and incrementalization.
That said, as the design and implementation evolved, we chose to accumulate some technical debt in exchange for moving fast and catering to a variety of special cases. The expectation is to formalize parts of the type system and prove theorems about its guarantees sometime soon.
What are some of the novelties of Flow that are important in practice but undervalued in the research literature? What are warts/realities that follow from the pressures of large-scale use?
Flow tries very hard not to disrupt the developer’s “flow.” In particular, it does not introduce any noticeable delay into the fast edit-run cycle of the typical JavaScript developer. To achieve this, Flow performs checking in the background, maintaining the state of the codebase in a persistent server, so that queries from the editor can be met with an effectively instantaneous response; furthermore, when files change, the server proactively and incrementally type checks these files to update its state as quickly as possible. Flow also employs massive parallelization at each stage of its type checking. This is possible due to specific choices in the design of the type system, that enable modular analysis and separate compilation.
Flow accommodates a lot of common JavaScript idioms that arise from a combination of powerful language features. Flow does this by employing data-flow and control-flow analysis (the kind of analyses compilers do to transform code) and feeds back the precise information obtained from such analysis into a subtyping-based inference algorithm. In other words, by modeling the flow of values precisely through code, abstracting this information with types, and using logical reasoning, it can find far more bugs with far less manual effort, and the precision of inferred types is also much better than most traditional gradual type systems. Doing this well pays off handsomely, because it means that developers don’t have to change the way they code in JavaScript to use Flow.
Ultimately, the driving force behind Flow is to add real value to developers: increasing their productivity by finding bugs early and providing them better tooling, while not requiring a lot of effort in exchange for those benefits. These constraints are hard and working against them has spurred several innovations. At the same time, some important academic concerns, such as soundness (or type safety), are not primary goals,[ref]Which is to say, well-typed Flow programs are not guaranteed to be without run-time errors, but of course those errors are within the scope of errors already possible in JavaScript.[/ref] even though they might closely coincide with providing real value. For example, soundness would allow us to exploit type inference to aggressively transform code for refactoring or performance. We do try to aim for soundness, and in the future might rely more on dynamic checks to achieve strong soundness guarantees. Still, it might just take time to convince people that soundness is needed: developers are motivated by real evidence and not just “principles.” So we often have to consciously decide between short-term wins vs. long-term wins, advocating for prudent choices case by case.
A view from industry
What is your view of the value of a PhD at a non-research oriented company like Google or Facebook? Why should people pursue a PhD if they aren’t going to end up at a University or research lab?
This is a very interesting point of conversation, and my views on it have evolved to the point that I see good arguments on both sides.
When I was in academia, I interviewed at companies where people with PhDs would insist that while they don’t publish on a regular basis, they are still doing important work: working on hard problems that have real impact. While I didn’t appreciate this idea then, it has grown on me since. A PhD can offer very deep knowledge in a particular area, and this can be an advantage, since it may be harder and may take longer to absorb that kind of knowledge indirectly from an unfocused process of self study. Thus, doing a PhD may enable you to get ambitious projects off the ground sooner, having a better feel for what to avoid, what the most elegant or efficient way to doing some things are, and where to push the state of the art.
Also, a PhD trains you to think independently, evaluate approaches, be persistent in the face of failure, and execute a multi-year project successfully with multiple intermediate results; I think this makes you overall more confident and better suited to take on (and provide leadership to) longer-term, technically difficult problems with unclear specifications. And the juiciest problems that require this kind of expertise don’t necessarily arise in academic research: indeed, it is just as common for industrial problems to question prevalent assumptions and trigger new lines of research. Certainly a lot of innovations at companies like Google and Facebook are driven in this manner.
That said, I work every day with lots of extremely talented engineers who have learned their craft solely from hands-on, real-world experience, and provide as much value to the company as several PhDs combined. Another interesting trend is that some of these astute programmers derive for themselves key ideas that people have come to understand in academia for some time. As an immediate example I come across often, JavaScript programming is becoming more functional and less stateful, since programmers can see that stateful code is harder to understand and easier to mess up.
What are the lessons you’ve learned in undertaking these projects?
Starting a new, groundbreaking project may be fun, but it is very important to have a convincing case on why that project is important, and to have (enough of) a concrete plan for how that project can be completed. Having a fun idea is seldom enough: you have to convince others that the project is worth doing, and is actually doable; otherwise you may end up wasting valuable resources.
Another thing I’ve found useful to remember is that “code wins arguments”: while others are debating the pros and cons of various approaches, a quick prototype helps narrow down the conversation drastically. A variant of the same principle is “done is better than perfect“: instead of making your customers wait a long time to get any value, it is often better to quickly give them something, and then add value over time. In the case that your product fails, you’ll be quicker to learn and adapt from that failure and try something else.
Finally, I’ve learned that it is very important to communicate your ideas and collaborate with your colleagues. Working alone is great for a while, but it doesn’t scale and often produces sub-optimal results: you need all kinds of help in delivering a successful product. At the least, communicating and collaborating offers different perspectives on the same problem, which not only helps generate completely new ideas, but also helps test your own ideas against new assumptions and thus makes them stronger.
Pingback: Interview with Go's Russ Cox and Sameer Ajmani - The PL Enthusiast
Pingback: What is PL research and how is it useful? - The PL Enthusiast
Pingback: Interview with Facebook's Peter O'Hearn - The PL EnthusiastThe Programming Languages Enthusiast