DL vs Engineering - OpenAI weeks 7&8

openai Mar 30, 2020

These last couple of weeks have involved a lot of code, from my PR to huggingface to my PR to Swift for Tensorflow, to my current work on implementing a text-to-SQL model on the Spider dataset. Since that time was spent mostly coding, I feel like I have picked up on a few differences between deep learning and engineering.

As I've mentioned in previous posts, I come from a software engineering background. I've deployed client projects in Vue.js, Scala, SwiftUI, Kotlin, and Node.js. What that means is that I have a decent amount of experience in both frontend and backend development. I realized that deep learning definitely has a different feel and workflow than fullstack development. What follows are some of the differences I've noticed as a novice in deep learning.

Feedback

The feedback loop is the biggest difference I've notice between DL and engineering. The time spent between writing code and seeing the results can vary dramatically.

For example when I'm writing code to load the dataset, or prepare a dataset, it's very comparable to writing backend code. In both cases I have data I want to take data from one place, process it, and return it to another place. Verifying that code can be pretty straight forward, just run some data through it. It's even possible to write test cases for the dataset.

Even the first steps of creating a model are similar to engineering. An architecture / design pattern is chosen and you can move forward implementing and connecting the pieces. If you run into an error at run time, it's not incredibly hard to debug, since you usually know where to look based off of a stack trace.

Where it gets rough is when you start training the model on the data. This is not comparable to any other type of programming I've experience before. Your model could output NaNs, after a while of training and it takes some time to understand where that came from, your model could output a negative loss (you probably want positive) and you don't know what's causing it. Maybe your model doesn't learn; where do you even start to debug that? Finding these sorts of issues, at least for me, are very time consuming and it seems like it can only be solved through a lot of trial and error.

Then, the feedback loop can get really long. At this point, it's a matter of waiting for the model to train and checking the results. This process can take hours depending on the kind of data and the dataset. The unfortunate part is that this part is necessary to test different hyperparameters to get the best results.

Though it might seem like a frustrating process, it can be very rewarding. And that's a perfect segway in the next section, rewards.

Rewards

In engineering the feedback loop can be pretty short, especially in front end development. For example using something like Vue.js, React, or SwiftUI, means that as I'm writing the code for some view, I can see that same view updated in real time on the same screen. This sort of work is immediately satisfying since you can constantly see exactly what you're making. That feedback loop is longer in backend development, since it can take time to see results and therefore more time to feel satisfied with a feature. Then in DL, that feedback loop can be stretched even further.

This doesn't mean that DL is any less satisfying. On the contrary, it can be more satisfying. Just with more varied time between results.

For example, a couple of weeks ago, I was working on training a model for a summarization task. It took me more than a week to get it working, I ran into the issues I described in the previous section. And at times it got a little frustrating. Then, I finally got some results. I summarized a CNN article and got the following results:

CNN's Soledad O'Brien takes a tour of the "forgotten floor," where mentally ill inmates are housed in Miami before trial. An inmate housed on the "forgotten floor.

I was ecstatic to see these results. These word combinations didn't appear together anywhere in the article, but the model was able to create this text which summarized the article and (somewhat) maintained the grammar.

This last week I was working on the text-to-SQL task. Where given a SQL schema and an english question, such as: "How many singers do we have?". Output an equivalent SQL query. After some experimentation, I got the following:

SELECT count(*) FROM singer GROUP BY singer_id

This is a valid SQL query. The model learned SQL grammar and learned to translate english to a SQL query. These are the kinds of results that get me excited about deep learning. Though this process was very time consuming, and there was frustration involved, I felt very satisfied with the results.

I've been getting very excited about my final project. In my next blog post I will describe my final project and share my project proposal.