Hacking away - OpenAI weeks 3&4

openai Feb 29, 2020

While writing this blog post I realized that since the field of deep learning moves so fast, having too many technical details won't be useful as time goes on. So I’ll try to focus more on sharing the details of my learning process and the lessons I've learned.

Lessons learned

First big lesson, community is invaluable.

These past two weeks I have been learning about language models. Many times I found myself with questions and a fuzzy understanding of how things worked. So I would reach out to my mentor, Melanie, and ask questions. I found it really helpful to talk through my confusion and explain those concepts to her and get feedback. I also spent time implementing a CoQA leaderboard paper with another scholar, Pamela. That pair programming helped me see different approaches to problem solving. And since I'm remote, I've been reviewing language model topics with local colleagues.  All of these conversations were essential to building an intuition and a working understanding.

I also spent time going over a huggingface tutorial, and ending up contributing to the repo. The goal of the tutorial was to build a conversational bot using transfer learning for transformer models. Since my final project will be closely tied with transformers, I thought it would be helpful to dive into the code. While I was exploring the code and attempting to fine tune the model, I noticed that the repo was using an old version the library and I tried to update. The update wasn't trivial and I ended up submitting a pull request. I found it really valuable to interact with maintainers and it was satisfying to contribute something as a newcomer.

And finally I have also been attending the Swift for Tensorflow design meetings. It's exciting to be part of a community building a tool that I believe in. And it's very helpful to see what goes into building a tool for deep learning. I'm hoping to contribute more in the following weeks.

Second big lesson, tooling is invaluable.

At first I started doing everything on Google Colab notebooks. The hardware that comes with colab is decently performant and I thought it would be a good way to start and get quick feedback. One problem I had was that I found myself running into a lot of errors, and I instinctively started printing out shapes and tensors to get an idea of what was going on. Eventually I realized how much that process was slowing me down and I downloaded my code and ran it on an IDE with debugging enabled. The IDE was helpful in understanding the flow and architecture of the programs. There are benefits to both tools so I found myself switching between the tools depending on task. I think it's always valuable to know what tools are available and their limitations.

Last big lesson, direction is invaluable

I have found that having a direction to work towards, a goal or a vision, is the most important part of staying motivated. This isn't something specific to deep learning and isn't something I learned these last couple of weeks. But I've always found it helpful in staying motivated in any big endevor, and the Scholars Program isn't any different. The vision I've been working towards, in this case, is my final project. I started writing out a draft for my final project as soon as I knew I was accepted to the program. Since then, all of my work has been building towards that final project, and it has kept me motivated.

Next time I'll be converging even more on resources that will help with my final project. And I'm excited to get more experience under my belt!