Data are becoming the new raw material of business
The Economist


Crunching Yelp Data to a Job at Crunchbase: Alumni Spotlight on Newton Le

At The Data Incubator we run a free eight-week data science fellowship to help our Fellows land industry jobs. We love Fellows with diverse academic backgrounds that go beyond what companies traditionally think of when hiring data scientists. Newton was a Fellow in our Summer 2016 cohort who landed a job with one of our hiring partners, Crunchbase.

Tell us about your background. How did it set you up to be a great data scientist? 

I have an electrical engineering and computer science degree from UC Berkeley, which gave me a strong coding foundation. I am also almost done with a PhD in structural engineering at UC Davis, which gave me a lot of experience solving analytical problems computationally.

What do you think you got out of The Data Incubator?

Beyond the miniprojects and learning the data science tools available, The Data Incubator really made me see my potential. Being challenged to do something interesting with data, I came up with a fun idea that I had no idea how to execute, but I pushed myself to learn new tools and produce some useful within a few days. Seeing what others could do every week, I was inspired to add significant improvements to my capstone project and learned a lot each step of the way. Super smart instructors and talented peers really inspired me as well. Everyone had their strengths and I could learn something different from each person. The environment didn’t feel competitive at all and was actually very collaborative. I actually miss coming in every day to work with my pod. Go Team Lannister!

Could you tell us about your Data Incubator project?

Rate to Plate recommends recipes and restaurants from a user’s ratings of restaurants. It first generates a restaurant’s flavor profile using a TF-IDF analysis of the Yelp review text with focus on key flavor-indicating terms, which I scraped from a pretty comprehensive list of food terms on BBC Food. Using a user’s rating of restaurants, the user’s flavor profile is obtained by aggregating the restaurant flavor profiles weighted by ratings. This profile is then matched with other restaurants and recipes that I scraped from Epicurious. The bulk of the data is from the Yelp academic dataset, which I supplemented by implementing a live-scrape feature on my app.

What advice would you give to someone who is applying for The Data Incubator, particularly someone with your background?

Start thinking of problems you can solve with data now. Start learning Python and exploring the wealth of modules available for it. You can install Jupyter notebook on your local machine, which will help you play around with scraping and machine learning. The two months go by really fast, and anything you can do ahead of time will help you tremendously.

What’s your favorite thing you learned while at The Data Incubator? This can be a technology, concept, or whatever you want!

HackerRank challenges are fun, and practicing them helped me nail the technical challenges I was given in interviews, impress the interviewers, and land the position. The Data Incubator provided solutions were always clever and elegantly written, which really helped me think of algorithms from different angles. In fact, one of the interviews actually used HackerRank to conduct the interview, so being familiar with the format and interface really helped.

Tell us a little about your new job!

I’ll be working as a data engineer for Crunchbase. From what I understand so far, one of my first tasks will be helping integrate disparate sources of data into one cohesive database.

Tweet about this on TwitterShare on FacebookShare on LinkedIn

Back to index