At The Data Incubator we run a free eight-week data science fellowship to help our Fellows land industry jobs. We love Fellows with diverse academic backgrounds that go beyond what companies traditionally think of when hiring data scientists. Athena was a Fellow in our Fall 2016 cohort who landed a job with Brighterion.
My background is in astronomy. My research consisted of developing and performing computer simulations of star formation in the early universe. The goal of these simulations was to better understand what stellar clusters looked like in regions of the universe that telescopes cannot observe. Thus I was already familiar with computer programming and visualizing data. This was very helpful in the transition to data science. Knowing how to present my research clearly to a range of audiences — both beginning students and other experts in the field — has helped as well!
What do you think you got out of The Data Incubator?
Tons! Just about everything I know about machine learning I learned at TDI. I met lots of great, friendly, and supportive people through TDI as well. This includes the instructors and mentors as well as the other fellows in my cohort, many of which I’m sure I will keep up with for many years to come. Through TDI I’ve also made contacts with other companies and data scientists in the San Francisco Bay area, which has been quite helpful in getting those job interviews!
Could you tell us about your Data Incubator project?
I analyzed Yelp business and review data to determine what features most strongly predict a business’ star rating. The goal was to build a web app that allows businesses to figure out what changes to their business would most increase their star rating. The business features I examined included attributes like the type of food they serve, their overall ambience (e.g. ‘upscale’ versus ‘dive bar’), and the text of their Yelp reviews. My web app allows a user to look at average star ratings and other correlated features for different subsets of businesses (e.g. the average star rating of all Italian restaurants, or the phrases most commonly used in Italian restaurant reviews). The next part of this web app is a function in which a business can input the url of their Yelp page. This goes into a rating-predictor to predict that business’ current rating. Finally, the rating-predictor predicts how changing various features will alter a business’ rating in the future. For instance, maybe better service would most increase their star rating, or maybe they should try adding more bacon items to their menu.
What advice would you give to someone who is applying for The Data Incubator, particularly someone with your background?
Get familiar with Python, Pandas, and SQL. Look around online for datasets to play with, and see what interesting relationships you can find in that data. If you’re an observational astronomer, look for astronomy journal papers that describe the implementation of machine learning techniques on an astronomical dataset, and see if you can use those techniques on your data as well!
What’s your favorite thing you learned while at The Data Incubator? This can be a technology, concept, or whatever you want!
There are lots of awesome data scientists out there that come from so many different backgrounds – from finance to physics to philosophy. They will be great colleagues to work with!
Where are you working now and tell us a little about your new job!
I will be working at Brighterion, a company that develops AI and machine learning models for various clients. A lot of their work is in credit card fraud detection. I will get to update some of their current fraud detection models using new credit card transaction data, and I’ll be developing new models as well. These models are applied to literally billions of transactions a month, so it’s a major and exciting challenge!