FROM ACADEMIA TO INDUSTRY: HOW A PHYSICIST TURNED INTO A DATA SCIENTIST

Vagner Zeizer C. Paes
7 min readJun 6, 2021

--

In this story, I will tell you my transition from academia to industry.

Everything started off when I finished my Ph.D. (2017) in Physics at the Federal University of Rio Grande do Sul (UFRGS), Brazil, and I was looking for a second career option in the case that (I was afraid of that) Science's scholarships in Brazil would become very hard or virtually impossible to obtain.

As of 2009, during my under graduation, I really started tackling programming languages, even though basic, writing some programs in Mathematica and Fortran, and this procedure continued in my Masters by modeling some magnetic materials of technological interest. And in my Ph.D. in Physics I was encouraged to do some programming, and I decided that is the branch I want to specialize in. During these four years and a half, I learned a lot of programming languages, mainly: Python, Mathematica, C++, and Fortran 90.

I started my Post-doc stage at the Federal University of Paraná in August 2018, and as I could stay home and save some money, I decided to really invest in another career, as an alternative. Another university degree would be too long and unnecessary. Then, occasionally I was talking to a friend who is a programmer, and he said to me that in IT the recruiters do not care much about degrees, they care about your experience. I really found that information useful. So, I had decided on a path to follow: something related to IT!

At that time, 2018, Udacity Nanodegrees were on the spot, even though a little bit expensive. However, a few months ago I had the bad habit of smoking, which is quite harmful, and literally burns your money! By doing simple calculations, I realized that the money I would invest in Udacity would be approximately the same I used to spend on cigarettes monthly !!! So, It was a matter of choice! But now I did the right choice, fortunately :) The main reasons why I decided on Udacity were the learning based on projects, which is quite funny, and also the idea of having your own Portfolio.

Okay, now I should decide what to pursue, so I started from scratch. I enrolled in the Nanodegree Intro to Programming, and then I could definitely decide if I would specialize in something related to Artificial Intelligence, Autonomous Systems, Data Science, Cloud Computing, Web Development, etc…

The first Nanodegree was very easy to complete, and I was leaning towards something that would use Python programming. Hopefully, I contacted a friend who was a post-doc in the United Kingdom and returned to Brazil, and he said to me that he got a very good job as a Data Scientist in a Startup in São Paulo. Voilà! He encouraged me to spend my time in this field as it would probably be very rewarding, both professionally and personally. I started reading about data science and I realized that my background in physics would be helpful indeed! Therefore, I decided to definitely follow this path. It was time to focus on it!!

Udacity Nanodegrees

My Github repository is found here.

During my first post-doc at UFPR I did the following two Nanodegrees (at that time there was Udacity Brazil and the Nanodegrees's prices were cheaper):

Nanodegree Data Science Foundations I: I learned the basics of Python for data science, such as Numpy, Pandas, Seaborn, and Matplotlib. I performed also some data wrangling and data visualization. Simple Nanodegree, yet helpful.

Nanodegree Data Science Foundations II: this Nanodegree was fantastic. It started with SQL, which can be hard sometimes. It continued with data exploration and collection in Pandas by using Tweepy API. It was a very nice project! Statistics was the next step, being A/B testing the focus. The last project was about machine learning on Fraud Detection and it was very nice too. In this Nanodegree the machine learning overview and projects were very good for an introduction. This Nanodegree nowadays is somehow equivalent to the Data Analyst Nanodegree.

It was about the final months of 2019 and suddenly Udacity published that it would leave Brazil :( It was a dirty trick. Prices of the Nanodegrees suddenly augmented significantly and I decided to think about alternatives…

In the middle of 2019, there was a huge discount on Udacity Brazil courses and I decided to get one just to get my hands dirty again.

Nanodegree AI programming with Python: I reviewed some mathematics-related to AI and Data Science, as well I learned some good practices of deep learning, developing a very nice flower classifier.

Having finished this Nanodegree, it was time to prepare me for a short-term post-doc in Chemnitz, Germany. Then, I just continued my data science studies by further reading books and articles. Having returned from Germany, the pandemics started, and for my luck, I started a post-doc at UFRGS and could stay home and save money. However, as time flew the situation for scientists just worsened. Therefore, I should prepare myself for the worst scenario: no scholarship for several months when the current one finishes! Fortunately, some lucky day I took a look at Udacity's homepage and realized that Nanodegree's prices were more affordable. It was time to enroll again.

Nanodegree Introduction to Machine Learning with TensorFlow: I learned supervised, unsupervised and deep learning. Especially, I found out that the courses on unsupervised techniques were really insightful, and the project was great (as well as time-consuming!).

Nanodegree Data Scientist: this Nanodegree was the greatest one. And it was very cheap :) I learned the data science thinking procedure, how to do a well-structured data science project, data engineering, a lot of Natural Language Processing, recommender systems, and there was also the Capstone Project, which I chose to be related to predicting churn in a fictitious enterprise, named Sparkify. The Capstone Project was amazing, really time-consuming and I learned a lot of PySpark.

Some other courses that I do recommend

Mario Filho Data Science: in his courses, you can learn how to construct a data science solution from zero to deployment. Really interesting and at that time it was a breakthrough course. There is also another interesting course on Freelancing.

José Portilla (Udemy): he teaches several very good data science courses at Udemy. From data science overview to computer vision, his courses provide simple explanations, examples, and projects.

Jones Granatyr (Udemy): he teaches very good courses from Machine Learning for Beginners to Genetic Algorithms in Python (everything in Brazilian Portuguese).

There were also other platforms that I used and took courses that I did not recommend, but this is not worth writing down here.

Books that I read (links to Amazon are provided)

1- Data Science from Scratch.

2- Data Science for Business.

3- Storytelling with Data.

4- Projetos de Ciência de Dados com Python.

5- Introdução à ciência de dados

6- Introdução à linguagem SQL.

7- Analítica de Dados com Hadoop.

8- Approaching (almost) any machine learning problem.

Breaking into the Field

As my scholarship from CNPq finished, I started to apply for data scientist jobs. Of course, Linkedin is the best place for that. I was called for two interviews. It was a good time, with four interviews in less than seven days (more than one interview apply for one job). In the end, Bright Photomedicine decided to hire me. It was a tough process, with three interviews, each one of about one hour, and even some of them were recorded! But I liked it. It was very rewarding.

From job applying until being hired, it took about two months and a half, but this short time period was due to the fact that I was very well prepared by taking several courses and also have learned how to do in the interviews in the Udacity's courses. That is one thing that I enjoy in Udacity, they do not only teach how to code or how to do a Data Science Project, they also pave the way for getting a job! Udacity's students can get their CVs and LinkedIn Profiles reviewed whenever they want!

Summary

I think that I can highlight the following spots when transitioning to data science:

1- It will be easier if you have a STEM degree. Master's and Ph.D. degrees also can help.

2- Good courses are essential if you want to showcase your skills in the field. However, Data Science is much more than just coding!

3- A powerful Github repository is mandatory. The majority of the recruiters do look at it.

4- Prepare for interviews! Otherwise, it can blow out your mind!

5- Study data science daily, at least read something. It is better in the long run. Diversify your sources of articles, books, etc…

6- I have many friends from academia that broke into the field easily. So, if you know Python and you are interested in data science, keep up the good work and applying for jobs! In many Enterprises and Startups Ph.D.'s are welcome! They appreciate the mindset of a scientist in the sense of a life-long learning lifestyle :)

7- dive deep into the field. There are a lot of people trying to break into data science nowadays. If you really want data science, go deep, it will be a tough journey! Yet pleasant :)

Hope you have enjoyed reading this article :)

Please, give it some claps :)

You can add me on LinkedIn here.

--

--

Vagner Zeizer C. Paes

Data Scientist; Data Passionate; Applied Machine Learning; Data Analysis