By Thomas Bazzi - Fullstack student || During my 5 years of computer science studies and my 10 years of development career, I have come across no less than a hundred programming languages, from the simplest to the most complex, from the most mainstream to the most obsolete. Each has its own characteristics, syntax, strengths and limitations. Today there are thousands and thousands of programming languages, but only a small "elite" stands out by domain (Web, embedded systems, management software, etc.). So the question is: Why has Python become one of these elites? Why is it the preferred language of data scientists? Several factors have made the language a key player in many fields, including data science. Here I list some key factors:
It is an open-source language
When we think of open source we think of freedom, of unlimited potential without an owner imposing restrictions. Indeed, Python belongs only to its users and contributors. They form a huge worldwide community that keeps growing and contributing to the improvement of the language, its development environments (IDE), and enriching it with new useful libraries.
Another reason why Python is so popular is its easy syntax.
Indeed, it is much easier to understand than some languages like C, C++ and even Java. One example is the declaration of variables. The type of variables is determined implicitly by Python, without the need to specify it clearly as one must do in C, or to allocate memory and manage pointers.
An Object Oriented Programming language
Python is an Object Oriented Programming language, which gives it the great advantages of this concept: modularity, abstraction, productivity and reusability, safety...
A wide range of Python libraries!
Python has a wide range of libraries for data science and data analytics. What is a library? Pre-made code that allows you to perform tasks from the simplest (doing calculations, importing large data sets) to the most complex (creating your own Machine Learning models). When it comes to Deep Learning issues, we talk about Frameworks. Here are the most commonly used Python libraries:
Useful for mathematical calculations such as matrix multiplication, array operations.
Useful for scientific calculations with modules for visualisation, optimisation, linear algebra and many other mathematical concepts.
Contains tools and functions that make data analysis fast and less complex. It has 2 important data structures: one-dimensional indexed series (int, string...), and DataFrame which are two-dimensional indexed structures, in rows and columns. All this makes it easy for Python to extract and retrieve data from Excel, CSV or SQL. Pandas provides a variety of useful functions that can be performed on series and DataFrames like Average, Sum, Group By...
It is a Python package for Machine Learning! It includes a wide range of Machine Learning algorithms and allows to implement simple or complex processes. Its great advantage is that it is compatible with other Python libraries, especially Pandas and Numpy. This package contains for example the regression algorithms, and it allows to calculate the accuracy rate of these algorithms.
Matplotlib and Seaborn
These are very useful libraries for visualising data in the form of graphs and histograms.
Finally, Python's strength comes from the stability, modernity and variety of its development environments (IDE).
Notebooks are an example of an absolutely essential code organizer for data scientists. They provide a "story telling" approach to code, making the work organised, readable and elegant to present. These notebooks can be managed by several IDEs on local machines or in the cloud. Examples include Jupyter (available on our JULIE learning platform) and Google Colaboratory.
To conclude, one only has to look at the polls on developers' favourite programming languages (such as the Kaggle poll which puts Python at the top, far ahead of the others) to see how popular Python is and to deduce that it is winning the battle against the other programming languages.
Will it last forever?
If you want to acquire the data skills that recruiters are looking for, take a look at the Data training courses that Jedha Bootcamp offers.