Tuesday, February 11, 2020

Python – Environmental Setup for Data Science



How to setup the working environment for Python machine learning on your local computer?
Libraries and Packages To understand machine learning, you need to have basic knowledge of Python programming. In addition, there are a number of libraries and packages generally used in performing various machine learning tasks as listed below:
 numpy - is used for its N-dimensional array objects
 pandas – is a data analysis library that includes dataframes  
matplotlib – is 2D plotting library for creating graphs and plots
 scikit-learn - the algorithms used for data analysis and data mining tasks
 seaborn – a data visualization library based on matplotlib
Installation:  
Method 1: 
 Download and install Python separately from python.org on various operating systems as explained below: 
To install Python after downloading, double click the .exe (for Windows) or .pkg (for Mac) file and follow the instructions on the screen. 
For Linux OS, check if Python is already installed by using the following command at the prompt:
$ python --version. ...
If Python 2.7 or later is not installed, install Python with the distribution's package manager. Note that the command and package name varies.
On Debian derivatives such as Ubuntu, you can use apt:
$ sudo apt-get install python3
Now, open the command prompt and run the following command to verify that Python is installed correctly:
$ python3 –version
Similarly, we can download and install necessary libraries like numpy, matplotlib etc. individually using installers like pip. For this purpose, you can use the commands shown here:
$pip install numpy
$pip install matplotlib
$pip install pandas
$pip install seaborn
Method 2:
Alternatively, to install Python and other scientific computing and machine learning packages simultaneously, we should install Anaconda distribution. It is a Python implementation for Linux, Windows and OSX, and comprises various machine learning packages like numpy, scikit-learn, and matplotlib. It also includes Jupyter Notebook, an interactive Python environment. We can install Python 2.7 or any 3.x version as per our requirement.
To download the free Anaconda Python distribution from Continuum Analytics, you can do the following:
Visit the official site of Continuum Analytics and its download page.  Note that the installation process may take 15-20 minutes as the installer contains Python, associated packages, a code editor, and some other files. Depending on your operating system, choose the installation process as explained here:
For Windows: Select the Anaconda for Windows section and look in the column with Python 2.7 or 3.x. You can find that there are two versions of the installer, one for 32-bit Windows, and one for 64-bit Windows. Choose the relevant one.
For Mac OS: Scroll to the Anaconda for OS X section. Look in the column with Python 2.7 or 3.x. Note that here there is only one version of the installer: the 64-bit version.   
For Linux OS: We select the "Anaconda for Linux" section. Look in the column with Python 2.7 or 3.x.   
Note that you have to ensure that Anaconda’s Python distribution installs into a single directory, and does not affect other Python installations, if any, on your system.
To work with graphs and plots, we will need these Python library packages: matplotlib and seaborn. 
If you are using Anaconda Python, your system already has numpy, matplotlib, pandas, seaborn, etc. installed. We start the Anaconda Navigator to access either Jupyter Note book or Spyder IDE of python.
After opening either of them, type the following commands:
import numpy
import matplotlib
Now, we need to check if installation is successful. For this, go to the command line and type in the following command:
$ python
Python 3.6.3 |Anaconda custom (32-bit)| (default, Oct 13 2017, 14:21:34) 
[GCC 7.2.0] on linux
Next, you can import the required libraries and print their versions as shown:
>>>import numpy
>>>print numpy.__version__
1.14.2
>>> import matplotlib
>>> print (matplotlib.__version__)
2.1.2
>> import pandas
>>> print (pandas.__version__)
0.22.0
>>> import seaborn
>>> print (seaborn.__version__)
0.8.1

No comments:

Post a Comment

ML-Model DecisionTree Example-IncomePrediction

DecisionTree -- IncomePrediction Decision Tree: Income Prediction ¶ In this l...