Peter Sondergaard, Senior Vice President at Gartner once aptly quoted that “Information is the oil of the 21st century, and analytics is the combustion engine”. This quote took the whole world of analytics to a new turn. People started understanding the necessity of analytics in the petabytes of data that is being generated with a lot of potentials to propel tomorrow`s world. And this signaled the start of developing several Data science packages that could supplement the process by making it more & more compact to analyze the tones of data.
Though many data science packages have been developed, a lot of users prefer one of the best data science packages developed till date i.e “ ANACONDA “. Anaconda is a complete, open source data science package that has a community strength of over 6 million active users. It was initially released 6 years ago on 17th July 2012 as an attempt to make a comprehensive collection of packages used for Data Science with its latest release 5.2.0 launched on 30th May 2018 and many people don’t know that it was fully written in Python with a cross-platform Operating System support. The package is very easy to download and install, and it is highly supported on Linux, MacOS, and Windows.
So what is the ANACONDA Data Science package?
One of the most important factor why Anaconda is so popular amongst the beginners is the easy onboarding process with a very comprehensive collection of up to 1000 packages along with a very useful Conda package & a virtual environment manager which reduces the task of installing each & every data science packages that we may require to analyze those large chunks of data .
Apart from this, the major reason why Anaconda Data Science package has such a strong community strength is the availability of the package for the most used programming/scripting languages namely Python & R . With the rise of Data Analytics as a career in today`s world, a paradigm shift has occurred in the usage of Python, R etc. and the introduction of “ Anaconda “ : a data science package collection came to rescue the people in the Data Analytics industry .
The open source data packages can be individually installed from a large Anaconda repository with the conda install command or using the pip install command that is installed with Anaconda. Pip packages provide many of the features of conda packages and in most cases, they can work together simultaneously.
The default environment is Python 3.6, but you can also easily install Python 3.5, Python 2.7, or R. The main feature is that every time Anaconda Navigator launches, it checks to see if new software is available and prompts you to update if necessary Every time Anaconda Navigator launches, it checks to see if new software is available and prompts you to update if necessary thus reducing the tasks of the users to check out for the recent updates .
Even the organizations can use an advanced version of Anaconda package by using the Enterprise option. Anaconda Enterprise is an AI/ML enablement platform that empowers organizations to develop, govern, and automate AI/ML and data science from laptop through training to production. It lets organizations scale from individual data scientists to collaborative teams of thousands, and to go from a single server to thousands of nodes for model training and deployment.
Setting up the environment for Anaconda Data Science Package in Windows / Linux
Before further exploration in the field of Data Science, it`s very necessary to set up the working platform ( in this case Anaconda ) to start working hands-on & extracting useful insights from the large chunks of data. As Anaconda, already contains some of the very useful data science packages; so it reduces our task of installing the major data science libraries like Numpy etc. in Python thus boosting our creativity.
In this section, the steps to install and set up Anaconda in Linux / Windows will be covered.
To install Anaconda on Linux
1. Go to https://www.anaconda.com/download/ and download the Linux installer for the 32-bit / 64-bit version of Anaconda based on Python 2.7 / 3.7.
2. When the download is done use the following command to install Anaconda: bash Anaconda2-2.4.0-Linux-x86_64.sh
3. We need to get the conda command working in the Linux command prompt. Anaconda will ask you whether it needs to do that, so answer “yes”.
To install Anaconda on Windows:
1. Go to https://www.anaconda.com/download/ and download the Windows installer for the 32-bit / 64-bit version of Anaconda based on Python 2.7 / 3.7.
2. Run the installer.
3. Set up the environment
Anaconda virtual environment setup in the Windows command prompt
Once the installation is done, it’s time to set up an environment. An interesting schema on conda vs pip commands can be found at http://conda.pydata.org/docs/_downloads/conda-pip-virtualenv-translator.html.
1 – Use the following command in your operating system command line. Replace “nameoftheenv” with the actual name you want your environment to have. conda create –n nameoftheenv anaconda
2 – Make sure you agree to proceed with the setup by typing “y” at the end of this list, as shown above, and after a while, you should be ready to go. Anaconda will create the environment on its default location, but options are available if you want to change the location.
3 – Now that you have an environment, you can activate it in the command line: –In Windows, type activate nameoftheenv
In Linux, type source activate nameoftheenv
Or you can point to it with your Python IDE (integrated development environment).
4 – If you activate it in the command line you can start-up the Jupiter (or IPython) IDE with the following command: Ipython notebook
Jupyter (formerly known as IPython) is an interactive Python development interface that runs in the browser. It’s useful for adding structure to your code.
For every package mentioned that isn’t installed in the default Anaconda environment:
A. Activate your environment in the command line.
B. Either use conda install library name or pip install library name in the command line.
“ ANACONDA”: The organization that supports the Anaconda Data Science packages also conducts an annual conference “ AnacondaCon”, for the people who are using data science and predictive analytics in today`s world to generate useful insights from the data, thus reflecting the passionate and eclectic nature of the growing Data Analytics community.
Overall, AnacondaCon is one of the best conferences held in the annual calendar to get updated about the recent advancements in the field of Data Analytics and many other such kinds of stuff. The next AnacondaCon is scheduled from April 3 – 5, 2019 in Austin, TX. Join in, if you are nearby to experience Data Analytics in a much deeper sense.
Travis Oliphant, Founder, Director, CEO, Data Scientist compiled a perfect slideshow to make you aware about all the detailing of Anaconda Data Science Package & how it is used in the real world along with some great insights on the Data Analytics Industry.
Keeping in mind the rising need of Data Analysts / Scientists, we need to be familiarized with various popular tools used for Data Analysis and Anaconda is one of them. The support that Anaconda provides as a platform with multiple data analytics libraries pre-installed is just an icing on the cake and there are multiple things to start exploring using Anaconda Data science packages. Moreover, you can also refer to the post on Data Analysis using Python to get a clearer view of how the whole process of Data Analytics works here.
So start getting familiarized with the highly useful & compact package of world-class Data Science libraries to generate useful insights from the data that is lying idle to power tomorrow`s world.