When you install Python on your computer, it already comes with many modules and functions to use directly, or import them as needed.
One of the great things about a programming language is the community around it.
This community develops many solutions to specific problems that the official releases don’t cover or just better solutions than the ones designed and built by the official maintainers.
To use these solutions, you have to install them separately and import them the same way we do with regular modules.
You can find these packages on PyPi, which stands for Python Package Index.
Before coding something, I advise you to go to the PyPi and check if someone didn’t publish a package to do what you want already.
For instance, you might want to do web scrapping, a task that requires to simulate a person navigating on a webpage, and then scrape information from that site.
Doing so requires, among other things, doing a lot of HTML and XML parsing, which you could do by hand or just use a widely used library called Beautiful Soup.
To use Beautiful Soup, you have to install it as extra on top of your standard Python installation.
Python allows you to install third-party packages through pip
, a package-management system.
First things first, check if pip
is installed and ready, it should be since it is installed by default since Python 3.4.
pip --version
The output should be something similar to this:
pip 20.1.1 from /home/renan/.local/lib/python3.6/site-packages/pip (python 3.6)
After confirming pip
is properly installed, you can easily install Beautiful Soup with the following command.
pip install beautifulsoup4
So it is always pip install <name of the package>
.
After that, you can easily use the new library.
>>> from bs4 import BeautifulSoup
>>> soup = BeautifulSoup("<h1>My<p>crazy<i>HTML")
>>> print(soup.prettify())
<html>
<body>
<h1>
My
</h1>
<p>
crazy
<i>
HTML
</i>
</p>
</body>
</html>
The command pip show <package name>
will show a complete description of the package.
pip show beautifulsoup4
Will display an output similar to this.
Name: beautifulsoup4
Version: 4.8.2
Summary: Screen-scraping library
Home-page: http://www.crummy.com/software/BeautifulSoup/bs4/
Author: Leonard Richardson
Author-email: [email protected]
License: MIT
Location: /Users/renanmoura/opt/anaconda3/lib/python3.7/site-packages
Requires: soupsieve
Required-by:
Notice the fields ‘Requires’ and Required-by
.
When you install a package using pip
, it will automatically look and install the sub-dependencies for that package.
So it installs soupsieve
for you, and if some other library depends on beautifulsoup4
, it will be listed on ‘Required-by’.
You can use pip list
to see all the packages already installed, your list might be different, but the output should be similar to this:
Package Version
---------------------------------- -------------------
alabaster 0.7.12
anaconda-client 1.7.2
anaconda-navigator 1.9.12
anaconda-project 0.8.3
applaunchservices 0.2.1
appnope 0.1.0
appscript 1.0.1
You can use the following command to upgrade the pip
itself.
python -m pip install --upgrade pip
The -m
is used to tell python to load pip
in memory, so you it can be safely removed and replaced by the newer versions.
Finally, if you want to uninstall a package, the command is very straightforward.
pip uninstall beautifulsoup4