How to Extract all links from a website in python

0

In this python tutorial, you will see how we can use a python programming language to extract all links from a web page. We can use this python script to extract all the absolute and relative links from a website.

In this tutorial, we have used the beautifulsoup and requests module. You can also do web scraping in python, by using only the requests-html library.

extract links from website in python


Extract all Links from a Website

To extract all links from a website, use the requests and beautiful soup library in python. requests are used for making a request to the website and beautifulsoup helps us parse the HTML in the webpage.

Install requests module


  pip install requests

If you have a problem installing the requests library. you can check out this article, how-to-install-requests-library-in-python

Install Beautifulsoup


  pip install bs4

Python code to extract all links from a website


import requests as rq
from bs4 import BeautifulSoup

url = input("Enter Link: ")
if ("https" or "http") in url:
    data = rq.get(url)
else:
    data = rq.get("https://" + url)
soup = BeautifulSoup(data.text, "html.parser")
links = []
for link in soup.find_all("a"):
    links.append(link.get("href"))

# Writing the output to a file (myLinks.txt) instead of to stdout
# You can change 'a' to 'w' to overwrite the file each time
with open("myLinks.txt", 'a') as saved:
    print(links[:10], file=saved)
    

Recommended Python Projects:

Summary and Conclusion:

In this article, we have seen how we can use the requests library to extract all the links from a webpage. If you have any questions, leave them in the comment section.

Post a Comment

0Comments
* Please Don't Spam Here. All the Comments are Reviewed by Admin.
Post a Comment (0)

#buttons=(Accept !) #days=(20)

Our website uses cookies to enhance your experience. Learn More
Accept !