Do It Yourself – Website Tutorials
Web scraping is a very powerful tool to learn for any data professional. With web scraping the entire internet becomes your database. In this tutorial we show you how to parse a web page into a data file (csv) using a Python package called BeautifulSoup.
In this example, we web scrape graphics cards from NewEgg.com.
Python Code:
https://code.datasciencedojo.com/datasciencedojo/tutorials/tree/master/Web%20Scraping%20with%20Python%20and%20BeautifulSoup
Sublime:
https://www.sublimetext.com/3
Anaconda:
JavaScript beautifier:
https://beautifier.io/
If you are not seeing the command line, follow this tutorial:
https://www.tenforums.com/tutorials/72024-open-command-window-here-add-windows-10-a.html
—
Table of Contents:
0:00 – Introduction
1:28 – Setting up Anaconda
3:00 – Installing Beautiful Soup
3:43 – Setting up urllib
6:07 – Retrieving the Web Page
10:47 – Evaluating Web Page
11:27 – Converting Listings into Line Items
16:13 – Using jsbeautiful
16:31 – Reading Raw HTML for Items to Scrape
18:34 – Building the Scraper
22:11 – Using the “findAll” Function
27:26 – Testing the Scraper
29:07 – Creating the .csv File
32:18 – End Result
—
Learn more about Data Science Dojo here:
https://datasciencedojo.com/data-science-bootcamp/
Watch the latest video tutorials here:
See what our past attendees are saying here:
https://datasciencedojo.com/bootcamp/reviews/#videos
—
Like Us: https://www.facebook.com/datasciencedojo
Follow Us: https://twitter.com/DataScienceDojo
Connect with Us: https://www.linkedin.com/company/datasciencedojo
Also find us on:
Instagram: https://www.instagram.com/data_science_dojo
Vimeo: https://vimeo.com/datasciencedojo
#webscraping #python #pythontutorial
source