Lecturer |
Jun.-Prof. Dr. Tristan Becker TU Dresden, Junior Professor in Business Administration, esp. Management Science |
Date | September 10 & 11, 2024 with classes at 09:00 a.m. – 04:00 p.m. |
Room/Address |
TU Dresden Georg Schumann-Bau (SCH/B37) |
Seminar content |
The internet contains vast amounts of open data. In most cases, it is
impossible to simply download the desired data in a structured form,
but the data is distributed across many web pages. Web Scraping is a
technique to automate the extraction of desired data. There are
numerous applications, such as gathering price data from online shops,
collecting information from social media websites like Facebook or
Twitter, gathering data from job networks, and collecting general
information on sports results or movie scores. By applying web
scraping, the data from a large number of web pages can be quickly
collected and saved in a structured data set. The data holds potential
for all kinds of research projects using, e.g., statistical or
optimization methods. In this course, we will explore the fundamentals of web scraping with Python 3. We will learn how to access APIs with Python and look at the basics of web scraping. This includes an overview of fundamental elements that make up websites, libraries for web scraping (such as requests, Beautiful Soup, Scrapy, Selenium Webdriver), and a brief discussion about data storage. Further, we will examine some examples of scraping real websites. |
Prerequisites | We recommend basic programming skills in Python 3. |
Certificate |
Ph.D. students from the Faculty of Business and Economics, TU Dresden can earn a
certificate according to § 9 of the Ph.D. doctoral regulations (PromO
2018): Ph.D. students of Business Administration: § 9 (1) Nr. 5 or 6 Ph.D. students of Business Information Systems: § 9 (1) Nr. 6 Ph.D. students of Economics: § 9 (1) Nr. 6 Ph.D. students from other universities can earn a certificate as well. |
Assignment | Students have to complete a brief web scraping assignment by picking a website and applying the Web Scraping skills from this course to compile a data set (e.g., collect weather data, sports results, or price data). They must submit both their code and data. |
Registration |
To register send an e-mail to Dr. Uta Schwarz:
uta.schwarz@tu-dresden.de Phone: +49 351 463-33141 |