9.2 Practice for Data Maturity Evaluation (2 hour)
Chapter 10 The Future of Data Governance (4 hours)
10.1 The Development Trend in Data Governance (2 hour)
10.2 Case Study (2 hour)
The lab include:
Chapter 1 Big Data Management (2 hours)
Data Management and Data Visualization by Hadoop
Chapter 2 Python & R (2 hours)
This chapter focuses on Python & R installation and the basic syntax of Python & R.
2.1 Python & R installation
2.2 Basic syntax of Python & R
Chapter 3 Web Crawler (6 hours)
3.1 Requests library (2 hours)
This section explains the HTML tags and the use of Requests library in Python.
3.2 BeautifulSoup Library - Picture Grab (2 hours)
This section focuses on the BeautifulSoup library in Python and uses of it to grab images from internet.
3.3 BeautifulSoup Library - Text Grab (2 hours)
This section is mainly to tutors students program web crawler to grab text information from the internet.
Chapter 4 Text Mining (4 hours)
4.1 jieba library (2 hours)
This section mainly explains the processing of jieba library.
4.2 TF-IDF algorithm (2 hours)
This section mainly explains how to use the TF-IDF algorithm to extract keywords and seek text similarity.
Chapter 5
Network Data Processing (4 hours)
2.1 SQL statement (2 class hours)
This part mainly explains the basic SQL statements: selcet, update, insert, delete
2.2 Generation of Network Data (2 class hours)
This part mainly explains how to use SQL statements to clean data and generate social network data that can be used for analysis.
Chapter 6 Network Visualization(4 hours)
3.1 NetDraw (2 hours)
This part mainly provides an overview of Netdraw software and explains the structure of VNA data.
3.2 Data Visualization (2 hours)
This section explains how to use Netdraw software for social network data visualization analysis.
Chapter 7 Data Management Cases (4 hours)
Case Study on Data quality
Chapter 8 Course Project (6 hours)