1
课程详述
COURSE SPECIFICATION
以下课程信息可能根据实际课需要或在课程检讨之后产生变动。如对课程有任何疑问,
联系授课教师。
The course information as follows may be subject to change, either during the session because of unforeseen
circumstances, or following review of the course at the end of the session. Queries about the course should be
directed to the course instructor.
1.
课程名称 Course Title
大数据治理与商业模式 Big Data Governance and Business Model
2.
授课院系
Originating Department
信息系统与管理工程系 Division of Information Systems & Management Engineering
3.
课程编号
Course Code
MIS303
4.
课程学分 Credit Value
3
5.
课程类别
Course Type
专业核心课 Major Core Courses
6.
授课学期
Semester
秋季 Fall
7.
授课语言
Teaching Language
英语 English
8.
他授课教师)
Instructor(s), Affiliation&
Contact
For team teaching, please list
all instructors
陈琨,信息系统与管理工程系, chenk@sustech.edu.cn
Kun Chen, Division of Information Systems & Management Engineering,
chenk@sustech.edu.cn
9.
验员/、所、联
方式
Tutor/TA(s), Contact
待公布 To be announced
10.
选课人数限额(可不)
Maximum Enrolment
Optional
40
2
11.
授课方式
Delivery Method
习题//讨论
Tutorials
实验/
Lab/Practical
其它(具体注明)
OtherPleasespecify
总学时
Total
学时数
Credit Hours
32
64
12.
先修课程、其它学习要求
Pre-requisites or Other
Academic Requirements
EBA203 管理信息系统或 MIS205 数据管理与数据库
13.
后续课程、其它学习规划
Courses for which this course
is a pre-requisite
None
14.
其它要求修读本课程的学系
Cross-listing Dept.
None
教学大纲及教学日历 SYLLABUS
15.
教学目标 Course Objectives
课程旨在培养学生将数据治理融合到机构的商业模式中。学生将学会如何运用信息技术,根据法规和机构运营制定并执行
规章制度以保护组织的信息资产。通过介绍制定有效数据治理的流程,课程将帮助学生掌握如何将来自企业各个部门的不
同期望和专业知识汇集在一起,从而做出有数据支持的决策。同时,课程将介绍有效数据治理的完整生命周期,即从元数
据管理到隐私和合规。
This course intends to provide students with knowledge of how data governance fits into an organization’s business
model. Students will learn how to work with information technology, regulations and operations to set policies and initiate
good practices that can safeguard an organization’s informational assets. By introducing the processes for developing
an effective data governance, this course intends to enhance students’ understanding of how an organization can bring
together diverse expectations and expertise across the enterprise to enable data-informed decision making. In addition,
this course will introduce students to the complete life cycle of effective data governance from metadata management to
privacy and compliance.
16.
预达学习成果 Learning Outcomes
经过学习该门课程,学生将会获得:
了解什么是数据以及数据的运作
评估数据管理能够解决的业务问题
评估数据管理和治理遇到的挑战
解释完整的数据治理模型
掌握如何通过数据治理减轻监管和业务风险
阐明数据质量策略
记录数据治理的商务业务需求
创建数据质量改进计划
At the end of this course, students will be able to
3
understand what data are and how they work
assess the business issues that data management can resolve
evaluate and explain the challenges inherent in data management and governance
explain data governance maturity models
understand how to mitigate regulatory and operational risk through data governance
articulate a data quality strategy
document the business needs for data governance
create a data quality improvement plan
17.
课程内容及教学日历 (如授课语言以英文为主,则课程内容介绍可以用英文;如团队教学或模块教学,教学日历须注明
主讲人)
Course Contents (in Parts/Chapters/Sections/Weeks. Please notify name of instructor for course section(s), if
this is a team teaching or module course.)
本课程将概述数据治理和商业模式。 它将涵盖基于商业模式,代理的角色和职责,管理,治理沟通,法规,隐私问题,
数据道德和风险管理来构建治理基础结构。随着数据量的增加和数据用途的扩展,现实应用越来越强调大数据环境下数据
质量的问题。本课程将解决数据质量问题,并讨论数据治理和商业模型中的各种问题。
理论(32 学时)
第一章 数据概念和定义(4 学时)
1.1 数据分类 (1 学时)
1.2 数据预处理 (3 学时)
第二章 数据挖掘和数据分析技术(4 学时)
1.1 分类、聚类技术(2 学时)
1.2 关联关系挖掘(2 学时)
第三章 数据治理原则和风险(2 学时)
3.1 数据治理顶层架构(1 学时)
3.2 数据隐私保护(1 学时)
第四章 元数据管理和数据治理(2 学时)
4.1 元数据定义(1 学时)
4.2 基于元数据的管理(1 学时)
第五章 数据治理构建业务案例(4 学时)
5.1 区块链与数据共享(2 学时)
5.2 5G 技术与数据安全 (2 学时)
第六章 通过治理实现数据质量(4 学时)
6.1 数据管控流程(2 学时)
6.2 数据管控绩效(2 学时)
第七章 数据质量评估和测量(2 学时)
7.1 数据质量需求(1 学时)
7.2 数据质量分析(1 学时)
第八章 数据评估方案(2 学时)
8.1 数据监察及评估 (1 学时)
8.2 数据提升 (1 学时)
4
第九章 数据质量的战略方法(4 学时)
9.1 数据管理成熟度评估模型 (2 学时)
9.2 数据管理成熟度评估实践(2 学时)
第十章 数据治理的未来(4 学时)
10.1 国内外数据治理的演变与发展(2 学时)
10.2 未来趋势 (2 学时)
实验(32 学时)
第一章 数据管理(2 学时)
大数据存储和描述工具 Hadoop
第二章 Python R(2 学时)
本章重点介绍 Python&R 的安装以及 Python&R 的基本语法。
2.1 Python&R 安装
2.2 Python R 的基本语法
第三章 Web 爬虫程序(6 学时)
3.1 Requests 库(2 学时)
本节说明了 HTML 标记以及 Python Requests 库的用法。
3.2 BeautifulSoup 库-图片抓取(2 学时)
本节重点介绍 Python 中的 BeautifulSoup 库,并使用该库从 Internet 上获取图像。
3.3 BeautifulSoup 库-文本抓取(2 学时)
本节主要辅导编写学生网络爬虫程序,以从网上获取文本信息。
第四章文本挖掘(4 学时)
4.1 jieba 库(2 学时)
本节主要说明 jieba 库的处理。
4.2 TF-IDF 算法(2 学时)
本节主要说明如何使用 TF-IDF 算法提取关键字并寻求文本相似性。
第五章网络数据处理(4 学时)
2.1 SQL 语句(2 学时)
这部分主要说明基本的 SQL 语句:select,update,insert,delete
2.2 网络数据的生成(2 学时)
本部分主要说明如何使用 SQL 语句清洗数据并生成可用于分析的社交网络数据。
第六章网络可视化(4 学时)
3.1 NetDraw(2 学时)
本部分主要概述 Netdraw 软件,并说明 VNA 数据的结构。
3.2 数据可视化(2 学时)
本节说明如何使用 Netdraw 软件进行社交网络数据可视化分析。
第七章数据管理案例(4 学时)
数据质量案例研究
第八章课程项目(6 学时)
This course will provide an overview of data governance and business model. It will cover building a governance
infrastructure based on business model, agents’ roles and responsibilities, stewardship, governance communications,
regulatory compliance, privacy concerns, data ethics, and risk management. This course will address data quality as a
continuous issue in data management. It will emphasize the challenges of data quality in the context of big data as
volumes of data increase and the uses for data expand. This course will utilize case studies, trends, techniques, and
best practices as it examines the topics of data governance and business model.
5
Topics include the following
Chapter 1 Data Concepts and Definitions (4 hours)
1.1 The classification of Data (2 hour)
1.2 Data Pre-processing (2 hour)
Chapter 2 Data Mining and Data Analysis Techniques (4 hours)
2.1 Data Classification (2 hour)
2.2 Data Association Mining (2 hour)
Chapter 3 Data Governance Principles and Risk (2 hours)
3.1 The Architecture of Data Governance (1 hour)
3.2 The Privacy of Data (1 hour)
Chapter 4 Metadata Management and Data Governance (2 hours)
4.1 Meta-data (1 hour)
4.2 Meta-data Management (1 hour)
Chapter 5 Building a Business Case for Data Governance (4 hours)
5.1 BlockChain (2 hour)
5.2 5G (2 hour)
Chapter 6 Data Quality Through Governance (4 hours)
6.1 The process of Data Quality Control (2 hour)
6.2 The evaluation of Data Quality Control (2 hour)
Chapter 7 Data Quality Assessment and Measurement (2 hours)
7.1 The Requirement of Data Quality Assessment (1 hour)
7..2 The Method for Data Quality Assessment (1hour)
Chapter 8 Data Assessment Scenarios (2 hours)
8.1 The Method for Evaluating Data (1 hour)
8.2 Case Study (1hour)
Chapter 9 A Strategic Approach to Data Maturity (4 hours)
9.1 A Model for Data Maturity Evaluation (2 hour)
6
9.2 Practice for Data Maturity Evaluation (2 hour)
Chapter 10 The Future of Data Governance (4 hours)
10.1 The Development Trend in Data Governance (2 hour)
10.2 Case Study (2 hour)
The lab include:
Chapter 1 Big Data Management (2 hours)
Data Management and Data Visualization by Hadoop
Chapter 2 Python & R (2 hours)
This chapter focuses on Python & R installation and the basic syntax of Python & R.
2.1 Python & R installation
2.2 Basic syntax of Python & R
Chapter 3 Web Crawler (6 hours)
3.1 Requests library (2 hours)
This section explains the HTML tags and the use of Requests library in Python.
3.2 BeautifulSoup Library - Picture Grab (2 hours)
This section focuses on the BeautifulSoup library in Python and uses of it to grab images from internet.
3.3 BeautifulSoup Library - Text Grab (2 hours)
This section is mainly to tutors students program web crawler to grab text information from the internet.
Chapter 4 Text Mining (4 hours)
4.1 jieba library (2 hours)
This section mainly explains the processing of jieba library.
4.2 TF-IDF algorithm (2 hours)
This section mainly explains how to use the TF-IDF algorithm to extract keywords and seek text similarity.
Chapter 5
Network Data Processing (4 hours)
2.1 SQL statement (2 class hours)
This part mainly explains the basic SQL statements: selcet, update, insert, delete
2.2 Generation of Network Data (2 class hours)
This part mainly explains how to use SQL statements to clean data and generate social network data that can be used for analysis.
Chapter 6 Network Visualization4 hours
3.1 NetDraw (2 hours)
This part mainly provides an overview of Netdraw software and explains the structure of VNA data.
3.2 Data Visualization (2 hours)
This section explains how to use Netdraw software for social network data visualization analysis.
Chapter 7 Data Management Cases (4 hours)
Case Study on Data quality
Chapter 8 Course Project (6 hours)
18.
教材及其它参考资料 Textbook and Supplementary Readings
Data Governance: How to Design, Deploy and Sustain an Effective Data Governance Program (The Morgan
Kaufmann Series on Business Intelligence) 1st Edition
课程评 ASSESSMENT
7
19.
评估形式
Type of
Assessment
评估时间
Time
占考试总成绩百分比
% of final
score
违纪处罚
Penalty
备注
Notes
出勤 Attendance
课堂表现
Class
Performance
课上
10
小测验
Quiz
课程项目 Projects
学期中
20
平时作业
Assignments
期中考试
Mid-Term Test
期末考试
Final Exam
期末
30
期末报告
Final
Presentation
期末
40
其它(可根据需要
改写以上评估方
式)
Others (The
above may be
modified as
necessary)
20.
记分方 GRADING SYSTEM
A. 十三级等级制 Letter Grading
B. 二级记分制(通过/不通过) Pass/Fail Grading
课程审 REVIEW AND APPROVAL
21.
本课程设置已经过以下责任人/委员会审议通过
This Course has been approved by the following person or committee of authority