4
Lecture 18 核主成分分析,流形学习 Kernel PCA, manifold learning
Lecture 19 EM 算法和高斯混合模型 Expectation-Maximization (EM) methods and Gaussian mixed models
Lecture 20 社交网络分析:谷歌 PageRank 算法 Social network analysis: Google PageRank
Lecture 21 神经网络 Neural networks
Lecture 22 深度学习:卷积神经网络和循环神经网络 Deep learning: CNN and RNN
Lecture 23 受限玻尔兹曼机和生成模型 Restrictive Boltzmann machine and generative model
Lecture 24 推荐系统 Recommender systems
实验课(32 学时)部分
章节 1 软件安装与数据预处理:展示 jupyter notebook 安装过程。分析案例 1--青少年市场细分数据集预
处理,分析案例 2--高血压数据分析,分析如何进行变量标准化、离散化、缺失值处理、异常值检测。
(2 学时)。
Chapter 1: Software installation and data preprocessing: show the installation process of jupyter notebook. Analysis case 1 -
pretreatment of youth market segmentation data set, analysis case 2 - hypertension data analysis, and analysis of how to standardize
and discretize variables, deal with missing values and detect abnormal values. (2 hours)
章节 2 分类模型:分析案例--使用 SVM 进行光学字符识别。分析不同的核函数得到的不同的效果,不同
的核函数处理不同的数据,优化调参。 (2 学时)。
Chapter 2: Classification model: analysis case -- optical character recognition using SVM. Different effects are obtained by
analyzing different kernel functions. Different kernel functions process different data and optimize parameters. (2 hours)
章节 3 分类模型:分析案例--𝐾近邻算法构建乳腺癌自动诊断模型,分析案例--使用决策树建立个人信用
风险评估模型。调参分析,不同的 K 值与分割点等对应不用效果。(2 学时)。
Chapter 3: Classification model: analysis case - neighbor neighbor algorithm to build an automatic breast cancer diagnosis model,
analyze the case - use the decision tree to establish a personal credit risk assessment model. Through parameter adjustment analysis,
different K values correspond to segmentation points. (2 hours)
章节 4 回归模型:分析案例--预测医疗费用的模型,分析一元回归、多元回归,分析参数估计(最小二
乘估计、极大似然估计),调参分析正则化解决过拟合和多重共线性等问题。(2 学时)。
Chapter 4: Regression model: analyze the case - the model for predicting medical expenses, analyze univariate regression and
multiple regression, analyze parameter estimation (least squares estimation and maximum likelihood estimation), adjust parameter
analysis regularization, and solve the problems of over fitting and multicollinearity. (2 hours).
章节 5 逻辑回归和朴素贝叶斯法则:分析案例--基于朴素贝叶斯算法的手机垃圾短信过滤及中文人名性
别预测,分析连续型变量处理方法(离散化、概率分布函数)。分析案例--使用逻辑回归进行鸢尾花品
种分类,分析极大似然估计。(2 学时)。
Chapter 5: Logistic regression and naive Bayes rule: analysis of cases -- mobile phone spam message filtering and gender prediction
of Chinese names based on Naive Bayes algorithm, and analysis of continuous variable processing methods (discretization and
probability distribution function). Case analysis - use logistic regression to classify iris varieties and analyze maximum likelihood