Commit b1db3675 authored by whzecomjm's avatar whzecomjm
Browse files

modi lecnote download, add html download

parent 435575fe
Loading
Loading
Loading
Loading
+82 −6
Original line number Diff line number Diff line
%% Cell type:markdown id: tags:

# 批量下载 pdf 讲义

以 Solomyak 的 GMT and Fractals 的课程为例。

urlretrieve 用来保存文件, py3 在 urllib.request内, py2在urllib

%% Cell type:code id: tags:

``` python
from urllib import request
from bs4 import BeautifulSoup
import re

url=r'http://u.math.biu.ac.il/~solomyb/TEACH/18/GMT/index.html'
link=r'http://u.math.biu.ac.il/~solomyb/TEACH/18/GMT/'

# proxy={'http':'http://localhost:80'}
headers = ("User-Agent"," Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/66.0.3359.139 Safari/537.36")  #这里模拟浏览器
opener = request.build_opener()
opener.addheaders = [headers]
request.install_opener(opener)
# 添加 header 模拟浏览器, 可兼容 urlretrieve.

contents = request.urlopen(url).read().decode()
soup = BeautifulSoup(contents,"html.parser")
n=1
for tag in soup.find_all('a'):
    pdf = tag.get('href')
    pdfurl = link+pdf
    print(pdfurl+"\n")
    pdfdir = 'C:/Users/whzec/Desktop/'+pdf
    request.urlretrieve(pdfurl,pdfdir)
    n=n+1

# urlretrieve 用来保存文件, py3 在 urllib.request内, py2在urllib
    pdfdir = '/Users/whzecomjm/Desktop/'+pdf
    #request.urlretrieve(pdfurl,pdfdir)
```

%% Output

    http://u.math.biu.ac.il/~solomyb/TEACH/18/GMT/syll.pdf
    
    http://u.math.biu.ac.il/~solomyb/TEACH/18/GMT/hw1.pdf
    
    http://u.math.biu.ac.il/~solomyb/TEACH/18/GMT/hw2.pdf
    
    http://u.math.biu.ac.il/~solomyb/TEACH/18/GMT/hw2a.pdf
    
    http://u.math.biu.ac.il/~solomyb/TEACH/18/GMT/hw3.pdf
    
    http://u.math.biu.ac.il/~solomyb/TEACH/18/GMT/hw3a.pdf
    
    http://u.math.biu.ac.il/~solomyb/TEACH/18/GMT/hw4.pdf
    
    http://u.math.biu.ac.il/~solomyb/TEACH/18/GMT/Lec1.pdf
    
    http://u.math.biu.ac.il/~solomyb/TEACH/18/GMT/Lec2.pdf
    
    http://u.math.biu.ac.il/~solomyb/TEACH/18/GMT/Lec3.pdf
    
    http://u.math.biu.ac.il/~solomyb/TEACH/18/GMT/Lec4.pdf
    
    http://u.math.biu.ac.il/~solomyb/TEACH/18/GMT/Lec5.pdf
    
    http://u.math.biu.ac.il/~solomyb/TEACH/18/GMT/Lec6.pdf
    
    http://u.math.biu.ac.il/~solomyb/TEACH/18/GMT/Lec7.pdf
    
    http://u.math.biu.ac.il/~solomyb/TEACH/18/GMT/Lec8.pdf
    
    http://u.math.biu.ac.il/~solomyb/TEACH/18/GMT/Lec9.pdf
    
    http://u.math.biu.ac.il/~solomyb/TEACH/18/GMT/Lec10.pdf
    
    http://u.math.biu.ac.il/~solomyb/TEACH/18/GMT/Lec11.pdf
    
    http://u.math.biu.ac.il/~solomyb/TEACH/18/GMT/Bernotes.pdf
    

%% Cell type:markdown id: tags:

# 下载计算共形几何清华讲义

链接地址: https://mp.weixin.qq.com/s/H9MBGEBw8T6BwcUnBcARDg

参考 https://blog.csdn.net/qq_36369941/java/article/details/88411464

%% Cell type:code id: tags:

``` python
from urllib import request
from bs4 import BeautifulSoup
import re

url=r'https://mp.weixin.qq.com/s/H9MBGEBw8T6BwcUnBcARDg'

headers = ("User-Agent"," Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/66.0.3359.139 Safari/537.36")  #这里模拟浏览器
opener = request.build_opener()
opener.addheaders = [headers]
request.install_opener(opener)
# 添加 header 模拟浏览器, 可兼容 urlretrieve.

contents = request.urlopen(url).read().decode()
soup = BeautifulSoup(contents,"html.parser")
urls_li = soup.select("#js_content > ol > li")
for url_li in urls_li:
    aurl = url_li.select('a')
    for url in aurl:
            url_href = url.get("href")
            url_title = url.string
            if url_title != None:
                ddir = '/Users/whzecomjm/Desktop/'+url_title+'.html'
                print(url_title)
#                print(url_href)
#                request.urlretrieve(url_href,ddir)

```

%% Output

    《计算共形几何-理论篇》教程简介
    十年磨一剑:《计算共形几何(理论篇)》即将出版
    几何为万物赋能——建筑、医疗、动漫、游戏…… | 凤凰卫视世纪大讲堂
    清华笔记:计算共形几何讲义 (0) 背景简介
    清华笔记:计算共形几何讲义 (1)代数拓扑
    清华笔记:计算共形几何讲义 (2)代数拓扑
    清华笔记:计算共形几何讲义 (3)微分拓扑
    清华笔记:计算共形几何讲义 (4)单纯同调
    清华笔记:计算共形几何讲义 (4.5)相对同调 Mayer-Vietoris 序列
    清华笔记:计算共形几何讲义 (5)上同调理论
    清华笔记:计算共形几何讲义 (6)上同调的霍奇理论
    清华笔记:计算共形几何讲义 (7)矢量场设计
    清华笔记:计算共形几何讲义 (8)狭缝映射(Slit Map)的存在性
    清华笔记:计算共形几何讲义 (9)全纯微分
    清华笔记:计算共形几何讲义 (10)纪念米尔扎哈尼——泰希米勒(Teichmuller)空间
    清华笔记:计算共形几何讲义 (11)黎曼映照(Riemann Mapping)的存在性
    清华笔记:计算共形几何讲义 (12)极值长度
    清华笔记:计算共形几何讲义 (13)Koebe 迭代收敛性
    清华笔记:计算共形几何讲义 (14)共形模的计算
    清华笔记:计算共形几何讲义 (15)拓扑圆盘的调和映照
    清华笔记:计算共形几何讲义 (16)拓扑球面的调和映照
    清华笔记:计算共形几何讲义 (17)全纯二次微分(holomorphic quadratic differential)
    清华笔记:计算共形几何讲义 (18)拟共形映射(Quasi-Conformal Map)
    离散曲面曲率流 (Discrete Surface Ricci Flow ) I
    清华笔记:计算共形几何讲义 (20)离散曲面曲率流 (Discrete Surface Ricci Flow)II
    清华笔记:计算共形几何讲义 (21)离散曲面曲率流 (Discrete Surface Ricci Flow)III
    清华笔记:计算共形几何讲义 (22)离散曲面曲率流 (Discrete Surface Ricci Flow)IV
    清华笔记:计算共形几何讲义 (23)离散曲面曲率流 (Discrete Surface Ricci Flow)V
    清华笔记:计算共形几何讲义 (24)Teichmuller 映射
    清华笔记:计算共形几何讲义 (25) 共形几何的概率解释
    清华笔记:计算共形几何讲义 (26) 单值化定理证明
    计算共形几何讲义:纤维丛和陈类
    计算共形几何讲义:阿贝尔定理
    计算共形几何讲义:雅可比定理
    计算共形几何讲义:黎曼-罗赫定理