小伙伴关心的问题:歌曲网站,教你爬取mp3和lyric,本文通过数据整理汇集了歌曲网站,教你爬取mp3和lyric相关信息,下面一起看看。

歌曲网站,教你爬取mp3和lyric

从歌曲网站,获取音频和歌词的流程:

1, 输入歌曲名,查找网站中存在的歌曲 id2, 拿歌曲 id 下载歌词 lyric

简单的 url 拼接

3, 拿歌曲 id 下载音频 mp3

先用一个 POST 请求,拿 ID 取音频资源路径,

再用 GET 请求,拿到音频资源

4 个网络请求,解决,

搜索歌曲,获取歌词,获取音频资源路径,获取音频资源

注意的是,4 个网络请求,都要模拟正常的浏览器请求,

GET 请求,需要配置请求头,POST 请求,需要配置请求头和请求体

1, 查找网站的歌曲

先准备,模拟正常的浏览器请求

配置 Session,

有一个加解密,具体见 github repo.

def__init__(self, timeout=60, cookie_path=.):self.headers = {Accept:*/*,Accept-Encoding:gzip,deflate,sdch,Accept-Language:zh-CN,zh;q=0.8,gl;q=0.6,zh-TW;q=0.4,Connection:keep-alive,Content-Type:application/x-www-form-urlencoded,Host:music.x.com,Referer:http://music.x.com/search/,User-Agent:Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/63.0.3239.132 Safari/537.36}self.session = requests.Session()self.session.headers.update(self.headers)self.session.cookies = cookiejar.LWPCookieJar(cookie_path)self.download_session = requests.Session()self.timeout = timeoutself.ep = Encrypyed()1234567891011121314151617

封装 Post 请求方法

defpost_request(self, url, params):""" Post请求 :return: 字典 """data = self.ep.encrypted_request(params) resp = self.session.post(url, data=data, timeout=self.timeout) result = resp.json()ifresult[code] !=200: click.echo(post_request error)else:returnresult1234567891011121314

去搜索:

defsearch(self, search_content, search_type, limit=9):""" 搜索API :params search_content: 搜索内容 :params search_type: 搜索类型 :params limit: 返回结果数量 :return: 字典. """url =http://music.x.com/weapi/xxx/get/web?csrf_token=params = {s: search_content,type: search_type,offset:0,sub:false,limit: limit} result = self.post_request(url, params)returnresult12345678910111213

拿到搜索结果:

result = self.search(song_name, search_type=1,limit=limit)ifresult[result][songCount] <= 0: click.echo(Song {} not existed..format(song_name))else: songs = result[result][songs]ifquiet: song_id, song_name = songs[0][id], songs[0][name] song = Song(song_id=song_id, song_name=song_name, song_num=song_num)returnsong1234567891011

下载歌词

下载很简单

lyricUrl =http://music.x.com/api/song/lyric/?id={}&lv=-1&csrf_token={}.format(song_id, csrf) lyricResponse = self.session.get(lyricUrl)12

拿到一个 json ,获取里面的歌词,

lyricJSON = lyricResponse.json() lyrics = lyricJSON[lrc][lyric].split("\n") lyricList = []forword in lyrics:time= word[1:6] name = word[11:] p = Node(time, name) lyricList.append(p) json_string = json.dumps([node.__dict__fornode in lyricList], ensure_ascii = False, indent =4)1234567891011

写入新建的本地文件

ifnotos.path.exists(folder):os.makedirs(folder) fpath =os.path.join(folder, str(song_num) +_+ song_name +.json) text_file =open(fpath,"w") n = text_file.write(json_string) text_file.close()123456

下载音频分两步

先拿到音频资源路径url =http://music.x.com/weapi/song/enhance/player/url?csrf_token=csrf =params = {ids: [song_id],br: bit_rate,csrf_token: csrf} result = self.post_request(url, params) 歌曲下载地址 song_url = result[data][0][url] 歌曲不存在ifsong_urlisNone: click.echo(Song {} is not available due to copyright issue..format(song_id))else:returnsong_url12345678910111213再获取音频资源ifnot os.path.exists(fpath): resp = self.download_session.get(song_url, timeout=self.timeout, stream=True) length =int(resp.headers.get(content-length)) label =Downloading {} {}kb.format(song_name,int(length/1024))1234

一边下载,一边看进度

withclick.progres *** ar(length=length, label=label)asprogres *** ar:withopen(fpath,wb)assong_file:forchunkinresp.iter_content(chunk_size=1024):ifchunk: song_file.write(chunk) progres *** ar.update(1024)12345678

需要源码01私信小编

更多歌曲网站,教你爬取mp3和lyric相关信息请关注本站,本文仅仅做为展示!