900字范文 > python文件实时同步_python文件自动同步备份v1.2【运维必备】/12/31

python文件实时同步_python文件自动同步备份v1.2【运维必备】/12/31

时间：2020-06-25 17:06:37

本帖最后由 We. 于 -1-4 08:18 编辑

v1版本打包在这里了，感兴趣的自己下来看：

同步备份v1.rar

(1.6 KB, 下载次数: 8)

-12-30 16:12 上传

点击文件名下载附件

下载积分: 吾爱币 -1 CB

声明：

感谢@的提醒，本方案只适用于局域网内同步备份，没有做加密/认证，没有过防火墙。

以下是v1.2内容：

新增加了多进程下载

修改下目录就可以直接拿去用了

--------------------------------------------------------------------------------------------------------------------------

需求：平台会把虚拟机备份的文件打包到服务器A，再同步备份到服务器B(只需要考虑A到B)。

思路：

服务器A作为服务端，定时遍历自己的文件目录，把文件目录信息打包成一个校验文件。

服务器B作为客户端，下载校验文件，遍历自己的文件目录是否和服务器相同，并下载本地没有的文件。

通过http传输，使用python开启一个简单的http服务。有防火墙需要把端口放通，没有就不管。

生产环境：python3.7.9，两台CentOS7.9服务器。

在服务端的备份目录下开启http服务：

nohup是用来后台开启http服务的，不然控制台没法干其他事情。

image.png (23.68 KB, 下载次数: 0)

-12-30 16:16 上传

服务端：

[Python] 纯文本查看复制代码import os

path = '/H3C_Backup'

def func(path):

contents = os.walk(path, topdown=True)

dir = []

file = []

for (root, dirs, files) in contents:

dir.append(root)

for i in files:

file.append(root+'/'+i)

return [dir, file]

content = func(path)

with open(path+'/'+'content.txt', 'w', encoding='utf-8') as f:

for i in content[0]:

f.write(i)

f.write('\n')

with open(path+'/'+'file.txt', 'w', encoding='utf-8') as f:

for i in content[1]:

f.write(i)

f.write('\n')

客户端：

[Python] 纯文本查看复制代码import os

import time

import shutil

import multiprocess

import requests

def init() :

url = ['http://172.172.172.1:8000/file.txt', 'http://172.172.172.1:8000/content.txt']

download_file = requests.get(url[0], stream=True)

with open('/download/file.txt', 'wb') as f :

for chunk in download_file.iter_content(chunk_size=4096) :

f.write(chunk)

download_content = requests.get(url[1], stream=True)

with open('/download/content.txt', 'wb') as f :

for chunk in download_content.iter_content(chunk_size=4096) :

f.write(chunk)

def function(path) :

# 通过os.walk()方法遍历到所有文件夹和文件

file = []

dir = []

x = os.walk(path, topdown=True)

for (root, dirs, files) in x :

dir.append(root)

for i in files :

file.append(root + '/' + i)

return [dir, file]

def check_dir(path) :

# 获取本地目录

x = function(path)

dir_so = x[0]

# 清洗服务端目录

dirs = open('/download/content.txt', 'r', encoding='utf-8')

dir_dst = dirs.readlines()

dir_dst_info = []

for i in dir_dst :

i = i.replace('\n', '')

print(i)

dir_dst_info.append(i)

for i in dir_dst_info[1 :] + dir_so :

if i not in dir_so :

os.mkdir(i)

print('创建了' + i)

if i not in dir_dst_info :

try :

shutil.rmtree(i)

print('删除了' + i)

except :

pass

def download(url, path) :

download_file = requests.get(url, stream=True)

with open(path, 'wb') as f :

for chunk in download_file.iter_content(chunk_size=10240) :

f.write(chunk)

print('添加了' + path)

def check_file(path) :

x = function(path)

file_so = x[1]

pool = multiprocessing.Pool(processes=10)

# 清洗服务端文件

files = open('/download/file.txt', 'r', encoding='utf-8')

files_dst = files.readlines()

files_dst_info = []

for i in files_dst :

i = i.replace('\n', '')

files_dst_info.append(i)

# 没有的下载,多余的删掉

for i in file_so + files_dst_info :

if i not in file_so :

url = 'http://172.172.172.1:8000' + i

pool.apply_async(download, (url, i,))

if i not in files_dst_info :

os.remove(i)

print('删除了' + i)

pool.close()

pool.join()

if __name__ == '__main__' :

path = '/H3C_Backup'

init()

check_dir(path)

check_file(path)

10个进程起飞，一共12T数据慢慢跑。

image.png (116.95 KB, 下载次数: 0)

-12-30 18:02 上传

12个进程一起跑这cpu占用率有点高啊。

image.png (47.66 KB, 下载次数: 0)

-12-30 18:32 上传

速度也不算慢，一小会儿80个G了。

image.png (78.08 KB, 下载次数: 0)

-12-30 18:34 上传

今早起来一看，传了10个T了，还在运行，等他慢慢弄完把。

image.png (29.79 KB, 下载次数: 0)

-12-31 08:55 上传

待优化：

1、写法待优化

2、触发方式待优化

3、用socket的tcp会不会比http更快？

另外，为什么多线程这么拉跨比单线程还慢？总感觉多进程有点浪费cpu资源。迅雷的下载方式又是如何实现的？

欢迎指教。

本内容不代表本网观点和政治立场，如有侵犯你的权益请联系我们处理。

网友评论

网友评论仅供其表达个人看法，并不表明网站立场。