最近做了一个小功能,将一个页面上面的所有图片下载下来生成一个PDF文件。发现了一个非常好用的库reportlab, pyPdf。只需要几行代码就能实现功能,如果没有安装可以通过pip安装:
pip install reportlab -i https://pypi.douban.com/simple
pip install pyPdf -i https://pypi.douban.com/simple
注: -i表示使用豆瓣的镜像服务
操作过程
下面记录下我的处理的过程:
(1)如果只是简单的需要将图片进行拼接成PDF可以直接使用如下的方式:
from reportlab.lib.pagesizes import A4, portrait, landscape
from reportlab.pdfgen import canvas
def convert_images_to_pdf(img_path, pdf_path):
    pages = 0
    (w, h) = portrait(A4)
    c = canvas.Canvas(pdf_path, pagesize = portrait(A4))
    l = os.listdir(img_path)
    l.sort(key= lambda x:int(x[:-4]))
    for i in l:
        f = img_path + os.sep + str(i)
        c.drawImage(f, 0, 0, w, h)
        c.showPage()
        pages = pages + 1
    c.save()
(2)如果需要根据不同的尺寸的图片设置横屏还是竖屏模式,可以考虑使用如下方式实现:
import os, shutil
from PIL import Image
from reportlab.lib.pagesizes import A4, portrait, landscape
from reportlab.pdfgen import canvas
from pyPdf import PdfFileWriter, PdfFileReader
def convert_image_to_pdf(img_path, pdf_path):
    img = Image.open(img_path)
    (w0, h0) = img.size
    if w0 > h0:
        (w, h) = landscape(A4)
        c = canvas.Canvas(pdf_path, pagesize = landscape(A4))
        c.drawImage(img_path, 0, 0, w, h)
        c.showPage()
        c.save()
    else:
        (w, h) = portrait(A4)
        c = canvas.Canvas(pdf_path, pagesize = portrait(A4))
        c.drawImage(img_path, 0, 0, w, h)
        c.showPage()
        c.save()
def convert_images_to_pdf(img_path, pdf_path):
    pages = 0
    tmp_path = '.' + os.sep + 'temp'
    if not os.path.exists(tmp_path):
        os.mkdir(tmp_path)
    list = os.listdir(img_path)
    list.sort(key=lambda x:int(x[:-4]))
    output = PdfFileWriter()
    for item in list:
        img = img_path + os.sep + str(item)
        pdf = tmp_path + os.sep + str(pages + 1) + ".pdf"
        convert_image_to_pdf(img, pdf)
        input = PdfFileReader(file(pdf, "rb"))
        pageCount = input.getNumPages()
        pages = pages + 1
        for iPage in range(0, pageCount):
            output.addPage(input.getPage(iPage))
    outputStream = file(pdf_path, "wb")
    output.write(outputStream)
    outputStream.close()
    shutil.rmtree(tmp_path)