ruby 从Tumblr博客下载所有图像

Posted

tags:

篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了ruby 从Tumblr博客下载所有图像相关的知识,希望对你有一定的参考价值。

# Usage:
#   [sudo] gem install mechanize
#   ruby tumblr-photo-ripper.rb

require 'rubygems'
require 'mechanize'

# Your Tumblr subdomain, e.g. "jamiew" for "jamiew.tumblr.com"
site = "doctorwho"


FileUtils.mkdir_p(site)

concurrency = 8
num = 50
start = 0

loop do
  puts "start=#{start}"

  url = "http://#{site}.tumblr.com/api/read?type=photo&num=#{num}&start=#{start}"
  page = Mechanize.new.get(url)
  doc = Nokogiri::XML.parse(page.body)

  images = (doc/'post photo-url').select{|x| x if x['max-width'].to_i == 1280 }
  image_urls = images.map {|x| x.content }

  image_urls.each_slice(concurrency).each do |group|
    threads = []
    group.each do |url|
      threads << Thread.new {
        puts "Saving photo #{url}"
        begin
          file = Mechanize.new.get(url)
          filename = File.basename(file.uri.to_s.split('?')[0])
          file.save_as("#{site}/#{filename}")
        rescue Mechanize::ResponseCodeError
          puts "Error getting file, #{$!}"
        end
      }
    end
    threads.each{|t| t.join }
  end

  puts "#{images.count} images found (num=#{num})"
  if images.count < num
    puts "our work here is done"
    break
  else
    start += num
  end

end

以上是关于ruby 从Tumblr博客下载所有图像的主要内容,如果未能解决你的问题,请参考以下文章

ruby 使用Ruby从URL下载图像集合

如何从 Python 中查看 Tumblr 帖子的所有注释?

将鼠标悬停在图片上时如何在 tumblr 上显示标签? [关闭]

Tumblr 和 HTML5 - Square Grid 的画布?

HTML 帮助:width="100%" 拉伸图像?

在 Android 上像 tumblr 一样延迟加载图像