带有gtk/webkit/jswebkit的呈现/交互式javascript

Posted

tags:

篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了带有gtk/webkit/jswebkit的呈现/交互式javascript相关的知识,希望对你有一定的参考价值。

  1. # This is a downloader middleware that can be used to get rendered javascript pages using webkit.
  2. #
  3. # this could be extended to handle form requests and load errors, but this is the bare bones code to get it done.
  4. #
  5. # the advantage over the selenium based approachs I've seen is that it only makes one request and you don't have to set up selenium.
  6.  
  7. from scrapy.http import Request, FormRequest, htmlResponse
  8.  
  9. import gtk
  10. import webkit
  11. import jswebkit
  12.  
  13. class WebkitDownloader( object ):
  14. def process_request( self, request, spider ):
  15. if( type(request) is not FormRequest ):
  16. webview = webkit.WebView()
  17. webview.connect( 'load-finished', lambda v,f: gtk.main_quit() )
  18. webview.load_uri( request.url )
  19. gtk.main()
  20. js = jswebkit.JSContext( webview.get_main_frame().get_global_context() )
  21. renderedBody = str( js.EvaluateScript( 'document.documentElement.innerHTML' ) )
  22. return HtmlResponse( request.url, body=renderedBody )
  23.  
  24. # Snippet imported from snippets.scrapy.org (which no longer works)
  25. # author: jdwilson
  26. # date : Sep 06, 2011
  27.  

以上是关于带有gtk/webkit/jswebkit的呈现/交互式javascript的主要内容,如果未能解决你的问题,请参考以下文章