Javascript 循环、异步函数和无头浏览器

Posted

技术标签:

【中文标题】Javascript 循环、异步函数和无头浏览器【英文标题】:Javascript loops, async functions and headless browser 【发布时间】:2020-05-19 16:18:58 【问题描述】:

通过使用微软新的无头浏览器剧作家,我构建了既不返回错误也不返回其他东西的东西。

此时,我的想法已经结束,我请你给我一些提示,以指出我的失败。

这段代码应该只是启动多个无头浏览器组异步。 但是浏览器的启动挂起,应用程序处于无限循环中。 我在这里粘贴代码,它是一个简单的 nodejs 脚本,用于重现行为。

感谢您的帮助和阅读;)

const playwright = require('playwright');

log('start playwright async');

let maxRunners = 1;
let running = 0;
let list = [1,2,3,4,5,6,7,8,9,0,11,12,13,14,15];

log('start job');

while (list.length > 0) 

    if (running < maxRunners) 
        log('runner started');
        running++;

        let entry = list[0];
        list.shift();

        log('start browser loop');
        for (const browserType of ['chromium', 'firefox', 'webkit']) 
            log('fire async');
            (async () => 
                log('loop next');
                log('launch: ', browserType);
                const browser = await playwright[browserType].launch(
                    headless: false
                );
                log(browserType, ' launched');
                const context = await browser.newContext();
                log('open new page');
                const page = await context.newPage('http://whatsmyuseragent.org/');
                log('page opened');
                log('make screenshot');
                await page.screenshot(path: `example-$browserType.png`);
                log('screenshot made');
                log('close browser');
                await browser.close();
                log('browser closed');
                log('loop succeed');

                running--;
            )();
            log('end async');
        
        log('end loop');

        if (running === 0 && list.length === 0) 
            log('job finished');
        
    


log('end playwright script');

function log(...msgs) 
    let date = new Date();
    let timeString = date.toISOString().substr(11, 8);
    let msg = '';
    for (let i in msgs) 
        msg += msgs[i];
    

    console.log(timeString, ':', msg);

输出:

20:53:29 : start playwright async
20:53:29 : start job
20:53:29 : runner started
20:53:29 : start browser loop
20:53:29 : fire async
20:53:29 : loop next
20:53:29 : launch: chromium
20:53:29 : end async
20:53:29 : fire async
20:53:29 : loop next
20:53:29 : launch: firefox
20:53:29 : end async
20:53:29 : fire async
20:53:29 : loop next
20:53:29 : launch: webkit
20:53:29 : end async
20:53:29 : end loop

【问题讨论】:

【参考方案1】:

我不相信您的异步函数实际上正在被评估。

您能否创建一个承诺列表并使用Promise.all(),而不是每次迭代调用一次异步函数?

【讨论】:

调试器在循环内停止并进入浏览器。这只是一个例子,真实的代码执行一个承诺启动浏览器的函数 我添加了输出,你可以看到异步函数被评估并且在启动时停止。【参考方案2】:

立即调用的函数无需等待完成即可执行。例如:

const promise = (time = 1, shouldThrowError = false) => new Promise((resolve, reject) => 
  timeInMs = time * 1000
  setTimeout(()=>
    console.log(`Waited $time secs`)
    if (shouldThrowError) reject(new Error('Promise failed'))
    resolve(time)
  , timeInMs)
);

// Start excuting first async immediate function
(async () => 
  try 
    console.log('starting first promise')
    await promise(1)
    console.log('finished first promise')
   catch (error) 
    
  
)();
// This executes without finishing previous promise
(async () => 
  try 
    console.log('starting second promise')
    await promise(1)
    console.log('finished second promise')
   catch (error) 
    
  
)();

更改您的块代码:

        (async () => 
            log('loop next');
            ...
            log('loop succeed');

            running--;
        )();

到:

        const process = async () => 
            log('loop next');
            ...
            log('loop succeed');

            running--;
        ;
        await process()

另外,为了能够使用 await,您应该将所有代码包装在一个异步函数中:

(async () => 
   ...all your code
)();

【讨论】:

【参考方案3】:

您可以在代码中改进以下几点:

(async()=>

  log('start playwright async');

  let maxRunners = 1;
  let running = 0;
  let list = [1,2,3,4,5,6,7,8,9,0,11,12,13,14,15];

  log('start job');
  const promises = [];
  while (list.length > 0) 

      if (running < maxRunners) 
          log('runner started');
          running++;

          let entry = list[0];
          list.shift();

          log('start browser loop');
          for (const browserType of ['chromium', 'firefox', 'webkit']) 
              log('fire async');
              promises.push((async () => 
                  log('loop next');
                  log('launch: ', browserType);
                  const browser = await playwright[browserType].launch(
                      headless: false
                  );
                  log(browserType, ' launched');
                  const context = await browser.newContext();
                  log('open new page');
                  const page = await context.newPage('http://whatsmyuseragent.org/');
                  log('page opened');
                  log('make screenshot');
                  await page.screenshot(path: `example-$browserType.png`);
                  log('screenshot made');
                  log('close browser');
                  await browser.close();
                  log('browser closed');
                  log('loop succeed');

                  running--;
              )());
              log('end async');
          
          log('end loop');
       else 
        await Promise.all(promises);
      
  

  await Promise.all(promises);
  log('job finished');
  log('end playwright script');

  function log(...msgs) 
      let date = new Date();
      let timeString = date.toISOString().substr(11, 8);
      //date.setSeconds(45); // specify value for SECONDS here
      //var timeString = date.toISOString().substr(11, 8);
      let msg = '';
      for (let i in msgs) 
          msg += msgs[i];
      

      console.log(timeString, ':', msg);
  
)()

让我们将所有内容都包装在一个异步函数中

(async()=>
)();

然后,让我们跟踪这些任务/承诺:

const promises = [];
...
log('fire async');
promises.push((async () => 
)());

如果你离开工人,你需要等待他们:

if (running < maxRunners) 
...
 else 
   await Promise.all(promises);

您应该开始使用它。

【讨论】:

感谢您的有用提示。最好等待承诺而不是一直执行条件。现在我也清除了所有完成后的承诺数组。【参考方案4】:

与@hardkoded 的出色回答在同一行:

简而言之,使用带有 Promise 的异步函数:

一个简单地说明同一点的基本示例:

function delay() 
  return new Promise(resolve => setTimeout(resolve, 300));


async function delayedLog(item) 
  // notice that we can await a function
  // that returns a promise
  await delay();
  console.log(item);




async function processArray(array) 
  // map array to promises
  const promises = array.map(delayedLog);
  // wait until all promises are resolved
  await Promise.all(promises);
  console.log('Done!');

试一试,用它来理解 javascript 中的异步行为!

【讨论】:

以上是关于Javascript 循环、异步函数和无头浏览器的主要内容,如果未能解决你的问题,请参考以下文章

浏览器中的JavaScript事件循环机制

浏览器中的JavaScript事件循环机制

进阶学习5:JavaScript异步编程——同步模式异步模式调用栈工作线程消息队列事件循环回调函数

爬虫04 /asyncioselenium规避检测动作链无头浏览器

总结javascript基础概念:事件队列循环

适用于 Python 的无头浏览器(需要 Javascript 支持!)[关闭]