Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Puppeteer: A way to reduce the amount of repetitive work #86

Open
mylamour opened this issue Jun 16, 2021 · 1 comment
Open

Puppeteer: A way to reduce the amount of repetitive work #86

mylamour opened this issue Jun 16, 2021 · 1 comment
Labels

Comments

@mylamour
Copy link
Owner

In recent days, I faced a problem that our team have to mange lot of tickets with ServiceNow. also ServiceNow was provided a cli tools but anyway i didn't get that from service-now store. finally I write a automation scripts with puppeteer and show my little experience within this tutorial.

0x01 Intro

Puppeteer is a Node library which provides a high-level API to control headless Chrome or Chromium over the DevTools Protocol. It can also be configured to use full (non-headless) Chrome or Chromium. and It was useful to do automation, eg. form submission, UI testing, keyboard input, etc.

Puppeteer Basics

we can just install it with npm install -i puppeteer , and i list some basics options here.

  • take a screenshot
const puppeteer = require('puppeteer');

(async () => {
  const browser = await puppeteer.launch();
  const page = await browser.newPage();
  await page.goto('https://example.com');
  await page.screenshot({ path: 'example.png' });

  await browser.close();
})();
  • save a pdf
  await page.pdf({ path: 'hn.pdf', format: 'a4' });
  • Keyboard input
await page.keyboard.press("Tab");
await page.keyboard.type('this is funny')
  • Click
await page.waitForSelector('#header_attachment')
await page.click('#header_attachment')

I suggested to use waitForSelector before click due to the page has different response time.

  • emulate a mobile device
const iPhone = puppeteer.devices['iPhone 6'];
await page.emulate(iPhone);

Also you can setting Nexus 10 and Pixel 2 etc. The full device lists was here

  • cookie Operation
const cookies = await page.cookies()
fs.writeFile('cookie.json', JSON.stringify(cookies, null, 2), function(err) {
  if (err) throw err;
  console.log('completed write of cookies');
});

even we use the cookie, it may not able to reuse the same session. And we can use the userDataDir option to store the data

Puppeteer Advanced

  • Load extension
chrome_extension='extension/path'
    browser = await launch({
        'headless': False,
        'args': [
            '--no-sandbox',
            '--load-extension={}'.format(chrome_extension),
            '--disable-extensions-except={}'.format(chrome_extension),
        ]
    })
  • Executing JavaScript in the page context
    await page.evaluate(() => {
      // var aTags = document.getElementsByTagName("td");
      // var searchText = "\(empty\)";
      // var found;
  
      // for (var i = 0; i < aTags.length; i++) {
      //   if (aTags[i].textContent == searchText) {
      //     found = aTags[i];
      //     found.click();
      //     break;
      //   }
      }
    });
  • Parallel
const { Cluster } = require('puppeteer-cluster');
(async () => {
  const cluster = await Cluster.launch({
    concurrency: Cluster.CONCURRENCY_CONTEXT,
    maxConcurrency: 2,
  });
  await cluster.task(async ({ page, data: url }) => {
    await page.goto(url);
    const screen = await page.screenshot();
  });
  cluster.queue('http://www.google.com/');
  cluster.queue('http://www.wikipedia.org/');
  await cluster.idle();
  await cluster.close();
})();
  • Anti-Detection (This is can't be describe in shortly)

0x02 Scenario

There was two main operations from my side, The fist one is to assign ticket to each person, and the second one is to close tickets. i will show you how to use puppeteer to implement an automation tools. and this is part code of tools.

const puppeteer = require('puppeteer-debug')
const fs = require('fs');
// const iPhone = puppeteer.devices['iPhone 6'];

; (async function () {

  const browser = await puppeteer.launch({
    // devtools: true,
    headless: false,
    userDataDir: './userdata'
  })

  const page = await browser.newPage()
  // await page.emulate(iPhone);
  await page.goto('https://paypal.service-now.com/sc_task_list.do?xxxxxxxxx&yyyyyyyy')
  await page.waitForTimeout(50000)
  
  var array = fs.readFileSync('owners').toString().split("\n");

  for (m in array) {

    console.log("Assign ticket to " + array[m])

    await page.evaluate(() => {
      var aTags = document.getElementsByTagName("td");
      var searchText = "\(empty\)";
      var found;
  
      for (var i = 0; i < aTags.length; i++) {
        if (aTags[i].textContent == searchText) {
          found = aTags[i];
          found.click();
          break;
        }
      }
    });


    for (let i = 0; i < 20; i++) {
      await page.keyboard.press('Enter')
      await page.waitForTimeout('2000')
      await page.keyboard.type(array[m])
      await page.waitForTimeout('2000')
      await page.keyboard.press('ArrowDown')
      await page.waitForTimeout('2000')
      await page.keyboard.press('Enter')
      await page.waitForTimeout('3000')
      if (i != 19) {
        await page.keyboard.press('ArrowDown')
      }
    }

    await page.evaluate(() => {
      document.getElementsByTagName("button")[4].click();
    });

    await page.waitForTimeout(5000)

  }

  await puppeteer.debug() // go to REPL
  await browser.close()

})()

The code was not difficult, and here is some trick what i learned from that.

  • mobile site is more friendly for automation
  • If you can't get an element with dynamic selector , you can just use js code to find that in the page context or you can use keyboard operations if it was allowed in the page.
  • we should wait a selector before what we want do for that due to the response time was different even the same page.
  • if you need to login with accounts each time, try to save user data with the main site but search site.

Here is the final results:

test.mp4

0x03 Conclusion

In last year, we are under fastest growing and there was lot of works need to do. After period of testing , i have to say —— it's necessary to make something automation. especially In daily basis.

Resources

@mylamour
Copy link
Owner Author

mylamour commented Dec 4, 2024

after almost 3 years, i have to use it to process some basic things. today's scenario: take screenshot for each order from https://certcloud.cn

const puppeteer = require('puppeteer');
const setTimeout = require("node:timers/promises").setTimeout;

(async () => {
    const browser = await puppeteer.launch({headless: false, args: ['--lang=en-US,en','--start-fullscreen']});
    const page = await browser.newPage();
    await page.setViewport({ width: 2056, height: 1440});
    await page.goto('https://certcloud.cn/user/login', {waitUntil: 'networkidle0'});
    await setTimeout(3000);
    await page.type('#normal_login_username', 'yyyyyyy');
    await page.type('#normal_login_password', 'xxxxxxx');
    await page.click('button[type="submit"]');
    await page.waitForNavigation({ waitUntil: 'networkidle0' });
    await page.goto('https://certcloud.cn/cert/orders?page_size=100&page=1', {waitUntil: 'networkidle0'});
    await setTimeout(3000);

    const links = await page.evaluate(() => {
        const eles = document.querySelectorAll('a[href^="/cert/orders/"]');
        return Array.from(eles).map(ele => ele.href);
    });
    console.log(links);

    for (let link of links) {
        await page.goto(link, {waitUntil: 'networkidle0'});
        let certCommonName = await page.evaluate(() => document.querySelectorAll('p')[0].title.replace(/[<>:"/\\|?*]/g, ''));
        await page.screenshot({path: `./screenshots/${certCommonName}.png`});
    }
  
  await browser.close();
})();

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant