【譯】Headless Chrome 入門指南

时间:2023-06-27 11:15:32 编辑:浏览器知识

本文翻譯自：Getting Started with Headless Chrome
原文更新時間：July 28,2017
作者：Eric Bidelman（Engineer @ Google. Working on Chrome & the web.）
譯者：Pandorym

Headless Chrome由 Chrome 59 帶來。這是一種在無界面環境中運行 Chrome 瀏覽器的方法。本質上，就是運行無界面的的 Chrome ！他為命令行帶來了 Chromium 和 Blink 渲染引擎提供的所有現代的 Web 平台特性。

他有什麼用呢？

用於對於自動化測試和不需要能看到UI外殼的服務器環境，Headless 瀏覽器是一個極好的工具。舉個例子，你可能要對一個真實的 Web 頁面進行一些測試，為它創建一個 PDF ，或者就是檢查噶瀏覽器如何渲染一個 URL 。

提示：Headless模式在Mac和Linux上是Chrome 59可用。Windows的支持將在Chrome 60到來。你可以打開chrome://version檢查當前版本。

開啟 Headless（CLI）

啟動 Headless 模式最簡單的方式四在命令行中打開 Chrome 二進制文件。如果你已經安裝Chrome 59+，攜帶--headless標識啟動Chrome：

chrome \
  --headless \                   # 使用 headless 模式.
  --disable-gpu \                # 現在暫時還需要.
  --remote-debugging-port=9222 \ 
  https://www.chromestatus.com   # 將打開的 URL. 默認打開 about:blank.

注意：現在，你需要包含--disable-gpu參數。不過這個參數最終會被丟棄。

chrome需要指向你安裝的 Chrome。確切位置因平台而異。由於我在 Mac 上，我為我安裝的每個版本的 Chrome 都創建了方便的別名。

如果你在Chrome的穩定版頻道不能進行這個測試，我推薦使用chrome-canary：

alias chrome="/Applications/Google\ Chrome.app/Contents/MacOS/Google\ Chrome"
alias chrome-canary="/Applications/Google\ Chrome\ Canary.app/Contents/MacOS/Google\ Chrome\ Canary"
alias chromium="/Applications/Chromium.app/Contents/MacOS/Chromium"

從這裡下載 Chrome Canary。

命令行特性

在某些情況下，你可能不需要以編程方式編寫 Headless Chrome。這是一些對執行普通任務有用的命令行標識。

打印 DOM

--dump-dom標識，打印document.body.innerHTML到標準輸出：

chrome --headless --disable-gpu --dump-dom https://www.chromestatus.com/

創建 PDF

--print-to-pdf標識，為頁面創建一個PDF文件：

chrome --headless --disable-gpu --print-to-pdf https://www.chromestatus.com/

取得截屏

要捕獲一個頁面的屏幕截圖，使用--screenshot標識：

# Size of a standard letterhead.
chrome --headless --disable-gpu --screenshot --window-size=1280,1696 https://www.chromestatus.com/

# Nexus 5x
chrome --headless --disable-gpu --screenshot --window-size=412,732 https://www.chromestatus.com/

使用--screenshot標識運行，將在當前工作目錄產生一個命名為screenshot.png的文件。如果你正在尋找完整的頁面截圖，事情會更複雜一些。這有一個 Davil Schnurr 發佈的優秀的Blog，包含你需要的大部分內容。查看：使用 Headless Chrome 作為自動截屏工具。

REPL 模式（read-eval-print loop）

--repl標識使 Headless 運行砸地一個特定的模式。你可以通過命令行，要求在瀏覽器中執行JS表達式：

$ chrome --headless --disable-gpu --repl https://www.chromestatus.com/
[0608/112805.245285:INFO:headless_shell.cc(278)] Type a Javascript expression to evaluate or "quit" to exit.
>>> location.href
{"result":{"type":"string","value":"https://www.chromestatus.com/features"}}
>>> quit
$

脫離瀏覽器界面調試Chrome

當你使用--remote-debug-port=9222運行 Chrome 時，他將啟動一個實例，并激活 DevTools 協議。這個協議用於連接 Chrome ，和驅動 Headless 瀏覽器實例。他也用於使用 Sublime，VS Code 和 Node 等工具遠程調試應用程序。#協同

由於你沒有瀏覽器 UI 查看頁面，在另一個瀏覽器進入http://localhost:9222以檢查他正在工作。你將在視察頁面看到一個列表，在這你可以通過點擊來查看 Headless 的渲染。

DevTools remote debugging UI

在這，你可以使用熟悉的 DevTools 特性像通常那樣去檢查、調試和調整頁面。如果以編程方式使用 Headless，這個頁面也是一個強大的調試工具，能看到所有原始的 DevTools 協議命令通過連線，於瀏覽器進行通信。

使用編程方式（Node）

啟動 Chrome

在之前的章節，我們使用--headless --remote-debugging-port=9222手動啟動Chrome。然後，為了完全自動化測試，你可能想在你的應用程序中生成Chrome。

一個方法是使用child-process：

const execFile = require('child_process').execFile;

function launchHeadlessChrome(url, callback) {
  // Assuming MacOSx.
  const CHROME = '/Applications/Google\ Chrome.app/Contents/MacOS/Google\ Chrome';
  execFile(CHROME, ['--headless', '--disable-gpu', '--remote-debugging-port=9222', url], callback);
}

launchHeadlessChrome('https://www.chromestatus.com', (err, stdout, stderr) => {
  ...
});

但你想要一個跨多平台的解決方案時，事情就變得棘手了。就看看這硬編碼的 Chrome 路徑吧 ?

使用 ChromeLauncher

Lighthouse 是一個用於測試Web應用程序質量的極好的工具。有一個用於啟動 Chrome 的強大的模塊，之前被包含在 Lightouse 內部，但現在可以獨立使用了。chrome-launcher NPM 模塊將尋找 Chrome 的安裝位置、設置一個調試實例、啟動瀏覽器，并在你的應用程序完成時結束他。最優秀的是它跨多平台工作，感謝 Node ！

默認的，chrome-launcher將啟動 Chrome Canary （如果安裝了），但你能手動更改選擇使用哪一個 Chrome。要使用他，先從 npm 安裝：

yarn add chrome-launcher

Example – 使用chrome-launcher啟動 Headless

const chromeLauncher = require('chrome-launcher');

// Optional: set logging level of launcher to see its output.
// Install it using: yarn add lighthouse-logger
// const log = require('lighthouse-logger');
// log.setLevel('info');

/**
 * Launches a debugging instance of Chrome.
 * @param {boolean=} headless True (default) launches Chrome in headless mode.
 *     False launches a full version of Chrome.
 * @return {Promise<ChromeLauncher>}
 */
function launchChrome(headless=true) {
  return chromeLauncher.launch({
    // port: 9222, // Uncomment to force a specific port of your choice.
    chromeFlags: [
      '--window-size=412,732',
      '--disable-gpu',
      headless ? '--headless' : ''
    ]
  });
}

launchChrome().then(chrome => {
  console.log(`Chrome debuggable on port: ${chrome.port}`);
  ...
  // chrome.kill();
});

運行這個腳本并不會做什麼事，但你應該在任務管理器中看到一個 Chrome 實例啟動，他載入了about:blank。記著，這不會有任何瀏覽器界面，We’re headless.

要控制瀏覽器，我們需要 DevTools 協議！

檢索關於界面的信息

chrome-remote-interface是一個很好的 Node 包，他提供了 DevTools 協議可用的 APIs。你可以使用他操作 Headless Chrome，跳轉到頁面，獲取關於頁面的相關信息。

警告：DevTools 協議可以做很多有趣的事，但作為入門選項他令人沮喪。我建議先花費一點時間瀏覽 DevTools Protocol Viewer，然後移步到chrome-remote-interface的API文檔，看看他是如何包裝原始協議的。

讓我們安裝這個庫：

yarn add chrome-remote-interface

Examples

Example – 打印 user agent

const CDP = require('chrome-remote-interface');

...

launchChrome().then(async chrome => {
  const version = await CDP.Version({port: chrome.port});
  console.log(version['User-Agent']);
});

返回值大概像這樣：HeadlessChrome/60.0.3082.0

Example – 檢查這個網站是否存在 web app manifest

const CDP = require('chrome-remote-interface');

...

(async function() {

const chrome = await launchChrome();
const protocol = await CDP({port: chrome.port});

// Extract the DevTools protocol domains we need and enable them.
// See API docs: https://chromedevtools.github.io/devtools-protocol/
const {Page} = protocol;
await Page.enable();

Page.navigate({url: 'https://www.chromestatus.com/'});

// Wait for window.onload before doing stuff.
Page.loadEventFired(async () => {
  const manifest = await Page.getAppManifest();

  if (manifest.url) {
    console.log('Manifest: ' + manifest.url);
    console.log(manifest.data);
  } else {
    console.log('Site has no app manifest');
  }

  protocol.close();
  chrome.kill(); // Kill Chrome.
});

})();

Example – 使用 DOM APIs 取得頁面的<title>

const CDP = require('chrome-remote-interface');

...

(async function() {

const chrome = await launchChrome();
const protocol = await CDP({port: chrome.port});

// Extract the DevTools protocol domains we need and enable them.
// See API docs: https://chromedevtools.github.io/devtools-protocol/
const {Page, Runtime} = protocol;
await Promise.all([Page.enable(), Runtime.enable()]);

Page.navigate({url: 'https://www.chromestatus.com/'});

// Wait for window.onload before doing stuff.
Page.loadEventFired(async () => {
  const js = "document.querySelector('title').textContent";
  // Evaluate the JS expression in the page.
  const result = await Runtime.evaluate({expression: js});

  console.log('Title of page: ' + result.result.value);

  protocol.close();
  chrome.kill(); // Kill Chrome.
});

})();

使用 Selenium，WebDriver，和ChromeDriver

現在，Selenium打開了一個完整的 Chrome 實例。換句話說，這是一個自動化的解決方案，但不是完全 Headless。但是Selenium可以配置運行 Headless Chrome，只需少量配置即可。如果你想要配置的完全指南，我推薦 Running Selenium with Headless Chrome，但我已經在列出了一些例子，讓你可以快速開始。

使用 ChromeDriver

ChromeDriver 2.3.0 支持 Chrome 59 及之後的版本，並且工作於 Headless Chrome。在某些情況下，你可能需要 Chrome 60 來避免 Bugs。舉個例子，我們已經知道 Chrome 59 的獲取截圖存在問題。

安裝：

yarn add selenium-webdriver chromedriver

Example：

const fs = require('fs');
const webdriver = require('selenium-webdriver');
const chromedriver = require('chromedriver');

// This should be the path to your Canary installation.
// I'm assuming Mac for the example.
const PATH_TO_CANARY = '/Applications/Google Chrome Canary.app/Contents/MacOS/Google Chrome Canary';

const chromeCapabilities = webdriver.Capabilities.chrome();
chromeCapabilities.set('chromeOptions', {
  binary: PATH_TO_CANARY // Screenshots require Chrome 60. Force Canary.
  'args': [
    '--headless',
  ]
});

const driver = new webdriver.Builder()
  .forBrowser('chrome')
  .withCapabilities(chromeCapabilities)
  .build();

// Navigate to google.com, enter a search.
driver.get('https://www.google.com/');
driver.findElement({name: 'q'}).sendKeys('webdriver');
driver.findElement({name: 'btnG'}).click();
driver.wait(webdriver.until.titleIs('webdriver - Google Search'), 1000);

// Take screenshot of results page. Save to disk.
driver.takeScreenshot().then(base64png => {
  fs.writeFileSync('screenshot.png', new Buffer(base64png, 'base64'));
});

driver.quit();

使用 WebDriverIO

WebDriverIO 是一個 Selenium WebDirver 上的的高級別 API。

安裝：

yarn add webdriverio chromedriver

Example – 過濾 chromestatus.com 上的 CSS 特性

const webdriverio = require('webdriverio');
const chromedriver = require('chromedriver');

// This should be the path to your Canary installation.
// I'm assuming Mac for the example.
const PATH_TO_CANARY = '/Applications/Google Chrome Canary.app/Contents/MacOS/Google Chrome Canary';
const PORT = 9515;

chromedriver.start([
  '--url-base=wd/hub',
  `--port=${PORT}`,
  '--verbose'
]);

(async () => {

const opts = {
  port: PORT,
  desiredCapabilities: {
    browserName: 'chrome',
    chromeOptions: {
      binary: PATH_TO_CANARY // Screenshots require Chrome 60. Force Canary.
      args: ['--headless']
    }
  }
};

const browser = webdriverio.remote(opts).init();

await browser.url('https://www.chromestatus.com/features');

const title = await browser.getTitle();
console.log(`Title: ${title}`);

await browser.waitForText('.num-features', 3000);
let numFeatures = await browser.getText('.num-features');
console.log(`Chrome has ${numFeatures} total features`);

await browser.setValue('input[type="search"]', 'CSS');
console.log('Filtering features...');
await browser.pause(1000);

numFeatures = await browser.getText('.num-features');
console.log(`Chrome has ${numFeatures} CSS features`);

const buffer = await browser.saveScreenshot('screenshot.png');
console.log('Saved screenshot...');

chromedriver.stop();
browser.end();

})();

FAQ

我需要--disable-gpu標識嗎？

是的，現在需要。--disable-gpu標識是暫時處理一些 Bug 的必要條件。在未來的 Chrome 版本中不需要這個標識。看 https://crbug.com/546953#c152 和 https://crbug.com/695212 獲得更多信息。

所以我仍然需要 Xvfb？

不需要。Headless Chrome 不使用窗口，所以不再需要像 Xvfb 這樣的展示服務。你可以拋開它，開心地運行自動化測試。

什麼是 Xvfb？ Xvfb 是一個類 Unix 系統的內存顯示服務器，可以讓你運行圖形應用程序（如 Chrome），無需附加物理顯示器。許多人使用 Xvfb 運行早期版本的 Chrome 進行「Headless」測試。

我要如何創建一個 Docker 容器運行 Headless Chrome？

查看 lighthouse-ci。他有一個例子 Dockerfile，它使用 Ubuntu 為基本映像，並在 APP Engine Flexible 容器中安裝、運行 Lightouse。

我可以和 Selenium / WebDriver / ChromeDriver 一起使用嗎？

可以，看上文「使用 Selenium，WebDriver，和ChromeDriver」。

他和 PhantomJS 有什麼關係？

Headless Chrome 是一個於 PhantomJS 相似的工具。兩者都可用於 Headless 環境下的自動化測試。兩者主要的區別在於，Phantom 使用舊版本的 WebKit 作為渲染引擎，Headless Chrome 使用最新版本的 Blink。

目前，Phantom 提供了比 DevTools 協議高級別的 API。

在哪提交 bugs？

對於 Headless Chrome 的 Bugs，crbug.com。
對於 DevTools 協議的 Bugs，github.com/ChromeDevTools/devtools-protocol。

文章TAG：chrome 【譯】Headless Chrome 入門指南

加载全部内容