Capture HAR files from a headless Chrome instance.
Under the hood this module uses chrome-remote-interface to instrument Chrome.
Install this module from NPM:
npm install chrome-har-capturer
Start Chrome like this:
google-chrome --remote-debugging-port=9222 --headless
Important note: this is a complete rewrite that uses brand new JavaScript features and takes full advantage of the headless mode of Chrome; the upshot is that it requires Node.js version 7.6.0+ and can only work in headless mode. It also introduces breaking changes in the API. If these requirements are unfeasible stick with version 0.9.5 but consider that it will not be supported anymore.
The command line utility can be used to generate HAR files from a list of URLs. The following options are available:
-h, --help output usage information
-t, --host <host> Chrome Debugging Protocol host
-p, --port <port> Chrome Debugging Protocol port
-x, --width <dip> frame width in DIP
-y, --height <dip> frame height in DIP
-o, --output <file> write to file instead of stdout
-c, --content also capture the requests body
-k, --cache allow caching
-a, --agent <agent> user agent override
-b, --block <URL> URL pattern (*) to block (can be repeated)
-H, --header <header> Additional headers (can be repeated)
-i, --insecure ignore certificate errors
-g, --grace <ms> time to wait after the load event
-u, --timeout <ms> time to wait before giving up with a URL
-r, --retry <number> number of retries on page load failure
-e, --retry-delay <ms> time to wait before starting a new attempt
-d, --post-data <bytes> maximum POST data size to be returned
-l, --parallel <n> load <n> URLs in parallel
Alternatively this module provides a simple API that can be used to write custom applications. See the command line utility source code for a working example.
Start the loading of a batch of URLs. Returns an event emitter (see below for the list of supported events).
urls
is array of URLs.
options
is an object with the following optional properties:
-
host
: Chrome Debugging Protocol host. Defaults tolocalhost
; -
port
: Chrome Debugging Protocol port. Defaults to9222
; -
width
: frame width in DIP. Defaults to a Chrome-defined value; -
height
: frame height in DIP. Defaults to a Chrome-defined value; -
content
: iftrue
also capture the requests body. Defaults tofalse
; -
cache
: iftrue
allow caching. Defaults tofalse
; -
timeout
: milliseconds to wait before giving up with a URL; -
retry
: number of retries on page load failure. Defaults to0
; -
retry-delay
: time to wait before starting a new attempt. Defaults to0
; -
postData
: maximum POST data size (in bytes) to be returned. Defaults to unlimited; -
parallel
: iftrue
load the URLs in parallel (warning: this may spoil time-based metrics). Defaults tofalse
; -
preHook
: function returning a Promise executed before each page load:url
: the current URL;client
: CDP client instance;index
: index ofurl
inurls
;urls
: input URL array.
-
postHook
: function returning a Promise executed after each page load event:url
: the current URL;client
: CDP client instance;index
: index ofurl
inurls
;urls
: input URL array.
If this hook resolves to a value then it is included in the resulting HAR object as the value of the
_user
key of the this URL's page object.
function (url, index, urls) {}
Emitted when Chrome is about to load url
. index
is the index of url
in
urls
. urls
is the array passed to run()
.
function (url, index, urls) {}
Emitted when Chrome finished loading url
. index
is the index of url
in
urls
. urls
is the array passed to run()
.
function (url, err, index, urls) {}
Emitted when Chrome cannot load url
. The Error
object err
contains the
failure reason. Failed URLs will not appear in the resulting HAR object. index
is the index of url
in urls
. urls
is the array passed to run()
.
function (har) {}
Emitted when all the URLs have been processed. If all the URLs fails then a
valid empty HAR object is returned. har
is the resulting HAR object.
Generate a single-page HAR from an array of raw events that comes from the Chrome Debugging Protocol (e.g., from chrome-remote-interface). Returns a Promise that fulfills to the generated HAR.
url
is the page URL;
log
is the array of events in the form:
{
method: '...',
params: {...}
}
Events to be provided are:
Page.domContentEventFired
;Page.loadEventFired
;Network.requestWillBeSent
;Network.dataReceived
;Network.responseReceived
;Network.resourceChangedPriority
;Network.loadingFinished
;Network.loadingFailed
.
options
is an object with the following optional properties:
content
: iftrue
also expect the requests body. Defaults tofalse
.
When content
is true
synthetic events in the following form are also
expected that represent the reply of the Network.getResponseBody
method:
{
method: 'Network.getResponseBody',
params: {
requestId: '...',
body: '...',
base64Encoded: true/false
}
}