This short tutorial explains how to replay the captured browser’s network offline using Python.
Once you have captured the browser’s network as a HAR file, you can set up a proxy server in order to provide the responses for certain requests which exist in the HAR.
Let’s start with creating a dictionary from the HAR.
import json
dict_url_response = {}
fob = open("har/cap1.har", "r")
data = json.load(fob)
fob.close()
entries = data["log"]["entries"]
for entry in entries:
url = entry["request"]["url"]
response = entry["response"]
dict_url_response[url] = response
Now create a class which we will be using for interception in the proxy.
from mitmproxy.net.http import Response, Headers
class Interception:
def request(self, flow):
try:
url = flow.request.url
har_response = dict_url_response[url]
text = har_response["content"]["text"]
byt = str.encode(text)
list_headers = []
for obj in har_response["headers"]:
list_headers.append((str.encode(obj["name"]), str.encode(obj["value"])))
headers = Headers(list_headers)
response = Response(
http_version = str.encode(har_response["httpVersion"]),
status_code = har_response["status"],
reason = str.encode(har_response["statusText"]),
headers = headers,
content = byt,
trailers = None,
timestamp_start = 0.,
timestamp_end = 1.
)
except KeyError:
print("Request URL not found in HAR")
headers = Headers([
(b'Content-Type', b'text/plain; charset=utf-8'),
])
response = Response(
http_version = str.encode("http/2.0"),
status_code = 200,
reason = str.encode("OK"),
headers = headers,
content = str.encode("Request URL not found in HAR\n"),
trailers = None,
timestamp_start = 0.,
timestamp_end = 1.
)
flow.response = response
Finally, we just have to start the MITM proxy with the interception.
from mitmproxy.options import Options
from mitmproxy.proxy.config import ProxyConfig
from mitmproxy.proxy.server import ProxyServer
from mitmproxy.tools.dump import DumpMaster
import threading
import asyncio
intercept = Interception()
def thread_func(loop, m):
asyncio.set_event_loop(loop)
m.run_loop(loop.run_forever)
options = Options(
listen_host='0.0.0.0',
listen_port=8080,
http2=True
)
m = DumpMaster(
options,
with_termlog=True,
with_dumper=True
)
m.addons.add(intercept)
config = ProxyConfig(options)
m.server = ProxyServer(config)
loop = asyncio.get_event_loop()
t = threading.Thread(
target=thread_func,
args=(loop, m)
)
t.start()
The
thread_func
is used in order to keep the proxy server running. The proxy server listens on 127.0.0.1:8080
.In order to test the HAR replay you have to make the browser go through the proxy server, e.g. for Chrome the command would be:
/Applications/Google\ Chrome.app/Contents/MacOS/Google\ Chrome --proxy-server=127.0.0.1:8080
Sometimes the requests that are captured in the HAR are gzip or brotli compressed. In this GitHub repository such cases are taken into account:
https://github.com/endgame-is-on/HAReplay
Also published on Medium's endgame-is-on