-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge pull request #18 from piercefreeman/feature/ca-install-cli
Make the installation process easier for third party libraries by extending our executable with an install-ca command. This currently supports installation on MacOS and Ubuntu. We also add build logic for the python library to deploy via pypi in one executable. Specifically: - Add a new build extension phase that will build the go executable and deposit into the correct asset directory. This is somewhat of a rare / anti-pattern in distutil based pipelines, since we're not building a shared .so library. This is intentional since our goal is to deliver the appropriate go executable as a separate process and not to integrate it at the code level with our python application. - Add separate runners to build component wheels on Ubuntu and OSX - Combine wheels on final workflow and upload via poetry to pypi
- Loading branch information
Showing
18 changed files
with
418 additions
and
63 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Binary file not shown.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,3 +1,80 @@ | ||
# groove-python | ||
# Groove | ||
|
||
Python APIs for Groove. | ||
Python APIs for Groove, a proxy server built for web crawling and unit test mocking. Highlights of its primary features: | ||
|
||
- HTTP and HTTPs support over HTTP/1 and HTTP/2. | ||
- Local CA certificate generation and installation on Mac and Linux to support system curl and Chromium. | ||
- Different tiers of caching support - from disabling completely to aggressively maintaining all body archives. | ||
- Limit outbound requests of the same URL to 1 concurrent request to save on bandwidth if requests are already inflight. | ||
- Record and replay requests made to outgoing servers. Recreate testing flows in unit tests while separating them from crawling business logic. | ||
- 3rd party proxy support for commercial proxies. | ||
- Custom TLS Hello Client support to maintain a Chromium-like TLS handshake while intercepting requests and re-forwarding on packets. | ||
|
||
For more information, see the [Github](https://github.com/piercefreeman/grooveproxy) project. | ||
|
||
## Usage | ||
|
||
Add groove to your project and install the local certificates that allow for https certificate generation: | ||
|
||
``` | ||
pip install groove | ||
install-ca | ||
``` | ||
|
||
Instantiating Groove with the default parameters is usually fine for most deployments. To ensure we clean up resources once you're completed with the proxy, wrap your code in the `launch` contextmanager. | ||
|
||
``` | ||
from groove.proxy import Groove | ||
from requests import get | ||
proxy = Groove() | ||
with proxy.launch(): | ||
response = get( | ||
"https://www.example.com", | ||
proxies={ | ||
"http": proxy.base_url_proxy, | ||
"https": proxy.base_url_proxy, | ||
} | ||
) | ||
assert response.status_code == 200 | ||
``` | ||
|
||
Create a fully fake outbound for testing: | ||
|
||
``` | ||
from groove.proxy import Groove | ||
from requests import get | ||
records = [ | ||
TapeRecord( | ||
request=TapeRequest( | ||
url="https://example.com:443/", | ||
method="GET", | ||
headers={}, | ||
body=b"", | ||
), | ||
response=TapeResponse( | ||
status=200, | ||
headers={}, | ||
body=b64encode("Test response".encode()) | ||
), | ||
) | ||
] | ||
proxy = Groove() | ||
with proxy.launch(): | ||
proxy.tape_load( | ||
TapeSession( | ||
records=records | ||
) | ||
) | ||
response = get( | ||
"https://www.example.com", | ||
proxies={ | ||
"http": proxy.base_url_proxy, | ||
"https": proxy.base_url_proxy, | ||
} | ||
) | ||
assert response.content == b"Test response" | ||
``` |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,75 @@ | ||
from distutils.command.build_ext import build_ext | ||
from distutils.core import Distribution | ||
from distutils.errors import (CCompilerError, CompileError, DistutilsExecError, | ||
DistutilsPlatformError) | ||
from distutils.extension import Extension | ||
from os import chmod, stat | ||
from pathlib import Path | ||
from shutil import copyfile | ||
from subprocess import run | ||
|
||
|
||
class GoExtension(Extension): | ||
def __init__(self, name, path): | ||
super().__init__(name, sources=[]) | ||
self.path = path | ||
|
||
|
||
extensions = [ | ||
GoExtension( | ||
#"groove", | ||
"groove.assets.grooveproxy", | ||
# Assume we have temporarily copied over the proxy folder into our current path | ||
"./proxy", | ||
) | ||
] | ||
|
||
|
||
class BuildFailed(Exception): | ||
pass | ||
|
||
|
||
class GoExtensionBuilder(build_ext): | ||
def run(self): | ||
try: | ||
build_ext.run(self) | ||
except (DistutilsPlatformError, FileNotFoundError): | ||
raise BuildFailed("File not found. Could not compile extension.") | ||
|
||
def build_extension(self, ext): | ||
try: | ||
if isinstance(ext, GoExtension): | ||
extension_root = Path(__file__).parent.resolve() / ext.path | ||
ext_path = self.get_ext_fullpath(ext.name) | ||
result = run(["go", "build", "-o", str(Path(ext_path).absolute())], cwd=extension_root) | ||
if result.returncode != 0: | ||
raise CompileError("Go build failed") | ||
else: | ||
build_ext.build_extension(self, ext) | ||
except (CCompilerError, DistutilsExecError, DistutilsPlatformError, ValueError): | ||
raise BuildFailed('Could not compile C extension.') | ||
|
||
|
||
def build(setup_kwargs): | ||
distribution = Distribution({"name": "python_ctypes", "ext_modules": extensions}) | ||
distribution.package_dir = "python_ctypes" | ||
|
||
cmd = GoExtensionBuilder(distribution) | ||
cmd.ensure_finalized() | ||
cmd.run() | ||
|
||
# This is somewhat of a hack with go executables; this pipeline will package | ||
# them as .so files but they aren't actually built libraries. We maintain | ||
# this convention only for the ease of plugging in to poetry and distutils that | ||
# use this suffix to indicate the build architecture and run on the | ||
# correct downstream client OS. | ||
for output in cmd.get_outputs(): | ||
relative_extension = Path(output).relative_to(cmd.build_lib) | ||
copyfile(output, relative_extension) | ||
mode = stat(relative_extension).st_mode | ||
mode |= (mode & 0o444) >> 2 | ||
chmod(relative_extension, mode) | ||
|
||
|
||
if __name__ == "__main__": | ||
build({}) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,10 @@ | ||
from subprocess import run | ||
|
||
from groove.assets import get_asset_path | ||
|
||
|
||
def install_ca(): | ||
run( | ||
str(get_asset_path("grooveproxy")), | ||
"install-ca", | ||
) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.