-
Notifications
You must be signed in to change notification settings - Fork 666
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Export space #1458
Comments
I was considering this idea some time ago, but I couldn't find a working solution. Executing the endpoint to trigger the export is feasible. However, the issue arises after sending the POST request to initiate the export. Another form is displayed to update the status (via AJAX requests). Once the status update reaches 100%, another request is sent with the link to download the generated export file. Does anyone have any ideas on how to capture these AJAX requests that originate from the initial POST HTTP request from the site? |
@gkowalc This is how I solved it eventually by inspecting HTTP requests that Confluence sends. My script uses HTML as hard-coded format but I am sure it can be further parameterized if needed. This function returns a direct URL to the zipped content which can be downloaded by sending HTTP GET request. I used def __get_space_html_download_url(self, space_key: str) -> str:
try:
url = f"spaces/exportspacehtml.action?key={space_key}"
response = self.confluence_client.get(url, advanced_mode=True)
parsed_html = BeautifulSoup(response.text, "html.parser")
atl_token = parsed_html.find("input", { "name": "atl_token" }).get("value")
form_data = {
"atl_token": atl_token,
"exportType": "TYPE_HTML",
"contentOption": "visibleOnly",
"includeComments": True,
"confirm": "Export"
}
# bypass self.confluence_client.post method because it serializes form data as JSON which is wrong
url = self.confluence_client.url_joiner(url=self.confluence_client.url, path=f"spaces/doexportspace.action?key={space_key}")
response = self.confluence_client.session.post(url, headers=self.confluence_client.form_token_headers, data=form_data)
parsed_html = BeautifulSoup(response.text, "html.parser")
poll_url = parsed_html.find("meta", { "name": "ajs-pollURI" }).get("content")
running_task = True
while running_task:
progress_response = self.confluence_client.get(poll_url)
print(progress_response)
if progress_response['complete']:
parsed_html = BeautifulSoup(progress_response['message'], "html.parser")
download_url = parsed_html.find("a", { "class": "space-export-download-path" }).get("href")
return download_url
time.sleep(1)
return None
except Exception as e:
print(e)
return None Maybe someone can tweak it further, make it more general (choice of export format and what to export) and create a PR so it becomes a part of the library. 🙂 |
"Hmm, I was setting up a session using the requests library to utilize an object from atl_token, but your approach with Beautiful Soup (BS4) looks promising. I will try to experiment with the code based on your idea, and if I come up with a working solution, I will send a pull request (PR) soon. Thx for sharing your idea. |
Using BeautifulSoup is not a must here, it can easily be replaced with simple regular expressions. I used it so the code is more readable. Feel free to experiment with the snippet. |
Hello @aleksvujic and others interested in this feature. I think we might hit an issue when export time takes too long (CSRF token might expire), I run my tests on relatively small space exports that were done in couple of minutes. |
Can you please implement "Export space" functionality available via GUI in your Python library as well? It is available in GUI by clicking on
Space settings
in the left sidebar and clicking onExport space
in theManage space section
.The text was updated successfully, but these errors were encountered: