Skip to content

Commit

Permalink
DiscordChatExporterPy 2.7.0
Browse files Browse the repository at this point in the history
* Add asset channel as way to export assets (#91)

* Addition of an optional asset channel to save attachments to another channel and preserve them from being not displayed in the final transcript (as long as the asset channel doesn't get deleted)

* Fetch attachments properly as Bytes and send them into the attachment channel

* propagate errors from sending assets to asset channel to user level

* Cleanup code after 1st review

Signed-off-by: doluk <69309597+doluk@users.noreply.github.com>

* Add new attribute to README.md

Signed-off-by: doluk <69309597+doluk@users.noreply.github.com>

* cleanup

Signed-off-by: doluk <69309597+doluk@users.noreply.github.com>

* Fix missing keyword argument

---------

Signed-off-by: doluk <69309597+doluk@users.noreply.github.com>

* Implement member caching across multiple instances of MemberConstruct (#92)

* feat: implement caching members in MessageConstructs

* version bump

* Asset Handler (#96)

- Removal of asset_channel feature
- Addition of AssetHandler origin class
- Addition of two examples for implementing AssetHandlers in form of LocalFileHostHandler and DiscordChannelHandler (latter one has the same functionality as the asset_channel)
- Bugfix: Preventing code blocks from being parsed for other markdown
- Bugfix: Fix timestamp markdown
- Bugfix: Fix div tags for pinned messages
- Bugfix: Fix parsing and css of here and everyone mentions
- Bugfix: Attempting to create a transcript of an empty channel would lead to an IndexError
- Bugfix: Fixing erroring in some channels but not others #98
- Addition of list markdown
- Addition of heading markdown

* README update

* Explain the concept of AttachmentHandlers (#99)

* Explain the AttachmentHandler concept

Signed-off-by: doluk <69309597+doluk@users.noreply.github.com>

* Deduplicate

Signed-off-by: doluk <69309597+doluk@users.noreply.github.com>

---------

Signed-off-by: doluk <69309597+doluk@users.noreply.github.com>

---------

Signed-off-by: doluk <69309597+doluk@users.noreply.github.com>
Co-authored-by: Lukas Dobler <69309597+doluk@users.noreply.github.com>
Co-authored-by: Dan <77840397+Void-ux@users.noreply.github.com>
Co-authored-by: mahtoid <git@mahto.id>
  • Loading branch information
4 people authored Apr 1, 2024
1 parent 51e85d3 commit aa82847
Show file tree
Hide file tree
Showing 11 changed files with 397 additions and 57 deletions.
191 changes: 184 additions & 7 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -94,13 +94,14 @@ This would be the main function to use within chat-exporter.

**Optional Argument(s):**<br/>
`limit`: Integer value to set the limit (amount of messages) the chat exporter gathers when grabbing the history (default=unlimited).<br/>
`tz_info`: String value of a [TZ Database name](https://en.wikipedia.org/wiki/List_of_tz_database_time_zones#List) to set a custom timezone for the exported messages (default=UTC)<br/>
`guild`: `discord.Guild` object which can be passed in to solve bugs for certain forks<br/>
`military_time`: Boolean value to set a 24h format for times within your exported chat (default=False | 12h format)<br/>
`fancy_times`: Boolean value which toggles the 'fancy times' (Today|Yesterday|Day)<br/>
`before`: `datetime.datetime` object which allows to gather messages from before a certain date
`after`: `datetime.datetime` object which allows to gather messages from after a certain date
`bot`: `commands.Bot` object to gather members who are no longer in your guild.
`tz_info`: String value of a [TZ Database name](https://en.wikipedia.org/wiki/List_of_tz_database_time_zones#List) to set a custom timezone for the exported messages (default=UTC).<br/>
`guild`: `discord.Guild` object which can be passed in to solve bugs for certain forks.<br/>
`military_time`: Boolean value to set a 24h format for times within your exported chat (default=False | 12h format).<br/>
`fancy_times`: Boolean value which toggles the 'fancy times' (Today|Yesterday|Day).<br/>
`before`: `datetime.datetime` object which allows to gather messages from before a certain date.<br/>
`after`: `datetime.datetime` object which allows to gather messages from after a certain date.<br/>
`bot`: `commands.Bot` object to gather members who are no longer in your guild.<br/>
`attachment_handler`: `chat_exporter.AttachmentHandler` object to export assets to in order to make them available after the `channel` got deleted.<br/>

**Return Argument:**<br/>
`transcript`: The HTML build-up for you to construct the HTML File with Discord.
Expand Down Expand Up @@ -149,6 +150,7 @@ This would be for people who want to filter what content to export.
`military_time`: Boolean value to set a 24h format for times within your exported chat (default=False | 12h format)<br/>
`fancy_times`: Boolean value which toggles the 'fancy times' (Today|Yesterday|Day)<br/>
`bot`: `commands.Bot` object to gather members who are no longer in your guild.
`attachment_handler`: `chat_exporter.AttachmentHandler` object to export assets to in order to make them available after the `channel` got deleted.<br/>

**Return Argument:**<br/>
`transcript`: The HTML build-up for you to construct the HTML File with Discord.
Expand Down Expand Up @@ -183,6 +185,178 @@ async def purge(ctx: commands.Context, tz_info: str, military_time: bool):
```
</details>


<p align="right">(<a href="#top">back to top</a>)</p>

---
## Attachment Handler

Due to Discords newly introduced restrictions on to their CDN, we have introduced an Attachment Handler. This handler
will assist you with circumventing the 'broken' and 'dead-assets' which arise when former attachments hosted by Discord
reach their expiration date.

The `AttachmentHandler` serves as a template for you to implement your own asset handler. Below are two basic examples on
how to use the `AttachmentHandler`. One using the example of storing files on a local webserver, with the other being
an example of storing them on Discord *(the latter merely just being an example, this will still obviously run in to
the expiration issue)*.

If you do not specify an attachment handler, chat-exporter will continue to use the (proxy) URLs for the assets.

<details><summary><b>Concept</b></summary>

The concept of implementing such an AttachmentHandler is very easy. In the following a short general procedure is
described to write your own AttachmentHandler fitting your storage solution. Here we will assume, that we store the
attachments in a cloud storage.

1. Subclassing
Start by subclassing `chat_exporter.AttachmentHandler` and implement the `__init__` method if needed. This should look
something like this:

```python
from chat_exporter import AttachmentHandler
from cloud_wrapper import CloudClient


class MyAttachmentHandler(AttachmentHandler):
def __init__(self, *args, **kwargs):
# Your initialization code here
# in your case we just create the cloud client
self.cloud_client = CloudClient()

```

2. Overwrite process_asset
The `process_asset` method is the method that is called for each asset in the chat. Here we have to implement the
upload logic and the generation of the asset url from the uploaded asset.

```python
import io
import aiohttp
from chat_exporter import AttachmentHandler
from cloud_wrapper import CloudClient
from discord import Attachment


class MyAttachmentHandler(AttachmentHandler):
async def process_asset(self, attachment: Attachment):
# Your upload logic here, in our example we just upload the asset to the cloud

# first we need to authorize the client
await self.cloud_client.authorize()

# then we fetch the content of the attachment
async with aiohttp.ClientSession() as session:
async with session.get(attachment.url) as res:
if res.status != 200:
res.raise_for_status()
data = io.BytesIO(await res.read())
data.seek(0)

# and upload it to the cloud, back we get some sort of identifier for the uploaded file
asset_id = await self.cloud_client.upload(data)

# now we can generate the asset url from the identifier
asset_url = await self.cloud_client.get_share_url(asset_id, shared_with="everyone")

# and set the url attribute of the attachment to the generated url
attachment.url = asset_url
return attachment

```

Note
1. The `process_asset` method should return the attachment object with the url attribute set to the generated url.
2. The `process_asset` method should be an async method, as it is likely that you have to do some async operations
like fetching the content of the attachment or uploading it to the cloud.
3. You are free to add other methods in your class, and call them from `process_asset` if you need to do some
operations before or after the upload of the asset. But the `process_asset` method is the only method that is
called from chat-exporter.

</details>

**Examples:**

<ol>
<details><summary>AttachmentToLocalFileHostHandler</summary>

Assuming you have a file server running, which serves the content of the folder `/usr/share/assets/`
under `https://example.com/assets/`, you can easily use the `AttachmentToLocalFileHostHandler` like this:
```python
import io
import discord
from discord.ext import commands
import chat_exporter
from chat_exporter import AttachmentToLocalFileHostHandler

...

# Establish the file handler
file_handler = AttachmentToLocalFileHostHandler(
base_path="/usr/share/assets",
url_base="https://example.com/assets/",
)

@bot.command()
async def save(ctx: commands.Context):
transcript = await chat_exporter.export(
ctx.channel,
attachment_handler=file_handler,
)

if transcript is None:
return

transcript_file = discord.File(
io.BytesIO(transcript.encode()),
filename=f"transcript-{ctx.channel.name}.html",
)

await ctx.send(file=transcript_file)

```
</details>

<details><summary>AttachmentToDiscordChannel</summary>

Assuming you want to store your attachments in a discord channel, you can use the `AttachmentToDiscordChannel`.
Please note that discord recent changes regarding content links will result in the attachments links being broken
after 24 hours. While this is therefor not a recommended way to store your attachments, it should give you a good
idea how to perform asynchronous storing of the attachments.

```python
import io
import discord
from discord.ext import commands
import chat_exporter
from chat_exporter import AttachmentToDiscordChannel

...

# Establish the file handler
channel_handler = AttachmentToDiscordChannel(
channel=bot.get_channel(CHANNEL_ID),
)

@bot.command()
async def save(ctx: commands.Context):
transcript = await chat_exporter.export(
ctx.channel,
attachment_handler=channel_handler,
)

if transcript is None:
return

transcript_file = discord.File(
io.BytesIO(transcript.encode()),
filename=f"transcript-{ctx.channel.name}.html",
)

await ctx.send(file=transcript_file)

```
</details>
</ol>
<p align="right">(<a href="#top">back to top</a>)</p>

---
Expand All @@ -204,6 +378,7 @@ async def purge(ctx: commands.Context, tz_info: str, military_time: bool):
---
## Additional Functions


<details><summary><b>Link Function</b></summary>
Downloading exported chats can build up a bunch of unwanted files on your PC which can get annoying, additionally - not everyone wants to download content from Discord.

Expand Down Expand Up @@ -274,6 +449,8 @@ It simply makes a request to the given URL and echos (prints) the content for yo

</details>



---
## Attributions

Expand Down
15 changes: 13 additions & 2 deletions chat_exporter/__init__.py
Original file line number Diff line number Diff line change
@@ -1,11 +1,22 @@
from chat_exporter.chat_exporter import export, raw_export, quick_export, link, quick_link
from chat_exporter.chat_exporter import (
export,
raw_export,
quick_export,
link,
quick_link,
AttachmentHandler,
AttachmentToLocalFileHostHandler,
AttachmentToDiscordChannelHandler)

__version__ = "2.6.1"
__version__ = "2.7.0"

__all__ = (
export,
raw_export,
quick_export,
link,
quick_link,
AttachmentHandler,
AttachmentToLocalFileHostHandler,
AttachmentToDiscordChannelHandler,
)
7 changes: 7 additions & 0 deletions chat_exporter/chat_exporter.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,7 @@

from chat_exporter.construct.transcript import Transcript
from chat_exporter.ext.discord_import import discord
from chat_exporter.construct.attachment_handler import AttachmentHandler, AttachmentToLocalFileHostHandler, AttachmentToDiscordChannelHandler


async def quick_export(
Expand Down Expand Up @@ -61,6 +62,7 @@ async def export(
before: Optional[datetime.datetime] = None,
after: Optional[datetime.datetime] = None,
support_dev: Optional[bool] = True,
attachment_handler: Optional[AttachmentHandler] = None,
):
"""
Create a customised transcript of your Discord channel.
Expand All @@ -74,6 +76,7 @@ async def export(
:param fancy_times: (optional) boolean - set javascript around time display
:param before: (optional) datetime.datetime - allows before time for history
:param after: (optional) datetime.datetime - allows after time for history
:param attachment_handler: (optional) attachment_handler.AttachmentHandler - allows custom asset handling
:return: string - transcript file make up
"""
if guild:
Expand All @@ -91,6 +94,7 @@ async def export(
after=after,
support_dev=support_dev,
bot=bot,
attachment_handler=attachment_handler,
).export()
).html

Expand All @@ -104,6 +108,7 @@ async def raw_export(
military_time: Optional[bool] = False,
fancy_times: Optional[bool] = True,
support_dev: Optional[bool] = True,
attachment_handler: Optional[AttachmentHandler] = None,
):
"""
Create a customised transcript with your own captured Discord messages
Expand All @@ -115,6 +120,7 @@ async def raw_export(
:param bot: (optional) discord.Client - set getting member role colour
:param military_time: (optional) boolean - set military time (24hour clock)
:param fancy_times: (optional) boolean - set javascript around time display
:param attachment_handler: (optional) AttachmentHandler - allows custom asset handling
:return: string - transcript file make up
"""
if guild:
Expand All @@ -132,6 +138,7 @@ async def raw_export(
after=None,
support_dev=support_dev,
bot=bot,
attachment_handler=attachment_handler
).export()
).html

Expand Down
68 changes: 68 additions & 0 deletions chat_exporter/construct/attachment_handler.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,68 @@
import datetime
import io
import pathlib
from typing import Union

import aiohttp
import discord


class AttachmentHandler:
"""Handle the saving of attachments (images, videos, audio, etc.)
Subclass this to implement your own asset handler."""

async def process_asset(self, attachment: discord.Attachment) -> discord.Attachment:
"""Implement this to process the asset and return a url to the stored attachment.
:param attachment: discord.Attachment
:return: str
"""
raise NotImplementedError

class AttachmentToLocalFileHostHandler(AttachmentHandler):
"""Save the assets to a local file host and embed the assets in the transcript from there."""

def __init__(self, base_path: Union[str, pathlib.Path], url_base: str):
if isinstance(base_path, str):
base_path = pathlib.Path(base_path)
self.base_path = base_path
self.url_base = url_base

async def process_asset(self, attachment: discord.Attachment) -> discord.Attachment:
"""Implement this to process the asset and return a url to the stored attachment.
:param attachment: discord.Attachment
:return: str
"""
file_name = f"{int(datetime.datetime.utcnow().timestamp())}_{attachment.filename}".replace(' ', '%20')
asset_path = self.base_path / file_name
await attachment.save(asset_path)
file_url = f"{self.url_base}/{file_name}"
attachment.url = file_url
attachment.proxy_url = file_url
return attachment


class AttachmentToDiscordChannelHandler(AttachmentHandler):
"""Save the attachment to a discord channel and embed the assets in the transcript from there."""

def __init__(self, channel: discord.TextChannel):
self.channel = channel

async def process_asset(self, attachment: discord.Attachment) -> discord.Attachment:
"""Implement this to process the asset and return a url to the stored attachment.
:param attachment: discord.Attachment
:return: str
"""
try:
async with aiohttp.ClientSession() as session:
async with session.get(attachment.url) as res:
if res.status != 200:
res.raise_for_status()
data = io.BytesIO(await res.read())
data.seek(0)
attach = discord.File(data, attachment.filename)
msg: discord.Message = await self.channel.send(file=attach)
return msg.attachments[0]
except discord.errors.HTTPException as e:
# discords http errors, including missing permissions
raise e
Loading

0 comments on commit aa82847

Please sign in to comment.