Discord bot with NOS fine-tuning API #314

spillai · 2023-09-01T01:37:20Z

Summary

adds discord bot with Dockerfile, docker-compose.yml and requirements for discord app
new discord training bot with new fine-tuning API for sdv2 lora
added tests for fine-tuning API

Related issues

Checks

make lint: I've run make lint to lint the changes in this PR.
make test: I've made sure the tests (make test-cpu or make test) are passing.
Additional tests:
- Benchmark tests (when contributing new models)
- GPU/HW tests

- added tests for fine-tuning API

outtanames

Think this can be merged as is. Training interface will need cleanup over time.

outtanames · 2023-09-01T23:24:00Z

examples/discord/bot.py

+
+
+@dataclass
+class LoRAPromptModel:


can this not live with the training service?

Yes, possibly. Just kept it here for simplicity. Also this is specific to the discord bot with thread_id etc which has nothing to do with training parameters.

outtanames · 2023-09-01T23:42:22Z

examples/discord/requirements.txt

@@ -0,0 +1,4 @@
+discord==2.3.2
+discord.py==2.3.2


still unclear to me why/if we need both discord and discord.py

I'm not sure, I pulled this from your example, so I figured you would know.

outtanames · 2023-09-01T23:55:59Z

examples/discord/.env.template

@@ -0,0 +1 @@
+DISCORD_BOT_TOKEN=


Probably need a more sophisticated way to manage this going forward. Doesn't look like we are persisting this secret anywhere though which is good.

What's wrong with this? This is a pretty standard way of showing that a .env needs to be created with the DISCORD_BOT_TOKEN= specified.

outtanames · 2023-09-01T23:56:44Z

examples/discord/Dockerfile

+
+WORKDIR /tmp/$PROJECT
+ADD requirements.txt .
+RUN pip install -r requirements.txt


how happy are we with keeping this in examples? This means that it will be included with the wheel file I believe?

Top-level examples are not included in the wheel file. Only subdirectories under nos are.

outtanames · 2023-09-01T23:59:11Z

examples/discord/bot.py

+    raise RuntimeError("NOS server is not healthy")
+
+logger.debug("Server is healthy!")
+NOS_VOLUME_DIR = Path(client.Volume())


Does the training volume get blown out when we restart the server?

No, it's persistent as long as the permissions are set correctly. We could use docker volumes instead of volume mounts.

outtanames · 2023-09-02T00:03:32Z

examples/discord/bot.py

+        logger.debug("no attachments to train on, returning!")
+        return
+
+    if "sks" not in prompt:


[nit] maybe a different tag? "sks" seems arbitrary, would prefer "INSTANCE" or "OBJECT"

Both "INSTANCE" and "OBJECT" is a common word in tokenizers. We need to pick something that's unique, where the model hasn't seen this token before.

outtanames · 2023-09-02T00:53:45Z

examples/discord/bot.py

+
+        # Run inference on the trained model
+        st = time.perf_counter()
+        response = client.Run(


so this always does a single generation run at the end of training on the new thread?

Yes, it's a validation prompt to make sure it generated something reasonable.

outtanames · 2023-09-02T00:54:24Z

nos/client/grpc.py

+        except grpc.RpcError as e:
+            raise NosClientException(f"Failed to train model (details={(e.details())})", e)
+
+    def Volume(self, name: str = None) -> str:


should this live in grpc? maybe utils?

It's a client utility, so we have everything under grpc for now.

outtanames · 2023-09-02T00:57:27Z

requirements/requirements.server.txt

@@ -3,7 +3,7 @@ diffusers>=0.17.1
 huggingface_hub
 memray
 pyarrow>=12.0.0
-ray>=2.6.1
+ray[default]>=2.6.1


We need this now for the ray jobs submission client, will add some docs to explain this.

outtanames · 2023-09-02T00:57:52Z

scripts/entrypoint.sh

@@ -2,9 +2,11 @@
 set -e
 set -x

-echo "Starting Ray server with OMP_NUM_THREADS=${OMP_NUM_THREADS}..."
+# Get number of cores
+NCORES=$(nproc --all)


are we sure we want to be maxing this out to all cores?

I think so. The nos.init() has a utilization kwarg that optionally allows users to specify <100% cpu core utilization

outtanames · 2023-09-02T16:44:58Z

examples/discord/bot.py

+BASE_MODEL = "runwayml/stable-diffusion-v1-5"
+
+# Init NOS server, wait for it to spin up then confirm its healthy.
+client = InferenceClient()


[nit] do we want to roll these into an init function?

outtanames · 2023-09-02T16:46:18Z

examples/discord/bot.py

+async def generate(ctx, *, prompt):
+    """Create a callback to read messages and generate images from prompt
+
+    Usage:


should we add this to docs? don't think its necessary since its not part of the main service/API, might be better suited to a readme if we release this by itself.

Somewhat evolving set of docstrings for now, so better to keep them here until we have fully fleshed out demos.

outtanames · 2023-09-02T18:13:20Z

nos/proto/nos_service.proto

@@ -65,6 +65,7 @@ message PingResponse {
 // Service information repsonse
 message ServiceInfoResponse {
    string version = 1;  // (e.g. "0.1.0")
+    string runtime = 2;  // (e.g. "cpu", "gpu", "local" etc)


did we version the API?

Yes. We also check if the server/client versions are consistent with this ServiceInfo routine.

outtanames · 2023-09-02T18:25:56Z

nos/server/train/_service.py

+        hooks = {"on_completed": (register_model, (job_id,), {})}
+
+        # Spawn a thread to monitor the job
+        def monitor_job_hook(job_id: str, timeout: int = 600, retry_interval: int = 5):


is there a way to do this without explicitly monitoring the training run?

We'll need a bunch of post-training hooks to do a lot of book-keeping (model registry, upload to hub etc). For now this is a placeholder hook for registering custom models.

spillai added 9 commits August 31, 2023 18:20

Training dreambooth example

d8321ef

Working local hub registry for stable diffusion lora models

5a94567

Overhaul training API with new TrainingService and grpc client tests

6e1f87e

Working training API with dynamic model registry on training completion

6ee8363

Fix OMP_NUM_THREADS defaults in entrypoint, if not already set

8609402

Working volume directory mounts for client-server training api

2ff9cd9

Discord training bot with new fine-tuning API for sdv2 lora

4150c12

Fully functional discord bot training

d1da82e

Final fixes for discord bot with fully functional training / inference

d14f302

spillai added the demo label Sep 1, 2023

spillai added this to the NOS v0.0.10 milestone Sep 1, 2023

spillai requested a review from outtanames September 1, 2023 01:37

spillai self-assigned this Sep 1, 2023

Discord bot with fine-tuning API

ed4c513

- added tests for fine-tuning API

spillai force-pushed the spillai/sdv2-finetuning-with-discord branch from 1c9ffac to ed4c513 Compare September 1, 2023 01:45

outtanames approved these changes Sep 2, 2023

View reviewed changes

outtanames reviewed Sep 2, 2023

View reviewed changes

spillai merged commit 74bf39f into main Sep 4, 2023
1 check passed

spillai mentioned this pull request Sep 5, 2023

Set up nos bot to save training images #313

Closed

8 tasks

spillai deleted the spillai/sdv2-finetuning-with-discord branch September 5, 2023 04:29

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Discord bot with NOS fine-tuning API #314

Discord bot with NOS fine-tuning API #314

spillai commented Sep 1, 2023 •

edited

Loading

outtanames left a comment

outtanames Sep 1, 2023

spillai Sep 4, 2023

outtanames Sep 1, 2023

spillai Sep 4, 2023

outtanames Sep 1, 2023

spillai Sep 4, 2023

outtanames Sep 1, 2023

spillai Sep 4, 2023

outtanames Sep 1, 2023

spillai Sep 4, 2023

outtanames Sep 2, 2023

spillai Sep 4, 2023

outtanames Sep 2, 2023

spillai Sep 4, 2023

outtanames Sep 2, 2023

spillai Sep 4, 2023

outtanames Sep 2, 2023

spillai Sep 4, 2023

outtanames Sep 2, 2023

spillai Sep 4, 2023

outtanames Sep 2, 2023

outtanames Sep 2, 2023

spillai Sep 4, 2023

outtanames Sep 2, 2023

spillai Sep 4, 2023

outtanames Sep 2, 2023

spillai Sep 4, 2023



		@dataclass
		class LoRAPromptModel:

		@@ -0,0 +1,4 @@
		discord==2.3.2
		discord.py==2.3.2

		@@ -0,0 +1 @@
		DISCORD_BOT_TOKEN=

Discord bot with NOS fine-tuning API #314

Discord bot with NOS fine-tuning API #314

Conversation

spillai commented Sep 1, 2023 • edited Loading

Summary

Related issues

Checks

outtanames left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

spillai commented Sep 1, 2023 •

edited

Loading