Static analysis utilties to improve ai request generation (#291)

* Proof of concept of static analysis to expand function context for ai req gen * Wire up tests for static analysis with an example repo * Move expand-function tests to their own test file * Use logger in the expand-function module and also fix directory traversal * Return undefined if no value was found for expanded defn * Factor out helper for getting out-of-scope identifiers and add utiltiies for creating workspace/executeCommand rpc calls * Also analzy function expressions jic * Update followImport with a docstring explaining why it was used (and hopefully should not be anymore) * Add failing test for filtering out web standard globals from identifiers returned by static analysis * Filter out web standard globals from the identifiers returned by static analysis * Add a type guard for definition responses from tsserver * Remove dead code and shuffle some files * Fix type error * Refactor forEach * Comment out some debug logs * Refactor getDefinitionText to provide info on whether it returned a function * Add TODO for tomorrow * Terminate language server when process exits * Add janky test for recursively expanding out of scope vars ((needs work)) * Add tests for identifier analyzer and update its behavior for property accesses on out of scope vars * Update workers-types in static analysis test repo * Implement a singleton pattern for the typescript server instance (child process) and fix up some tests * Add some tests for a drizzlified codebase and handle pnpm packages properly when extracing package names * Update comment * Move search-function into a folder * Write some tests for search-file and search-function * Update comments and logs * Move out of scope identifier analysis out of searchFunction * Tidy up search-function tests * Add README * Fix typecheck on studio * Add docstring to expandFunction * Start ts language server process if ai is enabled * Resolve path that is passed to getTSServer * Ignore directories more aggresively when searching for function * Remove srcPath argument as it is unneeded * Recursively expand context * Improve some logging and make test-static-analysis runnable * Add a nontrivial column to db schema * Expand logger with trace level and account for reponening changed files in lang server * Add trace logging * Format * Implement absolutely unhinged method for mapping back to source function but it works * Add option to skip source map searches * Remove type from prompt * Document search-compiled-function * Make search-source-function async * Fix tests for async search file, and remove .trace logging * Format * Add link to Ades work in the README * Also expand middleware context * Do not serialize 3rd party functions into context * Update identifier analyzer to have a scope stack (broke property access logic however) * Fix issues with property access. Now start tackling inner scope issues * Fix inner scope issues * Break out unit tests to be more granular * Split out identifier analyzer into smaller unit tests (again) * Continue with identifier-analyzer tests * Add hacky solution for middleware source function detection issue * Fix expand-function tests except for the one on Drizzle schema expansion (schema.stuff) * Improve type-narrowing for functions passed to analyzeOutOfScopeIdentifiers * Add a test for the source map reader to cover case where certain middleware could not be mapped back * Add some comments * Implement source hints from hits from the source map * Update blurb about eslint * Set up tests for full file context * Expand entirety of imported file if we are dealing with namespace imports * Add middleware context to the qa tester propmt * Turn off noisy logs bit by bit and try to speed up context builder * Start optimizing source map lookups * More comments * Start refactoring findSourceFunction to handle batch lookups * Modify tests to handle batch lookups * Update some comments * Add performance boost * Cleanup * Move inference router to its own folder * Factor out handler and middleware expansion utilties from the inference route * Fix expand-handler test route * Add docstrings and clean up some code for getting source map and compiled index.js * Log when we terminate tsserver * Start renaming things to make more sense * Add back prefiltering of bearerAuth middleware * Clean up some debug logs and implement an explicict debug logging switch on expandFunction * Factor out extractPackage * Do not throw errors in inference handler if context expansion fails * Add better docstrings to expand-function * Get definition text of type assignments * Extract out FunctionContextTypes * Remove unnecessary any * Update docstring in identifierAnalyzer * Update context-for-import to use module resolution helper * No need for xml escaping * Remove comment * Update expand function README * Update find-source-function * Clean up expand-handler use of buildAiContext * Remove TODO comments by unknown types * Add changelog entry * Add helper is-dependency * Update find-source-function tests after more fiddling with the test-static-analysis codebase * Comment out debug logs * Remove debug logs * Comment out one more debug log
fiberplane · Oct 7, 2024 · c799b26 · c799b26
1 parent ef904b0
commit c799b26
Show file tree

Hide file tree

Showing 75 changed files with 65,041 additions and 213 deletions.
diff --git a/DEVELOPMENT.md b/DEVELOPMENT.md
@@ -45,7 +45,7 @@ Follow the instructions in the [`client-library-otel` README](./packages/client-
 
 ## Developing
 
-This project uses typescript, biome and pnpm workspaces. The frontend package also uses eslint for linting purposes, all other packages use biome for linting (formatting is always done with biome).
+This project uses typescript, biome and pnpm workspaces. Linting and formatting is handled with [biome](https://biomejs.dev/).
 
 In the project root you can format all typescript codebases with `pnpm run format`.
 

diff --git a/api/package.json b/api/package.json
@@ -22,6 +22,8 @@
     "db:drop": "drizzle-kit drop",
     "db:seed": "tsx scripts/seed.ts",
     "db:studio": "drizzle-kit studio",
+    "expand-function": "tsx src/lib/expand-function/tests/expand-function-smoke-test.ts",
+    "expand-function:debug": "node --inspect-brk -r tsx/cjs src/lib/expand-function/tests/expand-function-smoke-test.ts",
     "build": "pnpm run db:generate && tsc",
     "format": "biome check . --write",
     "lint": "biome lint .",
@@ -56,6 +58,10 @@
     "minimatch": "^10.0.1",
     "openai": "^4.47.1",
     "source-map": "^0.7.4",
+    "typescript": "^5.5.4",
+    "typescript-language-server": "^4.3.3",
+    "vscode-jsonrpc": "^8.2.1",
+    "vscode-uri": "^3.0.8",
     "ws": "^8.17.1",
     "zod": "^3.23.8"
   },

diff --git a/api/src/app.ts b/api/src/app.ts
@@ -9,7 +9,7 @@ import logger from "./logger.js";
 
 import type * as webhoncType from "./lib/webhonc/index.js";
 import appRoutes from "./routes/app-routes.js";
-import inference from "./routes/inference.js";
+import inference from "./routes/inference/index.js";
 import settings from "./routes/settings.js";
 import source from "./routes/source.js";
 import traces from "./routes/traces.js";

diff --git a/api/src/constants.ts b/api/src/constants.ts
@@ -1 +1,7 @@
+import path from "node:path";
+
 export const DEFAULT_DATABASE_URL = "file:fpx.db";
+
+export const USER_PROJECT_ROOT_DIR = path.resolve(
+  process.env.FPX_WATCH_DIR ?? process.cwd(),
+);
diff --git a/api/src/index.node.ts b/api/src/index.node.ts
@@ -7,8 +7,9 @@ import { drizzle } from "drizzle-orm/libsql";
 import figlet from "figlet";
 import type { WebSocket } from "ws";
 import { createApp } from "./app.js";
-import { DEFAULT_DATABASE_URL } from "./constants.js";
+import { DEFAULT_DATABASE_URL, USER_PROJECT_ROOT_DIR } from "./constants.js";
 import * as schema from "./db/schema.js";
+import { getTSServer } from "./lib/expand-function/tsserver/index.js";
 import { setupRealtimeService } from "./lib/realtime/index.js";
 import { getSetting } from "./lib/settings/index.js";
 import { resolveWebhoncUrl } from "./lib/utils.js";
@@ -76,8 +77,7 @@ server.on("error", (err) => {
 //
 // Additionally, this will watch for changes to files in the project directory,
 //   - If a file changes, send a new probe to the service
-const watchDir = process.env.FPX_WATCH_DIR ?? process.cwd();
-startRouteProbeWatcher(watchDir);
+startRouteProbeWatcher(USER_PROJECT_ROOT_DIR);
 
 // Set up websocket server
 setupRealtimeService({ server, path: "/ws", wsConnections });
@@ -92,3 +92,16 @@ if (proxyRequestsEnabled ?? false) {
   logger.debug("Proxy requests feature enabled.");
   await webhonc.start();
 }
+
+// check settings if ai is enabled, and proactively start the typescript language server
+const aiEnabled = await getSetting(db, "aiEnabled");
+if (aiEnabled ?? false) {
+  logger.debug(
+    "AI Request Generation enabled. Starting typescript language server",
+  );
+  try {
+    await getTSServer(USER_PROJECT_ROOT_DIR);
+  } catch (error) {
+    logger.error("Error starting TSServer:", error);
+  }
+}
diff --git a/api/src/lib/ai/anthropic.ts b/api/src/lib/ai/anthropic.ts
@@ -19,12 +19,14 @@ type GenerateRequestOptions = {
   handler: string;
   baseUrl?: string;
   history?: Array<string>;
+  handlerContext?: string;
   openApiSpec?: string;
   middleware?: {
     handler: string;
     method: string;
     path: string;
   }[];
+  middlewareContext?: string;
 };
 
 /**
@@ -42,9 +44,11 @@ export async function generateRequestWithAnthropic({
   method,
   path,
   handler,
+  handlerContext,
   history,
   openApiSpec,
   middleware,
+  middlewareContext,
 }: GenerateRequestOptions) {
   logger.debug(
     "Generating request data with Anthropic",
@@ -54,18 +58,22 @@ export async function generateRequestWithAnthropic({
     `method: ${method}`,
     `path: ${path}`,
     `handler: ${handler}`,
-    `openApiSpec: ${openApiSpec}`,
-    `middleware: ${middleware}`,
+    // `handlerContext: ${handlerContext}`,
+    // `openApiSpec: ${openApiSpec}`,
+    // `middleware: ${middleware}`,
+    // `middlewareContext: ${middlewareContext}`,
   );
   const anthropicClient = new Anthropic({ apiKey, baseURL: baseUrl });
   const userPrompt = await invokeRequestGenerationPrompt({
     persona,
     method,
     path,
     handler,
+    handlerContext,
     history,
     openApiSpec,
     middleware,
+    middlewareContext,
   });
 
   const toolChoice: Anthropic.Messages.MessageCreateParams.ToolChoiceTool = {

diff --git a/api/src/lib/ai/index.ts b/api/src/lib/ai/index.ts
@@ -8,22 +8,26 @@ export async function generateRequestWithAiProvider({
   method,
   path,
   handler,
+  handlerContext,
   history,
   openApiSpec,
   middleware,
+  middlewareContext,
 }: {
   inferenceConfig: Settings;
   persona: string;
   method: string;
   path: string;
   handler: string;
+  handlerContext?: string;
   history?: string[];
   openApiSpec?: string;
   middleware?: {
     handler: string;
     method: string;
     path: string;
   }[];
+  middlewareContext?: string;
 }) {
   const {
     openaiApiKey,
@@ -43,9 +47,11 @@ export async function generateRequestWithAiProvider({
       method,
       path,
       handler,
+      handlerContext,
       history,
       openApiSpec,
       middleware,
+      middlewareContext,
     }).then(
       (parsedArgs) => {
         return { data: parsedArgs, error: null };
@@ -67,9 +73,11 @@ export async function generateRequestWithAiProvider({
       method,
       path,
       handler,
+      handlerContext,
       history,
       openApiSpec,
       middleware,
+      middlewareContext,
     }).then(
       (parsedArgs) => {
         return { data: parsedArgs, error: null };

diff --git a/api/src/lib/ai/openai.ts b/api/src/lib/ai/openai.ts
@@ -11,13 +11,15 @@ type GenerateRequestOptions = {
   method: string;
   path: string;
   handler: string;
+  handlerContext?: string;
   history?: Array<string>;
   openApiSpec?: string;
   middleware?: {
     handler: string;
     method: string;
     path: string;
   }[];
+  middlewareContext?: string;
 };
 
 /**
@@ -35,9 +37,11 @@ export async function generateRequestWithOpenAI({
   method,
   path,
   handler,
+  handlerContext,
   history,
   openApiSpec,
   middleware,
+  middlewareContext,
 }: GenerateRequestOptions) {
   logger.debug(
     "Generating request data with OpenAI",
@@ -46,19 +50,23 @@ export async function generateRequestWithOpenAI({
     `persona: ${persona}`,
     `method: ${method}`,
     `path: ${path}`,
-    `handler: ${handler}`,
-    `openApiSpec: ${openApiSpec}`,
-    `middleware: ${middleware}`,
+    // `handler: ${handler}`,
+    // `handlerContext: ${handlerContext}`,
+    // `openApiSpec: ${openApiSpec}`,
+    // `middleware: ${middleware}`,
+    // `middlewareContext: ${middlewareContext}`,
   );
   const openaiClient = new OpenAI({ apiKey, baseURL: baseUrl });
   const userPrompt = await invokeRequestGenerationPrompt({
     persona,
     method,
     path,
     handler,
+    handlerContext,
     history,
     openApiSpec,
     middleware,
+    middlewareContext,
   });
 
   const response = await openaiClient.chat.completions.create({

diff --git a/api/src/lib/ai/prompts.ts b/api/src/lib/ai/prompts.ts
@@ -32,31 +32,37 @@ export const invokeRequestGenerationPrompt = async ({
   method,
   path,
   handler,
+  handlerContext,
   history,
   openApiSpec,
   middleware,
+  middlewareContext,
 }: {
   persona: string;
   method: string;
   path: string;
   handler: string;
+  handlerContext?: string;
   history?: Array<string>;
   openApiSpec?: string;
   middleware?: {
     handler: string;
     method: string;
     path: string;
   }[];
+  middlewareContext?: string;
 }) => {
   const promptTemplate =
     persona === "QA" ? qaTesterPrompt : friendlyTesterPrompt;
   const userPromptInterface = await promptTemplate.invoke({
     method,
     path,
     handler,
+    handlerContext: handlerContext ?? "NO HANDLER CONTEXT",
     history: history?.join("\n") ?? "NO HISTORY",
     openApiSpec: openApiSpec ?? "NO OPENAPI SPEC",
     middleware: formatMiddleware(middleware),
+    middlewareContext: middlewareContext ?? "NO MIDDLEWARE CONTEXT",
   });
   const userPrompt = userPromptInterface.value;
   return userPrompt;
@@ -87,9 +93,15 @@ Here is the OpenAPI spec for the handler:
 Here is the middleware that will be applied to the request:
 {middleware}
 
+Here is some additional context for the middleware that will be applied to the request:
+{middlewareContext}
+
 Here is the code for the handler:
 {handler}
 
+Here is some additional context for the handler source code, if you need it:
+{handlerContext}
+
 `.trim(),
 );
 
@@ -113,9 +125,15 @@ Here is the OpenAPI spec for the handler:
 Here is the middleware that will be applied to the request:
 {middleware}
 
+Here is some additional context for the middleware that will be applied to the request:
+{middlewareContext}
+
 Here is the code for the handler:
 {handler}
 
+Here is some additional context for the handler source code, if you need it:
+{handlerContext}
+
 REMEMBER YOU ARE A QA. MISUSE THE API. BUT DO NOT MISUSE YOURSELF.
 Keep your responses short-ish. Including your random data.
 `.trim(),