Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Support Structured Outputs #813

Merged
merged 5 commits into from
Aug 7, 2024
Merged

Conversation

eiixy
Copy link
Contributor

@eiixy eiixy commented Aug 7, 2024

Copy link

codecov bot commented Aug 7, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 98.88%. Comparing base (774fc9d) to head (68ee228).
Report is 35 commits behind head on master.

Additional details and impacted files
@@            Coverage Diff             @@
##           master     #813      +/-   ##
==========================================
+ Coverage   98.46%   98.88%   +0.42%     
==========================================
  Files          24       26       +2     
  Lines        1364     1347      -17     
==========================================
- Hits         1343     1332      -11     
+ Misses         15        9       -6     
  Partials        6        6              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@sashabaranov
Copy link
Owner

sashabaranov commented Aug 7, 2024

Thank you for the PR!

Could we please add some tests here? Both integration and unit tests. I believe we also need to update the ChatCompletionResponse to make this work

@h0rv
Copy link

h0rv commented Aug 7, 2024

Would love to add this to https://github.com/instructor-ai/instructor-go to streamline parsing the responses into Go structs.

@eiixy
Copy link
Contributor Author

eiixy commented Aug 7, 2024

Thank you for the PR!

Could we please add some tests here? Both integration and unit tests. I believe we also need to update the ChatCompletionResponse to make this work

Thanks for the feedback! I've added the requested unit tests. Please review and let me know if anything else is needed.

)
checks.NoError(t, err, "CreateChatCompletion (use json_schema response) returned error")
var result = make(map[string]string)
err = json.Unmarshal([]byte(resp.Choices[0].Message.Content), &result)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be super cool if we could do that automatically based on JSON schema in the future!

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe it is not possible to construct a Go type based on json schema description in the current form.

What we could do is to use struct tags like https://github.com/invopop/jsonschema is doing, and than be able to both generate json schema from Go struct and also be able to automatically unmarshal structured responses

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added this to our v2 roadmap as might require breaking changes #801

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Implement it like this?

package jsonschema

func Unmarshal(schema Definition, data []byte,v any) error {
	// TODO
}

Copy link
Owner

@sashabaranov sashabaranov Aug 7, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@eiixy Yes, pretty much! The only problem is that you need a definition and Go type. In this test the definition is

openai.ChatCompletionResponseFormatJSONSchema{
	Name: "cases",
	Schema: jsonschema.Definition{
		Type: jsonschema.Object,
		Properties: map[string]jsonschema.Definition{
			"PascalCase": jsonschema.Definition{Type: jsonschema.String},
			"CamelCase":  jsonschema.Definition{Type: jsonschema.String},
			"KebabCase":  jsonschema.Definition{Type: jsonschema.String},
			"SnakeCase":  jsonschema.Definition{Type: jsonschema.String},
		},
		Required:             []string{"PascalCase", "CamelCase", "KebabCase", "SnakeCase"},
		AdditionalProperties: false,
	},
	Strict: true,
},

and Go type is map[string]string.

Ideally, we would like to

type MyStructuredResponse struct {
	PascalCase string
	CamelCase string
	KebabCase string
	SnakeCase string
}

So that could be automatically used as a schema definition (or converted to it) and unmarshalling of structured response. Potentially we can have a function that creates jsonschema.Definition from a given struct

EDIT: types in the last code sample

Copy link

@gspeicher gspeicher Aug 7, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for putting this together so quickly. I already have a plaintext JSON schema that I have been sending to the API in the system prompt, and already have a Go struct to hold the unmarshalled response. I was hoping to reuse both with the new Structured Outputs feature without having to manually convert the schema to a jsonschema.Definition but I don't see any way around that.

Could you possibly change the accepted schema type from jsonschema.Definition to json.Marshaler from the standard Go encoding/json package? The jsonschema.Definition godoc itself states "It is fairly limited, and you may have better luck using a third-party library." but by defining Schema as a jsonschema.Definition we are precluded from using any third-party library to construct the schema. This would be a backward compatible change since jsonschema.Definition already implements MarshalJSON.

Thanks for your consideration.

Copy link
Contributor Author

@eiixy eiixy Aug 8, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sashabaranov Could you review the feasibility of this implementation plan? #819

type MyStructuredResponse struct {
    PascalCase string `json:"pascal_case" required:"true" description:"PascalCase"`
    CamelCase  string `json:"camel_case" required:"true" description:"CamelCase"`
    KebabCase  string `json:"kebab_case" required:"true" description:"KebabCase"`
    SnakeCase  string `json:"snake_case" required:"true" description:"SnakeCase"`
}
schema := jsonschema.Warp(MyStructuredResponse{})
resp, err := c.CreateChatCompletion(
    ctx,
    openai.ChatCompletionRequest{
        Model: openai.GPT4oMini,
        Messages: []openai.ChatCompletionMessage{
            {
                Role: openai.ChatMessageRoleSystem,
                Content: "Please enter a string, and we will convert it into the following naming conventions:" +
                    "1. PascalCase: Each word starts with an uppercase letter, with no spaces or separators." +
                    "2. CamelCase: The first word starts with a lowercase letter, " +
                    "and subsequent words start with an uppercase letter, with no spaces or separators." +
                    "3. KebabCase: All letters are lowercase, with words separated by hyphens `-`." +
                    "4. SnakeCase: All letters are lowercase, with words separated by underscores `_`.",
            },
            {
                Role:    openai.ChatMessageRoleUser,
                Content: "Hello World",
            },
        },
        ResponseFormat: &openai.ChatCompletionResponseFormat{
            Type: openai.ChatCompletionResponseFormatTypeJSONSchema,
            JSONSchema: &openai.ChatCompletionResponseFormatJSONSchema{
                Name:   "cases",
                Schema: schema,
                Strict: true,
            },
        },
    },
)
checks.NoError(t, err, "CreateChatCompletion (use json_schema response) returned error")
if err == nil {
    _, err = schema.Unmarshal(resp.Choices[0].Message.Content)
    checks.NoError(t, err, "CreateChatCompletion (use json_schema response) unmarshal error")
}

Copy link
Owner

@sashabaranov sashabaranov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you!!

@sashabaranov sashabaranov merged commit 623074c into sashabaranov:master Aug 7, 2024
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants