Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(pinecone): Add rerank task for Pinecone component #773

Conversation

MubeenKodvavi
Copy link
Contributor

@MubeenKodvavi MubeenKodvavi commented Oct 23, 2024

Because

This commit

  • Added Pinecone Rerank Support.
    • Inputs: Documents and Query input. We can restrict output to n documents by setting optional field (top-n)
    • Output: Reranked Documents and Scores
  • Made URL in Pinecone Setup Optional as for rerank input URL is not required. For others, this will still need to be input.

@MubeenKodvavi
Copy link
Contributor Author

Test Details

Tested locally by creating a pinecone component's rerank task and verified data using sample input from API documentation

Sample Pipeline Recipe:

version: v1beta

variable:
  query:
    title: Query
    instill-format: string
  documents:
    title: Documents
    instill-format: array:string

component: 
  pinecone-0:
    type: pinecone
    input:
      query: ${variable.query}
      documents: ${variable.documents}
      top-n: 3
    setup:
      api-key: ${secret.pineconetoken} 
    task: TASK_RERANK

output:
  reranked_documents:
    title: Reranked Documents
    value: ${pinecone-0.output.documents}
  scores:
    title: scores
    value: ${pinecone-0.output.scores} 

Input:

image

Component Output:

image

@MubeenKodvavi MubeenKodvavi marked this pull request as ready for review October 23, 2024 20:40
Copy link
Contributor

@chuang8511 chuang8511 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your contribution!

I left a few comments that we need you to modify.
And, if possible, please also help us add the test code.

}

return &rerankReq{
Model: "bge-reranker-v2-m3",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Though it only provides a model now, we can make it a field with a default value.

You can refer to this.

"instillUIOrder": 0,
"properties": {
"query": {
"description": "The query to rerank the documents",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"description": "The query to rerank the documents",
"description": "The query to rerank the documents.",

Now, we always add period in the end for description fields.

"type": "string"
},
"documents": {
"description": "The documents to rerank",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"description": "The documents to rerank",
"description": "The documents to rerank.",

"type": "array"
},
"top-n": {
"description": "The number of results to return sorted by relevance. Defaults to the number of inputs.\n\n",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"description": "The number of results to return sorted by relevance. Defaults to the number of inputs.\n\n",
"description": "The number of results to return sorted by relevance. Defaults to the number of inputs.",

"instillUIOrder": 0,
"properties": {
"documents": {
"description": "Reranked documents",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"description": "Reranked documents",
"description": "Reranked documents.",

"type": "string"
},
"instillUIOrder": 0,
"title": "Reranked documents",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"title": "Reranked documents",
"title": "Reranked Documents",

resp := rerankResp{}
req.SetResult(&resp).SetBody(inputStruct.asRequest())
if _, err := req.Post(rerankPath); err != nil {
job.Error.Error(ctx, httpclient.WrapURLError(err))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Recently, I found the clients don't reply the meaningful error message in err when sending http request.

Could you add the errMsg like this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

WrapURLError is a helper to add an end-user message to http transport errors.
This message is extracted in ErrorHandler interface for component job.

@@ -145,7 +167,34 @@ func (e *execution) Execute(ctx context.Context, jobs []*base.Job) error {
job.Error.Error(ctx, err)
continue
}
case taskRerank:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It'd be good to have the test code if possible.

MubeenKodvavi and others added 8 commits December 4, 2024 18:09
Define task in pinecone's definition.json
Define inputs and outputs of component in task.json
URL is no longer required as Rerank task does not require an index
…tput response

Implement required structs to parse input data to API request and parse API output to component output
Introduce new client to make rerank task
…changes

run campogen to update documentation with new changes
… formed

remove api version 2024-10 from headers as this is latest stable version now and supports rerank api by default
…client

pinecone is redirecting towards oldest stable version that does not support rerank endpoint
also recommended by pinecone in their documentation to specify api version in headers
@chuang8511 chuang8511 force-pushed the MubeenKodvavi/add-rerank-task-for-pinecone branch from 979bc0b to 2c808cf Compare December 4, 2024 18:12
Copy link
Contributor

@chuang8511 chuang8511 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@chuang8511
Copy link
Contributor

@MubeenKodvavi
I modified the code a bit.

Will merge it when CI finishes.

Thanks for your contribution! 🙏

@chuang8511 chuang8511 merged commit e1fd611 into instill-ai:main Dec 6, 2024
9 checks passed
donch1989 pushed a commit that referenced this pull request Dec 9, 2024
🤖 I have created a release *beep* *boop*
---


##
[0.48.4-beta](v0.48.3-beta...v0.48.4-beta)
(2024-12-09)


### Features

* **pinecone:** Add rerank task for Pinecone component
([#773](#773))
([e1fd611](e1fd611))
* **vdp:** upload raw inputs for run log
([#904](#904))
([960f4c2](960f4c2))


### Bug Fixes

* **component,http:** fix the request body marshalling
([#928](#928))
([b47cc71](b47cc71))


### Miscellaneous Chores

* release v0.48.4-beta
([b08878e](b08878e))

---
This PR was generated with [Release
Please](https://github.com/googleapis/release-please). See
[documentation](https://github.com/googleapis/release-please#release-please).
jvallesm pushed a commit that referenced this pull request Dec 10, 2024
🤖 I have created a release *beep* *boop*
---


##
[0.48.4-beta](v0.48.3-beta...v0.48.4-beta)
(2024-12-09)


### Features

* **pinecone:** Add rerank task for Pinecone component
([#773](#773))
([e1fd611](e1fd611))
* **vdp:** upload raw inputs for run log
([#904](#904))
([960f4c2](960f4c2))


### Bug Fixes

* **component,http:** fix the request body marshalling
([#928](#928))
([b47cc71](b47cc71))


### Miscellaneous Chores

* release v0.48.4-beta
([b08878e](b08878e))

---
This PR was generated with [Release
Please](https://github.com/googleapis/release-please). See
[documentation](https://github.com/googleapis/release-please#release-please).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
No open projects
Status: No status
Development

Successfully merging this pull request may close these issues.

[Pinecone] Add rerank task
3 participants