Generating package summaries with GPT #2368
Replies: 5 comments 10 replies
-
ResultsAll results are an unprocessed README file copied and pasted from the rendered Markdown with the prompt above. No editing has been done to the results. Alamofire
SemanticVersion
Satin
FlyingFox
ArgumentParser
Swift Crypto
|
Beta Was this translation helpful? Give feedback.
-
If you go forward with this, then it must be opt-in, not opt-out, for package maintainers. I personally would be very upset if I learned that descriptions that may or may not contain confidently wrong statements had been automatically generated for my packages without asking me first. |
Beta Was this translation helpful? Give feedback.
-
Naturally running READMEs which are entirely unstructured and unstandardised through an LLM is going to result in some unexpected cases. I'm not sure if we have analytics for this, but I've definitely seen packages with no README, packages with READMEs in different languages (such as Chinese), boilerplate READMEs (either with no information in, or just the standard "This is a Swift Package" template). There are READMEs with FAQs in, TODO lists, version history and testimonials, links to podcast episodes and books. There is so much variety here and areas where GPT could get confused. But no system is perfect, and I think I've seen enough examples here (and separately) where it's output very strong descriptions even in some of these tough scenarios. I think the prompt could be tweaked over time to wean out bad results. But I think there's also stuff we can do with the implementation to make it safer including filtering out bad READMEs, and potentially starting with only the top 500 packages (based on internal scoring) to test it out. I also think introducing a button to the UI to report a package summary could be handy, one which simply opens a GitHub issue with a pre-filled form. This would help surface these bad results so we, as a community, can re-assess and make changes to the prompt, to the README, and if necessary override using the SPI YML. |
Beta Was this translation helpful? Give feedback.
-
Heard about this conversation through the SPI podcast - wanted to chime in. I think using it to create a 2-3 sentence summary for a package might be a fantastic use case, specifically to supplement search results. That said, I think the summary result should be something that a package developer can override - perhaps a couple sentence summary in the structure of the SPI.yml file? I also think the field should be clearly identified - maybe with a background color, or a caption-like subtitle underneath it (my design skills suck, so I'll defer to nearly anyone else about good affordances here). The idea being that when generated by an LLM, the summary should be highlighted as dynamic content. And somewhat related, using LLM generation to deal with stemming issues in the search is absolutely brilliant. A longer summary - as a generated and stored text field within the index, and then in turn used with the Postgres style search mechanism (the classic stemmer/ranking mechanisms) - would, I suspect, improve ranked relevance for those quirky results where we've got two or more words CamelCased together that are easy to visually disambiguate, but which that stemmer fails with - and extending that system in Postgres is an unfortunate challenge. I don't know the details about ChatGPT as a tool - but I'm wondering if perhaps it's also multi-lingual (Whisper, also by OpenAI clearly is). If so, then that could also potentially up the value of summarized search results if we can ask it translate into English for the search-only-summary view. I wouldn't want to display that translated/search-focused result - but I suspect it may solve the issue of getting a reasonable summary from some of the packages that have a README in another language (the ones I remember being Mandarin - but it wouldn't surprise me to see Japanese or various European languages either). There's some downsides with that - but I think overall the results from search would be notably improved. |
Beta Was this translation helpful? Give feedback.
-
Finally got access to a GPT-4 API key and did some more testing with this tonight. The results are different, but still very good. There's much more structure to the prompt, which is great. All of these results were generated from raw, unprocessed markdown files so we would not need to render or transform anything. This is with a smaller word limit than above, but it really struggles with word limits 😂 PromptYou are a technical editor that summarises Markdown input. Output
|
Beta Was this translation helpful? Give feedback.
-
We are not trying to find a use for GPT in SPI, but there are a couple of uses (one in this thread and one in #2369) where it could be useful.
The first and most practical one is in README summarisation. GPT is good at summarising text, and we don’t have good, concise descriptions for most packages in the index. We have package name and sometimes a one sentence description, and then the full README file which is too long and often contains much more than a package description.
Having reliably good summaries would be very useful for package search results, category and author lists, and also in a slightly redesigned package page metadata area. It would also give us another relevant field for search results without bringing in the entire README file.
It’s important to note that if we do include package summaries as metadata, they would be overridable by humans via a new the
.spi.yml
file. We would always prioritise human entered text over AI-generated.We’re not quite at “New Issue“ with it yet, but we’re convinced enough that it’s worth a discussion.
Costs
Running v3 queries is very cheap and with some back of the envelope calculations we could summarise every README in the current index for under $50. We then would only need to change the summary when we see README changes, and at most something like once a week.
GPT4 queries are much more expensive, so the results would need to be much better. TBD when we get the new API key.
Results
These results are were made using the
text-davinci-003
model. We are on the waiting list for a GPT-4 API key. The full text of a README was entered without any pre-processing and processed with the following prompt:Results follow this post.
Alternatives
We did some experimentation with the Kagi summariser but it produces summaries that are too long and always include installation instructions. A problem that GPT had too, but could be worked around with the prompt.
Beta Was this translation helpful? Give feedback.
All reactions