Skip to content

Commit

Permalink
Summarizer merge (Jaseci-Labs#1198)
Browse files Browse the repository at this point in the history
* Relocated unused modules to _inactive directory
1. cl_summer
2. ent_ext
3. zs_classifier
4. bi_ner
5. gpt3

* Merged BART_SUM and T5_SUM to summarization

* reverting cl_summer

* updating action config
  • Loading branch information
AshishMahendra authored Jul 31, 2023
1 parent dd556de commit e68e9f5
Show file tree
Hide file tree
Showing 24 changed files with 97 additions and 227 deletions.
File renamed without changes.
File renamed without changes.
File renamed without changes.
2 changes: 1 addition & 1 deletion jaseci_ai_kit/config.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@
from .jac_nlp.jac_nlp.tfm_ner.action_config import TFM_NER_ACTION_CONFIG
from .jac_nlp.jac_nlp.use_enc.action_config import USE_ENC_ACTION_CONFIG
from .jac_nlp.jac_nlp.use_qa.action_config import USE_QA_ACTION_CONFIG
from .jac_nlp.jac_nlp.bart_sum.action_config import BART_SUM_ACTION_CONFIG
from .jac_nlp.jac_nlp.summarization.action_config import BART_SUM_ACTION_CONFIG

ACTION_CONFIGS = {
"cl_summer": CL_SUMMER_ACTION_CONFIG,
Expand Down
61 changes: 11 additions & 50 deletions jaseci_ai_kit/jac_nlp/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -28,10 +28,7 @@ The `jac_nlp` package contains a collection of state-of-the-art NLP models that
- [Summarizer (`cl_summer`)](#summarizer-cl_summer)
- [Actions](#actions-8)
- [Example Jac Usage](#example-jac-usage-8)
- [T5 Summarization (`t5_sum`)](#t5-summarization-t5_sum)
- [Actions](#actions-9)
- [Example Jac Usage:](#example-jac-usage-9)
- [Bart Summarization (`bart_sum`)](#bart-summarization-bart_sum)
- [Summarization (`summarization`)](#summarization-summarization)
- [Actions](#actions-10)
- [Example Jac Usage:](#example-jac-usage-10)
- [Topic Modeling Modules](#topic-modeling-modules)
Expand Down Expand Up @@ -662,51 +659,15 @@ walker cl_summer_example {
```

For a complete example visit [here](jac_nlp/cl_summer/README.md)
### T5 Summarization (`t5_sum`)
`t5_sum` uses the T5 transformer model to perform abstractive summary on a body of text.

#### Actions

* `classify_text`: use the T5 model to summarize a body of text
* **Input**:
* `text` (string): text to summarize
* `min_length` (integer): the least amount of words you want returned from the model
* `max_length` (integer): the most amount of words you want returned from the model
* **Input datafile**
`**data.json**`
```
{
"text": "The US has passed the peak on new coronavirus cases, President Donald Trump said and predicted that some states would reopen this month. The US has over 637,000 confirmed Covid-19 cases and over 30,826 deaths, the highest for any country in the world. At the daily White House coronavirus briefing on Wednesday, Trump said new guidelines to reopen the country would be announced on Thursday after he speaks to governors. We'll be the comeback kids, all of us, he said. We want to get our country back. The Trump administration has previously fixed May 1 as a possible date to reopen the world's largest economy, but the president said some states may be able to return to normalcy earlier than that.",
"min_length": 30,
"max_length": 100
}
```

#### Example Jac Usage:
```jac
# Use the T5 model to summarize a given piece of text
walker summarization {
can t5_sum.classify_text;
has data = "data.json";
data = file.load_json(data);
summarized_text = t5_sum.classify_text(
text = data["text"],
min_length = data["min_length"],
max_length = data["max_length"]
);
report summarized_text;
}
```

For a complete example visit [here](jac_nlp/t5_sum/README.md)

### Bart Summarization (`bart_sum`)
### Summarization (`summarization`)

`bart_sum` uses the BART transformer model to perform abstractive summary on a body of text.
`summarization` uses the BART transformer model to perform abstractive summary on a body of text.

#### Actions

There are 2 ways to use `bart_sum` module.
There are 2 ways to use `summarization` module.
1. Given a text, it will return the summary of the text.
2. Given a web page url, it will return the summary of the web page.

Expand All @@ -724,15 +685,15 @@ Following example will return the summary of the a single text.

```jac
walker test_summarize_single {
can bart_sum.summarize;
report bart_sum.summarize("There was once a king of Scotland whose name was Robert Bruce. He needed to be both brave and wise because the times in which he lived were wild and rude.", 10);
can summarization.summarize;
report summarization.summarize("There was once a king of Scotland whose name was Robert Bruce. He needed to be both brave and wise because the times in which he lived were wild and rude.", 10);
}
```
You can also pass a list of texts to get the summary of all the texts.
```jac
walker test_summarize_batch {
can bart_sum.summarize;
report bart_sum.summarize(
can summarization.summarize;
report summarization.summarize(
["There was once a king of Scotland whose name was Robert Bruce. He needed to be both brave and wise because the times in which he lived were wild and rude.",
"There was once a king of Scotland whose name was Robert Bruce. He needed to be both brave and wise because the times in which he lived were wild and rude.",
"There was once a king of Scotland whose name was Robert Bruce. He needed to be both brave and wise because the times in which he lived were wild and rude."],
Expand All @@ -744,12 +705,12 @@ Following example will return the summary of the web page.

```jac
walker test_summarize_url {
can bart_sum.summarize;
report bart_sum.summarize(null, "https://in.mashable.com/");
can summarization.summarize;
report summarization.summarize(null, "https://in.mashable.com/");
}
```

For a complete example visit [here](jac_nlp/bart_sum/README.md)
For a complete example visit [here](jac_nlp/summarization/README.md)

## Topic Modeling Modules

Expand Down
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
BART_SUM_ACTION_CONFIG = {
"module": "jac_nlp.bart_sum",
"loaded_module": "jac_nlp.bart_sum.bart_sum",
SUMMARIZATION_ACTION_CONFIG = {
"module": "jac_nlp.summarization",
"loaded_module": "jac_nlp.summarization.summarization",
"local_mem_requirement": 2100,
"remote": {
"Service": {
Expand All @@ -26,7 +26,7 @@
"creationTimestamp": None,
},
"data": {
"prod_up": "uvicorn jac_nlp.bart_sum:serv_actions --host 0.0.0.0 --port 80"
"prod_up": "uvicorn jac_nlp.summarization:serv_actions --host 0.0.0.0 --port 80"
},
},
"Deployment": {
Expand Down

This file was deleted.

1 change: 0 additions & 1 deletion jaseci_ai_kit/jac_nlp/jac_nlp/bart_sum/__init__.py

This file was deleted.

23 changes: 0 additions & 23 deletions jaseci_ai_kit/jac_nlp/jac_nlp/bart_sum/tests/fixtures/bart_sum.jac

This file was deleted.

4 changes: 2 additions & 2 deletions jaseci_ai_kit/jac_nlp/jac_nlp/config.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
from .action_configs.bart_sum_action_config import BART_SUM_ACTION_CONFIG
from .action_configs.summarization_action_config import SUMMARIZATION_ACTION_CONFIG
from .action_configs.bi_enc_action_config import BI_ENC_ACTION_CONFIG
from .action_configs.cl_summer_action_config import CL_SUMMER_ACTION_CONFIG
from .action_configs.sbert_sim_action_config import SBERT_SIM_ACTION_CONFIG
Expand All @@ -10,7 +10,7 @@
from .action_configs.sentiment_action_config import SENTIMENT_ACTION_CONFIG

ACTION_CONFIGS = {
"bart_sum": BART_SUM_ACTION_CONFIG,
"summarization": SUMMARIZATION_ACTION_CONFIG,
"bi_enc": BI_ENC_ACTION_CONFIG,
"cl_summer": CL_SUMMER_ACTION_CONFIG,
"sbert_sim": SBERT_SIM_ACTION_CONFIG,
Expand Down
Original file line number Diff line number Diff line change
@@ -1,29 +1,29 @@
---
title: Text Summarization with BART
title: Text Summarization
---

# **Bart Summarizer (`bart_sum`)**
# **Summarization (`summarization`)**

Module `bart_sum` uses the `bart-large-cnn` to get the abstractive summary of a text.
Module `summarization` uses the `philschmid/bart-large-cnn-samsum` to get the abstractive summary of a text.

1. Import [`bart_sum`](#1-import-summarizer-bart_sum-module-in-jac) module in jac
1. Import [`summarization`](#1-import-summarizer-summarization-module-in-jac) module in jac
2. [Summarizer](#2-summarizer)

# **Walk through**

## **1. Import Summarizer (`bart_sum`) module in jac**
## **1. Import Summarizer (`summarization`) module in jac**
1. For executing jaseci Open terminal and run follow command.
```
jsctl -m
```
2. Load bart_sum module in jac
2. Load summarization module in jac
```
actions load module jac_nlp.bart_sum
actions load module jac_nlp.summarization
```


## **2. Summarizer**
There are 2 ways to use `bart_sum` module.
There are 2 ways to use `summarization` module.
1. Given a text, it will return the summary of the text.
2. Given a web page url, it will return the summary of the web page.

Expand All @@ -40,15 +40,15 @@ Following example will return the summary of the a single text.

```jac
walker test_summarize_single {
can bart_sum.summarize;
report bart_sum.summarize("There was once a king of Scotland whose name was Robert Bruce. He needed to be both brave and wise because the times in which he lived were wild and rude.", 10);
can summarization.summarize;
report summarization.summarize("There was once a king of Scotland whose name was Robert Bruce. He needed to be both brave and wise because the times in which he lived were wild and rude.", 10);
}
```
You can also pass a list of texts to get the summary of all the texts.
```jac
walker test_summarize_batch {
can bart_sum.summarize;
report bart_sum.summarize(
can summarization.summarize;
report summarization.summarize(
["There was once a king of Scotland whose name was Robert Bruce. He needed to be both brave and wise because the times in which he lived were wild and rude.",
"There was once a king of Scotland whose name was Robert Bruce. He needed to be both brave and wise because the times in which he lived were wild and rude.",
"There was once a king of Scotland whose name was Robert Bruce. He needed to be both brave and wise because the times in which he lived were wild and rude."],
Expand All @@ -62,15 +62,15 @@ Following example will return the summary of the web page.

```jac
walker test_summarize_url {
can bart_sum.summarize;
report bart_sum.summarize(null, "https://in.mashable.com/");
can summarization.summarize;
report summarization.summarize(null, "https://in.mashable.com/");
}
```

### Setup Parameters
* `tokenizer` - Tokenizer to be used for tokenizing the text. Type: `str` Default: `facebook/bart-large-cnn`
* `model` - Model to be used for summarizing the text. Type: `str` Default: `facebook/bart-large-cnn`

* `model` - Model to be used for summarizing the text. Type: `str` Default: `philschmid/bart-large-cnn-samsum`

# **References**
* [Bart Summarizer](https://huggingface.co/transformers/model_doc/bart.html)
* [Bart Summarizer Paper](https://arxiv.org/abs/1910.13461)
* [Summarization](https://huggingface.co/transformers/model_doc/bart.html)
* [Summarization Paper](https://arxiv.org/abs/1910.13461)
1 change: 1 addition & 0 deletions jaseci_ai_kit/jac_nlp/jac_nlp/summarization/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
from .summarization import * # noqa
Loading

0 comments on commit e68e9f5

Please sign in to comment.