Skip to content

Commit

Permalink
make termination only need one exhausted path
Browse files Browse the repository at this point in the history
  • Loading branch information
Dan-wanna-M committed Oct 21, 2023
1 parent fd57ccb commit e0dd1b5
Show file tree
Hide file tree
Showing 5 changed files with 9 additions and 9 deletions.
2 changes: 1 addition & 1 deletion Cargo.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

6 changes: 3 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -79,10 +79,10 @@ In this project, a slightly modified version of BNF is used. The key differences
The possible tokens listed are the tokens that can be accepted by the sampler in its current state.
The following rule defines whether a token is listed in the return value of `Sampler::all_possible_tokens` with a given BNF:

- The sampler has not terminated or gets into an invalid state. In other words, there are still terms not consumed in the sampler, and the current input token can be accepted by the sampler.
- The sampler has not terminated or gets into an invalid state. In other words, the current input token can be accepted by the sampler, and no path exists such that all the terminals and nonterminals are consumed in the path.

- e.g. With `<start>::=<A><B><C>, <A>::='cryscan', <B>::='hex', <C>::='wanicca'`,`<start>` will create a sampler that terminates after `cryscan`,`hex`,`wanicca` are inputed in this exact sequence, and goes into an invalid state otherwise.
- e.g. `<sequence>::=<any!>|<any!><sequence>` will create a sampler that never terminate as `<sequence>` can always become `<any!><sequence>`.
- e.g. With `<start>::=<A><B><C>, <A>::='boy', <B>::='next', <C>::='door'`,`<start>` will create a sampler that terminates after `boy`,`next`,`door` are inputed in this exact sequence, and goes into an invalid state otherwise.
- e.g. `<sequence>::=<any!>|<any!><sequence>` will create a sampler that terminates after any input token because of the path where `<sequence>` become `<any!>`. In other words, `<any!>` is the only nonterminal in the path and is consumed.

- For a given terminal, only the longest possible token is listed.

Expand Down
4 changes: 2 additions & 2 deletions assets/grammar.bnf
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
<start>::=<string>
<string>::=' '<word>'"'
<start>::=<word>'*'
<string>::=<any!>|<any!><string>
<word>::=<letter>|<letter><word>
<letter>::='a'|'b'|'c'|'d'|'e'|'f'|'g'|'h'|'i'|'j'|'k'|'l'|'m'|'n'|'o'|'p'|'q'|'r'|'s'|'t'|'u'|'v'|'w'|'x'|'y'|'z'|'A'|'B'|'C'|'D'|'E'|'F'|'G'|'H'|'I'|'J'|'K'|'L'|'M'|'N'|'P'|'P'|'Q'|'R'|'S'|'T'|'U'|'V'|'W'|'X'|'Y'|'Z'
2 changes: 1 addition & 1 deletion bnf_sampler/Cargo.toml
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
[package]
name = "bnf_sampler"
version = "0.3.2"
version = "0.3.3"
edition = "2021"
license = "MIT OR Apache-2.0"
description = "A crate that uses recursive descent algorithm to ensure tokens produced by a large language model follow a Backus Naur Form schema."
Expand Down
4 changes: 2 additions & 2 deletions bnf_sampler/src/sampler.rs
Original file line number Diff line number Diff line change
Expand Up @@ -311,7 +311,7 @@ impl Sampler {
self.stack_arena.clear();
// println!("failed: {:?}",failed_prefixs);
}
println!("stack: {:?}, {:?}", stack, now.elapsed());
// println!("stack: {:?}, {:?}", stack, now.elapsed());
// println!("{:?}",failed_prefixs);
}
entry.insert(self.token_ids.clone());
Expand Down Expand Up @@ -374,7 +374,7 @@ impl Sampler {
self.stacks.swap_remove(i);
}
if accepted {
if self.stacks.is_empty() || self.stacks.iter().all(|x| x.is_empty()) {
if self.stacks.is_empty() || self.stacks.iter().any(|x| x.is_empty()) {
return Ok(AcceptTokenResult::End);
}
Ok(AcceptTokenResult::Continue)
Expand Down

0 comments on commit e0dd1b5

Please sign in to comment.