-
Notifications
You must be signed in to change notification settings - Fork 19
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
significant overhead while running the FST due to heap allocations #15
Comments
As follows is the patch I mentioned. It bypasses the multi-version decoder/state abstraction, but it makes the problem clear: diff --git a/decoder_v1.go b/decoder_v1.go
index d56e61d..5befe50 100644
--- a/decoder_v1.go
+++ b/decoder_v1.go
@@ -69,6 +69,15 @@ func (d *decoderV1) stateAt(addr int, prealloc fstState) (fstState, error) {
return state, nil
}
+func (d *decoderV1) stateAtNoAlloc(addr int) (fstStateV1, error) {
+ var state fstStateV1
+ err := state.at(d.data, addr)
+ if err != nil {
+ return state, err
+ }
+ return state, nil
+}
+
type fstStateV1 struct {
data []byte
top int
diff --git a/encoding.go b/encoding.go
index 988d486..6418529 100644
--- a/encoding.go
+++ b/encoding.go
@@ -50,6 +50,7 @@ type decoder interface {
getRoot() int
getLen() int
stateAt(addr int, prealloc fstState) (fstState, error)
+ stateAtNoAlloc(addr int) (fstStateV1, error)
}
func loadDecoder(ver int, data []byte) (decoder, error) {
diff --git a/fst.go b/fst.go
index 3140042..ff2300d 100644
--- a/fst.go
+++ b/fst.go
@@ -160,7 +160,7 @@ func (f *FST) Accept(addr int, b byte) int {
// IsMatchWithVal returns if this state is a matching state in this Automaton
// and also returns the final output value for this state
func (f *FST) IsMatchWithVal(addr int) (bool, uint64) {
- s, err := f.decoder.stateAt(addr, nil)
+ s, err := f.decoder.stateAtNoAlloc(addr)
if err != nil {
return false, 0
}
@@ -170,7 +170,7 @@ func (f *FST) IsMatchWithVal(addr int) (bool, uint64) {
// AcceptWithVal returns the next state for this Automaton on input of byte b
// and also returns the output value for the transition
func (f *FST) AcceptWithVal(addr int, b byte) (int, uint64) {
- s, err := f.decoder.stateAt(addr, nil)
+ s, err := f.decoder.stateAtNoAlloc(addr)
if err != nil {
return noneAddr, 0
}
@@ -178,6 +178,35 @@ func (f *FST) AcceptWithVal(addr int, b byte) (int, uint64) {
return next, output
} What do you think we could do about it without going around the architecture? My first thought was to add a +type FSTRunner struct {
+ fst *FST
+ prealloc fstState
+}
+
+func (f *FST) Runner() (*FSTRunner, error) {
+ return &FSTRunner { f, nil }, nil
+}
+
+func (r *FSTRunner) AcceptWithVal(addr int, b byte) (int, uint64) {
+ s, err := r.fst.decoder.stateAt(addr, r.prealloc)
+ r.prealloc = s
+ if err != nil {
+ return noneAddr, 0
+ }
+ _, next, output := s.TransitionFor(b)
+ return next, output
+}
+
+func (r *FSTRunner) IsMatchWithVal(addr int) (bool, uint64) {
+ s, err := r.fst.decoder.stateAt(addr, r.prealloc)
+ r.prealloc = s
+ if err != nil {
+ return false, 0
+ }
+ return s.Final(), s.FinalOutput()
+} I've tested it and performance-wise it actually works a bit better than the first solution. |
Thanks @thelink2012 . I think what'd be better here is for you to raise this diff as a pull request instead. The team will get to reviewing it at the soonest possible. |
Sure @abhinavdangeti. After looking a bit more into the code I noticed |
Hey, thanks for building
vellum
, it's awesome!I have a service which uses vellum's FST for searching needles in a haystack, and while profiling it, I found significant overhead at the calls to
IsMatchWithVal
andAcceptWithVal
, both of which are linked todecoderV1.stateAt -> runtime.newobject
.I took at look at the code and found it's due to
stateAt
heap allocating afstStateV1
at every call toIsMatchWithVal
andAcceptWithVal
. In other words, heap allocation occurs for every character in the string being searched on.I hacked a patch (see next comment) and found the service to reduce its response time by half. So I guess this is significant enough to discuss upstream.
The text was updated successfully, but these errors were encountered: