You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The docs object expects (for technical reasons) that all words occur with frequency 1. If words occur several times, they appear several times each with frequency 1.
In the quanteda package there are dfm objects that also allow values greater than 1. If you do your preprocessing in quanteda and want to use quanteda::dfm2lda to convert your object into the necessary structure, you need one more step to fulfill the requirements for the docs object. Just execute the following line:
This replicates words with multiple occurrences and protects you from the error message all(sapply(docs, function(x) all(x[2, ] == 1))) is not TRUE in LDARep and similar functions.
The text was updated successfully, but these errors were encountered:
Unfortunately, this yields a numeric matrix (at least in R 4.1.1), whereas LDARep expects an integer matrix.
There might be a more elegant solution, but this did the trick for me:
The
docs
object expects (for technical reasons) that all words occur with frequency 1. If words occur several times, they appear several times each with frequency 1.In the
quanteda
package there aredfm
objects that also allow values greater than 1. If you do your preprocessing inquanteda
and want to usequanteda::dfm2lda
to convert your object into the necessary structure, you need one more step to fulfill the requirements for thedocs
object. Just execute the following line:docs = lapply(docs, function(x) rbind(rep(x[1,], x[2,]), 1))
This replicates words with multiple occurrences and protects you from the error message
all(sapply(docs, function(x) all(x[2, ] == 1))) is not TRUE
inLDARep
and similar functions.The text was updated successfully, but these errors were encountered: