You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
File /usr/local/lib/python3.9/dist-packages/octis/models/pytorchavitm/AVITM.py:77, in AVITM.train_model(self, dataset, hyperparameters, top_words)
74 self.set_params(hyperparameters)
76 if self.use_partitions:
---> 77 train, validation, test = dataset.get_partitioned_corpus(use_validation=True)
79 data_corpus_train = [' '.join(i) for i in train]
80 data_corpus_test = [' '.join(i) for i in test]
TypeError: cannot unpack non-iterable NoneType object
What I Did
Here's the code for creating the custom dataset from a list of strings...
# docs is a list of strings
# collect tokens
tokens = []
for d in tqdm(docs):
tokens += word_tokenize(d.lower())
# write vocab file
with open("octis_dataset/vocabulary.txt", "w+") as f:
for s in tqdm(set(tokens)):
f.write(s + "\n")
# create corpus tsv
df = pd.DataFrame(docs)
# partition
tr_data = df.sample(48500, random_state=420)
te_data = df.query("index not in @tr_data.index").sample(12900, random_state=420)
val_data = df.query("index not in @tr_data.index and index not in @te_data.index")
df = pd.concat([tr_data, te_data, val_data])
# write tsv
df.to_csv("octis_dataset/corpus.tsv", sep="\t", header=None)
And here is the code to optimize the model...
optimizer=Optimizer()
start = time.time()
optimization_result = optimizer.optimize(
model, dataset, coherence, search_space, number_of_call=optimization_runs,
model_runs=model_runs, save_models=True,
extra_metrics=None, # to keep track of other metrics
save_path='results/test_neuralLDA/'
)
end = time.time()
duration = end - start
optimization_result.save_to_csv("results_neuralLDA.csv")
print('Optimizing model took: ' + str(round(duration)) + ' seconds.')
And this raises the error.
The text was updated successfully, but these errors were encountered:
Description
Trying to run Optimization, following this tutorial on custom dataset raises:
What I Did
Here's the code for creating the custom dataset from a list of strings...
And here is the code to optimize the model...
And this raises the error.
The text was updated successfully, but these errors were encountered: