Skip to content

Commit

Permalink
Make Meilisearch 1.0 a requirement
Browse files Browse the repository at this point in the history
  • Loading branch information
duncanmcclean committed Oct 16, 2023
1 parent f601a5d commit 95ea27a
Showing 1 changed file with 1 addition and 140 deletions.
141 changes: 1 addition & 140 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,7 @@ This addon provides a [Meilisearch](https://www.meilisearch.com/) search driver
* PHP 8.1+
* Laravel 9+
* Statamic 4
* Meilisearch 1.0+

### Installation

Expand Down Expand Up @@ -146,143 +147,3 @@ http {
```

Then restart the server, or run `sudo service nginx restart`.

### Quirks

meilisearch can only index 1000 words... which isn't so great for long markdown articles.

> [!NOTE]
> As of version 0.24.0 the 1000 word limit [no longer exists](https://github.com/meilisearch/meilisearch/issues/1770) on documents, which makes the driver a lot more suited for longer markdown files you may use on Statamic.
#### Solution 1
On earlier versions, you can overcome this by breaking the content into smaller chunks:

```php
'articles' => [
'driver' => 'meilisearch',
'searchables' => ['collection:articles'],
'fields' => ['id', 'title', 'locale', 'url', 'date', 'type', 'content'],
'transformers' => [
'date' => function ($date) {
return $date->format('F jS, Y'); // February 22nd, 2020
},
'content' => function ($content) {
// determine the number of 900 word sections to break the content field into
$segments = ceil(Str::wordCount($content) / 900);

// count the total content length and figure out how many characters each segment needs
$charactersLimit = ceil(Str::length($content) / $segments);

// now create X segments of Y characters
// the goal is to create many segments with only ~900 words each
$output = str_split($content, $charactersLimit);

$response = [];
foreach ($output as $key => $segment) {
$response["content_{$key}"] = utf8_encode($segment);
}

return $response;
}
],
],
```

This will create a few extra fields like `content_1`, `content_2`, ... `content_12`, etc. When you perform a search it'll still search through all of them and return the most relevant result, but it's not possible to show highlights anymore for matching words on the javascript client. You'll have trouble figuring out if you should show `content_1` or `content_8` highlights. So if you go this route, make sure each entry has a synopsis you could show instead of highlights. I wouldn't recommend it at the moment.


#### Solution 2
If you need a lot more fine-grained control, and need to break content down into paragraphs or even sentences. You could use a artisan command to parse the entries in a Statamic collection, split the content and store it in a database. Then sync the individual items to meilisearch using the `php artisan scout:import` command.

1. Create a new database migration (make sure the migration has an origin UUID so you can link them to the parent entry)
2. Create a new Model and add the `searchables` trait from Scout.
3. Create an artisan command to parse all the entries and bulk import existing ones

```php
private function parseAll()
{
// disable search
Articles::withoutSyncingToSearch(function () {
// process all
$transcripts = Entry::query()
->where('collection', 'articles')
->where('published', true)
->get()
->each(function ($entry) {
// push individual paragraphs or sentences to a collection
$segments = $entries->customSplitMethod();

$segments->each(function ($data) {
try {
$article = new Article($data);
$article->save();
} catch (\Illuminate\Database\QueryException $e) {
dd($e);
}
});
});
});

$total = Article::all()->count();
$this->info("Imported {$total} entries into the articles index.");

$this->info("Bulk import the records with: ");
$this->info("php artisan scout:import \"App\Models\Article\" --chunk=100");
}
```

4. Add some Listeners to the EventServiceProvider to watch for update or delete events on the collection (to keep it in sync)

```php
protected $listen = [
'Statamic\Events\EntrySaved' => [
'App\Listeners\ScoutArticleUpdated',
],
'Statamic\Events\EntryDeleted' => [
'App\Listeners\ScoutArticleDeleted',
],
];
```

4. Create the Event Listeners, for example:

```php
public function handle(EntryDeleted $event)
{
if ($event->entry->collectionHandle() !== 'articles') return;

// get the ID of the original transcript
$id = $event->entry->id();

// delete all from Scout with this origin ID
$paragraphs = Article::where('origin', $id);
$paragraphs->unsearchable();
$paragraphs->delete();
}

public function handle(EntrySaved $event)
{
// ... same as above ...

// if state:published
if (!$event->entry->published()) return;

// TODO: split $event->entry into paragraphs again and save them to the database,
// they will re-sync automatically with the Searchables Trait.
}
```

5. Create a placeholder, or empty index into the search config so you can create the index on meilisearch before importing the existing entries

```php
// required as a placeholder where we store the paragraphs later
'articles' => [
'driver' => 'meilisearch',
'searchables' => [], // empty
'settings' => [
'filterableAttributes' => ['type', 'entity', 'locale'],
'distinctAttribute' => 'origin', // if you only want to return one result per entry
// any search settings
],
],
```

0 comments on commit 95ea27a

Please sign in to comment.