Skip to content

Commit

Permalink
improve fb2
Browse files Browse the repository at this point in the history
  • Loading branch information
ewilan-riviere committed Sep 20, 2023
1 parent cc7b60c commit 7c5671f
Show file tree
Hide file tree
Showing 4 changed files with 372 additions and 100 deletions.
83 changes: 45 additions & 38 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,11 @@
[![tests][tests-src]][tests-href]
[![codecov][codecov-src]][codecov-href]

PHP package to read metadata and extract covers from eBooks (`.epub`, `.cbz`, `.cbr`, `.cb7`, `.cbt`, `.pdf`) and audiobooks (`.mp3`, `.m4a`, `.m4b`, `.flac`, `.ogg`).
PHP package to read metadata and extract covers from eBooks, comics and audiobooks.

- eBooks: `.epub`, `.pdf`
- Comics: `.cbz`, `.cbr`, `.cb7`, `.cbt` (metadata from [github.com/anansi-project](https://github.com/anansi-project))
- Audiobooks: `.mp3`, `.m4a`, `.m4b`, `.flac`, `.ogg`

_Supports Linux, macOS and Windows._

Expand All @@ -27,19 +31,11 @@ This package was built for [`bookshelves-project/bookshelves`](https://github.co
- **PHP extensions**:
- [`zip`](https://www.php.net/manual/en/book.zip.php) (native, optional) for `.EPUB`, `.CBZ`
- [`phar`](https://www.php.net/manual/en/book.phar.php) (native, optional) for `.CBT`
- [`rar`](https://www.php.net/manual/en/book.rar.php) (optional) for `.CBR`
- [`rar`](https://www.php.net/manual/en/book.rar.php) (optional) for `.CBR` ([`p7zip`](https://www.7-zip.org/) binary can be used instead)
- [`imagick`](https://www.php.net/manual/en/book.imagick.php) (optional) for `.PDF`
- [`intl`](https://www.php.net/manual/en/book.intl.php) (native, optional) for `Transliterator` for better slugify
- [`fileinfo`](https://www.php.net/manual/en/book.fileinfo.php) (native, optional) for better detection of file type

| Type | Supported | Requirement | Uses |
| :-------------------------------------: | :-------: | :------------------------------------------------------------------------------------------------------: | :----------------------------: |
| `.epub`, `.cbz` || N/A | `zip` PHP extension |
| `.cbt` || N/A | `phar` PHP extension |
| `.cbr` || [`rar` PHP extension](https://github.com/cataphract/php-rar) or [`p7zip`](https://www.7-zip.org/) binary | PHP `rar` or `p7zip` |
| `.cb7` || [`p7zip`](https://www.7-zip.org/) binary | `p7zip` binary |
| `.pdf` || Optional (for extraction) [`imagick` PHP extension](https://github.com/Imagick/imagick) | `smalot/pdfparser` (included) |
| `.mp3`, `.m4a`, `.m4b`, `.flac`, `.ogg` || N/A | `kiwilan/php-audio` (included) |
- To know more about requirements, see [Supported formats](#supported-formats).

> **Warning**
>
Expand All @@ -48,6 +44,7 @@ This package was built for [`bookshelves-project/bookshelves`](https://github.co
## Features

- Support some formats:
- 🔎 Read metadata from **eBooks** and **audiobooks**
- 🖼️ Extract covers from **eBooks** and **audiobooks**
- 📚 Support metadata
Expand All @@ -60,36 +57,14 @@ This package was built for [`bookshelves-project/bookshelves`](https://github.co
- 🔖 Chapters extraction (`EPUB` only)
- 📦 `EPUB` and `CBZ` creation supported
<!-- - 📝 `EPUB` and `CBZ` metadata update supported -->
- Works perfectly with [kiwilan/php-opds](https://github.com/kiwilan/php-opds): PHP package to generate OPDS feeds (not included)

### Roadmap

- [ ] More formats support: `.mobi`, `.azw`, `.azw3`, `.djvu`, `.fb2`
- [ ] More formats support: `.djvu`
- [ ] Better `.epub` creation support
- [ ] Add `.epub` metadata update support

### Formats

There is a lot of different formats for eBooks and comics, if you want to know more about:

- [Comparison of e-book formats](https://en.wikipedia.org/wiki/Comparison_of_e-book_formats) for eBooks
- [Comic book archive](https://en.wikipedia.org/wiki/Comic_book_archive) for comics
- Amazing [MobileRead wiki](https://wiki.mobileread.com/wiki/Category:Formats)

| Name | Extensions | Supported | Notes |
| :--------------: | :-------------------------------------------------------------: | :-------: | :-----------: |
| EPUB (IDPF) | `.epub` || |
| Kindle (Amazon) | `.azw`, `.azw3`, `.kf8`, `.kfx` || _proprietary_ |
| Mobipocket | `.mobi`, `.prc` || _deprecated_ |
| PDF | `.pdf` || |
| iBook (Apple) | `.ibooks` || _proprietary_ |
| DjVu | `.djvu`, `.djv` || |
| Rich Text Format | `.rtf` || |
| FictionBook | `.fb2` || |
| Broadband eBooks | `.lrf`, `.lrx` || |
| Palm Media | `.pdb` || |
| CBA | `.cbz`, `.cbr`, `.cb7`, `.cbt` || |
| Audio | See [`kiwilan/php-audio`](https://github.com/kiwilan/php-audio) || |

## Installation

You can install the package via composer:
Expand All @@ -100,7 +75,7 @@ composer require kiwilan/php-ebook

## Usage

With eBook files (`.epub`, `.cbz`, `.cba`, `.cbr`, `.cb7`, `.cbt`, `.pdf`) or audiobook files (`mp3`, `m4a`, `m4b`, `flac`, `ogg`).
With eBook files or audiobook files (to know more about formats, see [Supported formats](#supported-formats)).

```php
use Kiwilan\Ebook\Ebook;
Expand Down Expand Up @@ -293,9 +268,41 @@ $creator->addDirectory('./', 'path/to/directory')
->save();
```

## More
## Supported formats

There is a lot of different formats for eBooks and comics, if you want to know more about:

- [Comparison of e-book formats](https://en.wikipedia.org/wiki/Comparison_of_e-book_formats) for eBooks
- [Comic book archive](https://en.wikipedia.org/wiki/Comic_book_archive) for comics
- Amazing [MobileRead wiki](https://wiki.mobileread.com/wiki/Category:Formats)

- [kiwilan/php-opds](https://github.com/kiwilan/php-opds): PHP package to generate OPDS feeds
`.epub`, `.pdf`, `.mobi`, `.prc`,

- Kinlde: `.azw`, `.azw3`, `.kf8`, `.kfx`

`.cbz`, `.cbr`, `.cb7`, `.cbt`

| Name | Extensions | Supported | Notes | Uses | Has cover |
| :--------------: | :-------------------------------------: | :-------: | :-----------: | :----------------------------------------------------------------------: | :-------------------------------------------------------------------------: |
| EPUB (IDPF) | `.epub` || | Native [`zip`](https://www.php.net/manual/en/book.zip.php) ||
| Kindle (Amazon) | `.azw`, `.azw3`, `.kf8`, `.kfx` || _proprietary_ | Native [`filesystem`](https://www.php.net/manual/en/book.filesystem.php) | ✅ (See [MOBI cover note](#mobi-cover-note)) |
| Mobipocket | `.mobi`, `.prc` || _deprecated_ | Native [`filesystem`](https://www.php.net/manual/en/book.filesystem.php) ||
| PDF | `.pdf` || | [`smalot/pdfparser`](https://github.com/smalot/pdfparser) (included) | Uses [`imagick`](https://www.php.net/manual/en/book.imagick.php) |
| iBook (Apple) | `.ibooks` || _proprietary_ | | N/A |
| DjVu | `.djvu`, `.djv` || | | N/A |
| Rich Text Format | `.rtf` || | | N/A |
| FictionBook | `.fb2` || | Native [`filesystem`](https://www.php.net/manual/en/book.filesystem.php) ||
| Broadband eBooks | `.lrf`, `.lrx` || | | N/A |
| Palm Media | `.pdb` || | | N/A |
| Comics CBZ | `.cbz` || | ||
| Comics CBR | `.cbr` || | ||
| Comics CB7 | `.cb7` || | ||
| Comics CBT | `.cbt` || | ||
| Audio | `.mp3`, `.m4a`, `.m4b`, `.flac`, `.ogg` || | See [`kiwilan/php-audio`](https://github.com/kiwilan/php-audio) | [Depends of format](https://github.com/kiwilan/php-audio#supported-formats) |

### MOBI cover note

Mobipocket files and derivatives (`.mobi`, `.prc`, `.azw`, `.azw3`, `.kf8`, `.kfx`) can have a cover image embedded in the file. With native solution of `php-ebook` cover could be extracted but resolution is not good.

## Testing

Expand Down
60 changes: 38 additions & 22 deletions src/Formats/Fb2/Fb2Module.php
Original file line number Diff line number Diff line change
Expand Up @@ -23,59 +23,75 @@ public static function make(Ebook $ebook): EbookModule

public function toEbook(): Ebook
{
$titleInfo = $this->parser->getTitleInfo();
$descriptionInfo = $this->parser->getDescription() ?? null;
if (! $descriptionInfo) {
return $this->ebook;
}

$this->ebook->setTitle($titleInfo['book-title'] ?? null);
$this->ebook->setTitle($descriptionInfo->title?->bookTitle);

$authors = $titleInfo['author'] ?? null;
$authors = $descriptionInfo->title?->author;
if (is_array($authors)) {
foreach ($authors as $author) {
$firstName = $author['first-name'] ?? null;
$lastName = $author['last-name'] ?? null;
$firstName = $author->firstName ?? null;
$lastName = $author->lastName ?? null;
$author = new BookAuthor(
name: "$firstName $lastName",
);
$this->ebook->setAuthor($author);
}
}

$keywords = $titleInfo['keywords'] ?? null;
$keywords = $descriptionInfo->title?->keywords;
if (is_string($keywords)) {
$keywords = explode(',', $keywords);
}

$genre = $titleInfo['genre'] ?? null;
$genre = $descriptionInfo->title?->genre;
$this->ebook->setTags([
$genre,
...$keywords,
]);

$lang = $titleInfo['lang'] ?? null;
$lang = $descriptionInfo->title?->lang;
$this->ebook->setLanguage($lang);

$description = $titleInfo['annotation'] ?? null;
$description = $descriptionInfo->title?->annotation;
$description = $this->arrayToHtml($description);

$this->ebook->setDescription($this->descriptionToString($description));
$this->ebook->setDescriptionHtml($this->descriptionToHtml($description));

$documentInfo = $this->parser->getDocumentInfo();
$uuid = $documentInfo['id'] ?? null;
$uuid = new BookIdentifier($uuid, 'uuid');
$this->ebook->setIdentifier($uuid);
$documentInfo = $descriptionInfo->document;
$uuid = $documentInfo?->id ?? null;
if ($uuid) {
$uuid = new BookIdentifier($uuid, 'uuid');
$this->ebook->setIdentifier($uuid);
}

$publishInfo = $this->parser->getPublishInfo();
$publisher = $publishInfo['publisher'] ?? null;
$publishInfo = $descriptionInfo->publish;
if ($publishInfo) {
$this->ebook->setPublisher($publishInfo?->publisher ?? null);

$this->ebook->setPublisher($publisher);
$year = $publishInfo->year ?? null;
if ($year) {
$year = new \DateTime($year);
$this->ebook->setPublishDate($year);
}

$year = $publishInfo['year'] ?? null;
$year = new \DateTime($year);
$this->ebook->setPublishDate($year);
if ($publishInfo->isbn) {
$isbn = new BookIdentifier($publishInfo->isbn);
$this->ebook->setIdentifier($isbn);
}
}

if ($descriptionInfo->title?->sequence) {
$series = $descriptionInfo->title->sequence->name ?? null;
$number = $descriptionInfo->title->sequence->number ?? null;

$isbn = $publishInfo['isbn'] ?? null;
$isbn = new BookIdentifier($isbn);
$this->ebook->setIdentifier($isbn);
$this->ebook->setSeries($series);
$this->ebook->setVolume($number);
}

return $this->ebook;
}
Expand Down
Loading

0 comments on commit 7c5671f

Please sign in to comment.