Skip to content

Commit

Permalink
Update to 1.16.0
Browse files Browse the repository at this point in the history
  • Loading branch information
joniles committed Feb 10, 2021
1 parent 3119eef commit cd420e7
Show file tree
Hide file tree
Showing 2 changed files with 57 additions and 57 deletions.
112 changes: 56 additions & 56 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,56 +1,56 @@
RTF Parser Kit
==============

I have often been frustrated by the lack of comprehensive support for working with RTF in Java, and the need to use RTF parsers which are incomplete and form part of larger projects whose libraries I don't want to import just to use the RTF parser. The RTF Parser Kit project is an attempt to address these points.

The idea is to provide a "kit" of components which can either be used "as-is", for example to extract plain text or HTML from an RTF file, or can be used as a component in a larger application which requires the capability to parse RTF documents.

What's currently included?
--------------------------
* Raw RTF Parser - parses RTF, sends events representing content to a listener. Performs minimal processing - you get the RTF commands and data exactly as they appear in the file.
* Standard RTF Parser - parses RTF, sends events representing content to a listener. Handles character encoding, Unicode and so on, so you don't have to. This is probably the parser you want to use.
* Text Converter - demonstrates very simple text extraction from an RTF file
* RTF Dump - another demonstration, this time writing the RTF file contents as XML

Getting Started
===============

To install the library, you can either download the latest JAR directly from the GitHub releases page,
or you can add RTF Parser Kit as a dependency using Maven:

```xml
<dependency>
<groupId>com.github.joniles</groupId>
<artifactId>rtfparserkit</artifactId>
<version>1.12.0</version>
</dependency>
```

Once you have the library, you have a choice of two parsers to work with, the standard parser and the raw parser. The raw parser carries out minimal processing on the RTF, the standard parser handles character encodings, and translates commands which represent special characters into their Unicode equivalents. Most people will want to use the standard parser.

The parser is invoked like this:
```java
InputStream is = new FileInputStream("/path/to/my/file.rtf");
IRtfSource source = new RtfStreamSource(is)
IRtfParser parser = new StandardRtfParser();
MyRtfListener listener = new MyRtfListener();
parser.parse(source, listener);
```
You provide input to the parser via a class that implements the `IRtfSource` interface. Two implementations are provided for you, `RtfStreamSource`, for reading RTF from a stream, and `RtfStringSource` for reading RTF from a string.

The other thing you need to provide the parser with is a listener class. The listener class implements the `IRtfListener` listener interface. The interface consists of a set of methods which are called by the parser to inform you of when it encounters different parts of the docuent structure. The set of method, along with some comments describing their purpose can be seen [here](https://github.com/joniles/rtfparserkit/blob/master/RTF%20Parser%20Kit/src/com/rtfparserkit/parser/IRtfListener.java).

You don't need to implement all of the `IRtfListener` interface yourself, if you wish you can subclass `RtfListenerAdaptor` which provides empty methods for all of the `IRtfListener` methods. You can then just override the methods you are interested in.

An example text extractor is provided, you can invoke it like this:
```java
new StreamTextConverter().convert(new RtfStreamSource(inputStream), outputStream, "UTF-8");
```
This code reads an RTF file from the `inputStream` and writes the resulting text to the `outputStream` in the encoding specified by the last argument.

A second example text extractor is also provided, this one extracts text from the RTF file into a string:
```java
StringTextConverter converter = new StringTextConverter();
converter.convert(new RtfStreamSource(inputStream));
String extractedText = converter.getText();
```
RTF Parser Kit
==============

I have often been frustrated by the lack of comprehensive support for working with RTF in Java, and the need to use RTF parsers which are incomplete and form part of larger projects whose libraries I don't want to import just to use the RTF parser. The RTF Parser Kit project is an attempt to address these points.

The idea is to provide a "kit" of components which can either be used "as-is", for example to extract plain text or HTML from an RTF file, or can be used as a component in a larger application which requires the capability to parse RTF documents.

What's currently included?
--------------------------
* Raw RTF Parser - parses RTF, sends events representing content to a listener. Performs minimal processing - you get the RTF commands and data exactly as they appear in the file.
* Standard RTF Parser - parses RTF, sends events representing content to a listener. Handles character encoding, Unicode and so on, so you don't have to. This is probably the parser you want to use.
* Text Converter - demonstrates very simple text extraction from an RTF file
* RTF Dump - another demonstration, this time writing the RTF file contents as XML

Getting Started
===============

To install the library, you can either download the latest JAR directly from the GitHub releases page,
or you can add RTF Parser Kit as a dependency using Maven:

```xml
<dependency>
<groupId>com.github.joniles</groupId>
<artifactId>rtfparserkit</artifactId>
<version>1.16.0</version>
</dependency>
```

Once you have the library, you have a choice of two parsers to work with, the standard parser and the raw parser. The raw parser carries out minimal processing on the RTF, the standard parser handles character encodings, and translates commands which represent special characters into their Unicode equivalents. Most people will want to use the standard parser.

The parser is invoked like this:
```java
InputStream is = new FileInputStream("/path/to/my/file.rtf");
IRtfSource source = new RtfStreamSource(is)
IRtfParser parser = new StandardRtfParser();
MyRtfListener listener = new MyRtfListener();
parser.parse(source, listener);
```
You provide input to the parser via a class that implements the `IRtfSource` interface. Two implementations are provided for you, `RtfStreamSource`, for reading RTF from a stream, and `RtfStringSource` for reading RTF from a string.

The other thing you need to provide the parser with is a listener class. The listener class implements the `IRtfListener` listener interface. The interface consists of a set of methods which are called by the parser to inform you of when it encounters different parts of the docuent structure. The set of method, along with some comments describing their purpose can be seen [here](https://github.com/joniles/rtfparserkit/blob/master/RTF%20Parser%20Kit/src/com/rtfparserkit/parser/IRtfListener.java).

You don't need to implement all of the `IRtfListener` interface yourself, if you wish you can subclass `RtfListenerAdaptor` which provides empty methods for all of the `IRtfListener` methods. You can then just override the methods you are interested in.

An example text extractor is provided, you can invoke it like this:
```java
new StreamTextConverter().convert(new RtfStreamSource(inputStream), outputStream, "UTF-8");
```
This code reads an RTF file from the `inputStream` and writes the resulting text to the `outputStream` in the encoding specified by the last argument.

A second example text extractor is also provided, this one extracts text from the RTF file into a string:
```java
StringTextConverter converter = new StringTextConverter();
converter.convert(new RtfStreamSource(inputStream));
String extractedText = converter.getText();
```
2 changes: 1 addition & 1 deletion pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@
<modelVersion>4.0.0</modelVersion>
<groupId>com.github.joniles</groupId>
<artifactId>rtfparserkit</artifactId>
<version>1.15.0</version>
<version>1.16.0</version>
<packaging>jar</packaging>

<name>RTF Parser Kit</name>
Expand Down

0 comments on commit cd420e7

Please sign in to comment.