Skip to content
Jeremy Foster edited this page Sep 19, 2024 · 1 revision

Apache Kafka has been implemented to support the Queue. The Queue enables a performant process where all other dependent and independent systems can interact with. This enables horizontal and vertical scaling, along with enabling the ability to separate other services geographically. The Queue also enables automated management of the lifecycle and lifespan of content. Audio and Video content is never uploaded to the Queue, however the metadata is. Audio and Video content is instead serviced and provided through additional services that are physically closer to the content, which enables better performance and reduces network bandwidth.

The Queue solution has not yet been fully designed or implemented.

More information from Apache Kafka

Producers

Kafka terms services that publish or push content to the queue as a Producer. A producer pushes content to a Kafka Topic. Each topic is an immutable log maintained by Kafka which can be acted upon once it is in the queue.

The TNO solution will have several ingest services (Producers) that pull or receive content from 3rd parties. This content will be pushed to the queue so that additional processes can be performed and content can be indexed, searched, and viewed by TNO subscribers.

Producer Description
API Listener Open RESTful API that 3rd parties can push content to
API Requester Service that pulls content from 3rd party APIs
Web Reader Service that crawls websites for content
Recorder - Stream Service that records streamed video
Recorder - TV Service that records TV
Recorder = Radio Service that records radio
File Share Listener Service that listens for file uploads to file shares
Editor App Web application that provides editors ability to manually add content

Producers will make a request to the TNO database for Data Source configuration. This configuration is used to manage the producer, whether it's enabled, when it will run, and how often it will run.

Consumers

Kafka terms services that subscribe or pull content from the queue as a Consumer. A consumer pulls content from a Kafka Topics or Stream. Each topics is an immutable log maintained by Kafka which will update the consumers listening to it when new content has been added. Each stream is a query of information from one or more topic within the Kafka queue. Kafka will update the consumers listening to a stream when new content has been added that match the streams criteria.

The TNO solution will have several services (Consumers) that will pull content from the queue and perform additional processes, such as, transcription, natural language processing, indexing, analysis, searching, moving content to cloud storage, uploading to 3rd party media services, and viewing.

Consumer Description
Store on File Share Video and audio content is downloaded to file shares
Upload to Media Service Video and audio content is uploaded to cloud media service
Transcribe Kafka consumer process to extract text from video and audio content
NLP Process Kafka consumer process to perform natural language processing
Index/Search Elasticsearch storage and indexing of content for the purpose of search
Store Content All metadata and content is stored within TNO for it's licensed lifecycle
Archive/Clean All content has license rules that must enforce lifecycle and lifespan

Diagram

The following diagram provides the flow of data in relation to the Kafka queue. It represents the heart of the solution.

Queue

Table of Contents

Clone this wiki locally