Skip to content

Database

Jeremy Foster edited this page Sep 21, 2022 · 3 revisions

The TNO solution has several data sources and data storage locations. To support the configuration and web applications there is a relational PostgreSQL database. This database provides tables to structure and manage editor, subscriber, and administrator features within the Editor and Subscriber web applications.

In addition to the relational database, there is also a non-relational database used for indexing and searching content, and a non-relational queue used for storing the body of content.

Data Model (ERD)

Table Description
User User account information
Role User roles
Claim User access privileges
Account User account management
Report Automated report configuration
Alert Automated alert configuration
Source List of content sources
Ingest Configuration details for ingesting 3rd party content
Product List of product designations to control where content is displayed
Filter Configurable options to search for content
Preference User preference configuration options
Component Web page component configuration
ContentReference Reference to ingested content to map to the Kafka Topic
Content Content created or edited within TNO
TonePool Manage different types of tone pools
Category Categorize content to be part of different reports
Tag Provide a way to identify content so that it can be organized
Action Perform an action or rule on content
MediaType Identify different types of media for ingest services
FileReference Reference information for files associated with content
ContentLink Identify related content
ContentTone Many-to-many relationship between content and tone pools
ContentAction Many-to-many relationship between content and actions
ContentTag Many-to-many relationship between content and tags
ContentCategory Many-to-many relationship between content and categories
TimeTracking Capture time spent on content by resource

Audit Columns

All tables will have the following common auditing columns.

Column Type Required Description
createdById uuid * Loose foreign key reference to User, or blank guid for non-User.
createdBy varchar(50) * Username to identify the source.
createdOn timestamp * Date and time when created. Trigger to update.
updatedById uuid * Loose foreign key reference to User, or blank guid for non-User.
updatedBy varchar(50) * Username to identify the source.
updatedOn timestamp * Date and time when updated. Trigger to update.
version bigint * Auto-incremented and used for optimistic concurrency.

Data Source

Manage the various data sources being imported into TNO.

Column Type Required Description
id int PK, Sequence
name varchar(50) *, Unique
code varchar(10) *, Unique Abbreviation to identify data source. Used as soft link from content.
description varchar(2000)
isEnabled boolean *
dataLocationId int *, FK Foreign key to data location.
mediaTypeId int *, FK Foreign key to media type.
licenseId int *, FK Foreign key to license.
ownerId int FK Foreign key to user.
scheduleType int Identifies the type of schedule.
topic varchar(50) * The Kafka Topic this data source will be push to.
connection json * Connection information.
parentId int FK Foreign key to parent data source. Used for audio/video capture/clip.
retryLimit int Maximum number of retries after a failure.

Data Service

Provides a way for services to update the state of a data source.

Column Type Required Description
dataSourceId int PK, FK Foreign key to the parent data source
failedAttempts int * Number of sequential failures that have occurred
lastRanOn timestamp When the service ran successfully the last time

Media Types

Identify the type of data source [RSS, ATOM, Newspaper, Radio News, Talk Radio, Regional, AV Archive, CC News, TV News, Today's Edition, Transcript, CP News, Scrum, Webcast]

Column Type Required Description
id int PK, Sequence
name varchar(50) *, Unique
description varchar(2000)
isEnabled boolean *
sortOrder int *

Licenses

Manage the different types of licenses.

Column Type Required Description
id int PK, Sequence
name varchar(50) *, Unique
description varchar(2000)
isEnabled boolean *
ttl int * Number of days before content is purged. "0" represents forever.
sortOrder int *

Data Locations

A data location represents a place where files are stored. Currently we only support local volume storage, but in time will support others (i.e. Azure, NAS, FTP, Local).

Column Type Required Description
id int PK, Sequence
name varchar(50) *, Unique
description varchar(2000)
isEnabled boolean *
sortOrder int *
locationType int * A way to identify types so that there can be multiple locations of the same time.
connection json * Connection settings and configuration

Schedules

Manage the different types of schedules and when processes are run (and how often). A process will run as often as the schedule allows. Each time the processes runs it will connect to the configuration to determine whether it should still run.

Column Type Required Description
id int PK, Sequence
name varchar(50) *, Unique
description varchar(2000)
isEnabled boolean *
delayMS int * Number of milliseconds before the process attempts to run again.
scheduleType timestamp * Whether the schedule is repeating or managed.
startAt time The time of day to start.
stopAt time The time of day to stop.
runOn timestamp The time of day to begin running. "null" is used when the schedule is continuous
repeat int * Number of times to run before waiting for next RunAt. "0" is used for continuous running.
runOnWeekDays int * Which weekdays to run on. [0=N/A, 1=Sunday, 2=Monday, 4=Tuesday, 8=Wednesday, 16=Thursday, 32=Friday, 64=Saturday]
runOnMonth int * Which month to run on. [0=N/A, 1=January, 2=February, 4=March, 8=April, 16=May, 32=June, 64=July, 128=August, 256=September, 512=October, 1024=November, 2048=December]
dayOfMonth int * Which day of the month to run on. "0" means this setting is not applicable.

Content Reference

A content reference provides a way to ensure duplication of content is not pushed to the Kafka Topics. Each reference will maintain the current status of its content and its location within the Kafka Topic.

Column Type Required Description
source varchar(50) PK The data source abbreviation code.
uid varchar(100) PK The unique key to identify the content from the source.
topic varchar(50) * The topic the content was pushed.
partition int * The partition within the Kafka Topic.
offset int * The offset within the Kafka Topic.
status int * The status of the content within TNO processing [InProgress(0), Success(1), Transcribed(2), NLP(3), Failed(-1)]
publishedOn timestamp When the content was published by the source.

Content

Content represents a story or news captured by or created within the solution.

Column Type Required Description
id long PK
status int * The current status of the content, used to determine if it has been published.
workflowStatus int * The current status of background services running processes on the content.
contentTypeId int *, FK Controls the UI form
mediaTypeId int *, FK Identifies the type of media the content represents
licenseId int *, FK License this content must obey
seriesId int FK Link to series this content is related to
ownerId int FK Link to editor who is responsible for this content
dataSourceId int *, FK Link to original source of content
source varchar(100) * Short code to identity original source of content
headline varchar(500) * Story headline
uid varchar(100) * Unique identification provided by original source of content
page varchar(10) * The page the story originated from
publishedOn timestamp PK When the story was published
summary text * Summary of content
transcription text * Transcription of content
sourceURL varchar(500) * URL to original source of content
Clone this wiki locally