A JSON list of podcast hosts, and the patterns they use in audio URLs.
If you have the URL of a podcast's audio file, this JSON file will help you work out who the host is; and abilities that host may have that may impact listener privacy.
This data is synched to podcast-privacy.com, which offers an API.
Each podcast host has an image, in optimised .png format. The filename is programmatically calculated from the podcast host's name, in lower case, with non-alphanumeric characters encoded as a "-". Here's our ugly code:
$imageFilename=str_replace(' ','-',str_replace('.','-',strtolower($host['hostname'])));
For now, the simplest way is to add to the file at src/hosts.json
. Each podcast host may have multiple entries, but the URL patterns should be unique.
Each entry must contain the following properties:
-
pattern
: a unique string to spot within the audio URL as viewed in the enclosure (more accurately, within the domain portion of the URL) -
hostname
: a humanly-readable name of the podcast that this is hosted with -
retailer
: a boolean showing whether podcasters can buy hosting on this podcast host (BBC isn't; Libsyn is). -
iab
: a boolean showing whether a podcast host is able to offer IAB Certified Compliant statistics. This does not mean that all statistics from this host are certified compliant. -
abilities_stats
: a boolean showing whether this podcast host uses its logs to provide stats -
abilities_tracking
: a boolean showing whether this podcast host uses its logs to provide tracking and attribution -
abilities_dynamicaudio
: a boolean showing whether this podcast host is capable of dynamic content insertion, used for advertising or content. -
hosturl
: a website link (escaped) that links to the homepage of the podcast host.
Each entry can contain the following properties:
-
rss-pattern
: an ADDITIONAL pattern for the RSS URL, to help with cases like LibsynPro which uses the same infrastructure for hosting retail and non-retail shows -
privacyhosturl
: a link to the privacy policy of the podcast host -
notes
: a freeform text link with details of evidence of 'abilities' claims. Ideally, all podcast hosts that have these indicated will have evidence in this field. -
pi-slug
: a freeform text value that is the podcast host slug used by Podcast Index to report who hosts what. Podcast Index lists many more host names.
Podnews uses the below to extract a host's name, and privacy details, in podcast pages (example).
$stmt = $db->prepare("SELECT * FROM `podcasts-hosts` WHERE (INSTR(:url,pattern) AND INSTR(:rssurl,`rss-pattern`)) OR INSTR(:url,pattern) ORDER BY `rss-pattern` ASC LIMIT 1");
$stmt->execute(array(':url'=>parse_url($podcast['audiourl'],PHP_URL_HOST).parse_url($podcast['audiourl'],PHP_URL_PATH),':rssurl'=>$podcast['feedUrl']));
$host = $stmt->fetch(PDO::FETCH_ASSOC);
(You're recommended to remove the query string, as above).
Improvements are welcome. This list won't adequately spot original hosts via Feedburner or similar, but otherwise will catch most podcasts.