-
Notifications
You must be signed in to change notification settings - Fork 4
Bayesian spam filter for activitystrea.ms data
License
e14n/activityspam
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
This is an experimental server for filtering Activity Streams (http://activitystrea.ms/) data for spam. Apache 2.0 license. More or less copying "a plan for spam" filtering, but make pseudo-tokens for activity streams fields. So something like: { id: "urn:uuid:7e4ed55a-2b99-48c8-a274-42819b2ddd39", url: "http://example.net/status/35", published: "2011-09-23T10:49:00Z", actor: { displayName: "John Smith", id: "urn:uuid:bff0ecdd-a944-4d92-aed3-d6af8f13d610", url: "http://example.net/status/johnsmith" }, verb: "post", object: { id: "urn:uuid:81e43564-c66f-40c5-878b-733275229521", type: "note", content: "<a href='http://example.com/viagra-spam'>Buy Viagra Now!</a>" } } Would tokenize as: id=urn:uuid:7e4ed55a-2b99-48c8-a274-42819b2ddd39 url=http://example.net/status/35 published=2011-09-23T10:49:00Z actor.displayName=John-Smith actor.id=urn:uuid:bff0ecdd-a944-4d92-aed3-d6af8f13d610 actor.url=http://example.net/status/johnsmith verb=post object.id=urn:uuid:81e43564-c66f-40c5-878b-733275229521 object.type=note a href http://example.com/viagra-spam Buy Viagra Now a There may be some value in grabbing the domains of URLs (example.com and example.net here).
About
Bayesian spam filter for activitystrea.ms data
Resources
License
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published