Skip to content

Commit

Permalink
Merge branch 'jbarker/blacklist_malicious_url' into 'main'
Browse files Browse the repository at this point in the history
Blacklist malicious url in openwebtext

See merge request ADLR/megatron-lm!669
  • Loading branch information
jon-barker committed Jul 11, 2023
2 parents be0e123 + 7316f73 commit 7809781
Showing 1 changed file with 4 additions and 1 deletion.
5 changes: 4 additions & 1 deletion tools/openwebtext/blacklist_urls.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,6 @@
# Copyright (c) 2022, NVIDIA CORPORATION. All rights reserved.
# Copyright (c) 2023, NVIDIA CORPORATION. All rights reserved.

# WARNING! This file contains a blacklist of known malicious sites and thus contains some NSFW language.


import glob
Expand Down Expand Up @@ -47,6 +49,7 @@
'google',
'gunprime',
'gyazo',
'horsefucker',
'hotdealstar',
'imagefap',
'imageshack',
Expand Down

0 comments on commit 7809781

Please sign in to comment.