Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

inject method - API extension request [rt.cpan.org #5941] #9

Open
oalders opened this issue Aug 24, 2020 · 0 comments
Open

inject method - API extension request [rt.cpan.org #5941] #9

oalders opened this issue Aug 24, 2020 · 0 comments

Comments

@oalders
Copy link
Member

oalders commented Aug 24, 2020

Migrated from rt.cpan.org#5941 (status was 'open')

Requestors:

Attachments:

From on 2004-04-05 23:25:04
:

Perl version: v5.8.3 built for i386-linux-thread-multi
HTML::Parser version: 3.36
on Linux 2.4.25, Debian testing dist

I am working with emulation of web browsers and found I need to have some level of preprocessing in the HTML parser.  A primitive I could use for this is the ability to inject input immediately after the current parse token.

As best I can tell, when a browser hits a chunk of content such as:
<script>
document.write('<a href="http://www.perl.org/">the stuff</a>');
</script>
it essentially injects that text immediately after the </script> element in the input parse buffer.

The attached patch adds an ->inject(chunk) method to an HTML::Parser object, and is far from a clean patch, but shows my intent.

Here is a sample use of the inject method to do simple preprocessing:

#!/usr/bin/perl
use strict;
use warnings;
use lib 'blib/lib';
use lib 'blib/arch';
use HTML::Parser qw();
use URI::Escape qw();
use IO::String qw();
use IO::Handle qw();

my $h = <<EOF;
<deftag name="foo">bar</deftag>
<deftag name="navbar">
  <foo>
  <table>
  <tr><td><a href="http://www.perl.org/">perl</a>
  <tr><td><a href="http://www.apache.org/">apache</a>
  <tr><td><a href="http://www.mozilla.org/">mozilla</a>
  </table>
</deftag>
<html><head><title>foo</title></head><body>
<navbar>
Testing 1... 2... 3...
</body></html>
EOF

my %special = ();
my $cdt = undef;
my $p;
my @out = (\*STDOUT);
$p = new HTML::Parser(
    'start_h' => [ sub { my($tag, $attr, $txt) = @_;
        if(exists $special{$tag}) {
            $p->inject($special{$tag});
        } elsif($tag eq 'deftag') {
            $cdt = $attr->{'name'};
            unshift @out, IO::String->new();
        } else {
            $out[0]->print($txt);
        }
    }, 'tag,attr,text' ],
    'text_h' => [ sub { $out[0]->print(shift) }, 'text' ],
    'end_h'  => [ sub { my($tag, $txt) = @_;
        if($tag eq '/deftag') {
            $special{$cdt} = ${$out[0]->string_ref()};
            shift @out;
        } else {
            $out[0]->print($txt);
        }
    }, 'tag,text' ],
) or die "No parser: $!";
$p->parse($h);


From on 2006-06-18 15:01:07
:

<a href='http://www.yahoo.com'></a>Thanks! http://www.insurance-top.com/auto/ <a href='http://www.insurance-top.com'>auto insurance</a>. <a href="http://www.insurance-top.com ">Insurance car</a>: auto insurance, insurance car, Best Insurance Web site
. Also [url]http://www.insurance-top.com/car/[/url] and [link=http://www.insurance-top.com]insurance quote[/link] from site .

From on 2006-06-18 15:01:13
:

Thanks!!! http://www.insurance-top.com/company/ auto site insurance. [URL=http://www.insurance-top.com]home insurance[/URL]: auto insurance, insurance car, Best Insurance Web site
. Also [url=http://www.insurance-top.com]cars insurance[/url] from website .

From on 2006-06-18 15:01:17
:

Hi! http://www.insurance-top.com/company/ auto site insurance. auto insurance, insurance car, Best Insurance Web site
. from website .

From on 2006-06-18 15:01:21
:


From on 2006-06-18 15:01:25
:


Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant