Home : Documentation : 1.3.6 documentation : HTML::Embperl
Google Web perl.apache.org

1.3.6 documentation
Tips & Tricks
More infos
Add info about Embperl

    Stable 2.4.0
    Beta 2.5.0_3
Support the development of Embperl! More...
Inside Embperl - How the embedded Perl code is actually processed
[ << Prev: Input/Output Functions ] [ Content ] [ Next: Performance >> ]

If Embperl encounters a piece of Perl code ([+/-/!/$ .... $/!/-/+]) it takes the following steps.


1. Remove anything which looks like an HTML tag


2. Translate HTML escapes to their corresponding ASCII characters


3. Remove all carriage returns


4. Eval the Perl code into a subroutine


5. Call the subroutine


6. Escape special characters in the return value


7. Send the return value as output to the destination (browser or file)

Steps 1-4 take place only the first time the Perl code is encountered. Embperl stores the eval'ed subroutine, so all subsequent requests only need to execute steps 5-7.

Steps 6 and 7 take place only for code surrounded by [+ ... +].

What does this mean?

Let's take a piece of code like the following:

 [+ <BR>
 $a = "This '&gt;' is a greater-than sign"
 <BR> +]

1. Remove the HTML tags. Now it looks liketop
 $a = "This '&gt;' is a greater-than sign"

The <BR>s were inserted by some WYSIWYG HTML editor (e.g., by hitting return to make the source more readable. Also, such editors often generate "random" tags like <FONT>, etc.). Embperl removes them so they don't cause syntax errors.

There are cases where you actually want the HTML tag to be there. For example, suppose you want to output something like

 [+ "<FONT COLOR=$col>" +]

If you write it this way, Embperl will just remove everything, leaving only

 [+ "" +]

There are several ways to handle this correctly.

 a. <FONT COLOR=[+$col+]>
    Move the HTML tag out of the Perl code. This is the best way, but
    it is not possible every time.

 b. [+ "\<FONT COLOR=$col>" +]
    You can escape the opening angle bracket of the tag with `\'.

 c. [+ "&lt;FONT COLOR=$col&gt;" +]

    You can use the HTML escapes instead of the ASCII characters.
    Most HTML editors will automatically do this.  (In this case,
    you don't have to worry about it at all.)

 d. Set optRawInput (see below).
    This will completely disable the removal of HTML tags.

NOTE: In cases b-d, you must also be aware of output escaping (see below).

You should also be aware that Embperl will interpret the Perl spaceship operator (<>) as an HTML tag and will remove it. So instead of

  [- $line = <STDIN>; -]

you need to write either

 a. [- $line = \<STDIN>; -]
 b. [- $line = &lt;STDIN&gt;; -]

Again, if you use a high-level HTML editor, it will probably write version (b) for you automatically.

2. Translate HTML escapes to ASCII characterstop

Since Perl doesn't understand things like $a &lt; $b, Embperl will translate it to $a < $b. If we take the example from earlier, it will now look like

 $a = "This '>' is a greater sign"

This step is done to make it easy to write Perl code in a high-level HTML editor. You do not have to worry that your editor is writing &gt; instead of > in the source.

Again, sometimes you need to have such escapes in your code. You can write them

 a. \&gt;
    Escape them with a `\' and Embperl will not translate them.

 b. &amp;gt;
    Write the first `&' as its HTML escape (&amp;).  A normal HTML
    editor will do this on its own if you enter &gt; as text.

 c. Set optRawInput (see below)
    This will completely disable the input translation.

Since not all people like writing in a high level or WYSIWYG HTML editor, there is an option to disable steps 1 and 2. You can use the optRawInput in EMBPERL_OPTIONS to tell Embperl to leave the Perl code as it is. It is highly recommended to set this option if you are writing your HTML in an ASCII editor. You normally don't want to set it if you use some sort of high level HTML editor.

You can also set the optRawInput in your document by using $optRawInput, but you must be aware that it does not have any consequences for the current block, because the current block is translated before it is executed. So write it in separate blocks:

 [- $optRawInput = 1 -]
 [- $line = <FILEHANDLE> -]

3. Remove all carriage returnstop

All carriage returns (\r) are removed from the Perl code, so you can write source on a DOS/Windows platform and execute it on a UNIX server. (Perl doesn't like getting carriage returns in the code it parses.)

4. Eval perl code into a subroutinetop

The next step generates a subroutine out of your Perl code. In the above example it looks like:

sub foo { $a = "This '>' is a greater sign" }

The subroutine is now stored in the Perl interpreter in its internal precompiled format and can be called later as often as necessary without doing steps 1-4 again. Embperl recognizes if you request the same document a second time and will just call the compiled subroutine. This will also speed up the execution of dynamic tables and loops, because the code inside must be compiled only on the first iteration.

5. Call the subroutinetop

Now the subroutine can be called to actually execute the code.

If Embperl isn't executing a [+ ... +] block we are done. If it is a [+ ... +] block, Embperl needs to generate output, so it continues.

6. Escape special characters in the return valuetop

Our example returns the string:

"This '>' is a greater sign"

The greater sign is literal text (and not a closing html tag), so according to the HTML specification it must be sent as &gt; to the browser. In most cases, this won't be a problem, because the browser will display the correct text if we send a literal '>'. Also we could have directly written &gt; in our Perl string. But when the string is, for example, the result of a database query and/or includes characters from national character sets, it's absolutely necessary to send them correctly-escaped to the browser to get the desired result.

A special case is the <A> HTML tag. Since it includes a URL, the text must be URL-escaped instead of HTML-escaped. This means special characters like `&' must be sent by their hexadecimal ASCII code and blanks must be translated to a `+' sign. If you do not do this, your browser may not be able to interpret the URL correctly.


   <A HREF="http://host/script?name=[+$n+]">

When $n is "My name", the requested URL, when you click on the hyperlink, will be


In some cases it is useful to disable escaping. This can be done by the variable $escmode.

Example: (For better readability, we assume that optRawInput is set. Without it, you need to cover the Embperl pre-processing described in steps 1-3.)

    [+ "<FONT COLOR=5>" +]

    This will be sent to the browser as &lt;FONT COLOR=5&gt;, so you
    will see the tag on the browser screen instead of the browser
    switching the color.

    [+ local $escmode=0 ; "<FONT COLOR=5>" +]

    This will (locally) turn off escaping and send the text as a plain
    HTML tag to the browser, so the color of the output will change.

    NOTE: You cannot set $escmode more than once inside a [+ ... +] block.
    Embperl uses the first setting of $escmode it encounters inside the block.
    If you need to change $escmode more than once, you must use multiple
    [+ ... +] blocks.

7. Send the return value as output to the destination (browser/file)top

Now everything is done and the output can be sent to the browser. If you haven't set dbgEarlyHttpHeaders, the output is buffered until the successful completion of document execution, and is sent to the browser along with the HTTP headers. If an error occurs, an error document is sent instead.

The content length and every <META HTTP-EQUIV=...> is added to the HTTP header before it is sent. If Embperl is executed as a subrequest or the output is going to a file, no http header is sent.

[ << Prev: Input/Output Functions ] [ Content ] [ Next: Performance >> ]

© 1997-2012 Gerald Richter / ecos gmbh