[wikka-community] Coding guideline about backslash.

Marjolein Katsma javawoman
Mon Oct 8 15:31:41 GMT 2007


At 14:09 2007-10-08, Mahefa wrote:
>Which coding guideline about backslash?
>
>My coding style for writing a backslash inside a string, whether
>double quote or single quote is used as delimiter; is to expressly
>escape it with another backslash.
>
>These 2 strings are the same to write a string composed of 2
>characters: a backslash and the letter n.
>'\\n' and '\n'

Actually they are NOT the same: within single quotes, every character is 
just a literal character, so there is nothing to "escape". So '\\n' reads 
"two backslashes and the letter n" and '\n' reads "a backslash and the 
letter n".

It's not a matter of a rule for backslashes, but a rule for using single or 
double quotes:
- use single quotes for LITERALS (every character stands for itself)
- use double quotes only for strings that need to be INTERPOLATED

It's only in "interpolated" strings  that you may need an escape character 
to make a "special" character stand for itself instead of something to be 
interpolated.


>I prefer the first one, and the reasons are:
>
>1) When in the future, someone changes my singlequote in doublequote,
>errors due to this change are minimized.

When someone changes the single quotes to double quotes they must have a 
reason for that - and that brings with it the responsibility to consider 
whether any character may need to be escaped.

>2) clarity: I don't need to think if the character that follow the
>backslash has a special meaning when eventually combined with it. I
>just have to count the number of backslashes and divide them by 2.

Yes, you DO need to think (not to divide by two but to escape special 
characters), because you need to make a reasoned decision to use double 
quotes in the first place, instead of the generally preferred single 
quotes. If you just need one or two interpolated characters, it's better to 
concatenate them with the rest of the string still in single quotes.

So if you start with:
         echo 'This is a very looong string to be written to the output.';
and want to add a newline to that, the solution is NOT
         echo "This is a very looong string to be written to the output.\n";
but instead:
         echo 'This is a very looong string to be written to the output.'."\n";


>Consider you want to write a constant \\192.168.1.2\d$, using a single
>quote as delimiter.
>
>If you use '\\192.168.1.2\d$', the string will be : \192.168.1.2\d$

No it won't - it will be \\192.168.1.2\d$ because every character stands 
for itself in single quotes: it's a LITERAL.

>You can easily get in trouble if you used to consider that escaping
>backslash is not needed within single-quote-delimited strings.

That's because escaping a backslash ISN'T needed for literals. The idea is 
even meaningless, because there is nothing to escape. So if you "used to 
consider" that, please go right on doing so, because it's true.

>To write the string correctly, you must do one of the 4 proposals below.
>'\\\192.168.1.2\d$' or '\\\\192.168.1.2\d$' or '\\\192.168.1.2\\d$' or
>'\\\\192.168.1.2\\d$'
>
>For me, having in mind that I always escape my backslashes, the 4th
>proposal is what I can read and understand more easily.

Actually, I find that hard to read - if it would be in double quotes - 
unless you really intend it to be "four backslashes, an IP address, two 
backslashes, the letter d and a dollar sign".


That said, there is only one special case to consider and that is using 
such strings as (building blocks for) regular expressions - because regular 
expressions have their own layer of interpolating and escaping: the PHP 
engine is handing off the string to the PCRE engine which *itself* 
interprets the regex string it is handed (essentially only a character 
class has "literal" values (except for the dash which will be a literal if 
put at the end), everything else is interpolated).

In that case the simple solution is to still write the string as you mean 
to - as a literal whenever possible -, keeping it readable for humans, and 
use preg_quote() to let the PCRE library do its own escaping.


To summarize:
- use single quotes for literals, as much as possible
- use double quotes ONLY when you need strings to be interpolated (by PHP), 
and whenever feasible concatenate only those bits with single-quoted literals

That much is pure PHP.

For regular expressions you need to consider how those strings are going to 
be used:
- when using regular expressions, use PCRE (preg_*) in PHP: that keeps the 
syntax consistent, expressions usable as building blocks, and our RE 
library (in the making) better maintainable
- when *creating* regular expressions, use the above rules for strings
- when the *resulting* string (after PHP has done any interpolation 
already!) /might/ contain any character that is a "special character" in 
PCRE, use preg_quote() to escape it before passing it to the PCRE engine 
with one of the (other) preg_* functions. See http://php.net/preg-quote.php .

I don't think the last bits are on our coding guidelines page yet - I'll 
add that soon.

>--
>Mahefa Randimbisoa (aka DotMG)
>
>_______________________________________________
>WikkaWiki Community mailing list
>community at wikkawiki.org
>http://mail.wikkawiki.org/mailman/listinfo/community_wikkawiki.org


--
JavaWoman
Web Standards Compliance Officer, Wikka Development Crew
http://wikkawiki.org/JavaWoman
Skype: callto://goneagain





More information about the community mailing list