# Encoding and Decoding PHP

The functions in this section transform data from one form to another. This includes stripping certain characters, substituting some characters for others, and translating data into some encoded form.

The addcslashes function returns the text argument after escaping characters in the style of the C programming language. Briefly, this means special characters are replaced with codes, such as n replacing a newline character, and other characters outside ASCII 32-126 are replaced with backslash octal codes.

The optional characters argument may contain a list of characters to be escaped, which overrides the default of escaping all special characters. The characters are specified with octal notation. You may specify a range using two periods as in the example below.

<?
$s = addcslashes($s, "..37");
?>

The addslashes function returns the text argument with backslashes preceding characters that have special meaning in database queries. These are single quotes ('), double quotes ("), and backslashes themselves ().

<?
$phrase = addslashes("I don't know"); // build query$Query = "SELECT * ";
$Query .= "FROM comment ";$Query .= "WHERE text like '%$phrase%'"; print($Query);
?>

string base64_decode(string data)
The base64_decode function translates data from MIME base64 encoding into 8-bit data. Base64 encoding is used for transmitting data across protocols, such as email, where raw binary data would otherwise be corrupted.

<?
$data = "VGhpcyBpcyBhIAptdWx0aS1saW5lIG1lc3NhZ2UK"; print(base64_decode($data));
?>

string base64_encode(string text)
The base64_encode function converts text, such as email, to a form that will pass through 7-bit systems uncorrupted.

<?
$text = "This is a nmulti-line messagen"; print(base64_encode($text));
?>

string basename(string path)
The basename function returns only the filename part of a path. Directories are understood to be strings of numbers and letters separated by slash characters (/). When running on Windows, backslashes () are used as well. The flip side to this function is dirname, which returns the directory.

<?
$path="/usr/local/bin/ls"; print(basename($path));
?>

string bin2hex(string data)
The bin2hex function returns the data argument with each byte replaced by its hexadecimal representation. The numbers are returned in little-endian style. That is, the first digit is most significant.

<?
//print book title in hex
//436f7265205048502050726f6772616d6d696e67
$s = "Core PHP Programming";$s = bin2hex($s); print($s);
?>

string chop(string text)
The chop function returns the text argument with any trailing whitespace removed. If you wish to remove both trailing and leading whitespace, use the trim function. If you wish to remove leading whitespace only, use ltrim. Whitespace includes spaces, tabs, and other nonprintable characters, including nulls (ASCII 0).

<?
print("" " .
chop("This has whitespace ") .
"" ");
?>

string chr(integer ascii_code)
Use chr to get the character for an ASCII code. This function is helpful for situations where you need to use a nonprinting character that has no backslash code, or the backslash code is ambiguous. Imagine a script that writes to a formatted text fileOrdinarily you would use n for an end-of-line marker. But the behavior may be different when your script is moved from Windows to Linux, because Windows uses a carriage return followed by a linefeed. If you wish to enforce that each line end with a linefeed only, you can use chr(10) as in the example below. Another alternative to chr is sprintf.
The %c code stands for a single character, and you may specify an ASCII value for the character. Additionally, some functions, such as ereg_replace, accept integers that are interpreted as ASCII codes. If you need the ASCII code for a character, use ord.

<?
//open a test file
$fp = fopen("data.txt", "w"); //write a couple of records that have //linefeeds for end markers fwrite($fp, "data record 1" . chr(10));
fwrite($fp, "data record 2" . chr(10)); //close file fclose ($fp);
?>

string chunk_split(string data, integer length, string marker)
The chunk_split function returns the data argument after inserting an end-of-line marker at regular intervals. By default a carriage return and a linefeed are inserted every 76 characters. Optionally, you may specify a different length and a different marker string.
Sascha Schumann added this function specifically to break base64 codes up into 76- character chunks. Although ereg_replace can mimic this functionality, chunk_split is faster. It isn't appropriate for breaking prose between words. That is, it isn't intended for performing a soft wrap.

<?
$encodedData = chunck_split(base64_encode($rawData));
?>

string convert_cyr_string(string text, string from, string to)
Use convert_cyr_string to convert a text in one Cyrillic character set to another. The from and to arguments are single-character codes

<?

$new = convert_cyr_string($old, "a", "w");

?>

string dirname(string path)
The dirname function returns only the directory part of a path. The trailing slash is not included in the return value. Directories are understood to be separated by slashes (/). On Windows, backslashes () may be used, too. If you need to get the filename part of a path, use basename.

<?
$path = "/usr/local/bin/ls"; print(dirname($path));
:?>

string escapeshellcmd(string command)
The escapeshellcmd function adds a backslash before any characters that may cause trouble in a shell command. This function should be used to filter user input before it is used in exec or system.

escapeshellcmd

string hebrev(string text, integer length)
Unlike English, Hebrew text reads right to left, which makes working with strings inconvenient at times. The hebrev function reverses the orientation of Hebrew text, but leaves English alone. Hebrew characters are assumed to be in the ASCII range 224 through 251, inclusive. The optional length argument specifies a maximum length per line. Lines that exceed this length are broken.

<?
print(hebrev("Hebrew"));
?>

string hebrevc(string text, integer length)
The hebrevc function operates exactly like hebrev, except that BR tags are inserted before end-of-line characters.

string htmlentities(string text)
The htmlentities function returns the text argument with certain characters translated into HTML entities. This list conforms to the ISO-8859-1 standard. The nl2br function is similar: it translates line breaks to BR tags. You can use strip_tags to remove HTML tags altogether.

<?
$text = "Use <HTML> to begin a document."; print(htmlentities($text));
?>

string htmlspecialchars(string text)
The htmlspecialchars function works like htmlentities, except that a smaller set of entities are used. They are amp, quot, lt, and gt.

<?
$text = "Use <HTML> to begin a document."; print(htmlspecialchars($text));
?>

The ip2long function takes an IP address and returns an integer. This allows you to compress a 16-byte string into a 4-byte integer. Use long2ip to reverse the process.

Use long2ip to get the textual representation of an IP address. Use ip2long to reverse the process.

string ltrim(string text)
The ltrim function returns the text argument with any leading whitespace removed. If you wish to remove whitespace on the end of the string, use chop. If you wish to remove whitespace from the beginng and end, use trim. Whitespace includes spaces, tabs and other nonprintable characters, including nulls (ASCII 0).

<?
$text = " Leading whitespace"; print("<PRE>" . ltrim($text) . "</PRE>");
?>

string nl2br(string text)
The nl2br function inserts <BR> before every newline in the text argument and returns the modified text.

<?
$text = "line1nline2nline3n"; print(nl2br($text));
?>

string number_format(double value, integer precision, stringdecimal, string thousands)
The number_format function returns a formatted representation of the value argument as an integer with commas inserted to separate thousands. The optionalprecision argument specifies the number of digits after the decimal point, which by default is zero. The optional decimal and thousands arguments must be used together. They override the default use of periods and commas for decimal points and thousands separators.

number_format

number_formatinteger ord(string character)
The ord function returns the ASCII code of the first character in the character argument. This function allows you to deal with characters by their ASCII values, which often can be more convenient than using backslash codes, especially if you wish to take advantage of the order of the ASCII table If you need to find the character associated with an ASCII code, use the chr function.

<?
/*
** Decompose a string into its ASCII codes.
** Test for codes below 32 because these have
** special meaning and we may not want to
** print them.
*/
$text = "Line 1nLine 2n"; print("ASCII Codes for '$text'<BR>n");
print("<TABLE>n");
for($i=0;$i < strlen($text);$i++)
{
print("<TR>");
print("<TH>");
if(ord($text[$i]) > 31)
{
print($text[$i]);
}
else
{
print("(unprintable)");
}
print("</TH> ");
print(ord($text[$i]));
print("</TD>");
print("</TR>n");
}
print("</TABLE>n");
?>

string pack(string format, ...)
The pack function takes inspiration from the Perl function of the same name. It allows you to put data in a compact format readable on all platforms. Format codes in the first argument match with the arguments that follow it. The codes determine how the values are stored. An optional number, called the repeat count, may follow the format code. It specifies how many of the following arguments to use. The repeat count may also be *, which matches the remaining arguments. Some of the codes use the repeat count differently.

<?
//create some packed data
$packedData = pack("ca10n", 65, "hello", 1970); //display ASCII code for each character print("<PRE>"); for($i=0; $istrlen($packedData); $i++) { print("0x" . dechex(ord($packedData[$i])) . " "); } print("</PRE>n"); //unpack the data$Data = unpack("cOne/a10Two/nThree", $packedData); //show all elements of the unpacked array while(list($key, $value) = each($Data))
{
print("$key =$value <BR>n");
} ?>

parse_str(string query)
The parse_str function parses the query argument as if it were an HTTP GET query. A variable is created in the current scope for each field in the query. You may wish to use this function on the output of parse_url.

$query = "name=Leon&occupation=Web+Engineer"; parse_str($query);
print("$name <BR>n"); print("$occupation BR>n");
?>

array parse_url(string query)
The parse_url function breaks an URL into an associative array with the following elements: fragment, host, pass, path, port, query, scheme, user. The query is not evaluated as with the parse_str function.

parse_url

string quoted_printable_decode(string text)
The quoted_printable_decode function converts a quoted string into 8-bit binary form. It reverses the action of the quotemeta function. That is, it removes backslashes preceding special characters.

This function performs the same function as imap_qprint but does not require the IMAP extension.

<?
$command = "echo 'hello?'"; print(quoted_printable_decode($command));
?>

string quotemeta(string command_text)
The quotemeta function returns the command_text argument with backslashes preceding special characters. These characters are listed. Compare this function to addslashes and escapeshellcmd. If your intention is to ensure that user data will cause no harm when placed within a shell command, use escapeshellcmd.
The quotemeta function may be adequate for assembling PHP code passed to eval. Notice in the example below how characters with special meaning inside double quotes are escaped by quote meta, thus defeating an attempt at displaying the password variable.

<?
//simulate user input
$input = '$password';
//assemble safe PHP command
$cmd = '$text = "' . quotemeta($input) . '";'; //execute command eval($cmd);
//print new value of $text print($text);
?>

string rawurldecode(string url_text)
The rawurldecode function returns the url_text string translated from url format into plain text. It reverses the action of rawurlencode. This function is safe for use with binary data. The urldecode function is not.

<?
print(rawurldecode("mail%20leon%40clearink.com"));
?>

string rawurlencode(string url_text)
The rawurlencode function returns the url_text string translated into URL format. This format uses percent signs (%) to specify characters by their ASCII code, as required by the HTTP pecification. This allows you to pass information in an URL that includes characters that have pecial meaning in URLs, such as the ampersand (&). This function is safe for use with binary data. Compare this to urlencode, which is not.

<?
print(rawurlencode("mail leon@clearink.com"));
?>

string serialize(value)
Use serialize to transform a value into an ASCII string that may be later turned back into the same value using the unserialize function. The serialized value may be stored in a file or a database for retrieval later. In fact, this function offers a great way to store complex data structures in a database without writing any special code.
serialize

string sql_regcase(string regular_expression)
The sql_regcase function translates a case-sensitive regular expression into a caseinsensitive regular expression. This is unnecessary for use with PHP's built-in regular expression functions but can be useful when creating regular expressions for external programs such as databases.

<?
//print [Mm][Oo][Zz][Ii][Ll][Ll][Aa]
print(sql_regcase("Mozilla"));
?>

string str_replace(string target, string replacement, stringtext)
The str_replace function attempts to replace all occurrences of target in text with replacement. This function is safe for replacing strings in binary data. It's also a much faster alternative to ereg_replace. Note that str_replace is case sensitive.

<?
$text = "Search results with keywords highlighted."; print(str_replace("keywords", "<B>keywords/B>",$text));
?>

string strip_tags(string text, string ignore)
The strip_tags function attempts to remove all SGML tags from the text argument. This includes HTML and PHP tags. The optional ignore argument may contain tags to be left alone. This function uses the same algorithm used by fgetss. If you want to preserve tags, you may wish to use htmlentities.

<?
//create some test text
$text = "<P><B>Paragraph One</B><P>Paragraph Two"; //strip out all tags except paragraph and break print(strip_tags($text, "<P><BR>"));
?>

string stripcslashes(string text)
The stripcslashes function complements addcslashes. It removes backslash codes that conform to the C style. See addcslashes, above, for more details.

<?
//create some test text
$text = "Line 1x0ALine 2x0A"; //convert backslashes to actual characters print(stripcslashes($text));
?>

string stripslashes(string text)
The stripslashes function returns the text argument with backslash encoding removed. It complements addslashes. By default, PHP is configured to add slashes to user input. Use stripslashes to remove slashes before sending submitted form fields to the browser.

<?
$text = "Leon's Test String"; print("Before:$textBR>n");
print("After: " . stripslashes($text) . "BR>n"); ?> string strrev(string text) The strrev function returns the text argument in reverse order. <? print(strrev("abcdefg")); ?> string strtolower(string text) The strtolower function returns the text argument with all letters changed to lowercase. Other characters are unaffected. Locale affects which characters are considered letters, and you may find that letters with accents and umlauts are being ignored. <? print(strtolower("Hello World")); ?> string strtoupper(string text) The strtoupper function returns the text argument with all letters changed to uppercase. Other characters are unaffected. Locale affects which characters are considered letters, and you may find that letters with accents and umlauts are being ignored. <? print(strtoupper("Hello World")); ?> string strtr(string text, string original, string translated) When passed three arguments, the strtr function returns the text argument with characters matching the second argument changed to those in the third argument. If original and translated aren't the same length, the extra characters are ignored. At the time of writing a second prototype for strtr was being planned that allows you to pass two arguments. The second argument must be an associative array. The indices specify strings to be replaced, and the values specify replacement text. If a substring matches more than one index, the longer substring will be used. The process is not iterative. That is, once substrings are replaced, they are not further matched. This function is safe to use with binary strings. <?$text = "Wow! This is neat.";
$original = "!.";$translated = ".?";
// turn sincerity into sarcasm
print(strtr($text,$original, $translated)); ?> string substr_replace(string text, string replacement, integerstart, integer length) Use substr_replace to replace one substring with another. Unlike str_replace, which searches for matches, substr_replace simply removes a length of text and inserts the replacement argument. The arguments operate similarly to substr. The start argument is an index into the text argument with the first character numbered as zero. If start is negative, counting will begin at the last character of the text argument instead of the first. The number of characters replaced is determined by the optional length argument or the ends of the string. If length is negative, the returned string will end as many characters from the end of the string. In any case, if the combination of start and length calls for a string of negative length, a single character is removed. <?$text = "My dog's name is Angus.";
//replace Angus with Gus
print(substr_replace($text, "Gus", 17, 5)); ?> string trim(string text) The trim function strips whitespace from both the beginning and end of a string. Compare this function to ltrim and chop. Whitespace includes spaces, tabs and otheronprintable characters, including nulls (ASCII 0). <?$text = " whitespace ";
print(" " " . trim(\$text) . "" ");
?>

string ucfirst(string text)
Use the ucfirst function to capitalize the first character of a string. Compare this function to strtoupper and ucwords. As with these other functions, your locale determines which characters are considered letters.

<?
print(ucfirst("i forgot to capitalize
something."));
?>

string ucwords(string text)
Use the ucwords function to capitalize every word in a string. Compare it to strtoupper and ucfirst. As with these other functions, your locale determines which characters are considered letters.

<?
print(ucwords("core PHP programming"));
?>

array unpack(string format, string data)
The unpack function transforms data created by the pack function into an associative array. The format argument follows the same rules used for pack except that each element is separated by a slash to allow them to be named. These names are used as the keys in the returned associative array. See the pack example.

value unserialize(string data)
Use unserialize to transform serialized data back into a PHP value. The description of serialize has an example of the entire process.

string urldecode(string url_text)
The urldecode function returns the url_text string translated from URL format into plain text. It is not safe for binary data.

<?
print(urldecode("mail%20leon%40clearink.com"));
?>

string urlencode(string url_text)
The urlencode function returns the url_text string translated into URL format. This format uses percent signs (%) to specify characters by their ASCII code. This function is not safe for use with binary data.

<?
print(urlencode("mail leon@clearink.com"));
?>