A server-side application code written in any particular language often contains pieces of code written in different, foreign language. The foreign language code embedding produces sufficient additional functionality to the application.
Since the languages are using different syntax, the embedding process creates sufficient difficulties for the utilization of all the foreign language syntax features in particular source code.
We'll study this problem with two most popular foreign code embedding: SQL and RegEx.
PHP, Swift and JS permit embedding the SQL strings in source code. All the listed languages allow to use character string object as the source of SQL instruction.
Here is an example from source code, written in PHP (source):
$sql = "SELECT id, firstname, lastname FROM MyGuests";
Here is an example from Swift source code (source):
let createTableString = """
CREATE TABLE Contact(
Id INT PRIMARY KEY NOT NULL,
Name CHAR(255));
"""
let insertStatementString = "INSERT INTO Contact (Id, Name) VALUES (?, ?);"
let queryStatementString = "SELECT * FROM Contact;"
let updateStatementString = "UPDATE Contact SET Name = 'Chris' WHERE Id = 1;"
let deleteStatementStirng = "DELETE FROM Contact WHERE Id = 1;"
let malformedQueryString = "SELECT Stuff from Things WHERE Whatever;"
let querySql = "SELECT * FROM Contact WHERE Id = ?;"
One more example from source code, written in JS (source)
var sql = "CREATE TABLE customers (name VARCHAR(255), address VARCHAR(255))";
To be embedded, the SQL expression should follow multiple restrictions. For example, an embedded code must exclude any comments. Also, included in the source code, the SQL expressions cannot be nicely formatted and immediately loose their native readability:
For example consider the original version (source) of simple SQL query:
/*
Get sales rep employees
that reports to Anthony
Arguments:
none
Returns:
List of employees
*/
SELECT
lastName, firstName
FROM
employees
WHERE
reportsTo = 1143
AND jobTitle = 'Sales Rep';
Embedded in JS source code it would look similar to this piece:
var employeeListSQLString = "SELECT lastName, firstName FROM employees WHERE reportsT = 1143 AND jobTitle = 'Sales Rep';"
Because of its simplicity this line can be understandable. For comparison we will try this piece of SQL code:
/*
Report the single product price dispersion
Source: http://www.mysqltutorial.org/mysql-inner-join.aspx
Arguments:
%productCode% -- code of product. Example: 'S10_1678'
Returning:
Single product price dispersion.
*/
SELECT
orderNumber,
productName,
msrp,
priceEach
FROM
products p
INNER JOIN
orderdetails o ON p.productcode = o.productcode
AND p.msrp > o.priceEach # report only if less than MSRP
WHERE
p.productcode = %productCode%;
After its linearization for embedding:
var productPriceDispersionSQL = "SELECT orderNumber, productName, msrp, priceEach FROM products p INNER JOIN orderdetails o ON p.productcode = o.productcode AND p.msrp > o.priceEach WHERE p.productcode = %productCode%;"
One can notice, that the complexity of reading the line increased dramatically.
Also, the embedded code can lose its functionality. For example, this SQL code utilizes the functionality of version-dependent command:
CREATE TABLE t1 (
k INT AUTO_INCREMENT,
KEY (k)
) /*!50110 KEY_BLOCK_SIZE=1024; */
The process of embedding catastrophically increases the complexity of the RegEx code understanding.
For example, the original source of multi-line and richly commented RegEx code can be readable and understandable from the first sight (source):
#
# Match a 20th or 21st century date in
# yyyy-mm-dd format
#
(19|20)\d\d # year (group 1)
[- /.] # separator
(0[1-9]|1[012]) # month (group 2)
[- /.] # separator
(0[1-9]|[12][0-9]|3[01]) # day (group 3)
An embedding of the code above inside a JS string, transforms it into "write-only" version:
var dateMatchRegEx = "(19|20)\d\d[- /.](0[1-9]|1[012])[- /.](0[1-9]|[12][0-9]|3[01])"
In the code below, the reader can notice the http
marker in this piece of JS code:
var expression = /https?:\/\/(www\.)?[-a-zA-Z0-9@:%._\+~#=]{2,256}\.[a-z]{2,6}\b([-a-zA-Z0-9@:%_\+.~#?&//=]*)/gi;
The marker http
may give a hint about the source of expression, but it is impossible to recognize the details of the rest of the expression string.
One should avoid embedding any foreign code in the native source code.
One can store the foreign code, like SQL or RegEx, in external files or storage, and use class-like wrappers to recall or extract the source from the native file or resource.