Preface

his style guide is a distillation of my own experience with additional examples and verbiage compiled from several sources, including:

In many cases, the above cited references conflict with each other, and with what I have observed as preferable practice. In those situations, I've done my best to state the justification for the recommendation I made.

Introduction

Source code development usually involves balancing requirements, some of which are often in conflict. For example, from a computer's perspective, the best source code requires the least interpretation at run time to achieve maximum efficiency. From a network's perspective, the best source code has the least total number of bytes transmitted to achieve maximum efficiency. For someone doing mainenance, source code that is neatly formatted, well commented, and blocked into logical groups of statements will require the least effort to do their job, and is therefore most efficient. For the original programmer, writing code in a creative streak, using only enough structure to keep the code organized in their mind will seem to be the most efficient pattern - until it's time to debug the code, and to insure all of the requirements are met. When that mindset change occurs, the original programmer suddenly finds their needs to be quite similar to those of the maintainer, and stopping to fix the code structure and add documentation makes sense.

For compiled languages such as C and C++, the conflict between the formatted source and least interpretation requirements is resolved by the compiler, but that adds at least one more step to the development process, and makes the code able to run on only one type of platform. Languages that use a just-in-time (JIT) "compiler" such as PHP, Python and Java or C# don't require additional steps in development, but the "compiler" must be run each time the code is used, generally producing "bytecodes" that are then interpreted by a virtual machine. This reduces portability issues but increases the processing requirements. If a bytecode cache is used, the JIT compiler only has to be run when a source code change is detected, but the bytecode interpreter is still less efficient than running native code directly on the underlying hardware. In most cases, the difference is negligible, and the increased portability is used to justify the loss of efficiency.

Network efficiency is most often increased by using compression to reduce the amount of data transmitted, but any compression algorithm has to be in place on both ends of the wire (the receiver must use a matching decompression algorithm to recover the original data without loss or corruption) and increases the computation overhead. Compression patterns are generally established in server configuration, and are therefore out of the scope of program development. What an application developer needs to be concerned with, however, is reducing the total amount of data presented to the compressor in the first place. Things such as eliminating extraneous whitespace may not make much of a difference on a page that is only viewed infrequently, but for a Web site with millions of page views a month, it could make as much as a 10% reduction in network traffic.

In order to decide where the "best" balance can be found, a major consideration is how the lowest total lifetime cost of a piece of software can be achieved. For long-lived products, having neatly formatted and well commented source code is a major consideration because the cost of maintenance skyrockets every time someone has to figure out what the existing code is doing. Over the life of a properly maintained program, maintenance costs can be expected to far outweigh the original cost of development. For a throw-away piece of code that will only be usedonce, it may make sense to forgo formatting and documentation, but doing so can lead to bad programming habits, and if the code is retained as part of the documentation, or could be used "occasionally" rather than once, it should be treated as carefully as any other part of the system.

Similarly, expending original development effort to achieve maximum network performance versus reducing development time should be decided in light of reducing the total lifetime cost of the application. With a very high traffic Web site, sending neatly formatted HTML code to the browser could significantly increase transmission costs, especially for uncompressed traffic. However, the higher cost of discovering an error in an unformatted page could be more than the savings from eliminating whitespace sent to the browser. As the number of times a particular page is served declines, the savings from eliminating HTML formatting drop, but maintenance costs are likely to stay the same. Thus it makes sense to try to send properly formatted HTML code when serving Web pages in nearly every case. Most of what a Web developer can do to reduce network overhead involves things like using tabs rather than spaces, and commenting out code in PHP rather than in HTML so that the comments are never transmitted.

Balancing requirements for source formatting, computation and network overhead is only one example of resolving conflicts during program development. Other issues are beyond the scope of this document, however, so herein those will be the issues we are addressing.

Miscellaneous Considerations

Consistent user experience

Within an application, pages should have a consistent look-and-feel so a new user can quickly learn their way around. Having pages that react and/or display information in similar ways makes finding information and using it easier for both novices and experienced users.

Readability

While the original author of a Web page may have a clear understanding of how the code is constructed when they first finish writing it, the structure and functionality may be obscure to another developer who is given the task of debugging or enhancing the page as interactions with other resources and requirements change. Even the original author may have difficulty following the code after a significant time has passed and they have been working on other tasks. This one of the major reasons why software maintenance costs frequently far outweigh the costs of development. Following this Style Guide when writing source files will make following the structure of a source file easier to follow.

The original source code for any given program should be human readable. The parsers, interpreters and/or compilers for nearly every programming language ignore whitespace outside of quoted strings. There are rare exceptions, e.g.Pythonwhere indentation is used to define logical structure rather than delimiters or control structure keywords. However, for every other language whitespace is allowed for the convenience of human readers and writers.Use the whitespaceto make your source code easier to comprehend. Not only will it make maintaining the program easier, but it will make looking at it as it is being written and knowing it functions correctly much simpler, feasible even for large, complex projects.

While it is frequently used to perform server operations not directly visible to a user, PHP is also used to generate source code - HTML, Javascript, CSS - that is sent to a browser for interpretation. Users occasionally look at the source code for Web pages, and skilled QA personnel and maintainers will use the source sent to the browser as a tool in their efforts. Without paying attention, it is trivial to get PHP to create Web source code that is nearly incomprehensible - badly formatted HTML with random line lengths and no rhyme or reason for use of whitespace is a frequent result.

There is no excuse for such a mess: PHP can and should be used to generate HTML code that is just as readable as the original PHP code itself. It doesn't take a lot more work to insure the generated code is properly formatted in the first place. Once it's done, no more work is required to create comprehensible HTML code unless the PHP source is changed: The server will just as happily emit properly formatted Web source code as not. Use the power of automation to write a better Web!

XHTML compliance

As of this writing (May 2013), most of MIT Sloan’s existing PHP Web pages include a DOCTYPE specification indicating they are compliant with XHTML 1.0 Strict, the most restrictive HTML standard. However, a very large number of the pages are not actually compliant with the DTD that defines the standard, which could result in cross-browser compatibility issues: The fact that some (or most) browsers ignore or fix up coding errors doesn't mean they all do (e.g., Internet Exploiter frequently has its own ideas of how things should work). Validated HTML code is most likely to function in the greatest number of browsers. Using a validation tool during development is strongly advisable to eliminate errors and warnings, such as theHTML Validator add-on for Firefox.

PHP errors

Code must run error free and not rely on warnings and notices to be hidden to meet this requirement. For instance, never access a variable that you did not set yourself (such as$_POSTarray keys) without first checking to see that itisset(). When developing code, check the Apache error log frequently to catch problems early, and eliminate warnings and errors as they are introduced.

PHP should be configured to report as many errors and warnings as possible: UseE_ALL|E_NOTICE, or preferablyE_ALL|E_NOTICE|E_STRICT. With error logging enabled and display errors disabled, even a badly written script won't tell the world everything that's wrong, but the problems will be logged so they can be corrected. If a sufficiently high level of reporting is not configured inphp.ini, anerror_reporting()call at the start of each script can fix the problem.

Debugging code

No debugging code can be left in place for when pushing to the production server unless it is commented out, i.e. novar_dump()orprint_r()calls, and nodie()orexit()calls that were used solely during development, unless they are commented out.

Example URLs

Useexample.com,example.organdexample.netfor all example URLs and email addresses, perRFC 2606.

Copyright declarations

Everysource file should have a copyright declaration, even if it is a "copyleft" declaration placing the code into the public domain. This will avoid ambiguities about the author's intent. Note that under U.S. copyright law once a document has been published without making a copyright claim, it cannot be legally copyrighted in the future. Therefore, if any copyrights are going to be reserved, any documents (including source code) must have a copyright declaration affixed before they are first made publicly available.

Similarly, every Web page sent to a browser should have a copyright declaration embedded in the HTML<head>section of the documentanda visible copyright claim statement if copyrights are being reserved for the page and/or its contents.

REF:http://mitsloan.mit.edu/shared/content/PHP_Code_Style_Guide.php

MIT PHP Code Style Guide