Chapter 1: Clean Code
The Art of Clean Code?
Let’s say you believe that messy code is a significant impediment. Let’s say that you accept that the only way to go fast is to keep your code clean. Then you must ask yourself: “How do I write clean code?” It’s no good trying to write clean code if you don’t know what it means for code to be clean!
Books on art don’t promise to make you an artist. All they can do is give you some of the tools, techniques, and thought processes that other artists have used. So too this book cannot promise to make you a good programmer. It cannot promise to give you “code-sense.” All it can do is show you the thought processes of good programmers and the tricks, techniques, and tools that they use.
Chapter 2: Meaningful Name
Class Names
Classes and objects should have noun or noun phrase names like Customer
, WikiPage
, Account
, and AddressParser
. Avoid words like Manager
, Processor
, Data
, or Info
in the name of a class. A class name should not be a verb.
Method Names
Methods should have verb or verb phrase names like postPayment
, deletePage
, or save
. Accessors, mutators, and predicates should be named for their value and prefixed with get
, set
, and is
according to the javabean standard.
When constructors are overloaded, use static factory methods with names that describe the arguments. For example,
Complex fulcrumPoint = Complex.FromRealNumber(23.0);
is generally better than
Complex fulcrumPoint = new Complex(23.0);
Avoid Encodings
Interfaces and Implementations
These are sometimes a special case for encodings. For example, say you are building an ABSTRACTFACTORY
for the creation of shapes. This factory will be an interface and will be implemented by a concrete class. What should you name them? IShapeFactory
and ShapeFactory
? I prefer to leave interfaces unadorned. The preceding I
, so common in today’s legacy wads, is a distraction at best and too much information at worst. I don’t want my users knowing that I’m handing them an interface. I just want them to know that it’s a ShapeFactory
. So if I must encode either the interface or the implementation, I choose the implementation. Calling it ShapeFactoryImp
, or even the hideous CShapeFactory
, is preferable to encoding the interface.
Chapter 3: Functions
Small
The first rule of functions is that they should be small. The second rule of functions is that they should be smaller than that.
Do One Thing & Use Descriptive Names
FUNCTIONS SHOULD DO ONE THING. THEY SHOULD DO IT WELL. THEY SHOULD DO IT ONLY.
It is hard to overestimate the value of good names. Remember Ward’s principle: “You know you are working on clean code when each routine turns out to be pretty much what you expected.” Half the battle to achieving that principle is choosing good names for small functions that do one thing.
Function Arguments
The ideal number of arguments for a function is zero (niladic). Next comes one (monadic), followed closely by two (dyadic). Three arguments (triadic) should be avoided where possible. More than three (polyadic) requires very special justification—and then shouldn’t be used anyway.
Even obvious dyadic functions like assertEquals(expected, actual)
are problematic. How many times have you put the actual where the expected should be? The two arguments have no natural ordering. The expected, actual ordering is a convention that requires practice to learn.
Arguments are most naturally interpreted as inputs
to a function.
appendFooter(s);
Does this function append s as the footer to something? Or does it append some footer to s? Is s an input or an output?
report.appendFooter();
In general output arguments should be avoided. If your function must change the state of something,have it change the state of its owning object.
Chapter 4: Comments
Comments Do Not Make Up for Bad Code
One of the more common motivations for writing comments is bad code. We write a module and we know it is confusing and disorganized. We know it’s a mess. So we say to ourselves, “Ooh, I’d better comment that!” No! You’d better clean it!
Don’t Use a Comment When You Can Use a Function or a Variable
Chapter 5: Formatting
The Purpose of Formatting
Code formatting is important. It is too important to ignore and it is too important to treat religiously. Code formatting is about communication, and communication is the professional developer’s first order of business.
Chapter 6: Objects and Data Structures
Data/Object Anti-Symmetry
Procedural code (code using data structures) makes it easy to add new functions without changing the existing data structures. OO code, on the other hand, makes it easy to add new classes without changing existing functions.
The complement is also true:
Procedural code makes it hard to add new data structures because all the functions must change. OO code makes it hard to add new functions because all the classes must change.
The Law of Demeter
There is a well-known heuristic called the Law of Demeter that says a module should not know about the innards of the objects it manipulates.
More precisely, the Law of Demeter says that a method f
of a class C
should only call the methods of these:
• C
• An object created by f
• An object passed as an argument to f
• An object held in an instance variable of C
The method should not invoke methods on objects that are returned by any of the allowed functions. In other words, talk to friends, not to strangers.
Train Wrecks
final String outputDir = ctxt.getOptions().getScratchDir().getAbsolutePath();
This kind of code is often called a train wreck because it look like a bunch of coupled train cars. Chains of calls like this are generally considered to be sloppy style and should be avoided. It is usually best to split them up as follows:
Options opts = ctxt.getOptions();
File scratchDir = opts.getScratchDir();
final String outputDir = scratchDir.getAbsolutePath();
Conclusion
Objects expose behavior and hide data. This makes it easy to add new kinds of objects without changing existing behaviors. It also makes it hard to add new behaviors to existing objects. Data structures expose data and have no significant behavior. This makes it easy to add new behaviors to existing data structures but makes it hard to add new data structures to existing functions.
In any given system we will sometimes want the flexibility to add new data types, and so we prefer objects for that part of the system. Other times we will want the flexibility to add new behaviors, and so in that part of the system we prefer data types and procedures. Good software developers understand these issues without prejudice and choose the approach that is best for the job at hand.
Chapter 7: Error Handling
Error handling is important, but if it obscures logic, it’s wrong.
Use Exceptions Rather Than Return Codes
Use Unchecked Exceptions
The debate is over. For years Java programmers have debated over the benefits and liabilities of checked exceptions. When checked exceptions were introduced in the first version of Java, they seemed like a great idea. The signature of every method would list all of the exceptions that it could pass to its caller. Moreover, these exceptions were part of the type of the method. Your code literally wouldn’t compile if the signature didn’t match what your code could do.
At the time, we thought that checked exceptions were a great idea; and yes, they can yield some benefit. However, it is clear now that they aren’t necessary for the production of robust software. C# doesn’t have checked exceptions, and despite valiant attempts, C++ doesn’t either. Neither do Python or Ruby. Yet it is possible to write robust software in all of these languages. Because that is the case, we have to decide—really—whether checked exceptions are worth their price.
What price? The price of checked exceptions is an Open/Closed Principle1 violation. If you throw a checked exception from a method in your code and the catch is three levels above, you must declare that exception in the signature of each method between you and the catch. This means that a change at a low level of the software can force signature changes on many higher levels. The changed modules must be rebuilt and redeployed, even though nothing they care about changed.
Checked exceptions can sometimes be useful if you are writing a critical library: You must catch them. But in general application development the dependency costs outweigh the benefits.
Define Exception Classes in Terms of a Caller’s Needs
In fact, wrapping third-party APIs is a best practice. When you wrap a third-party API, you minimize your dependencies upon it: You can choose to move to a different library in the future without much penalty. Wrapping also makes it easier to mock out third-party calls when you are testing your own code. One final advantage of wrapping is that you aren’t tied to a particular vendor’s API design choices. You can define an API that you feel comfortable with.
Don’t Return Null & Don’t Pass Null
When we return null, we are essentially creating work for ourselves and foisting problems upon our callers. All it takes is one missing null check to send an application spinning out of control.
It’s easy to say that the problem with the code above is that it is missing a null check, but in actuality, the problem is that it has too many. If you are tempted to return null from a method, consider throwing an exception or returning a SPECIAL CASE object instead. If you are calling a null-returning method from a third-party API, consider wrapping that method with a method that either throws an exception or returns a special case object.
Returning null from methods is bad, but passing null into methods is worse. Unless you are working with an API which expects you to pass null, you should avoid passing null in your code whenever possible.
Chapter 8: Boundaries
Learning Tests Are Better Than Free
Learning tests verify that the third-party packages we are using work the way we expect them to. Once integrated, there are no guarantees that the third-party code will stay compatible with our needs. The original authors will have pressures to change their code to meet new needs of their own. They will fix bugs and add new capabilities. With each release comes new risk. If the third-party package changes in some way incompatible with our tests, we will find out right away.
Whether you need the learning provided by the learning tests or not, a clean boundary should be supported by a set of outbound tests that exercise the interface the same way the production code does. Without these boundary tests to ease the migration, we might be tempted to stay with the old version longer than we should.
Clean Boundaries
Interesting things happen at boundaries. Change is one of those things. Good software designs accommodate change without huge investments and rework. When we use code that is out of our control, special care must be taken to protect our investment and make sure future change is not too costly.
Code at the boundaries needs clear separation and tests that define expectations. We should avoid letting too much of our code know about the third-party particulars. It’s better to depend on something you control than on something you don’t control, lest it end up controlling you.
We manage third-party boundaries by having very few places in the code that refer to them. We may wrap them as we did with Map, or we may use an ADAPTER to convert from our perfect interface to the provided interface. Either way our code speaks to us better, promotes internally consistent usage across the boundary, and has fewer maintenance points when the third-party code changes.
Chapter 9: Unit Tests
The Three Laws of TDD
By now everyone knows that TDD asks us to write unit tests first, before we write production code. But that rule is just the tip of the iceberg. Consider the following three laws:
First Law You may not write production code until you have written a failing unit test.
Second Law You may not write more of a unit test than is sufficient to fail, and not compiling is failing.
Third Law You may not write more production code than is sufficient to pass the currently failing test.
Keeping Tests Clean
Test code is just as important as production code. It is not a second-class citizen. It requires thought, design, and care. It must be kept as clean as production code.
Tests Enable the -ilities
If you don’t keep your tests clean, you will lose them. And without them, you lose the very thing that keeps your production code flexible. Yes, you read that correctly. It is unit tests that keep our code flexible, maintainable, and reusable. The reason is simple. If you have tests, you do not fear making changes to the code! Without tests every change is a possible bug. No matter how flexible your architecture is, no matter how nicely partitioned your design, without tests you will be reluctant to make changes because of the fear that you will introduce undetected bugs.
Clean Tests
What makes a clean test? Three things. Readability, readability, and readability. Readability is perhaps even more important in unit tests than it is in production code. What makes tests readable? The same thing that makes all code readable: clarity, simplicity, and density of expression. In a test you want to say a lot with as few expressions as possible.
The BUILD-OPERATE-CHECK pattern is made obvious by the structure of these tests. Each of the tests is clearly split into three parts. The first part builds up the test data, the second part operates on that test data, and the third part checks that the operation yielded the expected results.
One Assert per Test
There is a school of thought that says that every test function in a JUnit test should have one and only one assert statement. This rule may seem draconian, but the advantage is that those tests come to a single conclusion that is quick and easy to understand.
Notice that I have changed the names of the functions to use the common given-when-then convention. This makes the tests even easier to read. Unfortunately, splitting the tests as shown results in a lot of duplicate code.
I think the single assert rule is a good guideline. I usually try to create a domain-specific testing language that supports it. But I am not afraid to put more than one assert in a test. I think the best thing we can say is that the number of asserts in a test ought to be minimized.
Single Concept per Test
Perhaps a better rule is that we want to test a single concept in each test function. We don’t want long test functions that go testing one miscellaneous thing after another.
Clean tests follow five other rules that form the above acronym: (F.I.R.S.T.)
Fast Tests should be fast. They should run quickly. When tests run slow, you won’t want to run them frequently. If you don’t run them frequently, you won’t find problems early enough to fix them easily. You won’t feel as free to clean up the code. Eventually the code will begin to rot.
Independent Tests should not depend on each other. One test should not set up the conditions for the next test. You should be able to run each test independently and run the tests in any order you like. When tests depend on each other, then the first one to fail causes a cascade of downstream failures, making diagnosis difficult and hiding downstream defects.
Repeatable Tests should be repeatable in any environment. You should be able to run the tests in the production environment, in the QA environment, and on your laptop while riding home on the train without a network. If your tests aren’t repeatable in any environment, then you’ll always have an excuse for why they fail. You’ll also find yourself unable to run the tests when the environment isn’t available.
Self-Validating The tests should have a boolean output. Either they pass or fail. You should not have to read through a log file to tell whether the tests pass. You should not have to manually compare two different text files to see whether the tests pass. If the tests aren’t self-validating, then failure can become subjective and running the tests can require a long manual evaluation.
Timely The tests need to be written in a timely fashion. Unit tests should be written just before the production code that makes them pass. If you write tests after the production code, then you may find the production code to be hard to test. You may decide that some production code is too hard to test. You may not design the production code to be testable.
Chapter 10: Classes
Class Organization
Following the standard Java convention, a class should begin with a list of variables. Public static constants, if any, should come first. Then private static variables, followed by private instance variables. There is seldom a good reason to have a public variable.
Public functions should follow the list of variables. We like to put the private utilities called by a public function right after the public function itself. This follows the stepdown rule and helps the program read like a newspaper article.
Classes Should Be Small!
With functions we measured size by counting physical lines. With classes we use a different measure. We count responsibilities.
The name of a class should describe what responsibilities it fulfills. In fact, naming is probably the first way of helping determine class size. If we cannot derive a concise name for a class, then it’s likely too large.
The Single Responsibility Principle
The Single Responsibility Principle (SRP) states that a class or module should have one, and only one, reason to change. This principle gives us both a definition of responsibility, and a guidelines for class size. Classes should have one responsibility—one reason to change.
By minimizing coupling in this way, our classes adhere to another class design principle known as the Dependency Inversion Principle (DIP). In essence, the DIP says that our classes should depend upon abstractions, not on concrete details.
Chapter 11: Systems
Separate Constructing a System from Using It
Software systems should separate the startup process, when the application objects are constructed and the dependencies are “wired” together, from the runtime logic that takes over after startup.
Dependency Injection
A powerful mechanism for separating construction from use is Dependency Injection (DI), the application of Inversion of Control (IoC) to dependency management.3 Inversion of Control moves secondary responsibilities from an object to other objects that are dedicated to the purpose, thereby supporting the Single Responsibility Principle.
Scaling Up
It is a myth that we can get systems “right the first time.” Instead, we should implement only today’s stories, then refactor and expand the system to implement new stories tomorrow. This is the essence of iterative and incremental agility. Test-driven development, refactoring, and the clean code they produce make this work at the code level.
Test Drive the System Architecture
An optimal system architecture consists of modularized domains of concern, each of which is implemented with Plain Old Java (or other) Objects. The different domains are integrated together with minimally invasive Aspects or Aspect-like tools. This architecture can be test-driven, just like the code.
Chapter 12: Emergence
According to Kent, a design is “simple” if it follows these rules:
• Runs all the tests
• Contains no duplication
• Expresses the intent of the programmer
• Minimizes the number of classes and methods
The rules are given in order of importance.
Tight coupling makes it difficult to write tests. So, similarly, the more tests we write, the more we use principles like DIP and tools like dependency injection, interfaces, and abstraction to minimize coupling. Our designs improve even more.
Simple Design Rules: Refactoring
The TEMPLATE METHOD pattern is a common technique for removing higher-level duplication. For example:
public class VacationPolicy {
public void accrueUSDivisionVacation() {
// code to calculate vacation based on hours worked to date
// …
// code to ensure vacation meets US minimums
// …
// code to apply vaction to payroll record
// …
}
public void accrueEUDivisionVacation() {
// code to calculate vacation based on hours worked to date
// …
// code to ensure vacation meets EU minimums
// …
// code to apply vaction to payroll record
// …
}
}
abstract public class VacationPolicy {
public void accrueVacation() {
calculateBaseVacationHours();
alterForLegalMinimums();
applyToPayroll();
}
private void calculateBaseVacationHours() { /* … */ };
abstract protected void alterForLegalMinimums();
private void applyToPayroll() { /* … */ };
}
public class USVacationPolicy extends VacationPolicy {
@Override protected void alterForLegalMinimums() {
// US specific logic
}
}
public class EUVacationPolicy extends VacationPolicy {
@Override protected void alterForLegalMinimums() {
// EU specific logic
}
}
Chapter 13: Concurrency
Concurrency is a decoupling strategy. It helps us decouple what gets done from when it gets done. In single-threaded applications what and when are so strongly coupled that the state of the entire application can often be determined by looking at the stack backtrace.
Concurrency Defense Principles
Single Responsibility Principle, Keep your concurrency-related code separate from other code.
• Corollary: Limit the Scope of Data. Take data encapsulation to heart; severely limit the access of any data that may be shared.
• Corollary: Use Copies of Data. If there is an easy way to avoid sharing objects, the resulting code will be far less likely to cause problems. You might be concerned about the cost of all the extra object creation. It is worth experimenting to find out if this is in fact a problem. However, if using copies of objects allows the code to avoid synchronizing, the savings in avoiding the intrinsic lock will likely make up for the additional creation and garbage collection overhead
• Corollary: Threads Should Be as Independent as Possible. Attempt to partition data into independent subsets than can be operated on by independent threads, possibly in different processors.
Know Your Library
Review the classes available to you. In the case of Java, become familiar with java.util.concurrent, java.util.concurrent.atomic, java.util.concurrent.locks.
Beware Dependencies Between Synchronized Methods
Dependencies between synchronized methods cause subtle bugs in concurrent code. The Java language has the notion of synchronized, which protects an individual method. However, if there is more than one synchronized method on the same shared class, then your system may be written incorrectly
Recommendation: Avoid using more than one method on a shared object.
There will be times when you must use more than one method on a shared object. When this is the case, there are three ways to make the code correct:
• Client-Based Locking—Have the client lock the server before calling the first method and make sure the lock’s extent includes code calling the last method.
• Server-Based Locking—Within the server create a method that locks the server, calls all the methods, and then unlocks. Have the client call the new method.
• Adapted Server—create an intermediary that performs the locking. This is an example of server-based locking, where the original server cannot be changed.
Testing Threaded Code
Proving that code is correct is impractical. Testing does not guarantee correctness. However, good testing can minimize risk. This is all true in a single-threaded solution. As soon as there are two or more threads using the same code and working with shared data, things get substantially more complex.
Recommendation: Write tests that have the potential to expose problems and then run them frequently, with different programatic configurations and system configurations and load. If tests ever fail, track down the failure. Don’t ignore a failure just because the tests pass on a subsequent run.
Chapter 16: Refactoring SerialDate
Boy Scout Rule: We should leave the module a bit cleaner than we found it.
I understand why DayDate
inherits from Comparable
and Serializable
. But why does it inherit from MonthConstants
? The class MonthConstants
is just a bunch of static final constants that define the months. Inheriting from classes with constants is an old trick that Java programmers used so that they could avoid using expressions like MonthConstants.January
, but it’s a bad idea. MonthConstants
should really be an enum.
It’s generally a bad idea for base classes to know about their derivatives. To fix this, we should use the ABSTRACT FACTORY
pattern and create a DayDateFactory
. This factory will create the instances of DayDate
that we need and can also answer questions about the implementation, such as the maximum and minimum dates.
Again, we see the pattern of one method calling its twin with a flag. It is usually a bad idea to pass a flag as an argument to a function, especially when that flag simply selects the format of the output.
Chapter 17: Smells and Heuristics
Comments
• C1: Inappropriate Information (Comments should be reserved for technical notes about the code and design.)
• C2: Obsolete Comment
• C3: Redundant Comment (Comments should say things that the code cannot say for itself.)
• C4: Poorly Written Comment
• C5: Commented-Out Code
Environment
• E1: Build Requires More Than One Step
• E2: Tests Require More Than One Step
Functions
• F1: Too Many Arguments
• F2: Output Arguments
• F3: Flag Arguments
• F4: Dead Function
General
• G1: Multiple Languages in One Source File
• G2: Obvious Behavior Is Unimplemented (Following “The Principle of Least Surprise,” any function or class should implement the behaviors that another programmer could reasonably expect.)
• G3: Incorrect Behavior at the Boundaries
• G4: Overridden Safeties
• G5: Duplication (DRY, Don’t Repeat Yourself, principle. Every time you see duplication in the code, it represents a missed opportunity for abstraction. That duplication could probably become a subroutine or perhaps another class outright.
A more subtle form is the switch/case or if/else chain that appears again and again in various modules, always testing for the same set of conditions. These should be replaced with polymorphism. Still more subtle are the modules that have similar algorithms, but that don’t share similar lines of code. This is still duplication and should be addressed by using the TEMPLATE METHOD, or STRATEGY pattern.)
• G6: Code at Wrong Level of Abstraction
• G7: Base Classes Depending on Their Derivatives
• G8: Too Much Information (Good software developers learn to limit what they expose at the interfaces of their classes and modules. The fewer methods a class has, the better. The fewer variables a function knows about, the better. The fewer instance variables a class has, the better.)
• G9: Dead Code
• G10: Vertical Separation (Variables and function should be defined close to where they are used. Local variables should be declared just above their first usage and should have a small vertical scope. We don’t want local variables declared hundreds of lines distant from their usages.)
• G11: Inconsistency (If you do something a certain way, do all similar things in the same way. This goes back to the principle of least surprise. Be careful with the conventions you choose, and once chosen, be careful to continue to follow them.)
• G12: Clutter
• G13: Artificial Coupling
• G14: Feature Envy (The methods of a class should be interested in the variables and functions of the class they belong to, and not the variables and functions of other classes. When a method uses accessors and mutators of some other object to manipulate the data within that object, then it envies the scope of the class of that other object. It wishes that it were inside that other class so that it could have direct access to the variables it is manipulating.)
• G15: Selector Arguments
• G16: Obscured Intent
• G17: Misplaced Responsibility
• G18: Inappropriate Static (In general you should prefer nonstatic methods to static methods. When in doubt, make the function nonstatic. If you really want a function to be static, make sure that there is no chance that you’ll want it to behave polymorphically.)
• G19: Use Explanatory Variables
• G20: Function Names Should Say What They Do
• G21: Understand the Algorithm
• G22: Make Logical Dependencies Physical
• G23: Prefer Polymorphism to If/Else or Switch/Case
• G24: Follow Standard Conventions
• G25: Replace Magic Numbers with Named Constants
• G26: Be Precise (Expecting the first match to be the only match to a query is probably naive. Using floating point numbers to represent currency is almost criminal. Avoiding locks and/or transaction management because you don’t think concurrent update is likely is lazy at best. Declaring a variable to be an ArrayList when a List will due is overly constraining. Making all variables protected by default is not constraining enough.)
• G27: Structure over Convention
• G28: Encapsulate Conditionals
• G29: Avoid Negative Conditionals
• G30: Functions Should Do One Thing
• G31: Hidden Temporal Couplings
• G32: Don’t Be Arbitrary
• G33: Encapsulate Boundary Conditions
• G34: Functions Should Descend Only One Level of Abstraction
• G35: Keep Configurable Data at High Levels
• G36: Avoid Transitive Navigation
Names
• N1: Choose Descriptive Names
• N2: Choose Names at the Appropriate Level of Abstraction
• N3: Use Standard Nomenclature Where Possible
• N4: Unambiguous Names
• N5: Use Long Names for Long Scopes (Variable names like i
and j
are just fine if their scope is five lines long.)
• N6: Avoid Encodings (Names should not be encoded with type or scope information. Prefixes such as m_ or f are useless in today’s environments.)
• N7: Names Should Describe Side-Effects
Tests
• T1: Insufficient Tests
• T2: Use a Coverage Tool!
• T3: Don’t Skip Trivial Tests
• T4: An Ignored Test Is a Question about an Ambiguity
• T5: Test Boundary Conditions
• T6: Exhaustively Test Near Bugs
• T7: Patterns of Failure Are Revealing
• T8: Test Coverage Patterns Can Be Revealing
• T9: Tests Should Be Fast
from <Clean Code> by Robert C. Martin
网友评论