High Quality Good Code

Ideas for Code that Sucks Less

Passionate Discussions Ahead

This is not...

This is not a clean code talk and doesn't want to be one.

IMHO, everything that sounds like a religion or movement in computer science should be viewed skeptically.

This is...

This is a motivation for better coding "practices". Feel inspired, not governed.

No patterns, no frameworks, no movements.

Disclaimer

Everything that follows is extremely biased, based on my 25 years of experience as a developer and tester.

What is good code?

What do you think is good code?

  • Simple
  • Readable
  • Modular
  • Layered
  • Designed
  • Efficient
  • Testable
  • Consistent
  • Focused
  • Standardized
  • Scalable
  • Tested
  • Secure
  • Safe
  • IT WORKS!

Code is like a good book. It can entertain, teach, or advise. It should always be easy to read, pleasing to the eye, and provide something new. If allowed, you can easily build on it.

Unlike a book, code should never be unpredictable or surprising.

René Schwietzke

Deficiencies

There are two ways of constructing a software design: One way is to make it so simple that there are obviously no deficiencies and the other way is to make it so complicated that there are no obvious deficiencies.

C.A.R. Hoare, The 1980 ACM Turing Award Lecture

But Why?

We still don't know why all the fuzz

Who consumes the code?

Whom we are programming for besides the PM and PO?

Machine Yourself Colleague
Team Customer Humanity AI

The three channels

Identical but different

The Maschine

  • Your original target
  • Describe your ideas in a language the machine understands
  • Because assembler is hard, you use an intermediate high-level representation
  • Your code drives correctness and efficiency
  • The machine mostly does not know what correctness is
  • Efficiency is an outcome that is not directly part of the code

The Human

  • Your indirect target
  • Reads code differently than the machine
  • Syntactic correctness is not key
  • Requires information the machines does not need
  • Fading memory about purpose and design

The AI

  • Truth by pattern
  • Count and probability driven

The Maschine

The one that cares the least

  • Satisfy the compiler
  • All about the syntax
  • Basics such as concurrency and resource usage
  • "I do what you tell me!"
  • Context-free aka names and comments are meaningless

/**
 * Adds a key-value pair to the front of the LRU map or
 * moves the existing key to the front and updates
 * the value.
 *
 * @param key the key
 * @param value the value
 * @return the old value of the key, if it existed
 */
public V put(final K key, final V value)
{
    // see if we are at capacity first
    if (m1.size() >= slotSize)
    {
        // recycle the old
        final FastHashMap<K, V> oldM3 = m3;
        oldM3.clear();

        m3 = m2;
        m2 = m1;
        m1 = oldM3;
    }

    // update the cache
    final V old = m1.put(key,  value);

    return old;
}
        

public V put(K k,V v){if(m1.size()>=slotSize)
{var o=m3;o.clear();m3=m2;m2=m1;m1=o;}var p=m1.put(k,v);
return p;}
        

You

You don't know what you did last week

  • Halflife of knowledge is less than three weeks
  • You often don't remember why you have done that
  • You reverse engineer your own code, often
  • Multi-tasking, multi-branching... bad
  • Vacation? Bad!
  • You document for yourself first!

/**
 * Turns a string value into a java.lang.Number.
 *
 * <p>If the string starts with {@code 0x} or {@code -0x} (lower or upper case) or {@code #} or {@code -#}, it
 * will be interpreted as a hexadecimal Integer - or Long, if the number of digits after the
 * prefix is more than 8 - or BigInteger if there are more than 16 digits.
 * </p>
 * <p>Then, the value is examined for a type qualifier on the end, i.e. one of
 * {@code 'f', 'F', 'd', 'D', 'l', 'L'}.  If it is found, it starts
 * trying to create successively larger types from the type specified
 * until one is found that can represent the value.</p>
 *
 * <p>If a type specifier is not found, it will check for a decimal point
 * and then try successively larger types from {@link Integer} to
 * {@link BigInteger} and from {@link Float} to
 * {@link BigDecimal}.</p>
 *
 * <p>
 * Integral values with a leading {@code 0} will be interpreted as octal; the returned number will
 * be Integer, Long or BigDecimal as appropriate.
 * </p>
 *
 * <p>Returns {@code null} if the string is {@code null}.</p>
 *
 * <p>This method does not trim the input string, i.e., strings with leading
 * or trailing spaces will generate NumberFormatExceptions.</p>
 *
 * @param str  String containing a number, may be null
 * @return Number created from the string (or null if the input is null)
 * @throws NumberFormatException if the value cannot be converted
 */
public static Number createNumber(final String str) {
    ...
}

Code might last forever

Why should I? It will be rewritten soon

  • Code might be long lived
  • Especially backend code lives long
  • UI and mobile app code might rotate quickly (nowadays)
  • Enterprise code might live forever just because of the huge customer base
  • Library code changes rarely

There is empirical evidence that code must be rewritten about three times before it is any good.

There is empirical evidence that code is often rewritten because one does not understand it.

How long does code live?

GIT

Linux

How long does code live?

Node

Rust

For the Team

  • You must be replaceable!
  • Sharing is caring
  • Your knowledge lives here!

/**
 * Capitalizes a String changing the first character to title case as
 * per {@link Character#toTitleCase(int)}. No other characters are changed.
 *
 * <p>For a word based algorithm, see {@link org.apache.commons.text.WordUtils#capitalize(String)}.
 * A {@code null} input String returns {@code null}.</p>
 *
 * <pre>
 * StringUtils.capitalize(null)  = null
 * StringUtils.capitalize("")    = ""
 * StringUtils.capitalize("cat") = "Cat"
 * StringUtils.capitalize("cAt") = "CAt"
 * StringUtils.capitalize("'cat'") = "'cat'"
 * </pre>
 *
 * @param str the String to capitalize, may be null
 * @return the capitalized String, {@code null} if null String input
 */
public static String capitalize(final String str) {
    final int strLen = length(str);
    if (strLen == 0) {
        return str;
    }

    final int firstCodepoint = str.codePointAt(0);
    final int newCodePoint = Character.toTitleCase(firstCodepoint);
    if (firstCodepoint == newCodePoint) {
        // already capitalized
        return str;
    }

    final int[] newCodePoints = new int[strLen]; // cannot be longer than the char array
    int outOffset = 0;
    newCodePoints[outOffset++] = newCodePoint; // copy the first code point
    for (int inOffset = Character.charCount(firstCodepoint); inOffset < strLen; ) {
        final int codePoint = str.codePointAt(inOffset);
        newCodePoints[outOffset++] = codePoint; // copy the remaining ones
        inOffset += Character.charCount(codePoint);
     }
    return new String(newCodePoints, 0, outOffset);
}
        

Customer

  • Anyone not using the codebase but the API
  • Can be internal or external
  • Training, demos, and examples
  • It is about usage and safety
  • Stability and predictability are important

/**
 * Appends the specified element to the end of this list (optional
 * operation).
 *
 * Lists that support this operation may place limitations on what
 * elements may be added to this list.  In particular, some
 * lists will refuse to add null elements, and others will impose
 * restrictions on the type of elements that may be added.  List
 * classes should clearly specify in their documentation any restrictions
 * on what elements may be added.
 *
 * @param e element to be appended to this list
 * @return {@code true} (as specified by {@link Collection#add})
 * @throws UnsupportedOperationException if the {@code add} operation
 *         is not supported by this list
 * @throws ClassCastException if the class of the specified element
 *         prevents it from being added to this list
 * @throws NullPointerException if the specified element is null and this
 *         list does not permit null elements
 * @throws IllegalArgumentException if some property of this element
 *         prevents it from being added to this list
 */
boolean add(E e);

/**
 * Appends the specified element to the end of this list.
 *
 * @param e element to be appended to this list
 * @return {@code true} (as specified by {@link Collection#add})
 */
 public boolean add(E e) {...}

Humanity watches

  • Your code might become legacy
  • You set a lasting example
  • Complexity lasts forever
  • Your code "goes viral"
  • You might screw up mankind
  • Open source or not, doesn't matter

boolean contains(CharSequence s)
boolean contentEquals(CharSequence cs)
boolean contentEquals(StringBuffer sb)
String strip()
String stripIndent()
String stripLeading()
String stripTrailing()
CharSequence subSequence(int beginIndex, int endIndex)
String substring(int beginIndex)
String substring(int beginIndex, int endIndex)
char[] toCharArray()
String toLowerCase()
String toLowerCase(Locale locale)
String toString()
String toUpperCase()
String toUpperCase(Locale locale)
String translateEscapes()
String trim()
        

for the (in)human machine - AI

  • Correct means "used often"
  • Correct is a vote
  • Code is taken out of context
  • Your garbage code might shape the future!

public static String capitalizeString(String string) {
  string = string.toLowerCase();
  return string.substring(0, 1).toUpperCase() + string.substring(1);
}

public static String capitalizeFirstLetter(String str) {
    if (str == null || str.isEmpty()) {
        return str;
    }

    char[] charArray = str.toCharArray();
    charArray[0] = Character.toUpperCase(charArray[0]);
    return new String(charArray);
}

public static String capitalizeFirstLetter(String input) {
    if (input == null || input.isEmpty()) {
        return input; // Return the input as is if it's null or empty.
    }

    // Convert the first character to uppercase and concatenate it with the rest of the string.
    return Character.toUpperCase(input.charAt(0)) + input.substring(1);
}

String capitalizedString = IntStream.rangeClosed(0, string.length() - 1)
      .filter(i -> Character.isLetter(string.charAt(i)))
      .findFirst()
      .map(i -> string.substring(0, i)
            + string.charAt(i).toUpperCase()
            + string.substring(i + 1))
      .orElse(string);

My Change Frequency

Project Change Frequency

Under the Magnifying Glass

Various Code and Coding Aspects

Documentation

General

  • Documentation outside code ages quickly
  • Only beginners read outside documentation
  • Setup and things might be okish someplace else
  • Architecure docs might not fit code
  • Audience has to be kept in mind
  • Pleasing to the eye
  • Preserves knowledge
  • Must preserve non-obvious knownledge
  • Reading text is easier than interpreting code
  • Use the right place for the doc: class vs. method vs. inline

Documentation

  • Purpose
  • Suggested usage
  • When not to use
  • Design ideas (audience)
  • Concurrency
  • Efficiency and performance
  • STATE (!!)
  • API or not and how much
  • Security

/**
 * Resizable-array implementation of the {@code List} interface.  Implements
 * all optional list operations, and permits all elements, including
 * {@code null}.  In addition to implementing the {@code List} interface,
 * this class provides methods to manipulate the size of the array that is
 * used internally to store the list.  (This class is roughly equivalent to
 * {@code Vector}, except that it is unsynchronized.)
 *
 * <p>The {@code size}, {@code isEmpty}, {@code get}, {@code set},
 * {@code iterator}, and {@code listIterator} operations run in constant
 * time.  The {@code add} operation runs in <i>amortized constant time</i>,
 * that is, adding n elements requires O(n) time.  All of the other operations
 * run in linear time (roughly speaking).  The constant factor is low compared
 * to that for the {@code LinkedList} implementation.
 *
 * <p>Each {@code ArrayList} instance has a <i>capacity</i>. The capacity is
 * the size of the array used to store the elements in the list.  It is always
 * at least as large as the list size.  As elements are added to an ArrayList,
 * its capacity grows automatically.  The details of the growth policy are not
 * specified beyond the fact that adding an element has constant amortized
 * time cost.
 *
 * <p><strong>Note that this implementation is not synchronized.</strong>
 * If multiple threads access an {@code ArrayList} instance concurrently,
 * and at least one of the threads modifies the list structurally, it
 * <i>must</i> be synchronized externally...
 *
 * The iterators returned by this class's {@link #iterator() iterator} and
 * {@link #listIterator(int) listIterator} methods are <em>fail-fast</em>...
 * Thus, in the face of concurrent modification, the iterator fails quickly and
 * cleanly, rather than risking arbitrary, non-deterministic behavior
 * at an undetermined time in the future.
 *
 * <p>Note that the fail-fast behavior of an iterator cannot be guaranteed
 * as it is, generally speaking, impossible to make any hard guarantees in the
 * presence of unsynchronized concurrent modification.
        

Class

  • What is the main purpose
  • What should you not do with it
  • Similar to other classes? What is the difference?
  • Talk about algorithms and assumptions
  • Concurrency and state
  • Performance characteristics
  • Memory requirements
  • Your audience shapes the depth

/**
 * This class implements a vector of bits that grows as needed. Each
 * component of the bit set has a {@code boolean} value. The
 * bits of a {@code BitSet} are indexed by nonnegative integers.
 * Individual indexed bits can be examined, set, or cleared. One
 * {@code BitSet} may be used to modify the contents of another
 * {@code BitSet} through logical AND, logical inclusive OR, and
 * logical exclusive OR operations.
 *
 * <p>By default, all bits in the set initially have the value
 * {@code false}.
 *
 * <p>Every bit set has a current size, which is the number of bits
 * of space currently in use by the bit set. Note that the size is
 * related to the implementation of a bit set, so it may change with
 * implementation. The length of a bit set relates to logical length
 * of a bit set and is defined independently of implementation.
 *
 * <p>Unless otherwise noted, passing a null parameter to any of the
 * methods in a {@code BitSet} will result in a
 * {@code NullPointerException}.
 *
 * <p>A {@code BitSet} is not safe for multithreaded use without
 * external synchronization.
 *
 * @author  Arthur van Hoff
 * @author  Michael McCloskey
 * @author  Martin Buchholz
 * @since   1.0
 */
    

Fields, Constants, Others

  • What is the purpose
  • Why this type and size
  • Any contraints
  • Don't hide fields later in the code

/**
 * The default initial capacity - MUST be a power of two.
 */
static final int DEFAULT_INITIAL_CAPACITY = 1 << 4; // aka 16

/**
 * The maximum capacity, used if a higher value is implicitly specified
 * by either of the constructors with arguments.
 * MUST be a power of two <= 1<<30.
 */
static final int MAXIMUM_CAPACITY = 1 << 30;

/**
 * The internal field corresponding to the serialField "bits".
 */
private long[] words;

/**
 * The number of words in the logical size of this BitSet.
 */
private transient int wordsInUse = 0;
        

Methods

  • Purpose and details
  • Input, output
  • Does it change state?
  • Edge and error-cases
  • Concurrency
  • Performance
  • Memory
  • Talk about API stability

/**
 * Returns a new bit set containing all the bits in the given byte
 * buffer between its position and limit.
 *
 * <p>More precisely,
 * <br>{@code BitSet.valueOf(bb).get(n) == ((bb.get(bb.position()+n/8) & (1<<(n%8))) != 0)}
 * <br>for all {@code n < 8 * bb.remaining()}.
 *
 * <p>The byte buffer is not modified by this method, and no
 * reference to the buffer is retained by the bit set.
 *
 * @param bb a byte buffer containing a little-endian representation
 *        of a sequence of bits between its position and limit, to be
 *        used as the initial bits of the new bit set
 * @return a {@code BitSet} containing all the bits in the buffer in the
 *         specified range
 * @since 1.7
 */
public static BitSet valueOf(ByteBuffer bb) {
    bb = bb.slice().order(ByteOrder.LITTLE_ENDIAN);
    int n;
    for (n = bb.remaining(); n > 0 && bb.get(n - 1) == 0; n--)
        ;
    long[] words = new long[(n + 7) / 8];
    bb.limit(n);
    int i = 0;
    while (bb.remaining() >= 8)
        words[i++] = bb.getLong();
    for (int remaining = bb.remaining(), j = 0; j < remaining; j++)
        words[i] |= (bb.get() & 0xffL) << (8 * j);
    return new BitSet(words);
}

Inline

  • Build visual structure
  • Explain each step
  • Don't explain the obvious
  • Most code fails here
  • Start to program in comments before writing code

public static int getLevenshteinDistance(CharSequence s, CharSequence t, final int threshold) {
   if (s == null || t == null) {
       throw new IllegalArgumentException("Strings must not be null");
   }
   if (threshold < 0) {
       throw new IllegalArgumentException("Threshold must not be negative");
   }

   // Huge explaination of the algorithm was here (removed by me)

   int n = s.length(); // length of s
   int m = t.length(); // length of t

   // if one string is empty, the edit distance is necessarily the length of the other
   if (n == 0) {
       return m <= threshold ? m : -1;
   }
   if (m == 0) {
       return n <= threshold ? n : -1;
   }
   if (Math.abs(n - m) > threshold) {
       // no need to calculate the distance if the length difference is greater than the threshold
       return -1;
   }

   if (n > m) {
       // swap the two strings to consume less memory
       final CharSequence tmp = s;
       s = t;
       t = tmp;
       n = m;
       m = t.length();
   }

   int[] p = new int[n + 1]; // 'previous' cost array, horizontally
   int[] d = new int[n + 1]; // cost array, horizontally
   int[] tmp; // placeholder to assist in swapping p and d

   // fill in starting table values
   final int boundary = Math.min(n, threshold) + 1;
   for (int i = 0; i < boundary; i++) {
       p[i] = i;
   }
   // these fills ensure that the value above the rightmost entry of our
   // stripe will be ignored in following loop iterations
   Arrays.fill(p, boundary, p.length, Integer.MAX_VALUE);
   Arrays.fill(d, Integer.MAX_VALUE);

   // iterates through t
   for (int j = 1; j <= m; j++) {
       final char jOfT = t.charAt(j - 1); // jth character of t
       d[0] = j;

       // compute stripe indices, constrain to array size
       final int min = Math.max(1, j - threshold);
       final int max = j > Integer.MAX_VALUE - threshold ? n : Math.min(n, j + threshold);

       // the stripe may lead off of the table if s and t are of different sizes
       if (min > max) {
           return -1;
       }

       // ignore entry left of leftmost
       if (min > 1) {
           d[min - 1] = Integer.MAX_VALUE;
       }

       // iterates through [min, max] in s
       for (int i = min; i <= max; i++) {
           if (s.charAt(i - 1) == jOfT) {
               // diagonally left and up
               d[i] = p[i - 1];
           } else {
               // 1 + minimum of cell to the left, to the top, diagonally left and up
               d[i] = 1 + Math.min(Math.min(d[i - 1], p[i]), p[i - 1]);
           }
       }

       // copy current distance counts to 'previous row' distance counts
       tmp = p;
       p = d;
       d = tmp;
   }

   // if p[n] is greater than the threshold, there's no guarantee on it being the correct
   // distance
   if (p[n] <= threshold) {
       return p[n];
   }
   return -1;
}

Good and Ugly - java.util.HashMap


/**
 * Hash table based implementation of the {@code Map} interface.  This
 * implementation provides all of the optional map operations, and permits
 * {@code null} values and the {@code null} key.  (The {@code HashMap}
 * class is roughly equivalent to {@code Hashtable}, except that it is
 * unsynchronized and permits nulls.)  This class makes no guarantees as to
 * the order of the map; in particular, it does not guarantee that the order
 * will remain constant over time.
 *
 * <p>This implementation provides constant-time performance for the basic
 * operations ({@code get} and {@code put}), assuming the hash function
 * disperses the elements properly among the buckets.  Iteration over
 * collection views requires time proportional to the "capacity" of the
 * {@code HashMap} instance (the number of buckets) plus its size (the number
 * of key-value mappings).  Thus, it's very important not to set the initial
 * capacity too high (or the load factor too low) if iteration performance is
 * important.
 *
 * <p>An instance of {@code HashMap} has two parameters that affect its
 * performance: <i>initial capacity</i> and <i>load factor</i>.  The
 * <i>capacity</i> is the number of buckets in the hash table, and the initial
 * capacity is simply the capacity at the time the hash table is created.  The
 * <i>load factor</i> is a measure of how full the hash table is allowed to
 * get before its capacity is automatically increased.  When the number of
 * entries in the hash table exceeds the product of the load factor and the
 * current capacity, the hash table is <i>rehashed</i> (that is, internal data
 * structures are rebuilt) so that the hash table has approximately twice the
 * number of buckets.
 ...
 * <p><strong>Note that this implementation is not synchronized.</strong>
 * If multiple threads access a hash map concurrently, and at least one of
 * the threads modifies the map structurally, it <i>must</i> be
 * synchronized externally.  (A structural modification is any operation
 * that adds or deletes one or more mappings; merely changing the value
 * associated with a key that an instance already contains is not a
 * structural modification.)  This is typically accomplished by
 * synchronizing on some object that naturally encapsulates the map.

/*
 * Implementation notes.
 *
 * This map usually acts as a binned (bucketed) hash table, but
 * when bins get too large, they are transformed into bins of
 * TreeNodes, each structured similarly to those in
 * java.util.TreeMap. Most methods try to use normal bins, but
 * relay to TreeNode methods when applicable (simply by checking
 * instanceof a node).  Bins of TreeNodes may be traversed and
 * used like any others, but additionally support faster lookup
 * when overpopulated. However, since the vast majority of bins in
 * normal use are not overpopulated, checking for existence of
 * tree bins may be delayed in the course of table methods.
 *
 * Tree bins (i.e., bins whose elements are all TreeNodes) are
 * ordered primarily by hashCode, but in the case of ties, if two
 * elements are of the same "class C implements Comparable",
 * type then their compareTo method is used for ordering. (We
 * conservatively check generic types via reflection to validate
 * this -- see method comparableClassFor).  The added complexity
 * of tree bins is worthwhile in providing worst-case O(log n)
 * operations when keys either have distinct hashes or are
 * orderable, Thus, performance degrades gracefully under
 * accidental or malicious usages in which hashCode() methods
 * return values that are poorly distributed, as well as those in
 * which many keys share a hashCode, so long as they are also
 * Comparable. (If neither of these apply, we may waste about a
 * factor of two in time and space compared to taking no
 * precautions. But the only known cases stem from poor user
 * programming practices that are already so slow that this makes
 * little difference.)
 *
 * Because TreeNodes are about twice the size of regular nodes, we
 * use them only when bins contain enough nodes to warrant use
 * (see TREEIFY_THRESHOLD). And when they become too small (due to
 * removal or resizing) they are converted back to plain bins.  In
 * usages with well-distributed user hashCodes, tree bins are
 * rarely used.  Ideally, under random hashCodes, the frequency of
 * nodes in bins follows a Poisson distribution
 * (http://en.wikipedia.org/wiki/Poisson_distribution) with a
 * parameter of about 0.5 on average for the default resizing
 * threshold of 0.75, although with a large variance because of
 * resizing granularity. Ignoring variance, the expected
 * occurrences of list size k are (exp(-0.5) * pow(0.5, k) /
 * factorial(k)). The first values are:
 *
 * 0:    0.60653066
 * 1:    0.30326533
 * 2:    0.07581633
 * 3:    0.01263606
 * 4:    0.00157952
 * 5:    0.00015795
 * 6:    0.00001316
 * 7:    0.00000094
 * 8:    0.00000006
 * more: less than 1 in ten million

Good and Ugly - java.util.HashMap


final Node<K,V> removeNode(int hash, Object key, Object value,
                           boolean matchValue, boolean movable) {
    Node<K,V>[] tab; Node<K,V> p; int n, index;
    if ((tab = table) != null && (n = tab.length) > 0 &&
        (p = tab[index = (n - 1) & hash]) != null) {
        Node<K,V> node = null, e; K k; V v;
        if (p.hash == hash &&
            ((k = p.key) == key || (key != null && key.equals(k))))
            node = p;
        else if ((e = p.next) != null) {
            if (p instanceof TreeNode)
                node = ((TreeNode<K,V>)p).getTreeNode(hash, key);
            else {
                do {
                    if (e.hash == hash &&
                        ((k = e.key) == key ||
                         (key != null && key.equals(k)))) {
                        node = e;
                        break;
                    }
                    p = e;
                } while ((e = e.next) != null);
            }
        }
        if (node != null && (!matchValue || (v = node.value) == value ||
                             (value != null && value.equals(v)))) {
            if (node instanceof TreeNode)
                ((TreeNode<K,V>)node).removeTreeNode(this, tab, movable);
            else if (node == p)
                tab[index] = node.next;
            else
                p.next = node.next;
            ++modCount;
            --size;
            afterNodeRemoval(node);
            return node;
        }
    }
    return null;
}

 
        

final V putVal(int hash, K key, V value, boolean onlyIfAbsent,
               boolean evict) {
    Node<K,V>[] tab; Node<K,V> p; int n, i;
    if ((tab = table) == null || (n = tab.length) == 0)
        n = (tab = resize()).length;
    if ((p = tab[i = (n - 1) & hash]) == null)
        tab[i] = newNode(hash, key, value, null);
    else {
        Node<K,V> e; K k;
        if (p.hash == hash &&
            ((k = p.key) == key || (key != null && key.equals(k))))
            e = p;
        else if (p instanceof TreeNode)
            e = ((TreeNode<K,V>)p).putTreeVal(this, tab, hash, key, value);
        else {
            for (int binCount = 0; ; ++binCount) {
                if ((e = p.next) == null) {
                    p.next = newNode(hash, key, value, null);
                    if (binCount >= TREEIFY_THRESHOLD - 1) // -1 for 1st
                        treeifyBin(tab, hash);
                    break;
                }
                if (e.hash == hash &&
                    ((k = e.key) == key || (key != null && key.equals(k))))
                    break;
                p = e;
            }
        }
        if (e != null) { // existing mapping for key
            V oldValue = e.value;
            if (!onlyIfAbsent || oldValue == null)
                e.value = value;
            afterNodeAccess(e);
            return oldValue;
        }
    }
    ++modCount;
    if (++size > threshold)
        resize();
    afterNodeInsertion(evict);
    return null;
}
        

Formatting

  • Avoid subtile problems
  • Version control issues
  • Java sadly has not dictated standard
  • Linters can help but also create hard to read code
  • Java, find middle ground
  • Kotlin, Go, JS, TS, Rust - default linter

More Formatting


/**
 * Inserts the specified element at the specified position in this
 * list. Shifts the element currently at that position (if any) and
 * any subsequent elements to the right (adds one to their indices).
 *
 * @param index index at which the specified element is to be inserted
 * @param element element to be inserted
 * @throws IndexOutOfBoundsException {@inheritDoc}
 */
public void add(int index, E element) {
    rangeCheckForAdd(index);
    modCount++;
    final int s;
    Object[] elementData;
    if ((s = size) == (elementData = this.elementData).length)
        elementData = grow();
    System.arraycopy(elementData, index,
                     elementData, index + 1,
                     s - index);
    elementData[index] = element;
    size = s + 1;
}














 
        

/**
 * Inserts the specified element at the specified position in this
 * list. Shifts the element currently at that position (if any) and
 * any subsequent elements to the right (adds one to their indices).
 *
 * @param index index at which the specified element is to be inserted
 * @param element element to be inserted
 *
 * @throws IndexOutOfBoundsException {@inheritDoc}
 */
public void add(final int index, final E element) {
    // index must be >= 0 and <= size
    rangeCheckForAdd(index);

    // increase the modification count for fail-fast behavior
    modCount++;

    final int currentSize = this.size;
    Object[] newElementData = this.elementData;

    // check if we still have room, grow if needed
    if (currentSize == newElementData.length) {
        newElementData = grow();
    }

    // move elements behind the new position to the right
    System.arraycopy(
                newElementData, index,
                newElementData, index + 1,
                currentSize - index);

    // finally add element to the desired position
    newElementData[index] = element;

    // our list got bigger by one element
    this.size = currentSize + 1;
}
        
  • Add newlines
  • Form logical blocks
  • Always use braces
  • Visually guide with comments
  • Use this consistently
  • No empty initialization
  • Don't shadow fields
  • Proper naming

Naming

  • Make names speak
  • Include type when not obvious
  • Plural and singular matter
  • Follow common patterns
  • Stick to casing
  • Methods might not need a verb anymore
  • Grammar and spelling - English

/*
 * Compute an 8-byte hash of a byte array of length greater than 64 bytes.
 */
private static long fullFingerprint(byte[] bytes, int offset, int length) {
  // For lengths over 64 bytes we hash the end first, and then as we
  // loop we keep 56 bytes of state: v, w, x, y, and z.
  long x = load64(bytes, offset);
  long y = load64(bytes, offset + length - 16) ^ K1;
  long z = load64(bytes, offset + length - 56) ^ K0;
  long[] v = new long[2];
  long[] w = new long[2];
  weakHashLength32WithSeeds(bytes, offset + length - 64, length, y, v);
  weakHashLength32WithSeeds(bytes, offset + length - 32, length * K1, K0, w);
  z += shiftMix(v[1]) * K1;
  x = rotateRight(z + x, 39) * K1;
  y = rotateRight(y, 33) * K1;
  ...
}

Naming - Examples


// Methods
public void calculatePoints()
String String::indent(int n) // JDK

int String::indexOf(int ch) // JDK
String String::indentTextBy(int n)

// ?
void clear()
void purge();
void removeAll()
void deleteAll()
void reset()

// java.lang.String
boolean isBlank()
boolean isEmpty()

String trim()
String strip()
String stripIndent()
String stripLeading()
String stripTrailing()

String readIntoString()
StringBuilder readIntoStringBuilder()
void readInto(StringBuilder target)
        

// variables and parameters
for (int i = 0; i < 100; i++) {}
int count = 0;
long timeMs;
long duration (not length)

int numNew = a.length;
int numMoved = s - index;

for (Customer customer : customers)

Parameters

  • Naming, make clear what it is, especially when there is more than one of that type
  • Order is important, especially when overloading

// parameters and overloading
public void removeCarsFromStreet(final Street street)
public void removeCarsFromStreet(final Street street, final Car carType)
public void removeCarsFromStreet(final Street street, final Car carType, final int age)
public void removeCarsFromStreetByType(final Street street, final Car carType)

public V getFromMap(final K key); // JDK says get()
public void put(List<V> list); // JDK says putAll or addAll
        

Logic

  • Sort out step by step
  • Keep logical operators at minimum
  • Prefer nesting over long conditions
  • Order matters
  • Don't assign and compare, if possible
  • CPU can look beyond simple conditions

doIt(final String s)
{
    // empty s cannot be processed, sort things out early
    if (s == null || s.length() == 0)
    {
        return false;
    }

    // That is BAD code here
    if (s.length() == 3 &&
            !"foo".equals(s) &&
            !("bar".equals(s) || "rab".equals(s)))
    {
        return false;
    }

    int n1 = 10;
    int n2 = 5;
    int n3 = 2;
    return n1 >= n2 ?
        (n1 >= n3 ? n1 : n3) : (n2 >= n3 ? n2 : n3);

    // ternary ops can be faster but also slower on the CPU,
    // no simple rule for that
}

Negated State

  • People think positive
  • Asking for on or running is more natural
  • Don't say enable this to disable that or set it to true to disable it
  • Feature is on or off, don't say not on
  • Do the same for variables
  • Enums are often a better choice for states

# Uncomment to disable the processing, it is enabled by default
# myapplication.processing.disable = true

public void setState(final boolean disable)
{
    if (!disable)
    {
        // do stuff here
    }

    if (!isEnabled())
    {
    }
    enabled = !disabled;
}
        

public void enable();
public void disable();

Loops

  • Keep loop logic simple
  • Double for-conditions are atypical
  • Invariants outside the loop are uncommon
  • Labels are uncommon
  • Do-while is less common
  • Infinite loops with break are advanced

for (n = lb.remaining(); n > 0 && lb.get(n - 1) == 0; n--)
    ;

do {
    if (e.hash == hash &&
        ((k = e.key) == key || (key != null && key.equals(k))))
        return e;
} while ((e = e.next) != null);

for (int binCount = 0; ; ++binCount) {
    if ((e = p.next) == null) {
        p.next = newNode(hash, key, value, null);
        if (binCount >= TREEIFY_THRESHOLD - 1) // -1 for 1st
            treeifyBin(tab, hash);
        break;
    }
}

public void clear() {
    while (wordsInUse > 0)
        words[--wordsInUse] = 0;
}

while (true) {
    if (word != 0)
        return (u * BITS_PER_WORD) + Long.numberOfTrailingZeros(word);
    if (++u == wordsInUse)
        return wordsInUse * BITS_PER_WORD;
    word = ~words[u];
}

int i = 0;
found: {
     if (o == null) {
         for (; i < size; i++)
             if (es[i] == null)
                 break found;
     } else {
         for (; i < size; i++)
             if (o.equals(es[i]))
                 break found;
     }
     return false;
}

Final

  • Often misunderstood
  • Indicates that something has only one state ever, exactly one write before any read
  • Protects you from your own stupidity
  • Might or might not be used by the JIT
  • Apply by default, remove when needed

public class LRUBloomHashMap<K, V>
{
    // static should be mostly final
    public static final int LAYERCOUNT = 3;

    public final int capacity;
    public final int slotSize;
}

public static int hashCodeWithLimit(final CharSequence s,
                                    final char limitingChar)
{
    int hash = 0;

    final int length = s.length();
    for (int i = 0; i < length; i++)
    {
        final char c = s.charAt(i);

        if (c != limitingChar)
        {
            final int h1 = hash << 5;
            final int h2 = c - hash;
            hash = h1 + h2;
        }
        else
        {
            break;
        }

    }

    return hash;
}

THIS

  • Enhances context
  • Shows scope (method, class) more clearly
  • Helps to avoid accidental shadowing
  • Sure IDE might help

// bad
public E getLast() {
    int last = size - 1;
    if (last < 0) {
        throw new NoSuchElementException();
    } else {
        return elementData(last);
    }
}
        

// good
public E getLast() {
    final int last = this.size - 1;
    if (last < 0) {
        throw new NoSuchElementException();
    } else {
        return this.elementData(last);
    }
}
        

Shadowing

  • Don't shadow!
  • Variables from parameters
  • Variables from fields
  • One more reason for final

boolean equalsRange(List<?> other, int from, int to) {
    final Object[] elementData = this.elementData;
    if (to > elementData.length) {
        throw new ConcurrentModificationException();
    }
    var oit = other.iterator();
    for (; from < to; from++) {
        if (!oit.hasNext() ||
            !Objects.equals(elementData[from], oit.next())) {
            return false;
        }
    }
    return !oit.hasNext();
}

Declaration

  • Declare when needed, not upfront
  • Declare for the scope
  • Declare separately

// example 1
int n = s1.length(), m = s2.length();

// example 2
int cost;

for (int i = 1; i <= n; i++) {
    for (int j = 1; j <= m; j++) {
        cost = 1;
        if (s1.charAt(i - 1) == s2.charAt(j - 1)) {
            cost = 0;
        }

        d[i][j] = min(
                d[i - 1][j - 1] + cost, // substitution
                d[i][j - 1] + 1, // insertion
                d[i - 1][j] + 1 // deletion
        );

        //transposition check
        if (i > 1 && j > 1
                && s1.charAt(i - 1) == s2.charAt(j - 2)
                && s1.charAt(i - 2) == s2.charAt(j - 1)) {
            d[i][j] = Math.min(d[i][j], d[i - 2][j - 2] + cost);
        }
    }
}
// cost is never used outside the inner loop
        

The var war

  • Keyword var, compiler infers type
  • Made for blocks
  • Made for compact and more readable code
  • Very passionate discussion going on
  • Kotlin has a fancy var and val

public String toString()
{
    final var sb = new StringBuilder(32);
    sb.append('[');

    for (int i = 0; i < m_data.length; i += 2)
    {
        final var key = m_data[i];
        final var value = m_data[i + 1];

        if (key != FREE_KEY && key != REMOVED_KEY)
        {
            ...
        }
    }
    sb.append(']');

    return sb.toString();
}
        

History in the Code

  • Document decisions
  • Document especially odd code

/**
 * Constructs a list containing the elements of the specified
 * collection, in the order they are returned by the collection's
 * iterator.
 *
 * @param c the collection whose elements are to be placed into this list
 * @throws NullPointerException if the specified collection is null
 */
public ArrayList(Collection<? extends E> c) {
    elementData = c.toArray();
    if ((size = elementData.length) != 0) {
        // defend against c.toArray (incorrectly) not returning Object[]
        // (see e.g. https://bugs.openjdk.java.net/browse/JDK-6260652)
        if (elementData.getClass() != Object[].class)
            elementData = Arrays.copyOf(elementData, size, Object[].class);
    } else {
        // replace with empty array.
        this.elementData = EMPTY_ELEMENTDATA;
    }
}

Why is something not good

  • Document decisions and problems
  • Keep alternatives in the code
  • GIT is helpful but most people don't care
  • This is for the future
  • This is for new programmers
  • This is for the team

private static long getInitValue()
{
    long initValue = System.currentTimeMillis();

    // modify the init value for each user thread
    // when in a load test
    if (Session.getCurrent().isLoadTest())
    {
        final String userId = Thread.currentThread().getName();

        // String.hashCode() is not good enough -> #2890
        // final long hashCode = userId.hashCode();

        // use CRC32 instead, but square the result to extend
        // the range to 64 bits
        long hashCode = Hashing.crc32()
                            .hashUnencodedChars(userId)
                            .padToLong();
        hashCode = hashCode * hashCode;

        initValue += hashCode;
    }

    return initValue;
}

Duplication can be good

  • Duplicated code can be good
  • Duplicated functionality is ok
  • Makes implementations simplier
  • Removes dependencies
  • Can make code more efficient
  • This is not a general rule

/**
 * Simple hash map implementation taken from here
 * https://github.com/mikvor/hashmapTest/blob/master/src/main/java/map/objobj/ObjObjMap.java
 * No concrete license specified at the source. The project is public domain.
 *
 * Not thread-safe!
 *
 * Null support was removed.
 *
 * @since 7.0.0
 */
public class FastHashMap<K, V>
{
    ...
}
    

Source of Knowledge

State where things are coming from

  • State your sources
  • Make things understandable
  • Extend documentation
  • Support legal
  • Give credit
  • Enhance trust
  • Important for security

/**
 * Returns the next pseudorandom, Gaussian ("normally") distributed
 * {@code double} value with mean {@code 0.0} and standard
 * deviation {@code 1.0} from this random number generator's sequence.
 ...
 * This uses the <i>polar method</i> of G. E. P. Box, M. E. Muller, and
 * G. Marsaglia, as described by Donald E. Knuth in <i>The Art of
 * Computer Programming</i>, Volume 2: <i>Seminumerical Algorithms</i>,
 * section 3.4.1, subsection C, algorithm P. Note that it generates two
 * independent values at the cost of only one call to {@code StrictMath.log}
 * and one call to {@code StrictMath.sqrt}.
 *
 * @return the next pseudorandom, Gaussian ("normally") distributed
 *         {@code double} value with mean {@code 0.0} and
 *         standard deviation {@code 1.0} from this random number
 *         generator's sequence
 */
public synchronized double nextGaussian() {
    // See Knuth, ACP, Section 3.4.1 Algorithm C.
    if (haveNextNextGaussian) {
        haveNextNextGaussian = false;
        return nextNextGaussian;
    } else {
        ...
    }
}
        

Error Handling

The debate about when and how much

  • Distinguish between incorrect input and return state
  • Exceptions are great, but checked exceptions are painful
  • Don't reinvent exception that already exist
  • Not everything must be an exception
  • Exceptions are the third state
  • Not every method needs to handle all errors
  • Never catch all, catch only what applies
  • Never catch Errors
    • JDK: An Error is a subclass of Throwable that indicates serious problems that a reasonable application should not try to catch. Most such errors are abnormal conditions.
  • Don't catch common runtime exceptions, they indicate programming errors
  • Exceptions are rare conditions, don't use them for communicating the normal
  • Exceptions cost time

Error State

A quick exercise


/**
 * Sums up all parsable values. Null is not permitted.
 * Any array field that is null will raise a NullPointerException.
 * Unparsable data will raise a NumberFormatException.
 *
 * @param data an array of Data, can be null and empty
 * @return the sum of all valid entries, assumes 0 otherwise
 *
 * @throws IllegalArgumentException ...
 * @throws NullPointerException ...
 * @throws NumberFormatException ...
 */
public static long sum(final Data[] data) {
    if (data == null) {
        throw new IllegalArgumentException("Null is not permitted");
    }

    long sum = 0;

    for (Data d : data) {
        if (d == null) {
            throw new NullPointerException("Null fields are not permitted");
        }

        // implicit exception
        sum += Integer.parse(data.value);
    }

    return sum;
}

/**
 * Sums up all parsable values. Returns 0 for null data and data without
 * fields. In case of parsing problems or null fields, this fields is just
 * skipped. The performance is O(n) and there is no extra memory allocated.
 *
 * @param data an array of Data, can be null or empty
 * @return the sum of all valid entries, assumes 0 otherwise
 */
public static long sum(final Data[] data) {
    if (data == null) {
        return 0;
    }

    long sum = 0;

    for (Data d : data) {
        try {
            // parseInt also handles null
            sum += Integer.parseInt(d.value);
        }
        catch (NumberFormatException e) {
            // we don't care and just skip this entry
        }
    }

    return sum;
}

Testing

  • Code is fully tested
  • Behavior is frozen in time
  • Code might not be correct, but that is captured
  • Code coverage is 100% - not a quality indicator
  • Hard to test? Likely wrong design
  • Black box first, gray box second, white box last
  • No reason to reduce test case when duplicates
  • Tests represent the idea behind the code
  • No test, no idea, no code
  • Most tests lack state coverage
  • Most tests lack edge-case coverage
  • There are untestable things: Concurrency
  • There are hard to test things: Performance, security, efficiency
  • NEVER assume anything!!!!

Example


public class MiniStack<E> {
    public MiniStack();

    public E pop();
    public int push(E element);
    public E peek();

    public int size();
    public void clear();
}
        
  • Missing doc for edge-case behavior
  • Hard to test: Does clear() sets the size to 0 or really emptys out data
  • Test cases to the right are incomplete
  • new, size, peek, push, size, pop, size, push, size, clear, size - happy path
  • pop: 0, 1, 2...; pop, pop; pop after clear; pop to zero
  • peek: 0, 1, 2...; peek, peek; peek after clear
  • push: 0, 1, 2...; push after clear; push after pop
  • pop, push, pop, push...
  • no state change after peek
  • clear: 0, 1, 2, ...
  • clear, clear
  • size after each method
  • push A twice, two A poppable?

Testable

  • Easily testable
  • No hidden dependencies
  • No global state
  • Easily mockable, preferably without tooling
  • Final classes can be a pain
  • Why not open the code for more info?
  • Clock to avoid direct time source access
  • FileSystem to virtualize IO
  • Interface solution: Internal impl. exposes state, external view hides it

Summary

What you might have learned today

  • You produce code for everyone and yourself
  • It is not about that it works, rather about why
  • Don't try to predict too much when programming
  • Don't trust your brain and memory
  • Readability is king
  • Don't erase all history
  • Code is never fully correct (non-trivial)
  • Your consumers should shape your code quality
  • Embrace refactoring
  • Embrace rewriting
  • Don't overthink design, keep it simple
  • Your code teaches!
  • It is ok to hack a prototype or tool without comments

Sadly...

There are so many things we have not talked about.

Fluid code, layered design, modules, API design, versioning, performance, memory, concurrency design, advanced test design, code coverage, reuse, functional programming, object-oriented programming, global state...

One last thing...

One takeaway only

Any fool can write code that a computer can understand. Good programmers write code that humans can understand.

Martin Fowler

Always code as if the guy who ends up maintaining your code will be a violent psychopath who knows where you live.

Rick Osborne

Questions and Answers

Source: Agent-X Comics, CC-BY-NC-NA 3.0