Page tree
Skip to end of metadata
Go to start of metadata

Indentation

Tabs should be used in all C files. Tab width should be set at 8 characters in your IDE.

Tabs are preferred to align variable names in structure and enum declarations. This may temporarily misalign them with other variable names in the structure that are using spaces for indentation, or consider fixing up the whole struct in the same patch if there aren't too many of them. In some cases (long variable names, long comments, nested unions) then it isn't practical to use full tabs to align the fields and spaces can be used for partial indentation. Moreover, the returned type and the function name should be on two different lines.

right:
void
func_helper(...)
{
        struct foobar_counter  *foo;
        unsigned long           last;
        int                     rc = 0;


wrong:
void func(...)
{
        int rc;
        struct foobar_counter *foo;
        unsigned long last;


All lines should wrap at 80 characters. This is to avoid the need to have extra-wide terminal windows for the few lines of code that exceed 80 columns, and avoids nesting functions and conditionals too deeply. If it's getting too hard to wrap at 80 characters, you probably need to rearrange conditional order or break it up into more functions.

right:
void
func_helper(...)
{
        struct foobar_counter  *foo;
        unsigned long           last;
        int                     rc = 0;
 
        do_sth2_1;
 
        if (cond3)
                do_sth_which_needs_a_very_long_line(and, lots, of, arguments);
 
        do_sth2_2;
}
 
void
func(...)
{
        long   first;
        int    i;
 
        if (!cond1)
                return;
 
        do_sth1_1;
 
        if (cond 2)
                func_helper(...)
 
        do_sth1_2;
}

wrong:
void
func(...) { int rc; struct foobar_counter *foo; unsigned long last; if (cond1) { do_sth1_1; if (cond2) { do_sth2_1; if (cond3) { do_sth_which_needs_a_very_long(and, lots, of, arguments); } do_sth2_2; } do_sth1_2; } }

Do not include spaces or tabs on blank lines or at the end of lines. Please ensure you remove all instances of these in any patches you submit to Gerrit. You can find them with grep or in vim using the following regexps /\[ \t\]$/ or use git git diff --check. Alternatively, if you use vim, you can put this line in your .vimrc file, which will highlight whitespace at the end of lines and spaces followed by tabs in indentation (only works for C/C++ files) let c_space_errors=1. Or you can use this command, which will make tabs and whitespace at the end of lines visible for all files (but a bit more discretely) set list listchars=tab:>\ ,trail:$. In emacs, you can use (whitespace-mode) or (whitespace-visual-mode)depending on the version. You could also consider using (flyspell-prog-mode).

Functions & Variables

Functions that are more than a few lines long should be declared with a leading comment block that describes what the function does, any assumptions for the caller (locks already held, locks that should not be held, etc), and return values. DAOS uses the DOxygen markup style for formatting the code comments, as shown in the example here. \a argument can be used in the descriptive text to name a function argument. \param argument definition should be used to define each of the function parameters, and can be followed by [in][out], or [in,out] to indicate whether each parameter is used for input, output, or both. \retval should be used to define the function return values.

right:
/**
* Connect to the DAOS pool identified by UUID \a uuid. Upon a successful
* completion, \a poh returns the pool handle, and \a info returns the latest
* pool information.
*
* \param[in] uuid UUID to identify a pool.
* \param[in] grp Process set name of the DAOS servers managing the pool
* \param[in] svc Pool service replica ranks, as reported by
* daos_pool_create().
* \param[in] flags Connect mode represented by the DAOS_PC_ bits.
* \param[out] poh Returned open handle.
* \param[out] info Returned pool info.
* \param[in] ev Completion event, it is optional and can be NULL.
* The function will run in blocking mode if \a ev is NULL.
*
* \return These values will be returned by \a ev::ev_error in
* non-blocking mode:
* 0 Success
* -DER_INVAL Invalid parameter
* -DER_UNREACH Network is unreachable
* -DER_NO_PERM Permission denied
* -DER_NONEXIST Pool is nonexistent
*/
int
daos_pool_connect(const uuid_t uuid, const char *grp,
const d_rank_list_t *svc, unsigned int flags,
daos_handle_t *poh, daos_pool_info_t *info, daos_event_t *ev);
wrong:
/* Connect */
int
daos_pool_connect(const uuid_t uuid, const char *grp,
const d_rank_list_t *svc, unsigned int flags,
daos_handle_t *poh, daos_pool_info_t *info, daos_event_t *ev);

Don't use inline unless you're doing something so performance critical that the function call overhead will make a difference -- in other words - almost never. It makes debugging harder and overuse can actually hurt performance by causing instruction cache thrashing or crashes due to stack overflow.

Do not use typedef for new declarations, as this hides the type of a field for very little benefit. Never typedef pointers, as the * makes C pointer declarations obvious. Hiding it inside a typedef just obfuscates the code.

Do not embed assignments inside Boolean expressions. Although this can make the code one line shorter, it doesn't make it more understandable and you increase the risk of confusing =with == or getting operator precedence wrong if you skimp on brackets. It's even easier to make mistakes when reading the code, so it's much safer simply to avoid it altogether.

right:
        ptr = malloc(size);
        if (ptr != NULL) {
                ...

wrong:
        if ((ptr = malloc(size)) != NULL) {
                ...

Conditional expressions read more clearly if only Boolean expressions are implicit (i.e., non-Boolean and pointer expressions compare explicitly with 0 and NULL respectively.)

right:
        void *pointer = NULL;
        bool  writing = false;
        int   ref_count = 0;
 
        if (!writing &&     /* not writing? */
            pointer != NULL && /* valid pointer? */
            reval == 0)     /* no error is returned? */
                do_this();

wrong:
        if (writing == 0 && /* not writing? */
            pointer &&      /* valid pointer or not? */
            !retval)        /* return value is good or bad? */
               not_clear_if_we_do_this_or_not();

Use parentheses to help readability and reduce the chance of operator precedence errors, but not so heavily that it is difficult to determine which parentheses are a matched pair or accidentally hide bugs (e.g. the aforementioned assignments inside Boolean expressions).

right:
        if (a->a_field == 3 &&
            ((bar & BAR_MASK_1) || (bar & BAR_MASK_2)))
                do this();

wrong:
        if (((a == 1) && (b == 2) && ((c == 3) && (d = 4))))
                maybe_do_this()

wrong:
        if (a->a_field == 3 || b->b_field & BITMASK1 && c->c_field & BITMASK2)
                maybe_do_this()

wrong:
        if ((((a->a_field) == 3) || (((b->b_field) & (BITMASK1)) &&
           ((c->c_field) & (BITMASK2)))))
                maybe_do_this()

Code can be much more readable and efficient if the simple or common actions are taken first in a set of tests. Re-ordering conditions like this also eliminates excessive nesting and helps avoid overflowing the 80-column limit.

Do not place else blocks after terminal statements like returncontinuebreak, or goto in the if block.

right:
        list_for_each_entry(...) {
                if (!condition1) {
                        do_sth1;
                        continue;
                }
 
                do_sth2_1;
                do_sth2_2;
                ...
                do_sth2_N;
 
                if (!condition2)
                        break;
 
                do_sth3_1;
                do_sth3_2;
                if (!condition3)
                        break;
                ...
                do_sth3_N;
        }

wrong:
        list_for_each_entry(...) {
                if (condition1) {
                        do_sth2_1;
                        do_sth2_2;
                        ...
                        do_sth2_N;
                        if (condition2) {
                                do_sth3_1;
                                do_sth3_2;
                                if (condition3) {
                                        ...
                                        do_sth3_N;
                                        continue;
                                }
                        }
                        break;
                } else {
                        do_sth1;
                }
        }

Function bodies that have complex state to clean up in case of an error should do so only once at the end of the function with a single return, and use goto to jump there in case of an error:

right:
static int
bar_init(...) { foo = alloc_foo(); if (foo == NULL) return -DER_NOMEM; rc = setup_foo(foo); if (rc) goto free_foo; rc = init_bar(bar); if (rc) goto cleanup_foo;


        return 0;
 
cleanup_foo:
        cleanup_foo(foo);
free_foo:
        free_foo(foo);
 
        return rc;
}

wrong
static int
bad_func(...) { foo = alloc_foo(); if (foo == NULL) return -DER_NOMEM; rc = setup_foo(foo); if (rc != 0) { free_foo(foo); return rc;
} rc = init_bar(bar); if (rc) { cleanup_foo(foo); free_foo(foo); return rc; } return rc; }

Variable should be declared one per line, type and name, even if there are multiple variables of the same type. There should be one space between the variable type and the variable name and when multiple variables are declared, the names must be aligned with tabs. For maximum readability, longer or more important declarations should be at the top. There should always be an empty line after the variable declarations, before the start of the code. There shouldn't be complex variable initializers (e.g. function calls) in the declaration block, since this may be easily missed by the reader and make it confusing to debug.

right:
        struct vos_container  *cont;
struct d_uuid *pkey;
int count;
int rc;

cont = container_of(ulink, struct vos_container, vc_uhlink); 
 
wrong:
        int rc, count;
struct vos_container *cont;
cont = container_of(ulink, struct vos_container, vc_uhlink); 
struct d_uuid *pkey;
        

Variable declarations should be kept to the most internal scope, if practical and reasonable, to simplify understanding of where these variables are accessed and modified, and to avoid errors of using/reusing variables that only have meaning if certain branches are used:

right:
        int len;
 
        if (len > 0) {
                struct obj  *my_obj;
                int          count;

                my_obj = oget(foo);
 
                count = my_obj->size;
 
                if (count > 32) {
                        int left = count;
 
                        :
                }
        }

wrong:
        int len = path_length(bar), count, ret;
        struct obj *my_obj = oget(foo);
 
        if (len > 0) {
                count = my_obj->size;
                if (count > 42)
                       ret = 0;
        } else if (len == 0) {
                ret = -ENODATA;
        } else {
                D_ERROR("count is bad: %d\n", count);
        }

        return ret;

Even for short conditionals, the operation should be on a separate line:

right:
        if (foo)
                bar();

wrong:
        if (foo) bar();

When you wrap a line containing parenthesis, start the continued line after the parenthesis so that the expression or argument is visually bracketed.

right:
        variable = do_something_complicated(long_argument, longer_argument,
                                            longest_argument(sub_argument,
                                                             foo_argument),
                                            last_argument);
 
        if (some_long_condition(arg1, arg2, arg3) < some_long_value &&
            another_long_condition(very_long_argument_name,
                                   another_long_argument_name) >
            second_long_value) {
                do_something();
                ...
 
wrong:
        variable = do_something_complicated(long_argument, longer_argument,
                longest_argument(sub_argument, foo_argument),
                last_argument);
 
        if (some_long_condition(arg1, arg2, arg3) < some_long_value &&
                another_long_condition(very_long_argument_name,
                another_long_argument_name) >
                second_long_value) {
                do_something();
                ...

If you're wrapping an expression, put the operator at the end of the line. If there are no parentheses to which to align the start of the next line, just indent one more tab stop.

       off = le32_to_cpu(fsd->fsd_client_start) +
               cl_idx * le16_to_cpu(fsd->fsd_client_size);

Binary and ternary (but not unary) operators should be separated from their arguments by one space.

right:
        a++;
        b |= c;
        d = (f > g) ? 0 : 1;

Function calls should be nestled against the parentheses, the parentheses should crowd the arguments, and one space should appear after commas:

right:
       do_foo(bar, baz);

wrong:
       do_foo ( bar,baz );

Put a space between if, for, while, etc. and the following parenthesis. Put a space after each semicolon in a for statement.

right:
        for (a = 0; a < b; a++)
        if (a < b || a == c)
        while (1)

wrong:
        for( a=0; a<b; a++ )
        if( a<b || a==c )
        while( 1 )

Opening braces should be on the same line as the line that introduces the block, except for function calls. Bare closing braces (i.e. not else or while in do/while) get their own line.

int foo(void)
{
        if (bar) {
                this();
                that();
        } else if (baz) {
                stuff();
        } else {
                other_stuff();
        }
     
        do {
                cow();
        } while (condition);
}

If one part of a compound if block has braces, all should.

right:
        if (foo) {
                bar();
                baz();
        } else {
                salmon();
        }

wrong:
        if (foo) {
                bar();
                baz();
        } else
                moose();


External function declarations absolutely should *NEVER* go into .c files. The only exception is forward declarations for static functions in the same file that can't otherwise be moved before the caller. Instead, the declaration should go into the most "local" header available (e.g. subsystem_internal.h for a given subdirectory). Having external function declarations in .c files can cause very difficult to diagnose runtime bugs, because the compiler takes the local declaration as correct, can not compare it to the actual function declared in a different file, and does not have a declaration in a header to compare it against, but the linker does not check that the number and type of function arguments match.

Function declarations in header files should include the variable names for the parameters, so that they are self explanatory in the header without having to look at the code to see what the parameter is:

right:
    void ldlm_lock_addref_internal(struct ldlm_lock *lock, enum ldlm_lock_mode lock_mode, int refcount, int rc);

wrong:
    void ldlm_lock_addref_internal(struct ldlm_lock *, int, int, int)

Structure fields should have a common prefix derived from the structure name, so that they are easily found in the code and tag-jump works properly. This avoids problems with generic field names like page or count that have dozens or hundreds of potential matches in the code.

Structure and constant declarations should not be declared in multiple places. Put the struct into the most "local" header possible.

Structure initialization should be done by field names instead of using positional arguments:

struct foo {
        int     foo_val;
        int     foo_flag;
        int     foo_count;
        void   *foo_ptr;
        void   *foo_bar_ptr;
        void   *foo_baz_ptr;
};

right:
void func(void *data)
{
        struct foo fooz = { .foo_val = 1, .foo_flag = 0x20, .foo_ptr = data, .foo_baz_ptr = param };
}

wrong:
void func(void *data)
{
        /* not sure which field is being initialized, breaks (maybe silently) if structure adds a new field */
        struct foo fooz = { 1, 0x20, data, param };
}

When allocating/freeing a single struct, use D_ALLOC_PTR() for clarity:

right:
        D_ALLOC_PTR(mds_body);
        D_FREE_PTR(mds_body);

wrong:
        D_ALLOC(mds_body, sizeof(*mds_body));
        D_FREE(mds_body, sizeof(*mds_body));

Macros

When you define a macro, protect callers by placing parentheses round every parameter reference in the body.

Line up the backslashes of multi-line macros one tabstop from the end of the line to help readability.

Use a do/while (0) block with no trailing semicolon to ensure multi-statement macros are syntactically equivalent to procedure calls.

right:
#define DO_STUFF(a)                                     \
do {                                                    \
        int b = (a) + MAGIC;                            \
        do_other_stuff(b);                              \
} while (0)

wrong:
#define DO_STUFF(a) \
do { \
        int b = a + MAGIC; \
        do_other_stuff(b); \
} while (0);

If you write conditionally compiled code in a procedure body, make sure you do not create unbalanced braces, quotes, etc. This really confuses editors that navigate expressions or use fonts to highlight language features. It can often be much cleaner to put the conditionally compiled code in its own helper function which, by good choice of name, documents itself too.

right:
int do_stuff(struct object *obj)
{
        if (invalid_object(obj)) {
                ...

wrong:
int do_stuff(struct object *obj
#ifdef HAVE_FOO_FEATURE
             , struct foo_extra *foo)
#else
             )
#endif
{
#ifdef HAVE_FOO_FEATURE
        do_other_stuff_with_foo(obj, foo);
#else
        do_other_stuff(obj, foo);
#endif
                ...

If you nest preprocessor commands, use spaces to visually delineate:

#ifdef __USE_STEAK__
# include <goose>
# define MOOSE steak
#else /* !__USE_STEAK__ \*/
# include <mutton>
# define MOOSE prancing
#endif

For long or nested #ifdef blocks, include the conditional as a comment with each #else and #endif to make it clear which block is being terminated:

#ifdef __KERNEL__
# if HAVE_SOME_KERNEL_API
/* lots
   of
   stuff */
# endif /* HAVE_SOME_KERNEL_API */
#else /* !__KERNEL__ */
# if HAVE_ANOTHER_FEATURE
/* more
 * stuff */
# endif /* HAVE_ANOTHER_FEATURE */
#endif /* __KERNEL__ */

Comments

Single-line comments should have the leading /* and trailing */ on the same line as the comment. Multi-line comments should have the leading /* and trailing */ on their own lines, to match the upstream kernel coding style. Intermediate lines should start with a * aligned with the * on the first line:

/** This is a short comment */
 
/**
 * This is a multi-line comment.  I wish the line would wrap already,
 * as I don't have much to write about.
 */

Assertions

Do not embed operations inside assertions. If assertions are disabled for performance reasons this code will not be executed.

right:
        len = strcat(foo, bar);
        D_ASSERT(len > 0);

wrong:
        D_ASSERT(strcat(foo, bar) > 0);

Assertions should be used carefully only for detecting internal consistent state and never for input data from the user, data from the network or for legitimate error condition (e.g. allocation failure). Don't combine multiple sanity checks in the same ASSERT() since it is difficult to find out which one failed.

right:
        D_ASSERT(len > 0);
D_ASSERT(obj) wrong: D_ASSERT(len > 0 && obj);

Error Messages

Messages on the console (D_ERROR()D_WARN()) should be written so they provide useful information to the *administrator* and/or support person in case of a *significant event* or *failure condition*. They should not print "debugging" information in cases that might be hit under normal usage or user-generated race conditions, since verbose console error messages lose the important messages in a flood of noise. Consider that there may be thousands of servers and tens of thousands of clients hitting some failure at the same time.

Print the pool or container UUID at the start of the message, since there are usually multiple targets running on a single server or multiple mountpoints on a client. The error messages should also include enough information to make some useful diagnosis of the problem (e.g. FID and/or filename, client NID, actual values that caused failures, etc). Otherwise, there is little value in having printed an error, but then needing to reproduce the problem to diagnose it:

right:
        D_ERROR("%s: unsupported incompat filesystem feature(s) %x\n", DP_UUID(po_uuid), incompat);
        D_ERROR("%s: cannot create root entry: rc = %d\n", DP_UUID(co_uuid), rc);
        D_ERROR("%s: error initializing object: rc = %d\n",DP_UUID(co_uuid), rc);
D_ERROR("%s: cannot start coordinator thread: rc = %d\n", mdt_obd_name(mdt), rc);
wrong:
        D_ERROR("Cannot get handle\n");
        D_ERROR("NULL bitmap!\n");
        D_ERROR("invalid event\n");
        D_ERROR("allocation failed\n");

Error messages that print a numeric error value should print it at the end of the line in a consistent format using ": rc = %d\n":

right:
        D_ERROR("%s: error creating pool %s %s %s: rc = %d",
        D_ERROR("%s: cannot destroy container: rc = %d\n", DP_UUID(co_uuid) ,rc);
wrong:
        D_ERROR("err %d on param %d\n", rc, ptr);
        D_ERROR("Can't get index (%d)\n", rc);
  • No labels