Error handling
Note that unlike C or C11, C++ provides native support for exceptions. Hence the C++ API adopts the C++ style of handling and reporting errors to its caller, using exceptions. Unlike the normal C convention of returning integer values (0 for success and negative values in case of failure), it uses C++ exceptions to indicate error scenarios. All APIs return the expected output on success (or void if no output is returned), and throw an exception on error. Instead of checking the return value for errors as in C, the caller should use try-catch blocks for handling errors.
The OpenFAM 2.0 implementation defines a Fam_Exception
class, which
is derived from C++ standard exception class. It also defines a list of individual error
numbers to categorize various types of failures. Individual error numbers identify specific
error conditions. The Fam_Exception
object received by the caller in case of an
error contains a specific error number and an appropriate error
message string. The application can retrieve this information using
member functions, fam_error()
and fam_error_msg()/what()
and take
any necessary action. The currently defined exception and
error numbers are defined in Table 1 and Table 2
Fam Exception Class | Description |
---|---|
Fam_Exception | This exception class object is returned for all the error conditions. It will contain specific error number. |
Fam Error | Description |
---|---|
FAM_ERR_UNKNOWN | Unexpected or Unknown errors. |
FAM_ERR_NOPERM | Caller does not have access rights for the desired operation. |
FAM_ERR_TIMEOUT | Blocking APIs reached retry/timeout limit. |
FAM_ERR_INVALID | APIs called with invalid options/arguments. |
FAM_ERR_LIBFABRIC | Libfabric API failure. |
FAM_ERR_SHM | Shared memory allocator error. |
FAM_ERR_NOT_CREATED | Data item or region creation in FAM failed. |
FAM_ERR_NOTFOUND | Data item or region not found in FAM. |
FAM_ERR_ALREADYEXIST | Data item or region already exists in FAM. |
FAM_ERR_ALLOCATOR | Allocator specific error |
FAM_ERR_RPC | Error from grpc layer. |
FAM_ERR_PMI | Runtime error. |
FAM_ERR_OUTOFRANGE | Data access out of range. |
FAM_ERR_NULLPTR | Null pointer access error. |
FAM_ERR_UNIMPL | Calling unimplemented functions/APIs. |
FAM_ERR_RESOURCE | Resource not available. |
FAM_ERR_INVALIDOP | Invalid operations |
FAM_ERR_RPC_CLIENT_NOTFOUND | RPC service not available. |
FAM_ERR_MEMSERV_LIST_EMPTY | Memory service not initialized. |
FAM_ERR_METADATA | Metadata service error. |
FAM_ERR_MEMORY | Memory service error. |
FAM_ERR_NAME_TOO_LONG | Region or Data item name too long. |
FAM_ERR_ATL_QUEUE_FULL | Atomic large transfer APIs queue full. |
FAM_ERR_ATL_QUEUE_INSERT | Atomic large transfer APIs queue insert error. |
FAM_ERR_ATL_NOT_ENABLED | Atomic large transfer APIs not enabled. |
FAM_ERR_ATL | Atomic large transfer API error. |
Note that the library contains both blocking and non-blocking
calls for most data path operations. In case of errors, all blocking
calls throw exceptions immediately. For example, a call to
fam_put_blocking()
will either complete successfully, or throw Fam_Exception
object
containing one of the following error numbers - FAM_ERR_INVALID
,
FAM_ERR_OUTOFRANGE
, FAM_ERR_NOTFOUND
or
FAM_ERR_TIMEOUT
. In general, the application should use the
normal try-catch block to handle exceptions:
try { fam_put_blocking(); } catch (Fam_Exception &e) { // Exception handling code }
However, the non-blocking calls are queued within the library,
and may not catch exceptions immediately. Depending on the
underlying error, a fam exception will be thrown immediately with specific error number,
while others may only be thrown during the next fam_quiet()
call.
Thus the code may look like:
try { fam_put_nonblocking(); } catch (Fam_Exception &e) { // handle error numbers(for example) // FAM_ERR_INVALID, FAM_ERR_NOTFOUND, FAM_ERR_OUTOFRANGE, FAM_ERR_NOPERM } // ... Continue rest of the code ... try { fam_quiet(); } catch (Fam_Exception &e) { // This exception may actually result from a previous // fam_put_nonblocking() operation with following error number // FAM_ERR_TIMEOUT, FAM_ERR_OUTOFRANGE, FAM_ERR_NOPERM }
Note that uncaught exceptions will result in the application being terminated.