others

FLAGS_benchmark

(since 0.12.0)

Used to do benchmark. If set, it will make scope delete synchronized, add some memory usage log, and synchronize all cuda kernel after kernel launches.

Values accepted

Bool. The default value is False.

Example

FLAGS_benchmark=True will do some synchronizations to test benchmark.

FLAGS_inner_op_parallelism

(since 1.3.0)

Most operators are working in single thread mode, but for some operator, use multi thread is more suitable. For Example, optimization op that optimize sparse gradient will be much faster to use multi thread. This flag is used to set the thread number inside an operator.

Values accepted

Int32. The default value is 0 which means that operator will not run in multi thread mode.

Example

FLAGS_inner_op_parallelism=5 will set the thread number inside an operator to 5.

Note

currently only sparse adam op supports inner_op_parallelism.

FLAGS_max_body_size

(Since 1.0.0)

It controls the max message size in BRPC.

Values accepted

Int32. The default value is 2147483647.

Example

FLAGS_max_body_size=2147483647 will set the BRPC message size to 2147483647.

FLAGS_sync_nccl_allreduce

(since 1.3)

If the FLAGS_sync_nccl_allreduce is true, there will call cudaStreamSynchronize(nccl_stream) in allreduce_op_handle, this mode can get better performance in some scenarios.

Values accepted

Bool. The default value is True.

Example

FLAGS_sync_nccl_allreduce=True will call cudaStreamSynchronize(nccl_stream) in allreduce_op_handle.

FLAGS_tracer_profile_fname

(since 1.4.0)

FLAGS_tracer_profile_fname indicates the profiler filename for imperative tracer, which generated by gperftools. Only valid when compiled WITH_PROFILER=ON. Empty if disabled.

Values accepted

String. The default value is (“gperf”).

Example

FLAGS_tracer_profile_fname=”gperf_profile_file” will set the profiler filename for imperative tracer to “gperf_profile_file”.