others
FLAGS_benchmark
(since 0.12.0)
Used to do benchmark. If set, it will make scope delete synchronized, add some memory usage log, and synchronize all cuda kernel after kernel launches.
Values accepted
Bool. The default value is False.
Example
FLAGS_benchmark=True will do some synchronizations to test benchmark.
FLAGS_inner_op_parallelism
(since 1.3.0)
Most operators are working in single thread mode, but for some operator, use multi thread is more suitable. For Example, optimization op that optimize sparse gradient will be much faster to use multi thread. This flag is used to set the thread number inside an operator.
Values accepted
Int32. The default value is 0 which means that operator will not run in multi thread mode.
Example
FLAGS_inner_op_parallelism=5 will set the thread number inside an operator to 5.
Note
currently only sparse adam op supports inner_op_parallelism.
FLAGS_max_body_size
(Since 1.0.0)
It controls the max message size in BRPC.
Values accepted
Int32. The default value is 2147483647.
Example
FLAGS_max_body_size=2147483647 will set the BRPC message size to 2147483647.
FLAGS_sync_nccl_allreduce
(since 1.3)
If the FLAGS_sync_nccl_allreduce is true, there will call cudaStreamSynchronize(nccl_stream) in allreduce_op_handle, this mode can get better performance in some scenarios.
Values accepted
Bool. The default value is True.
Example
FLAGS_sync_nccl_allreduce=True will call cudaStreamSynchronize(nccl_stream) in allreduce_op_handle.
FLAGS_tracer_profile_fname
(since 1.4.0)
FLAGS_tracer_profile_fname indicates the profiler filename for imperative tracer, which generated by gperftools. Only valid when compiled WITH_PROFILER=ON. Empty if disabled.
Values accepted
String. The default value is (“gperf”).
Example
FLAGS_tracer_profile_fname=”gperf_profile_file” will set the profiler filename for imperative tracer to “gperf_profile_file”.