Enhances Timer with format options, enable toggle, and GPU sync#4785
Enhances Timer with format options, enable toggle, and GPU sync#4785ooctipus wants to merge 4 commits intoisaac-sim:developfrom
Conversation
032dc3e to
67e8d29
Compare
Greptile SummaryThis PR enhances Key observations:
Confidence Score: 3/5
Important Files Changed
Sequence DiagramsequenceDiagram
participant User
participant Timer
participant warp as wp (Warp)
participant GlobalState as Timer.timing_info / _welford_state
User->>Timer: Timer(msg, name, enable, time_unit)
Note over Timer: Captures Timer.enable at init<br/>Validates time_unit
User->>Timer: start()
alt _enable is False
Timer-->>User: return (no-op)
else _enable is True
Timer->>Timer: _start_time = perf_counter()
end
User->>Timer: stop()
alt _enable is False
Timer-->>User: return (no-op)
else _enable is True
Timer->>warp: wp.synchronize() ⚠️ all devices
warp-->>Timer: GPU work flushed
Timer->>Timer: _elapsed_time = perf_counter() - _start_time
alt name is not None
Timer->>GlobalState: _update_welford(elapsed)
GlobalState-->>Timer: timing_info[name] updated
end
end
opt Context manager __exit__
Timer->>User: print(msg, Last/Mean/Std/N)
end
Last reviewed commit: 804830e |
Additional Comments (1)
|
67e8d29 to
804830e
Compare
|
@greptile |
| raise TimerError("Timer is not running. Use .start() to start it") | ||
|
|
||
| # Synchronize the device to make sure we time the whole operation | ||
| wp.synchronize() |
There was a problem hiding this comment.
wp.synchronize() syncs all devices, not just the current one
The PR description explicitly states that wp.synchronize_device() should be used "before stopping to ensure accurate GPU timing", but the implementation calls wp.synchronize() instead. These are meaningfully different:
wp.synchronize()– blocks until all CUDA devices finish their work.wp.synchronize_device()– blocks until only the current (active) device finishes its work.
In a multi-GPU setup this causes unnecessarily broad synchronisation, stalling work on unrelated devices and inflating the measured time. The more targeted call matches the stated intent and should be used:
| wp.synchronize() | |
| wp.synchronize_device() |
| def stop(self): | ||
| """Stop timing.""" | ||
| if not self._enable: | ||
| return | ||
|
|
||
| if self._start_time is None: | ||
| raise TimerError("Timer is not running. Use .start() to start it") | ||
|
|
||
| # Synchronize the device to make sure we time the whole operation | ||
| wp.synchronize() |
There was a problem hiding this comment.
Unconditional GPU sync adds overhead to CPU-only timers
wp.synchronize() is called on every stop() regardless of whether any GPU kernels were submitted during the timed interval. For timers used around pure CPU work (e.g., data loading, configuration, bookkeeping) this adds a non-trivial, unnecessary stall.
Consider making GPU synchronization opt-in via a parameter (e.g., sync_cuda: bool = True) so that callers who know they are timing CPU-only paths can avoid the extra cost:
def __init__(self, ..., sync_cuda: bool = True):
...
self._sync_cuda = sync_cudadef stop(self):
if not self._enable:
return
if self._start_time is None:
raise TimerError("Timer is not running. Use .start() to start it")
if self._sync_cuda:
wp.synchronize_device()
...| self._start_time = None | ||
| self._stop_time = None | ||
| self._elapsed_time = None | ||
| self._enable = enable if Timer.enable else False |
There was a problem hiding this comment.
Global Timer.enable is only evaluated at instantiation
self._enable is captured once during __init__. If Timer.enable is later set to False, existing timer instances are unaffected and will continue to record timing data. Conversely, setting Timer.enable = False after instance creation has no effect on live timers.
This behaviour is implicitly tested (the test sets Timer.enable = False before creating the instance), but the docstring for enable does not clarify this. Users who expect a global kill-switch that affects already-created timers will be surprised. Consider adding a note to the enable docstring:
enable: ClassVar[bool] = True
"""Whether to enable newly created timers.
Note:
This flag is evaluated at instantiation time. Changing it after a
:class:`Timer` has been created does not affect that instance.
"""
Description
Adds configurable time format (s/ms/us/ns), global enable/enable_display_output toggles, and wp.synchronize_device() before stopping to ensure accurate GPU timing. Output now shows last/mean/std/n per measurement.
Credit Fully goes to : @AntoineRichard I just copied pasted his solution from dev/newton
Type of change
Screenshots
Please attach before and after screenshots of the change if applicable.
Checklist
pre-commitchecks with./isaaclab.sh --formatconfig/extension.tomlfileCONTRIBUTORS.mdor my name already exists there