diff options
| author | bors[bot] <26634292+bors[bot]@users.noreply.github.com> | 2021-10-18 12:05:43 +0000 |
|---|---|---|
| committer | GitHub <[email protected]> | 2021-10-18 12:05:43 +0000 |
| commit | 729b17bc25fed42b4348cae0fb3d781590572c3f (patch) | |
| tree | 6ad5c5d3d181aaa879a3261136312eb1827720ba /examples | |
| parent | b22c472af3a7e88c2855e6de216dcfa15ff155d1 (diff) | |
| parent | d32477f5a19496911a2a95b002bb7cddf5b9e605 (diff) | |
Merge #428
428: executor: Use critical sections instead of atomic CAS loops r=lulf a=Dirbaio
Optimize executor wakes.
CAS loops (either `fetch_update`, or manual `load + compare_exchange_weak`) generate surprisingly horrible code: https://godbolt.org/z/zhscnM1cb
This switches to using critical sections, which makes it faster. On thumbv6 (Cortex-M0) it should make it even faster, as it is currently using `atomic-polyfill`, which will make many critical sections for each `compare_exchange_weak` anyway.
```
opt-level=3 opt-level=s
atmics: 105 cycles 101 cycles
CS: 76 cycles 72 cycles
CS+inline: 72 cycles 64 cycles
```
Measured in nrf52 with icache disabled, with this code:
```rust
poll_fn(|cx| {
let task = unsafe { task_from_waker(cx.waker()) };
compiler_fence(Ordering::SeqCst);
let a = cortex_m::peripheral::DWT::get_cycle_count();
compiler_fence(Ordering::SeqCst);
unsafe { wake_task(task) }
compiler_fence(Ordering::SeqCst);
let b = cortex_m::peripheral::DWT::get_cycle_count();
compiler_fence(Ordering::SeqCst);
defmt::info!("cycles: {=u32}", b.wrapping_sub(a));
Poll::Ready(())
})
.await;
````
Co-authored-by: Dario Nieuwenhuis <[email protected]>
Diffstat (limited to 'examples')
0 files changed, 0 insertions, 0 deletions
