- 03 Nov, 2017 1 commit
-
-
Roman Lebedev authored
As requested, in a pr form :)
-
- 02 Nov, 2017 1 commit
-
-
Stefan Sauer authored
Describe how to use the cpupower command to disable CPU frequency scaling. Document this, since there are other ways that don't see to have the same effect. See #325
-
- 31 Oct, 2017 1 commit
-
-
Leo Koppel authored
* Fix BM_SetInsert example Move declaration of `std::set<int> data` outside the timing loop, so that the destructor is not timed. * Speed up BM_SetInsert test Since the time taken to ConstructRandomSet() is so large compared to the time to insert one element, but only the latter is used to determine number of iterations, this benchmark now takes an extremely long time to run in benchmark_test. Speed it up two ways: - Increase the Ranges() parameters - Cache ConstructRandomSet() result (it's not random anyway), and do only O(N) copy every iteration * Fix same issue in BM_MapLookup test * Make BM_SetInsert test consistent with README - Use the same Ranges everywhere, but increase the 2nd range - Change order of Args() calls in README to more closely match the result of Ranges - Don't cache ConstructRandomSet, since it doesn't make sense in README - Get a smaller optimization inside it, by givint a hint to insert()
-
- 20 Oct, 2017 1 commit
-
-
Yangqing Jia authored
* Add option to install benchmark * Change to BENCHMARK_ENABLE_INSTALL per @dominichamon
-
- 17 Oct, 2017 3 commits
-
-
Eric authored
Recently the library added a new ranged-for variant of the KeepRunning loop that is much faster. For this reason it should be preferred in all new code. Because a library, its documentation, and its tests should all embody the best practices of using the library, this patch changes all but a few usages of KeepRunning() into for (auto _ : state). The remaining usages in the tests and documentation persist only to document and test behavior that is different between the two formulations. Also note that because the range-for loop requires C++11, the KeepRunning variant has not been deprecated at this time.
-
Eric Fiselier authored
-
Eric authored
This patch improves the performance of the KeepRunning loop in two ways: (A) it removes the dependency on the max_iterations variable, preventing it from being loaded every iteration. (B) it loops to zero, instead of to an upper bound. This allows a single decrement instruction to be used instead of a arithmetic op followed by a comparison.
-
- 16 Oct, 2017 1 commit
-
-
Fred Tingaud authored
-
- 13 Oct, 2017 1 commit
-
-
Raúl Marín authored
Triggered by -Werror=double-promotion
-
- 10 Oct, 2017 1 commit
-
-
Eric authored
* Add C++11 Ranged For loop alternative to KeepRunning As pointed out by @astrelni and @dominichamon, the KeepRunning loop requires a bunch of memory loads and stores every iterations, which affects the measurements. The main reason for these additional loads and stores is that the State object is passed in by reference, making its contents externally visible memory, and the compiler doesn't know it hasn't been changed by non-visible code. It's also possible the large size of the State struct is hindering optimizations. This patch allows the `State` object to be iterated over using a range-based for loop. Example: void BM_Foo(benchmark::State& state) { for (auto _ : state) { [...] } } This formulation is much more efficient, because the variable counting the loop index is stored in the iterator produced by `State::begin()`, which itself is stored in function-local memory and therefore not accessible by code outside of the function. Therefore the compiler knows the iterator hasn't been changed every iteration. This initial patch and idea was from Alex Strelnikov. * Fix null pointer initialization in C++03
-
- 09 Oct, 2017 3 commits
-
-
mwinterb authored
* Always use inline asm DoNotOptimize with clang. clang-cl masquerades as MSVC but not GCC, so it was using the MSVC-compatible definitions of DoNotOptimize and ClobberMemory. Presumably, it's better in general to use the targeted assembly for this functionality (the codegen is different), but the specific issue is that clang-cl deprecates the usage of _ReadWriteBarrier, and this gets rid of that warning. * triggering another AppVeyor run
-
Anton Lashkov authored
* Add macros for create benchmark with templated fixture * Add info about templated fixtures to README.md * Add tests for templated fixtures
-
Dominic Hamon authored
-
- 27 Sep, 2017 5 commits
-
-
Dominic Hamon authored
#448
-
Dominic Hamon authored
#448
-
Dominic Hamon authored
Covered by Google Inc here and i'm in CONTRIBUTORS
-
Dominic Hamon authored
Fixes #448
-
Dominic Hamon authored
Part of #448
-
- 14 Sep, 2017 2 commits
-
-
Eric authored
* Fix #444 - Use BENCHMARK_HAS_CXX11 over __cplusplus. MSVC incorrectly defines __cplusplus to report C++03, despite the compiler actually providing C++11 or greater. Therefore we have to detect C++11 differently for MSVC. This patch uses `_MSVC_LANG` which has been defined since Visual Studio 2015 Update 3; which should be sufficient for detecting C++11. Secondly this patch changes over most usages of __cplusplus >= 201103L to check BENCHMARK_HAS_CXX11 instead. * remove redunant comment
-
Disconnect3d authored
-
- 13 Sep, 2017 1 commit
-
-
Andre Schroeder authored
-
- 28 Aug, 2017 2 commits
-
-
Roman Lebedev authored
* Tools: compare-bench.py: print change% with two decimal digits Here is a comparison of before vs. after: ```diff -Benchmark Time CPU Time Old Time New CPU Old CPU New ---------------------------------------------------------------------------------------------------------- -BM_SameTimes +0.00 +0.00 10 10 10 10 -BM_2xFaster -0.50 -0.50 50 25 50 25 -BM_2xSlower +1.00 +1.00 50 100 50 100 -BM_1PercentFaster -0.01 -0.01 100 99 100 99 -BM_1PercentSlower +0.01 +0.01 100 101 100 101 -BM_10PercentFaster -0.10 -0.10 100 90 100 90 -BM_10PercentSlower +0.10 +0.10 100 110 100 110 -BM_100xSlower +99.00 +99.00 100 10000 100 10000 -BM_100xFaster -0.99 -0.99 10000 100 10000 100 -BM_10PercentCPUToTime +0.10 -0.10 100 110 100 90 +Benchmark Time CPU Time Old Time New CPU Old CPU New +------------------------------------------------------------------------------------------------------------- +BM_SameTimes +0.0000 +0.0000 10 10 10 10 +BM_2xFaster -0.5000 -0.5000 50 25 50 25 +BM_2xSlower +1.0000 +1.0000 50 100 50 100 +BM_1PercentFaster -0.0100 -0.0100 100 99 100 99 +BM_1PercentSlower +0.0100 +0.0100 100 101 100 101 +BM_10PercentFaster -0.1000 -0.1000 100 90 100 90 +BM_10PercentSlower +0.1000 +0.1000 100 110 100 110 +BM_100xSlower +99.0000 +99.0000 100 10000 100 10000 +BM_100xFaster -0.9900 -0.9900 10000 100 10000 100 +BM_10PercentCPUToTime +0.1000 -0.1000 100 110 100 90 +BM_ThirdFaster -0.3333 -0.3333 100 67 100 67 ``` So the first ("Time") column is exactly where it was, but with two more decimal digits. The position of the '.' in the second ("CPU") column is shifted right by those two positions, and the rest is unmodified, but simply shifted right by those 4 positions. As for the reasoning, i guess it is more or less the same as with #426. In some sad times, microbenchmarking is not applicable. In those cases, the more precise the change report is, the better. The current formatting prints not so much the percentages, but the fraction i'd say. It is more useful for huge changes, much more than 100%. That is not always the case, especially if it is not a microbenchmark. Then, even though the change may be good/bad, the change is small (<0.5% or so), rounding happens, and it is no longer possible to tell. I do acknowledge that this change does not fix that problem. Of course, confidence intervals and such would be better, and they would probably fix the problem. But i think this is good as-is too, because now the you see 2 fractional percentage digits!1 The obvious downside is that the output is now even wider. * Revisit tests, more closely documents the current behavior.
-
Roman Lebedev authored
-
- 23 Aug, 2017 1 commit
-
-
Roman Lebedev authored
* Drop Stat1, refactor statistics to be user-providable, add median. My main goal was to add median statistic. Since Stat1 calculated the stats incrementally, and did not store the values themselves, it is was not possible. Thus, i have replaced Stat1 with simple std::vector<double>, containing all the values. Then, i have refactored current mean/stdev to be a function that is provided with values vector, and returns the statistic. While there, it seemed to make sense to deduplicate the code by storing all the statistics functions in a map, and then simply iterate over it. And the interface to add new statistics is intentionally exposed, so they may be added easily. The notable change is that Iterations are no longer displayed as 0 for stdev. Is could be changed, but i'm not sure how to nicely fit that into the API. Similarly, this dance about sometimes (for some fields, for some statistics) dividing by run.iterations, and then multiplying the calculated stastic back is also dropped, and if you do the math, i fail to see why it was needed there in the first place. Since that was the only use of stat.h, it is removed. * complexity.h: attempt to fix MSVC build * Update README.md * Store statistics to compute in a vector, ensures ordering. * Add a bit more tests for repetitions. * Partially address review notes. * Fix gcc build: drop extra ';' clang, why didn't you warn me? * Address review comments. * double() -> 0.0 * early return
-
- 21 Aug, 2017 1 commit
-
-
Dominic Hamon authored
When generating a human-readable number for user counters, we don't generally expect 1k to be 1024. This is the default due to the more general purpose string utility. Fixes #437
-
- 18 Aug, 2017 1 commit
-
-
Roman Lebedev authored
https://github.com/google/benchmark/commit/2373382284918fda13f726aefd6e2f700784797f reworked parsing, and introduced a regression in handling of the optional options that should be passed to both of the benchmarks. Now, unless the *first* optional argument starts with '-', it would just complain about that argument: Unrecognized positional argument arguments: '['q']' which is wrong. However if some dummy arg like '-q' was passed first, it would then happily passthrough them all... This commit fixes benchmark_options behavior, by restoring original passthrough behavior for all the optional positional arguments.
-
- 15 Aug, 2017 1 commit
-
-
Victor Costan authored
-
- 01 Aug, 2017 1 commit
-
-
Roman Lebedev authored
May be relevant for flakiness of win builds Noted by @KindDragon
-
- 31 Jul, 2017 1 commit
-
-
Eric Fiselier authored
The benchmark library is compiled as C++11, but certain tests are compiled as C++03. When -flto is enabled GCC 5.4 and above will diagnose an ODR violation in libstdc++'s <map>. This ODR violation, although real, should likely be benign. For this reason it seems sensible to simply suppress -Wodr when building the C++03 test. This patch fixes #420 and supersede's PR #424.
-
- 25 Jul, 2017 1 commit
-
-
Roman Lebedev authored
While the percentages are displayed for both of the columns, the old/new values are only displayed for the second column, for the CPU time. And the column is not even spelled out. In cases where b->UseRealTime(); is used, this is at the very least highly confusing. So why don't we just display both the old/new for both the columns? Fixes #425
-
- 24 Jul, 2017 1 commit
-
-
Roman Lebedev authored
* Json reporter: passthrough fp, don't cast it to int; adjust tooling Json output format is generally meant for further processing using some automated tools. Thus, it makes sense not to intentionally limit the precision of the values contained in the report. As it can be seen, FormatKV() for doubles, used %.2f format, which was meant to preserve at least some of the precision. However, before that function is ever called, the doubles were already cast to the integer via RoundDouble()... This is also the case for console reporter, where it makes sense because the screen space is limited, and this reporter, however the CSV reporter does output some( decimal digits. Thus i can only conclude that the loss of the precision was not really considered, so i have decided to adjust the code of the json reporter to output the full fp precision. There can be several reasons why that is the right thing to do, the bigger the time_unit used, the greater the precision loss, so i'd say any sort of further processing (like e.g. tools/compare_bench.py does) is best done on the values with most precision. Also, that cast skewed the data away from zero, which i think may or may not result in false- positives/negatives in the output of tools/compare_bench.py * Json reporter: FormatKV(double): address review note * tools/gbench/report.py: skip benchmarks with different time units While it may be useful to teach it to operate on the measurements with different time units, which is now possible since floats are stored, and not the integers, but for now at least doing such a sanity-checking is better than providing misinformation.
-
- 14 Jul, 2017 1 commit
-
-
Dominic Hamon authored
-
- 13 Jul, 2017 1 commit
-
-
Dominic Hamon authored
-
- 06 Jul, 2017 1 commit
-
-
Tom Madams authored
Change ThreadCPUUsage to call ProcessCPUUsage if __rtems__ is defined. RTEMS real time OS doesn't support CLOCK_THREAD_CPUTIME_ID. See https://github.com/RTEMS/rtems/blob/master/cpukit/posix/src/clockgettime.c#L58-L59 Prior to this change, ThreadCPUUsage would fail when running on RTEMS with: ERROR: clock_gettime(CLOCK_THREAD_CPUTIME_ID, ...) failed
-
- 04 Jul, 2017 1 commit
-
-
Eric authored
* Make Benchmark a single header library (but not header-only) This patch refactors benchmark into a single header, to allow for slightly easier usage. The initial reason for the header split was to keep C++ library components from being included by benchmark_api.h, making that part of the library STL agnostic. However this has since changed and there seems to be little reason to separate the reporters from the rest of the library. * Fix internal_macros.h * Remove more references to macros.h
-
- 16 Jun, 2017 1 commit
-
-
Jern-Kuan Leong authored
Add definition of ${VAR} to makefiles if specified as part of cmake parameter.
-
- 14 Jun, 2017 3 commits
-
-
Eric authored
* Add ClearRegisteredBenchmark() function. Since benchmarks can be registered at runtime using the RegisterBenchmark(...) functions, it makes sense to have a ClearRegisteredBenchmarks() function too, that can be used at runtime to clear the currently registered benchmark and re-register an entirely new set. This allows users to run a set of registered benchmarks, get the output using a custom reporter, and then clear and re-register new benchmarks based on the previous results. This fixes issue #400, at least partially. * Remove unused change
-
Tim authored
This removes warnings when using CMake >= 3.3 if you have symbol visibility set.
-
- 05 Jun, 2017 1 commit
-
-
Yixuan Qiu authored
* remove unnecessary weights * use sample standard deviation * add contributor information * remove redundant code * initialize variable to eliminate compiler warning
-