Peter Naur's classic 1985 essay "Programming as Theory Building" argues that a program is not its source code. A program is a shared mental construct (he uses the word theory) that lives in the minds of the people who work on it. If you lose the people, you lose the program. The code is merely a written representation of the program, and it's lossy, so you can't reconstruct
The counters that are the easiest to understand and the best for making ratios that are internally consistent (i.e., always fall in the range 0.0 to 1.0) are the mem_load_retired events, e.g., mem_load_retired.l1_hit and mem_load_retired.l1_miss.
These count at the instruction level, i.e., the universe of retired instructions. For example, could make a reasonable hit ratio from mem_load_retired.l1_hit / mem_inst_retired.all_loads and it will be sane (never indicate a hit rate more than 100%, for example).
That one isn't perfect though, in that it may not reflect the true costs of cache misses and the behavior of the program for at least the following reasons:
- It appplies only to loads and can't catch misses imposed by stores (AFAICT there is no event that counts store misses).
- It only counts loads that retire - a lot of the load activity in your process may be due to loads on a speculative path that never retire. Loads on a speculative path may bring in data that is never used, causing misses and d
// Sean Parent. Inheritance Is The Base Class of Evil. Going Native 2013 | |
// Video: https://www.youtube.com/watch?v=bIhUE5uUFOA | |
// Code : https://github.com/sean-parent/sean-parent.github.io/wiki/Papers-and-Presentations | |
/* | |
Copyright 2013 Adobe Systems Incorporated | |
Distributed under the MIT License (see license at | |
http://stlab.adobe.com/licenses.html) | |
This file is intended as example code and is not production quality. |
In general, check the crt/host_config.h
file to find out which versions are supported.
Sometimes it is possible to hack the requirements there to get some newer versions working, too :)
Thrust version can be found in $CUDA_ROOT/include/thrust/version.h
.
Download Archives: https://developer.nvidia.com/cuda-toolkit-archive
Release notes for CUDA Toolkit (CTK):
########################### GTEST | |
# Enable ExternalProject CMake module | |
INCLUDE(ExternalProject) | |
# Set default ExternalProject root directory | |
SET_DIRECTORY_PROPERTIES(PROPERTIES EP_PREFIX ${CMAKE_BINARY_DIR}/third_party) | |
# Add gtest | |
# http://stackoverflow.com/questions/9689183/cmake-googletest | |
ExternalProject_Add( |
Latency Comparison Numbers (~2012) | |
---------------------------------- | |
L1 cache reference 0.5 ns | |
Branch mispredict 5 ns | |
L2 cache reference 7 ns 14x L1 cache | |
Mutex lock/unlock 25 ns | |
Main memory reference 100 ns 20x L2 cache, 200x L1 cache | |
Compress 1K bytes with Zippy 3,000 ns 3 us | |
Send 1K bytes over 1 Gbps network 10,000 ns 10 us | |
Read 4K randomly from SSD* 150,000 ns 150 us ~1GB/sec SSD |