Skip to content

Instantly share code, notes, and snippets.

View nihui's full-sized avatar

nihui

  • Shanghai
  • 16:25 (UTC +08:00)
  • X @nihui
View GitHub Profile
@FCLC
FCLC / Understanding how modern processors got fast: SIMD, multiple pipes and Out of Order execution.md
Last active November 13, 2023 06:07
An approachable introduction to how modern CPUs got fast, beyond throwing more GHz at the problem

Context

I was helping a few computer science students and enthusiasts understand “how” modern processors got to be “so fast” outside of clock speed increases.  

Here is the main ;p exert  

Acronyms:  

SIMD: Single Instruction, Multiple Data  

Compiling/installing the mesa virtio-venus-driver-(below done with new linux container)
For: Chrome OS crostini-default debian container bookworm
Best viewed in "raw" format
In chrome browser type or paste
chrome://flags
@kekyo
kekyo / Creator_CI20.md
Last active September 21, 2024 07:37
How to update latest packages on Imagination Creator CI20 Debian 8 (jessie)

How to update latest packages on Imagination Creator CI20 Debian 8 (jessie)

Imagination Creator CI20 board: A MIPS32 architecture evaluation board.

Imagination Creator CI20 board


@BlueCocoa
BlueCocoa / cpu-test1.txt
Last active November 22, 2020 02:39
ncnn benchmark on Apple Silicon M1
$ ./benchmark/benchncnn
thread_policy_set error 46
loop_count = 4
num_threads = 8
powersave = 0
gpu_device = -1
cooling_down = 1
squeezenet min = 5.64 max = 6.24 avg = 5.88
squeezenet_int8 min = 8.93 max = 8.97 avg = 8.94
mobilenet min = 8.86 max = 8.99 avg = 8.91
Model Image Size Target Size Block Size Total Time(sec) GPU Memory(MB)
models-cunet 200x200 400x400 400/200/100 0.93/0.30/0.33 615/615/173
models-cunet 400x400 800x800 400/200/100 0.78/0.71/0.78 2408/615/174
models-cunet 1000x1000 2000x2000 400/200/100 3.16/3.21/3.53 2416/618/175
models-cunet 2000x2000 4000x4000 400/200/100 11.40/11.98/13.86 2420/669/193
models-cunet 4000x4000 8000x8000 400/200/100 44.33/47.15/54.76 2452/644/197
models-upconv_7_anime_style_art_rgb 200x200 400x400 400/200/100 0.16/0.16/0.15 459/459/119
models-upconv_7_anime_style_art_rgb 400x400 800x800 400/200/100 0.43/0.37/0.37 1741/460/119
models-upconv_7_anime_style_art_rgb 1000x1000 2000x2000 400/200/100 1.62/1.59/1.67 1764/462/120

Detail behind NCNN's factory pattern

NCNN adopts the factory pattern to create the layers of a nueral network. It's also the way the well-known library Caffe takes. It differs from Caffe in the implementation of the registry table. On one hand, the Caffe registry is populated in runtime as the side effect of initializion of global variable (which is a popular way for library initialization). On the other hand, the NCNN registry is determined in compile time. The registry is generated in a brilliant way using CMake instead of a hand-crafted table. NCNN's approach provides several benefits compared to Caffe's approach.

First, it's suitable for building a static library. When building a static library, the linker will strip any unused global variable to minimize the size of the library. This makes sense but it also strips the global variable which need to be inintialized to insert te layer creator into the registry. Tricky linker flags and related instrutions are required to resolve this issue. By creating

@wkcn
wkcn / op.h
Last active July 11, 2019 08:35
MXNet CPP Op
This file has been truncated, but you can view the full file.
/*!
* Copyright (c) 2019 by Contributors
* \file op.h
* \brief definition of all the operators
* \author Chuntao Hong, Xin Li
*/
#ifndef MXNET_CPP_OP_H_
#define MXNET_CPP_OP_H_
@samgooi4189
samgooi4189 / bcm57765or57785fix
Last active August 7, 2024 01:25
Fixing Broadcom Corporation BCM57765/57785 SDXC/MMC Card Reader
Follow the WORKAROUND:
1. Add a comand to /etc/rc.local, add the following line above "exit 0":
setpci -s 00:1c.2 0x50.B=0x41
2. Add the same comand to /etc/apm/resume.d/21aspm (which does not exist yet):
setpci -s 00:1c.2 0x50.B=0x41
3. Add the following to /etc/modprobe.d/sdhci.conf:
options sdhci debug_quirks2=4
4. Re-generate initrd:
sudo update-initramfs -u -k all
5. Reboot or reload sdhci module:
@evantoli
evantoli / GitConfigHttpProxy.md
Last active September 9, 2024 08:08
Configure Git to use a proxy

Configure Git to use a proxy

In Brief

You may need to configure a proxy server if you're having trouble cloning or fetching from a remote repository or getting an error like unable to access '...' Couldn't resolve host '...'.

Consider something like: