- Rust
Too many Cargo.toml
files in the workspace, set this:
"rust-analyzer.linkedProjects": [
"/home/tngo/codes/melia/tools/compiler/ml3x_compiler_accel/Cargo.toml"
]
"rust-analyzer.check.extraArgs": [
"--target-dir=target/rust-analyzer"
]
"rust-analyzer.server.extraEnv": {
"CARGO_HOME": "/home/tngo/codes/ml3x/insim/build/.cargo"
},
if x > 0 {
println!("condition was true");
} else if x < 0 {
println!("condition was false");
} else {
println!("condition was false");
}
match x {
1 => println!("one"),
2 => println!("two"),
3 => println!("three"),
_ => println!("anything"),
}
// Forever loop
loop {
println!("again!");
}
// labeled loop
'outer: loop {
'inner: loop {
break 'outer;
}
}
// while loop
while number != 0 {
println!("{}!", number);
number -= 1;
}
// for loop
for i in 1..4 {
println!("{}", i);
}
- Signed:
i8
,i16
,i32
,i64
,i128
(100_000
is valid syntax for 100000 in Rust) - Unsigned:
u8
,u16
,u32
,u64
,u128
- Floating:
f32
,f64
- Boolean:
bool
- Character:
char
- Arrays (fixed-size, homogeneous data structures):
let a: [i32; 3] = [1, 2, 3];
- Tuples (fixed-size, heterogeneous data structures):
let tup: (i32, f64, char) = (500, 6.4, 'a');
- Immutable
&T
and mutable&mut T
references str
must be a reference:let s: &str = "Hello, world!";
because it is a string slice mapped to a fixed memmory address.to_string
return aString
type (in heap), whileas_str
return a&str
type (reference stored in stack, but data point to the heap).
- Stored in heap:
let owned_string: String = String::from("Hello, Rust!");
- Vectors:
let v: Vec<i32> = vec![1, 2, 3];
orlet mut v: Vec<i32> = Vec::new();
- Hash maps:
let mut scores: HashMap<String, i32> = HashMap::new();
- Hash sets:
let mut set: HashSet<i32> = HashSet::new();
- Structs:
struct User { username: String, email: String }
- Enums:
enum IpAddr { V4(u8, u8, u8, u8), V6(String) }
- Traits:
trait Summary { fn summarize(&self) -> String; }
- Closures:
let expensive_closure = |num| { println!("calculating slowly..."); thread::sleep(Duration::from_secs(2)); num };
- Tuple
(a, b, c)
: A tuple is a fixed-size, stack-allocated collection of potentially different types. Once a tuple is created, you cannot add or remove elements from it. - Array
[a; n]
: An array is a fixed-size, stack-allocated collection of elements of the same type. Once an array is created, its size cannot be changed. All elements in an array must be of the same type. - Vector
Vec<T>
: A vector is a growable, heap-allocated collection of elements of the same type. You can add or remove elements from a vector dynamically. All elements in a vector must be of the same type.
let my_array = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10];
// Create a slice with a step of 2
let sliced_array: Vec<_> = my_array.iter().skip(1).step_by(2).take(3).collect();
println!("{:?}", sliced_array); // Output: [2, 4, 6]
Self
is a keyword used to represent the type of the implementing struct or enum within a trait definition or implementation
trait Example {
fn generic_method(value: Self) -> Self;
}
enum Color {
Red,
Green,
Blue,
}
// classic struct
struct MyStruct {
value: i32,
}
// tuple struct
struct RGB(Color, Color, Color);
// unit struct (fieldless)
struct UnitStruct;
impl Example for MyStruct {
fn generic_method(value: MyStruct) -> MyStruct {
// Some generic logic
value
}
}
Option
(optional values,Some(T)
orNone
):let x: Option<i32> = Some(42);
orlet x: Option<i32> = None;
Result
(error handling,Ok(T)
orErr(E)
):let x: Result<i32, String> = Ok(-3);
orlet x: Result<i32, String> = Err("Some error message".to_string());
&mut self
syntax is used when defining methods that modify the contents of the instance they are called on.
Passing by reference mean no copy. Passing by value mean copy.
In Rust, there's no difference in syntax when calling a method on an instance of a type or a reference to an instance. This is because Rust has a feature called automatic referencing and dereferencing.
When you call a method with the .
operator, Rust automatically adds in any necessary &
, &mut
, or *
so you can call methods on the value no matter how it's referenced.
However, the method itself may require a certain type of receiver (the type of self
). If a method takes self
, &self
, or &mut self
, it's called on a value, a reference, or a mutable reference, respectively.
If the method requires a value and you have a reference, you'll need to dereference the reference with
*
to call the method. If the method requires a reference and you have a value, Rust will automatically reference the value.
"It [the deref algorithm] will deref as many times as possible (&&String -> &String -> String -> str) and then reference at max once (str -> &str)".
Compiler ensures:
- No dangling references (using lifetime)
- No double free
In Rust, variables are moved by default when you assign them to another variable or pass them to a function. This means that the original variable can no longer be used after the move.
When you assign x[0]
to y
, you are moving the Option<String>
out of the array. This is not allowed because the size of the array is fixed, and moving an element out would leave a hole.
fn pr(x: [Option<String>; 4]) {
let y = x[0]; // ERROR
let y = &x[0]; // OK
let z = y.unwrap(); // ERROR: cannot move out of y because it's borrowed reference
let z = (*y).unwrap(); // ERROR: same as above, automatic dereference here
let z0: String = x[0].clone().unwrap(); // OK
let z1: &String = x[0].as_ref().unwrap(); // OK
}
let x = String::from("Hello, world!"); // x is the owner of the string
let y = x; // x is moved to y, x is no longer valid
println!("{}", x); // This will panic! Because x is no longer valid
{
let z = y; // y is moved to z
} // z is no longer valid, y is no longer valid
println!("{}", y); // This will panic! Because y is no longer valid
/*==== Move is hard to detect ====*/
let v = vec![1, 2, 3];
fn sum_take_ownership(v: Vec<i32>) -> i32 {
let mut sum = 0;
for i in v {
sum += i;
}
sum
// v is moved to the function, v is no longer valid
}
let s = sum_take_ownership(v);
dbg!(s);
dbg!(v); // This will panic! Because v is no longer valid
let s = sum(v.clone());
dbg!(s);
dbg!(v); // this work because v is cloned/copied
/*==== ownership and Option ====*/
let food = Option::Some(Food::Apple);
let chop = Chopped(food.unwrap());
dbg!(food); // this will panic! because food is moved to chop
dbg!(chop);
/*==== function call with borrow to avoid move ====*/
fn sum_ref(v: &Vec<i32>) -> i32 {
let mut sum = 0;
for i in v {
sum += i;
}
sum
}
let s = sum_ref(&v); // sum_ref borrow v
dbg!(v);
#[derive(Debug)]
pub struct ApiKeyNotFound {
target_key: String,
known_keys: String,
}
ApiKeyNotFound("api_key".to_string(), "keys".to_string()); // String ownership is moved to the struct
/// This indicates that the `ApiKeyNotFound` struct has a lifetime `'a`, and the `target_key` and `known_keys` references must live at least as long as `'a`.
#[derive(Debug)]
pub struct ApiKeyNotFound<'a> {
target_key: &'a str,
known_keys: &'a str,
}
impl<'a> ApiKeyNotFound<'a> {
pub fn new(target_key: &'a str, known_keys: &'a str) -> Self {
ApiKeyNotFound {
target_key,
known_keys,
}
}
}
/*==== In this case, you cannot refer the str but must own the String ====*/
fn test(){
let target = String::from("Shit");
Err(ApiKeyNotFound::new("damn", target.as_str())) // Err: cannot return value refer to a local variable
}
String
: This is an owned string type. If you want the struct to own the data (i.e., you want to store the string data directly in the struct), you should useString
. This means that the struct will be responsible for deallocating the string data when it is no longer needed. This is generally easier to work with because you don't have to worry about lifetimes, but it can cause more memory allocation/deallocation.&str
: This is a borrowed string type. If you want the struct to borrow the data (i.e., you want to store a reference to string data that is owned by something else), you should use&str
. This means that the struct will not be responsible for deallocating the string data. This can be more efficient because it avoids unnecessary memory allocation/deallocation, but it can be harder to work with because you have to ensure that the string data outlives the struct (i.e., the string data is not deallocated while the struct still exists).
In your case, if the target_key
and known_keys
are expected to be relatively short and not changed frequently, using String
would be more convenient and the performance impact would be negligible.
If they are expected to be very large or changed frequently, you might want to consider using &str
to avoid frequent memory allocation/deallocation. However, you would need to add a lifetime parameter to the struct to use &str
, like so:
The scope within which a borrowed reference is valid. The aim of lifetimes is to prevent dangling references, which cause a program to reference data that has been deallocated.
let alice = "Alice";
let bob = "Bob";
fn longest<'a>(x: &'a str, y: &'a str) -> &'a str {
if x.len() > y.len() {
x
} else {
y
}
}
let result = longest(alice, bob);
println!("The longest string is {}", result);
// this also work
fn longest<'a>(x: &'a str, _y: &str) -> &'a str {
x
}
// Struct with reference need lifetime annotation
struct Person<'a> {
name: &'a str
}
In this case the function return a borrowed value, but the compiler don't know the lifetime of the return value should be x
or y
Mutability constraints your ability to borrow references, only one of these two kinds of borrows can be active at a time:
- one or more references (
&T
) to a resource, - exactly one mutable reference (
&mut T
).
let mut writer = vec![1, 2, 3];
let reader = &writer;
writer.push(4);
/*==== but this does not work ====*/
let mut writer = vec![1, 2, 3];
let reader = &writer;
writer.push(4);
println!("{:?}", reader); // this will panic! because reader is still active
/*==== implicit borrow as immutable (active reader )====*/
let mut writer = vec![1, 2, 3];
for i in writer.iter() { // borrow as immutable
writer.push(i * 2); // error: borrow as mutable
}
Solutions for MARSAW:
/*==== Reorganize code ====*/
let mut writer = vec![1, 2, 3];
{
let item = writer.last();
}
writer.push(4); // this work because the borrow is no longer valid
/*==== clone ====*/
item = writer.last().clone();
writer.push(4); // this work because the borrow is no longer valid
/*==== interior mutability ====*/
use std::cell::RefCell;
let data = RefCell::new(vec![1, 2, 3]);
{
let im_ref = data.borrow(); // borrow immutably
}
let mut mut_ref = data.borrow_mut(); // borrow mutably
for i in mut_ref.iter_mut() {
*i += 1;
}
- Address-of/Reference operator:
&
- Dereference operator:
*
- Logical:
&&
- and,||
- or,!
- not,(a || b) && !(a && b)
- xor - Bitwise:
&
- and,|
- or,^
- xor,!
- not,<<
- left shift,>>
- right shift - Field access:
.
- access field of a struct,->
- access field of a struct pointer
- Range:
..
- exclusive,..=
- inclusive (i.e.1..=5
is equivalent to1, 2, 3, 4, 5
)- The real type is:
std::ops::Range
andstd::ops::RangeInclusive
- The real type is:
- Array indexing:
x[0]
- access element of an array- Slicing:
&x[1..3]
- access a slice of an array. Reference is needed because slicing does not copy the data, it creates a view or reference to the original array.
- Slicing:
Iterator:
- The iterator returned by into_iter may yield any of T, &T or &mut T, depending on the context.
- The iterator returned by iter will yield &T, by convention.
- The iterator returned by iter_mut will yield &mut T, by convention.
String slicing:
let s = "Hello, world!";
let slice = &s[0..5]; // Slice from byte index 0 to 5
println!("Slice: {}", slice); // Prints: Slice: Hello
let s = "Здравствуйте"; // "Hello" in Russian
let slice = &s[0..1]; // This will panic! Because the first byte of the string is not a valid char boundary (each char is 2 bytes in UTF-8)
- Lambda:
|x| x + 1
or|x, y| { x + y }
- Error propagation:
?
- propagate error to caller (down the stack) - Ignored values:
_
- Macro expansion:
!
-println!
is a macro, not a function - Liftime:
'a
- lifetime of a variablelet r: &'a i32 = &x;
-r
has the same lifetime asa
In Rust, the correct syntax for specifying type parameters is Type::<TypeParameter>::function()
.
fn identity<T>(value: T) -> T {
value
}
struct Wrapper<T> {
value: T,
}
let result: i32 = identity(42);
let wrapper: Wrapper<i32> = Wrapper { value: 42 };
let vector = Vec::<i32>::new();
<..>
is needed implementing the function because closure is a generic type, any trait that has the same signature can be passed to the function.
fn apply_operation<F>(x: i32, y: i32, operation: F) -> i32
where
F: Fn(i32, i32) -> i32,
{
operation(x, y)
}
let add_closure = |a, b| a + b;
let result = apply_operation(3, 5, add_closure);
Three types of closures:
FnOnce(self)
: consumes (copy) the variables it captures from its enclosing scope, known as the closure’s environment.FnMut(&mut self)
: can change the environment because it mutably borrows values.Fn(&self)
: borrows values from the environment immutably.
The move
keyword signals that the closure should take ownership (i.e. copy) of the variables it captures, even if they are normally borrowed.
let x = 42;
// Closure without `move` (borrows x)
let borrow_closure = || {
};
// Closure with `move` (takes ownership of x)
let move_closure = move || {
};
a
is borrow and it may outlive the context, thus, it must be move
to be captured by the closure.
let closure = |a: u64| async move {
for i in 1..10 {
println!("Hi number {} from the closure!", a);
thread::sleep(Duration::from_millis(1));
}
};
In Rust, when you create a closure that captures variables from its enclosing scope, by default, it borrows them immutably. This means that the closure cannot modify the captured variables unless they are declared as mutable.
// Example of Fn
let mut x = 7;
let add_two = |y| x += y;
add_two(5);
println!("{}", x); // prints 12
add_two(3);
println!("{}", x); // prints 15
// Example of FnOnce
let mut x = 7;
let add_two = move |y| x += y;
add_two(5);
println!("{}", x); // prints 12
// add_two(3); // This would cause a compile error because add_two takes ownership of x and x is not available after the first call
panic!
macro: terminate the program immediately unrecoverable.Result
type:Ok(T)
orErr(E)
, recoverable.
let x: Option<i32> = Some(5);
let y: Option<i32> = None;
let sum = x.unwrap() + y.unwrap(); // This will panic! Because y is None
let sum = x.unwrap() + y.unwrap_or(1); // this return 6
let sum = x.unwrap() + y.unwrap_or_else(|| 2); // this return 7
let sum = x.unwrap() + y.unwrap_or_default(); // this return 5
let sum = x.unwrap() + y.expect("y is None"); // this will panic! with message "y is None"
// using map
let sum = x.map(|v| v + 1).unwrap() ; // this will panic! because y is None
// using match
let z = match x {
Some(v) => v, // `v` is the value inside `Some`, this automatically unwrap the value
None => 0,
};
// using and_then
let sum = x.and_then(|v: i32| Some(v + 1)); // return Some(6)
let sum1 = x.map(|v: i32| v + 1); // return Some(6)
let sum2 = x.map(|v: i32| Some(v + 1)).flatten(); // return Some(6), without flatten it return Some(Some(6))
// using map results in nested Option, using and_then results in flattened Option
#[derive(Debug)] enum Food {Apple, Carrot, Potato}
#[derive(Debug)] struct Chopped(Food);
#[derive(Debug)] struct Cooked(Food);
let food = Option::Some(Food::Apple);
let cook = food.map(|f| Chopped(f)).map(|Chopped(f)| Cooked(f));
let x: Result<i32, &str> = Ok(5);
let y: Result<i32, &str> = Err("error");
dbg!(x.unwrap()); // this return 5
dbg!(y.unwrap()); // this will panic! with message "error"
fn multiply(x: &str, y: &str) -> Result<i32, std::num::ParseIntError> {
let x: i32 = x.parse()?;
let y: i32 = y.parse()?;
Ok(x * y)
}
fn multiply(x: &str, y: &str) -> Result<i32, std::num::ParseIntError> {
x.parse::<i32>().and_then(|x| y.parse::<i32>().map(|y| x * y))
}
type IoResult<T> = std::result::Result<T, std::io::Error>;
Multiple Error Type:
// Pulling results out of Options, map_or means (if-None, if-Some)
let x: Option<Result<String, &str>> = Some(Ok("hello".to_string()));
let y = x.map_or(Ok(None), |r| r.map(Some)); // NOTE: x is moved to y
let x: Result<String, &str> = Ok("hello".to_string());
x.ok_or_else(|| DoubleError.into()); // ok_or_else for Result
let x: Option<Result<String, &str>> = Some(Ok("hello".to_string()));
x.map_err(|e| DoubleError.into()); // map_err for Option
- Define a custom Error type
- Boxing Error:
Box<dyn Error>
, use aliastype Result<T> = std::result::Result<T, Box<dyn Error>>
- use
DoubleError.into()
to convertDoubleError
toBox<dyn Error>
- use
Must be called in an async context (i.e. async fn main
).
let handle = tokio::task::spawn_blocking(|| {
// some potentially blocking operation, offload onto a separate thread pool
});
let handle = tokio::spawn(async {
// some async code, schedule on an async executor
});
Can be called in a sync context.
let rt = tokio::runtime::Runtime::new().unwrap();
rt.block_on(async {
// some async code, schedule on an async executor
});
Using Async and Thread side-by-side in a sync function.
fn main_sync() {
println!("Hello, world!");
let rt = tokio::runtime::Runtime::new().unwrap();
rt.block_on(async {
println!("block_on");
join!(test0(), test1());
});
println!("Main thread {:?}", thread::current().id());
let data = Arc::new(Mutex::new(0));
let mut handles = vec![];
for _ in 0..10 {
let data = Arc::clone(&data);
let handle = thread::spawn(move || {
println!("thread {:?}", thread::current().id());
let mut data = data.lock().unwrap();
*data += 1;
});
handles.push(handle);
}
for handle in handles {
handle.join().unwrap();
}
println!("Result: {}", *data.lock().unwrap());
}
Using Async and Thread side-by-side in an async function.
async fn main_async() {
// Start an async task using tokio::spawn
let closure = |a: u64| async move {
for i in 1..a {
println!("Hi number {} from the closure!", i);
thread::sleep(Duration::from_millis(1));
}
};
let async_handle = task::spawn(closure(5));
// Start a new thread using std::thread::spawn
let thread_handle = thread::spawn(|| {
for i in 1..5 {
println!("Hi number {} from the new thread!", i);
thread::sleep(Duration::from_millis(1));
}
});
// Wait for both tasks to complete
async_handle.await.unwrap();
thread_handle.join().unwrap();
}
Project template:
my_project
├── Cargo.toml
├── src
│ ├── main.rs
│ ├── lib.rs
│ ├── mod0.rs
│ └── mod1.rs
└── tests
└── lib.rs
Define Modules: in lib.rs
mod mod0;
mod mod1;
Modules Codes: in mod0.rs
and mod1.rs
// mod0.rs
pub fn mod0_func() {
println!("mod0_func");
}
// mod1.rs
pub fn mod1_func() {
println!("mod1_func");
}
https://doc.rust-lang.org/cargo/guide/project-layout.html
├── Cargo.lock
├── Cargo.toml
├── src/
│ ├── lib.rs
│ ├── main.rs
│ └── bin/
│ ├── named-executable.rs
│ ├── another-executable.rs
│ └── multi-file-executable/
│ ├── main.rs
│ └── some_module.rs
├── benches/
│ ├── large-input.rs
│ └── multi-file-bench/
│ ├── main.rs
│ └── bench_module.rs
├── examples/
│ ├── simple.rs
│ └── multi-file-example/
│ ├── main.rs
│ └── ex_module.rs
└── tests/
├── some-integration-tests.rs
└── multi-file-test/
├── main.rs
└── test_module.rs
- Maturin create new project:
maturin new [name] --name=[name] --bindings=pyo3 --mixed
build-essential
andgcc
needed forcargo test
and buildrlib
root
├── Cargo.toml
├── pyrust-lib
│ ├── Cargo.toml
│ ├── src
│ │ └── lib.rs
│ ├── bin
│ │ └── main.rs
│ ├── tests
│ │ └── test.rs
│ ├── examples
│ │ └── example.rs
├── pyrust-rs
│ ├── Cargo.toml
│ ├── src
│ │ └── main.rs
└── pyrust-py
├── Cargo.toml
├── pyproject.toml
├── .rustfmt.toml
├── src/lib.rs
├── python/pyrust_py/__init__.py /py.typed /pyrust_py.pyi
└── tests/test_integration.rs
- Create
py.typed
file to define python type - Create
pyrust_py.pyi
file to define python interface
For Rust only project:
pyrust
├── Cargo.toml
├── pyproject.toml
├── .rustfmt.toml
├── src/lib.rs
├── pyrust.pyi
For mixed Python and Rust project:
pyrust
├── Cargo.toml
├── pyproject.toml
├── src/lib.rs
└── python/pyrust_py/__init__.py /py.typed /pyrust.pyi
Example of .pyi
file:
from typing import final
__all__ = ["sum_as_string"]
@final
def sum_as_string(a: int, b: int) -> str:
"""Sum two integers (add something) and return the result as a string."""
...
[workspace]
members = [ "pyrust-lib", "pyrust-rs", "pyrust-py" ]
resolver = "2"
[package]
name = "pyrust-py"
version = "0.1.0"
edition = "2021"
include = ["src", "python/pyrust_py", "pyproject.toml", "README.md", "!*.so"]
# See more keys and their definitions at https://doc.rust-lang.org/cargo/reference/manifest.html
[lib]
name = "pyrust_py"
# "cdylib" is necessary to produce a shared library for Python to import from.
# Downstream Rust code (including code in `bin/`, `examples/`, and `tests/`) will not be able
# to `use string_sum;` unless the "rlib" or "lib" crate type is also included, e.g.:
crate-type = ["cdylib", "rlib"]
[dependencies]
pyo3 = { version = "0.20.2" }
pyrust-lib = { path = "../pyrust-lib" }
[dev-dependencies]
pyo3 = { version = "0.20.2", features = ["auto-initialize"] }
[build-dependencies]
version_check = "0.9.4"
[features]
extension-module = ["pyo3/extension-module"]
# default = ["extension-module"]
[build-system]
requires = ["maturin>=1.4,<2.0"]
build-backend = "maturin"
[project]
name = "pyrust-py"
requires-python = ">=3.11"
classifiers = [
"Programming Language :: Rust",
"Programming Language :: Python :: Implementation :: CPython",
"Programming Language :: Python :: Implementation :: PyPy",
]
dynamic = ["version"] # __version__ must be defined in __init__.py
# pip install -e .[tests]
[project.optional-dependencies]
tests = ["pytest"]
[tool.maturin]
python-source = "python"
# module-name = "pyrust_py._pyrust_py"
bindings = "pyo3"
features = ["pyo3/extension-module"]
NOTE: module-name
must contain both python package name and rust lib name, separated by a dot (e.g. pyrust_py._pyrust_py
indicates import pyrust_py
for python package and import ._pyrust_py
for rust lib).
If python module name same as rust lib name, then module-name
can be omitted. Otherwise, it must be defined.
Hierarchical structure:
/projects/foo/bar/.cargo/config.toml
/projects/foo/.cargo/config.toml
/projects/.cargo/config.toml
$CARGO_HOME/.config.toml
paths = ["/projects/foo/target", "/projects/target"] # dependency overrides
[alias] # command aliases
t = "test -- --nocapture"
rr = "run --release"
[build]
jobs = 1
rustflags = ["-C", "target-cpu=native"] #compiler flags
[profile.dev] # cargo build
[profile.test] # cargo test
[profile.bench] # cargo bench
[profile.doc] # cargo doc
[profile.release] # cargo build --release
opt-level = 3
lto = "fat" # link-time optimization: "off", "thin" - part, "fat" - whole
codegen-units = 1 # When you set `codegen-units = 1`, it means that the compiler will generate the machine code for your program in a single unit. This can potentially lead to more optimized code because the compiler has full visibility into all parts of your program when performing optimizations.
# Profile overrides
[profile.dev.package."*"]
opt-level = 1
incremental = true # Enable incremental compilation
codegen-units = 4 # Set the number of codegen units for parallel compilation
[doc]
browser = "chromium"
[env]
RUST_BACKTRACE = "full" # full, minimal, 1
RUST_TEST_THREADS = "1"
PYO3_PRINT_DEBUG = "1"
CARGO_LOG = "info" # trace, debug, info, warn, error
# .rustfmt.toml
max_width = 100
tab_spaces = 4
hard_tabs = false
newline_style = "auto" # auto, native, unix, windows
use_small_heuristics = "Default" # Max,Off, Default, a positive integer
edition = "2021"
use std::collections::HashMap;
use pyo3::prelude::*;
use serde_json::Value;
fn value_to_object( val: &Value, py: Python<'_> ) -> PyObject {
match val {
Value::Null => py.None(),
Value::Bool( x ) => x.to_object( py ),
Value::Number( x ) => {
let oi64 = x.as_i64().map( |i| i.to_object( py ) );
let ou64 = x.as_u64().map( |i| i.to_object( py ) );
let of64 = x.as_f64().map( |i| i.to_object( py ) );
oi64.or( ou64 ).or( of64 ).expect( "number too large" )
},
Value::String( x ) => x.to_object( py ),
Value::Array( x ) => {
let inner: Vec<_> = x.iter().map(|x| value_to_object(x, py)).collect();
inner.to_object( py )
},
Value::Object( x ) => {
let inner: HashMap<_, _> =
x.iter()
.map( |( k, v )| ( k, value_to_object( v, py ) ) ).collect();
inner.to_object( py )
},
}
}
#[repr(transparent)]
#[derive( Clone, Debug )]
struct ParsedValue( Value );
impl ToPyObject for ParsedValue {
fn to_object( &self, py: Python<'_> ) -> PyObject {
value_to_object( &self.0, py )
}
}
#[pyfunction]
pub fn parse() -> PyResult<PyObject> {
let mapping: HashMap<i64, HashMap<String, ParsedValue>> = HashMap::from( [
( 1, HashMap::from( [
( "test11".to_string(), ParsedValue( "Foo".into() ) ),
( "test12".to_string(), ParsedValue( 123.into() ) ),
] ) ),
( 2, HashMap::from( [
( "test21".to_string(), ParsedValue( "Bar".into() ) ),
( "test22".to_string(), ParsedValue( 123.45.into() ) ),
] ) ),
] );
Ok( pyo3::Python::with_gil( |py| {
mapping.to_object( py )
} ) )
}
#[pymodule]
fn parser( _py: Python, m: &PyModule ) -> PyResult<()> {
m.add_function( wrap_pyfunction!( parse, m )? )?;
return Ok( () );
}
Rust Lulz: Godbolt assembly exploring without crate limitations, in Visual Studio Code https://saveriomiroddi.github.io/Rust-lulz-godbolt-assembly-exploring-without-crate-limitations-in-visual-studio-code/
- LLVM (and GCC) don't know how to auto-vectorize loops whose trip-count can't be calculated up front. This rules out search loops like this.
- Probably your only hope would be to manually loop over 2, 4, or 8-element chunks of the arrays, branchlessly calculating your condition based on all those elements. If you're lucky, LLVM might turn that into operations on one SIMD vector. So using that inner loop inside a larger loop could result in getting the compiler to make vectorized asm, for example using AVX vptest (which sets CF according to bitwise a AND (not b) having any non-zero bits). i.e. manually express the "unrolling" of SIMD elements in your source, for a specific vector widt
pub fn demo_slow(x: &[i32], y: &[i32], z: &mut [i32]) {
for i in 0..z.len() {
z[i] = x[i] * y[i];
}
}
pub fn demo_fast(x: &[i32], y: &[i32], z: &mut [i32]) {
let n = z.len();
let (x, y, z) = (&x[..n], &y[..n], &mut z[..n]); // NOTE: reslicing
for i in 0..z.len() {
z[i] = x[i] * y[i];
}
}
for i in 0..n
loops are generally the slowest in Rust. If you can't use iterators (which eliminate bounds checks), then the trick is to hint LLVM that the slices are large enough: let x = &x[0..n];
, but that very much depends on whether LLVM will figure out that the range of the slice matches range used in the for loop.
rustflags = ["-C", "target-cpu=native", "-C", "llvm-args=-ffast-math", "-C", "opt-level=3", "-C", "llvm-args=-force-vector-width=16"]
It uses the shortest length between x, y, and z
pub fn demo_iter(x: &[i32], y: &[i32], z: &mut [i32]) {
let products = std::iter::zip(x, y).map(|(&x, &y)| x * y);
std::iter::zip(z, products).for_each(|(z, p)| *z = p);
}
// This vectorizes great
pub fn dot_int(x: &[i32], y: &[i32]) -> i32 {
std::iter::zip(x, y).map(|(&x, &y)| x * y).sum()
}
// But this only unrolls, without vectorizing
pub fn dot_float(x: &[f32], y: &[f32]) -> f32 {
std::iter::zip(x, y).map(|(&x, &y)| x * y).sum()
}
pub fn dot<const N: usize>(x: &[f32; N], y: &[f32; N]) -> f32 {
let (x, x_tail) = x.as_chunks::<4>();
let (y, y_tail) = y.as_chunks::<4>();
assert!(x_tail.is_empty() && y_tail.is_empty(), "N must be a multiple of 4");
let mut sums = [0.0; 4];
for (x, y) in std::iter::zip(x, y) {
let [x0, x1, x2, x3] = *x;
let [y0, y1, y2, y3] = *y;
let [p0, p1, p2, p3] = [x0 * y0, x1 * y1, x2 * y2, x3 * y3];
sums[0] += p0;
sums[1] += p1;
sums[2] += p2;
sums[3] += p3;
}
(sums[0] + sums[1]) + (sums[2] + sums[3])
}