Friday, April 21, 2023

Data queries and transformations via WebAssembly plugins

Following on the previous research I've done on using WebAssembly as a way to write plugins possibly in several languages and running them in a Rust app via the Wasmer.io runtime, I've started building something more extensive.

The plugins now allow writing data querying and transformation code without having to deal with the low level details on how to connect to the underlying database. A plugin can:

- define the input parameters it needs to run.

- based on the actual provided values for these parameters, generate a SQL query string and bound parameters that actually need to run on the database. So a plugin can have full control on the SQL generation. Some plugins could always use the same SQL query and just use bound parameters coming from their input parameters, or could do any kind of preprocessing to generate the SQL, for things that bound parameters don't allow.

- the runtime will run the query with the bound parameters on the underlying database, and return the result row by row to the plugin. The plugin can then do whatever processing on each row it needs, and return intermediate results or only return results at the end.

Since I using Wasmer, let's define the interface we need in WAI format:

// Get the metadata of the query.
metadata: func() -> query-metadata

// Start the query processing with the given variables.
start: func(variables: list<variable>) -> execution

// Encapsulates the query row processing.
resource execution {
// The actual query to run.
query-string: func() -> string

// The variables to use in the query.
variables: func() -> list<variable>

// Callback on each data row, returning potential intermediate results.
row: func(data: list<variable>) -> option<query-result>

// Callback on query end, returning potential final results.
// Columns are passed in case no data was returned.
end: func(columns: list<string>) -> option<query-result>
}

The simple data types are left out for brevity (they are defined in their own file), but hopefully the intent of this interface should be clear.  The metadata function returns a description of what the plugin does and what parameters it takes (a parameter is strongly typed). The start function take actual values for parameters and return an execution resource. This execution exposes the actual query string and bound parameters for that query (via the variables method). The runtime will then call the row function for each result row and the end function at the end. Each of these can return a possibly partial result. The end function takes the names of columns so that proper metadata is known even if the query returned no row.

Examples of very basic plugins used in tests can be seen here and here. They just collect the data passed to them and return it in the end method.

Each plugin can then be compiled to WASM for example via the cargo wapm --dry-run command provided by Wasmer.

The current runtime I've built is very simple: it takes all plugins from a folder and database connections are defined in a YAML file, and only Sqlite and Postgres are supported. An executable is provided to be able to run plugins from the command line.

Using a WAI interface and not having to deal with low level WASM code is great. cargo expand is your friend to understand what Wasmer generates as structures are generated differently between the import! and export! macros, so some structures own their data while some take references, which can sometimes trip you up.

Of course I would need to test this for performance, to determine how much copying of data is done between the Rust runtime and the WASM plugins.

Let me know if you have use cases where this approach could be interesting! As usual, all code is available on Github.


Tuesday, April 11, 2023

WebAssembly: bidirectional communication between components and host

Still investigating the use of WebAssembly to implement a plugin system using Rust and Wasmer. In the previous post, I could load a WebAssembly plugin that implemented an interface that the host knew about. But since WebAssembly is very limited by design in terms of API, as soon as a plugin wants to interact with the outside world in some shape of form, it needs to be able to call an API the host will provide. Thus the host can ensure safety, performance, etc. without allowing the WebAssembly code direct access to real resources.

So let's say we'd like to customize our greeting message based on the time of day. We're going to need a very simple function to tell us the current hour of the day:

hour: func() -> u32

This goes into host.wai. In our english-rs module we can then import! this file with standard web assembly Rust generation and use it in our implementation of greet:

wai_bindgen_rust::export!("greeter.wai");
wai_bindgen_rust::import!("host.wai");

struct Greeter;

impl crate::greeter::Greeter for Greeter {
/// The language we greet in.
fn language() -> String {
String::from("English")
}

/// Greet the given name.
fn greet(name: String) -> String {
let hour = host::hour();
if hour < 12 {
format!("Good morning, {name}!")
} else if hour < 18 {
format!("Good afternoon, {name}!")
} else {
format!("Good evening, {name}!")
}
}
}

Note how we call the host::hour function and get the hour. Of course here we doing something trivial using standard WebAssembly types like u32, things would become more complicated if the types get more complex.

The host code needs to provide an implementation of the hour function and inject it via imports:

fn hour() -> u32 {
let now = chrono::Local::now();
now.hour()
}
 
fn main() -> Result<()> {
 ...
let imports = imports! {
"host" => {
"hour" => Function::new_typed(&mut store, hour)
},
};
let instance = Instance::new(&mut store, &module, &imports)?;
...

Et voilĂ !

cargo run -- "JP Moresmau"
Language: English
Greeting: Good afternoon, JP Moresmau!

All code can be found as before at https://github.com/JPMoresmau/greet-plugins/tree/wasmer-wai.

Happy WebAssembly and Rust hacking!


Monday, April 10, 2023

Web Assembly Interfaces help integration of WASM libraries

 In the previous post, I showed how to run plugins generated from Rust code with wasm-bindgen, using the wasmtime crate. I then discovered Wasmer, so I rewrote the runtime to use the Wasmer API, which is very similar (see https://github.com/JPMoresmau/greet-plugins/tree/wasmer). But I see Wasmer have a lot more tooling available than just a runtime, so let's see how it can help!

I followed first the tutorial at https://wasmer.io/posts/wasmer-takes-webassembly-libraries-manistream-with-wai.

You can first define the Web Assembly interface you want to expose in a type of IDL file - think protobuf or Corba, depending on your age :-). Easy enough in our case:

language: func() -> string

greet: func(name: string) -> string

We can put that file (greeter.wai) in our english-rs crate folder, and remove all wasm-bindgen dependencies and related code. We can then use the export! macro of the wai-bindgen-rust crate to automatically generate a trait that defines both function, and then provide an implementation:

wai_bindgen_rust::export!("greeter.wai");

struct Greeter;

impl crate::greeter::Greeter for Greeter {
/// The language we greet in.
fn language() -> String {
String::from("English")
}

/// Greet the given name.
fn greet(name: String) -> String {
format!("Hello, {name}!")
}
}

That's it! The only change from the previous code apart from the impl block is that the name parameter is an owned String and not a &str.

Then I can publish this library to the wasmer WebAssembly libraries repositories via the cargo-wapm command. It now lives at https://wapm.io/JPMoresmau/english-rs. You can download the .wasm file from there!

What about the runtime? There's not a lot of documentation yet because of lot of this is still beta (and the specs of WebAssembly Interfaces and related concepts are still in flux), but it's possible to use the import! macro of the wai-bindgen-wasmer crate to generate code to interact with the module, using the same wai file: we use the same file that defines the interface to both generate the trait we need to implement and the client struct. This is what our greeter now looks like:

use std::{env, fs};

use anyhow::{anyhow, Result};
use greeter::{Greeter, GreeterData};
use wasmer::*;
use wasmer_compiler_llvm::LLVM;

wai_bindgen_wasmer::import!("greeter.wai");

/// Greet using all the plugins.
fn main() -> Result<()> {
let args: Vec<String> = env::args().collect();
if args.len() != 2 {
return Err(anyhow!("Usage: i18n-greeter <name>"));
}
let compiler_config = LLVM::default();
let engine = EngineBuilder::new(compiler_config).engine();

let paths = fs::read_dir("./plugins").unwrap();

for path in paths {
let path = path?;
let mut store = Store::new(&engine);

let module = Module::from_file(&store, path.path())?;

let imports = imports! {};
let instance = Instance::new(&mut store, &module, &imports)?;
let env = FunctionEnv::new(&mut store, GreeterData {});
let greeter = Greeter::new(&mut store, &instance, env)?;

let language = greeter.language(&mut store)?;
println!("Language: {language}");
let greeting = greeter.greet(&mut store, &args[1])?;
println!("Greeting: {greeting}");
}
Ok(())
}

No more requirement to understand how to call WASM functions and get the return value, the Wasmer WAI generated code does that for you! The API could still be cleaner (maybe without that store parameter everywhere) but the convenience is already a clear win.

All this code can be found at https://github.com/JPMoresmau/greet-plugins/tree/wasmer-wai.

Happy WebAssembly hacking!

Friday, April 07, 2023

A WebAssembly plugin system

Wow, it's been 4 years since I last blogged something! I usually just lurk on social media and I guess I was too busy to find the time to write about something reasonably interesting...

But hopefully today I did something I can present. I wanted to see how I could use WebAssembly to write plugins I could then load dynamically in Rust code. So far, results are good but only work with the wasm-bindgen conventions for calling and get results from functions. Still!

A lot of WebAssembly samples show you how to pass around the basic numeric types like i32, so I wanted something with Strings for a little extra complexity and interest. So our plugins will expose two methods:

- language takes no argument and return a string indicating which (human, not programming) language the plugin handles

- greet takes one argument, a person name, and return a greeting in the plugin's language

As you can see, not too involved, but a nice little use case.

The first plugin in Rust

I followed the instructions at the Rust and WebAssembly guide to get started, this was really painless. The Rust code for my first plugin is simply (including some generated code I didn't touch):

mod utils;

use wasm_bindgen::prelude::*;

// When the `wee_alloc` feature is enabled, use `wee_alloc` as the global
// allocator.
#[cfg(feature = "wee_alloc")]
#[global_allocator]
static ALLOC: wee_alloc::WeeAlloc = wee_alloc::WeeAlloc::INIT;

/// The language we greet in.
#[wasm_bindgen]
pub fn language() -> String {
String::from("English")
}

/// Greet the given name.
#[wasm_bindgen]
pub fn greet(name: &str) -> String {
format!("Hello, {name}!")
}

Running wasm-pack build gives us a .wasm file that exports language and greet, good!

The plugin runner in Rust

I used the wasmtime crate to get a runtime engine capable of loading and calling WebAssembly modules. I didn't look for any utility functions and implemented the string handling functions necessary to interact with the wasm-bindgen exposed functions myself.

So cargo.toml has very few dependencies:

[dependencies]
wasmtime = "1.0.0"
anyhow = "1.0.70"
byteorder = "1.4.3"

The main function is fairly straightforward: it gets the arguments, creates the WASM engine and asks each plugin it can find in a folder to greet the person:

fn main() -> Result<()> {
let args: Vec<String> = env::args().collect();
if args.len() != 2 {
return Err(anyhow!("Usage: i18n-greeter <name>"));
}
let engine = Engine::default();
let linker = Linker::new(&engine);

let paths = fs::read_dir("./plugins").unwrap();

for path in paths {
let path = path?;
let module = Module::from_file(&engine, path.path())?;
let mut runtime = Runtime::new(&engine, &linker, &module)?;
let language = runtime.language()?;
println!("Language: {language}");
let greeting = runtime.greet(&args[1])?;
println!("Greeting: {greeting}");
}
Ok(())
}

The magic is inside the Runtime struct, that actually handles the nitty-gritty of initializing all the necessary things for wasmtime to do its thing:

struct Runtime {
store: Store<()>,
memory: Memory,
/// Pointer to currently unused memory.
pointer: usize,
language: TypedFunc<i32, ()>,
greet: TypedFunc<(i32, i32, i32), ()>,
}

We have to manage the WebAssembly linear memory ourselves so we do it very simplify by keeping a pointer to where in the memory we can put stuff.

Initializing the runtime:

fn new(engine: &Engine, linker: &Linker<()>, module: &Module) -> Result<Self> {
let mut store = Store::new(engine, ());

let instance = linker.instantiate(&mut store, module)?;

let memory = instance
.get_memory(&mut store, "memory")
.ok_or(anyhow::format_err!("failed to find `memory` export"))?;
let language = instance
.get_func(&mut store, "language")
.ok_or(anyhow::format_err!(
"`language` was not an exported function"
))?
.typed::<i32, (), _>(&store)?;
let greet = instance
.get_func(&mut store, "greet")
.ok_or(anyhow::format_err!("`greet` was not an exported function"))?
.typed::<(i32, i32, i32), (), _>(&store)?;

Ok(Self {
store,
memory,
pointer: 0,
language,
greet,
})
}

With this we can do our own very basic memory management, which means reserving an area of memory, for example to read and write strings as UTF8 byte arrays, using the wasm-bindgen conventions:

/// Get a new pointer to store the given size in memory.
/// Grows memory if needed.
fn new_pointer(&mut self, size: usize) -> Result<i32> {
let current = self.pointer;
self.pointer += size;
while self.pointer > self.memory.data_size(&self.store) {
self.memory.grow(&mut self.store, 1)?;
}
Ok(current as i32)
}

/// Reset pointer, so memory can get overwritten.
fn reset_pointer(&mut self) {
self.pointer = 0;
}

/// Read string from memory.
fn read_string(&self, offset: i32, length: i32) -> Result<String> {
let mut contents = vec![0; length as usize];
self.memory
.read(&self.store, offset as usize, &mut contents)?;
Ok(String::from_utf8(contents)?)
}

/// Read bounds from memory.
fn read_bounds(&self, offset: i32) -> Result<(i32, i32)> {
let mut buffer = [0u8; 8];
self.memory
.read(&self.store, offset as usize, &mut buffer)?;
let start = (&buffer[0..4]).read_i32::<LittleEndian>()?;
let length = (&buffer[4..]).read_i32::<LittleEndian>()?;
Ok((start, length))
}

/// Write string into memory.
fn write_string(&mut self, str: &str) -> Result<(i32, i32)> {
let data = str.as_bytes();
let offset = self.new_pointer(data.len())?;
self.memory.write(&mut self.store, offset as usize, data)?;
Ok((offset, str.len() as i32))
}


Basically we pass two i32 when we need to transfer a string, the offset and length that we use on the linear memory to read or write the bytes.

Using these, wrapping our plugin functions is easy:

/// Call language function.
fn language(&mut self) -> Result<String> {
let offset = self.new_pointer(16)?;
self.language.call(&mut self.store, offset)?;
let (offset, length) = self.read_bounds(offset)?;
let s = self.read_string(offset, length)?;
self.reset_pointer();
Ok(s)
}

/// Call greet function.
fn greet(&mut self, name: &str) -> Result<String> {
let offset = self.new_pointer(16)?;
let (start, length) = self.write_string(name)?;
self.greet.call(&mut self.store, (offset, start, length))?;
let (offset, length) = self.read_bounds(offset)?;
let s = self.read_string(offset, length)?;
self.reset_pointer();
Ok(s)
}

So if we copy the wasm file compiled from our first plugin into the plugins directory and run the program with my name:

cargo run "JP Moresmau"
    Finished dev [unoptimized + debuginfo] target(s) in 0.08s
     Running `target/debug/i18n-greeter 'JP Moresmau'`
Language: English
Greeting: Hello, JP Moresmau!

Further steps

Of course now what would be very cool would be to be able to write plugins in other languages, but for example it looks like Go, even with TinyGo, is still very much tied to a Javascript runtime. Maybe the wasm-bindgen conventions will be ported to other languages than Rust in the future?

Trying it yourself

All the code can be found at https://github.com/JPMoresmau/greet-plugins.