Skip to content
laforge49 edited this page Dec 9, 2011 · 417 revisions

#Functional Programming with High Performance Actors

##Introduction

Why do we need a new approach to programming? On the one hand, Solid State Disk (SSD) has shifted the balance between I/O and CPU. Where before there might not have been much benefit to faster code, that is simply no longer the case. Improvements in execution speed can now significantly improve program performance.

On the other hand, computers are getting fatter, not faster. An i9 chip has 12 hardware threads. A Sparc T4 chip has 64. But writing programs that actually run faster as you add more hardware threads is difficult--which is one of the reasons why single-threaded platforms like node.js are so popular. Passing data between threads (messaging) is inherently slow. Programs which work well with multiple threads typically pass data between threads in large chunks, e.g. data-flow based software. So moving everything to an asynchronous model like actors is unlikely to work well when a lot of messages are involved.

As we move to use more threads in our software, managing state becomes more complex. Locks are not a particularly good solution, as the code is prone to errors--especially as the code changes over time. Actors provide a attractive solution to managing state, as do concurrent data structures.

But with current technologies we end up developing programs with an architecture specific to the problem at hand. And re-architect our programs as the requirements change. This is a labor-intensive approach that minimizes code reuse, particularly when using actors and refactoring method calls to be asynchronous messages or asynchronous messages become method calls. To maximize reuse, an actor should neither know nor care if the exchange of messages with another actor is synchronous or asynchronous. And if most message exchanges are synchronous, we can have many small actors working at a fairly low-level without slowing the program down to a crawl.

Another issue is flow control and its impact on throughput. Messaging is typically one-way, and takes extra effort to add flow control. But without some flow control, the systems we develop behave badly under load. Also, if you send one message at a time and do not proceed until a response is received, the throughput of most actor implementations will suffer a severe drop in throughput. What is needed is a messaging system which is more like method calls, while still having a reasonably high throughput rate.

AsyncFP Actors, by default, are single threaded. But an actor can be initialized with an asynchronous mailbox, and then it will run on a separate thread when it has messages to process. So we can easily break out blocking I/O and heavy computations (> 5 microseconds) to take advantage of the available hardware threads.

Message passing in AsyncFP is very close to a simple method call, and nearly as fast. A request/response runs in about 200 nanoseconds. (The only exception is when the target actor is both idle and asynchronous--and then it takes about 3 microseconds.) Here's an in-depth look at the code that accomplishes this: Speed --New in release 2.0.0

Speed is important, but keeping the code clean and simple is just as important. And addressing that is actually the larger part of AsyncFP: --New in release 2.0.0

  1. Sequence, which supports a functional programming approach.
  2. Chain, a simple state machine to help keep the code linear.
  3. AsyncMailbox, for when you need things to run in parallel.
  4. Transactions, for added thread safety.
  5. ExceptionHandlers, which are helpful as control flows are not always predictable.
  6. Factories and Components, for improved code reuse and flexibility.
  7. Dependency Injection simplifies access to common operations.
  8. File Loader loads files asynchronously.
  9. Properties, parameters for multiple actors.
  10. Factory Registry, needed for factory-based serialization.
  11. Actor Registry, an easy way to track actors.
  12. ...

##Blip Project - Beta

  • Tutorial (Docs somewhat dated as of release 2.0.0.)

##Building on Blip

Incremental Deserialization - Alpha (Docs somewhat dated as of release 2.0.0.)

  • Incremental (lazy) deserialization, for high-speed updates of a persistent store. Only the data which is accessed needs to be deserialized, and only the data which is updated needs to be reserialized.
  • Deserialization makes use of a factory registry. Configuration data then is not serialized, but supplied by the factory when an actor is reinstantiated.

Persistence - Alpha (Docs somewhat dated as of release 2.0.0.)

  • Basic support for transactional persistence.
  • Crash-proof persistence.
  • Transaction logs can be used to rebuild the database.
  • Batch updates and opportunistic locking. --New in release 1.3.0
  • Swift only periodically updates its datastore. On restart, the tail of the old log file is used to create a current version of the datastore. The transaction rate is now limited primarily by the flushes to the log file. --New in release 1.3.2

##Misc

##Contact Us!

Feel free to join the discussion group at https://groups.google.com/group/agilewikidevelopers. Or contact me directly--I can always use the encouragement!

Bill La Forge
Software Research Engineer
laforge49@gmail.com
Twitter: laforge49
Pearltree

Clone this wiki locally