Beginning Factor – Shufflers & Combinators
In the previous installment of Beginning Factor, we discussed some of the attributes of stack-based languages and the syntax for defining words in Factor. This time around, I’d like to introduce stack shufflers, quotations & combinators, and touch on some more basic data types and their properties.
Up until now, we’ve essentially been using Factor as an overqualified RPN calculator. I just wanted to make sure that you don’t underestimate Factor because of these particular examples; Factor is an extremely capable and modern language that can be used for everything from web applications, to game development, to complex text parsing, and so on. I’m purposefully using over-simplified examples as a means to demonstrate specific points about the language. Stick with me, and I assure you the examples will gradually get more expressive.
Stack Shufflers
Because of Factor’s stack-based nature, you sometimes need the ability to rearrange or copy items in the stack to ensure that they are in the correct position for future use. The way to go about this is a group of words called “stack shufflers”.
Much like the name implies, stack shufflers are merely words that change the order (or number) of the items on the top of the stack. It is said that shuffle words “control the flow of data between words that perform actions.” In fact, you already know one stack shuffler… drop
There are three basic varieties of stack shufflers. Here they are along with the most commonly used word for each type:
-
Removing Shufflers
drop
( x -- )- removes the top item from the stack and discards it
-
Duplicating Shufflers
dup
( x -- x x )- duplicates the top item on the stack
-
Permuting Shufflers
swap
( x y -- y x )- exchanges the positions of the first and second items on the top of the stack
…and here are a couple of word definitions that use these shufflers appropriately:
: sq ( x -- y )
dup * ;
: neg ( x -- -x )
0 swap - ;
(scratchpad) 5 sq .
25
(scratchpad) 1 neg .
-1
Caveats
There are many more shuffle words that support intricate rearrangements and the duplication/removal of multiple items from differing locations within the stack. The problem is, code that is full of stack shufflers can easily become confusing. In general, you should try to minimize the use of stack shufflers to keep things understandable.
Of course there are exceptions to that rule, times when shufflers make the most sense, but the preferred alternative to complicated stack shufflers is the use of a “combinator” that fits your use case. That said, combinators will take a bit of explaining…
Quotations
Before we can get to the idea of combinators, we first have to discuss quotations:
(scratchpad) 5 [ "hello" print ] times
hello
hello
hello
hello
hello
In this example, [ "hello" print ]
is a quotation.
In layman’s terms, a quotation is a way to encapsulate a snippet of code so it doesn’t get called right away, but can be passed around on the stack and called later. The computer science term for this is an “anonymous function”. It’s referred to as “anonymous” because it is fundamentally a word that has no name.
If we didn’t have quotations, that same example would get ugly real fast:
(scratchpad) "hello" dup dup dup dup print print print print print
hello
hello
hello
hello
hello
Imagine if you wanted to print “hello” 1000 times!
To create a quotation, you just surround your code snippet with square brackets, and it will be pushed onto the stack: [ ... ]
. In order to do anything useful with that quotation once it’s on the stack, what you need is a combinator.
What is a Combinator?
A combinator is just a fancy name for a word that takes a quotation as one of its inputs. In the example above, times
is a combinator which takes an integer (n) and a quotation from the stack, and it calls that quotation n times.
Factor uses quotations & combinators extensively for conditionals, sequence traversal, namespaces, closures, and more…but that’s jumping the gun a bit. Before we dive into all of that, I’d like to get back to the idea of minimizing the use of stack shufflers by replacing them with appropriate combinators instead.
Combinators That Express Intent
While our code examples thus far have been easy to follow, when you start tackling more realistic problems, stack shufflers will begin to obfuscate your code. If we can represent our intentions consistently with a combinator instead, then code becomes cleaner and you can consequently focus more on your problem domain and less on mentally organizing your stack.
To illustrate this point, I’d like to introduce a couple of simple combinators:
dip
( x quot -- x )- calls a quotation while temporarily hiding the top item on the stack
keep
( x quot -- x )- calls a quotation with an item on the stack, restoring that item after the quotation returns
While both of these combinators have the same stack effect declaration, their usage is a bit different:
(scratchpad) 1 2 4 [ + ] dip
--- Data stack:
3
4
(scratchpad) clear
(scratchpad) 1 2 4 [ + ] keep
--- Data stack:
1
6
4
These two combinators alone can greatly reduce the number of stack shufflers your code will need. If you’re curious about how these combinators work, they both secretly take advantage of an auxiliary stack (called the “retain stack”) to temporarily store items while the supplied quotation is being executed. There are a few other preserving combinators that are worth exploring as well.
Cleave, Spread, and Apply
The cleave, spread, and apply combinators are your best weapons when trying to reduce the use of stack shufflers while simultaneously expressing intent. The key point being that they should express intent…if you find yourself writing code where these combinators don’t fit logically, then try another option.
-
Cleave Combinators
These are used when you want to apply multiple quotations to the same set of items on the top of the stack.
Let’s say that you want to find the average of a bunch of numbers in an array. The steps are straightforward, you take the sum of all the numbers and divide that sum by how many numbers you have (the length of the array):
(scratchpad) { 1 2 3 } dup sum swap length / . 2
We can eliminate the need for those stack shufflers and better express our intent by using a cleave combinator to achieve the same thing:
bi
( x p q -- )- applies quotation p to x, then applies quotation q to x
(scratchpad) { 1 2 3 } [ sum ] [ length ] bi / . 2
The different cleave combinators change either the number of quotations applied to your items (
bi
vs.tri
), or the number of items used as input for your quotations (bi
vs.2bi
). -
Spread Combinators
These are used when you want to apply a different quotation to different items on the top of the stack. The spread combinators are closely related to
dip
, but provide a bit more flexibility while also expressing intent.Let’s say that you have two coordinate positions in the form of
{ x y }
, and you’d like to extract the x-coordinate from the first position, and the y-coordinate from the second position to form a new position with those values:(scratchpad) { 1 2 } { 3 4 } swap first swap second 2array . { 1 4 }
We can eliminate the need for those stack shufflers and better express our intent by using a spread combinator to achieve the same thing:
bi*
( x y p q -- )- applies quotation p to x, then applies quotation q to y
(scratchpad) { 1 2 } { 3 4 } [ first ] [ second ] bi* 2array . { 1 4 }
When you want to do the same thing with more than two quotations/items, then using spread combinators eliminates the need for nested dips or shufflers and the added clarity becomes much more evident.
The different spread combinators change the number of quotations applied to the corresponding number of items on the stack (
bi*
vs.tri*
). -
Apply Combinators
These are used when you want to apply a single quotation to multiple items on the top of the stack.
Let’s say that you have two strings, each containing a name, and you want to see if those names are the same. In order to ignore case when doing the comparison, you decide to convert both strings to upper case before checking for equality:
(scratchpad) "john" "John" swap >upper swap >upper = . t
We can eliminate the need for those stack shufflers and better express our intent by using an apply combinator to achieve the same thing:
bi@
( x y quot -- )- applies the quotation to x, then to y
(scratchpad) "john" "John" [ >upper ] bi@ = . t
The different apply combinators change the number of items on the stack your quotation is applied to (
bi@
vs.tri@
).
The cleave, spread, and apply combinators are all closely related; if you’re having trouble keeping them apart, try to memorize the naming convention:
- If there is no suffix, it is a “cleave”
- If the suffix is *, it is a “spread”
- If the suffix is @, it is an “apply”
Once you learn these combinators, you should be able to express almost any pattern of complicated stack shufflers. Note that there are also generic forms for all of these combinators that can take additional inputs from the stack. If you find that you resort to using the generic forms more often then not, that’s usually a good indication that you should rethink your approach or put your data into a more appropriate structure.
Data Type Details
Before I turn you loose, I wanted to offer a few extra details about some of Factor’s basic data types…
Sequences
In the cleave and spread examples above, I was sneaky and used sequences without explaining them formally. A sequence is a finite, ordered, collection of elements. Any data type that implements the sequence mixin class (meaning a data type that knows its length and will let you set/get an element at a specific index) gains the ability to use the powerful built-in sequence operators. Read through that documentation to get an idea on how to manipulate sequences and their elements.
Factor has many sequence types that you may already be familiar with, such as arrays (fixed-size mutable sequences) and vectors (resizable mutable sequences), but there are also other data types that you might not expect to be sequences, such as strings. In Factor, a string is merely an array of Unicode 5.0 code points.
Using the sequence operators and combinators together, you can create all sorts of powerful abstractions that I’ll talk more about next time. Here are a couple of examples to whet your appetite:
(scratchpad) { 1 2 3 4 5 } [ even? ] filter .
{ 2 4 }
(scratchpad) { 1 2 3 } [ . ] each
1
2
3
(scratchpad) "Hello" [ alpha? ] all? .
t
(scratchpad) "Hello!!!" [ alpha? ] all? .
f
Keep in mind that in Factor, sequences are zero-based.
Numbers
So far, we have only used integers, but Factor also supports rational numbers (fractions), floats (decimal approximations of a number), and complex numbers (imaginary numbers).
(scratchpad) 100 330 / .
10/33
(scratchpad) 5/4 1/2 + .
1+3/4
(scratchpad) 5/4 0.5 + .
1.75
(scratchpad) 2 5 rect> .
C{ 2 5 }
Iota
One last thing that is going to come in handy, is a word for creating sequences from integers:
iota
( n -- iota )- creates a sequence of length n, whose elements are its non-negative predecessors (i.e. 0 to n-1 inclusive)
This word can be very helpful when performing counted loops or other control flow statements. For example, the following two lines are equivalent:
{ 0 1 2 3 4 5 6 7 8 9 } [ . ] each
10 iota [ . ] each
Next Time
Next time we’ll stick with a theme of “flow” and discuss control flow for your words and after that, the typical work flow when developing code with Factor. Hope you enjoyed this installment!