Emtest
Cucumber
Recently a testing framwork called Cucumber came to my attention. I have multiple reactions to it:
They somewhat adopted my approach of table-driven testing.
Hooray! They somewhat adopted my approach of table-driven testing. When I started using table-driven testing and made it available in Emtest, nobody was doing that. Back then, factory methods were the big thing.
I created it because I saw a dilemma. Often one is testing functionality that builds an output from a related input. Before, there were no good options to relate input and output. You could:
- Repeat yourself by writing both the inputs and the outputs that often contained the same values. It's a huge error opportunity, along with all the other vices of repeating yourself in source code.
- Write a test that constructed or deconstructed objects. Such testa are typically almost as complex as the function they test.
- Build the output examples from the input examples by name. Bug juggling dozens of very similar names this way clutters the namespace and is a huge PITA. This was the actual impetus for me to invent a better way to do it.
But they left important parts unadopted
But they didn't really adopt table testing in its full power. There are a number of things I have found important for table-driven testing that they apparently have not contemplated:
- N/A fields
- These are unprovided fields. A test detects them, usually skipping over rows that lack a relevant field. This is more useful than you might think. Often you are defining example inputs to a function that usually produces output (another field) but sometimes ought to raise error. For those cases, you need to provide inputs but there is nothing sensible to put in the output field.
- Constructed fields
- Often you want to construct some fields in terms of other fields in the same row. The rationale above leads directly there.
- Constructed fields II
- And often you want to construct examples in terms of examples that are used in other tests. You know those examples are right because they are part of working tests. If they had some subtle stupid mistake in them, it'd have already shown up there. Reuse is nice here.
- Persistent fields
-
This idea is not originally mine, it comes
from an article on Gamasutra1. I did expand it a lot,
though. The author looked for a way to test image generation
(scenes) and what he did was at some point, capture a "good"
image the same image generator. Then from that point on, he
could automatically compare the output to a known good image.
- He knew for sure when it passed.
- When the comparison failed, he could diff the images and see where and how badly; it might be unnoticeable dithering or the generator might have omitted entire objects or shadows.
- He could improve the reference image as his generator got better.
I've found persistent fields indispensable. I use them for basically anything that's easier to inspect that it is to write examples of. For instance, about half of the Klink tests use it.
They didn't even mention me
AFAICT neither Cucumber nor Gherkin credits me at all. Maybe they're honestly unaware of the lineage of the ideas they're using. Still, it gets tiresome not getting credit for stuff that AFAICT I invented and gave freely to everybody in the form of working code.
They don't use TESTRAL or anything like it.
TESTRAL is the format I defined for reporting tests. Without going
into great detail, TESTRAL is better than anything else out there.
Not just better than the brain-dead ad hoc
formats, but better than
TestXML.
BDD is nice
Still, I think they have some good ideas, especially regarding Behavior Driven Development. IMO that's much better than Test-Driven Development2.
In TDD, you're expected to test down to the fine-grained units. I've gone that route, and it's a chore. Yes, you get a nice regression suite, but pretty soon you just want to say "just let me write code!"
In constrast, where TDD is bottom-up, BDD is top-down. Your tests come from use-cases (which are structured the way I structure inline docstrings in tests, which is nice, and just how much did you Cucumber guys borrow?) BDD looks like a good paradigm for development.
Not satisfied with Emtest tables, I replaced them
But my "I was first" notwithstanding, I'm not satisfied with the way I made Emtest do tables. At the time, because nobody anywhere had experience with that sort of thing, I adopted the most flexible approach I could see. This was tag-based, an idea I borrowed from Carsten Dominick's org-mode3.
However, over the years the tag-based approach has proved too powerful.
- It takes a lot of clever code behind the scenes to make it work.
- Maintaining that code is a PITA. Really, it's been one of the most time-consuming parts of Emtest, and always had the longest todo list.
- In front of the scenes, there's too much power. That's not as good as it sounds, and led to complex specifications because too many tags needed management.
- Originally I had thought that a global tag approach would work best, because it would make the most stuff available. That was a dud which I fixed that years ago.
So, new tables for Emtest
So this afternoon I coded a better table package for Emtest. It's available on Savannah right now; rather, the new Emtest with it is available. It's much simpler to use:
- emt:tab:make
-
define a table, giving arguments:
- docstring
- A docstring for the entire table.
- headers
- A list of column names. For now they are simply symbols, later they may get default initialization forms and other help
- rows
- The remaining arguments are rows. Each begins with a namestring.
- emt:tab:for-each-row
-
Evaluate
body
once for each row, with the row bound tovar-sym
- emt:tab
- Given a table row and a field symbol, get the value of the respective field
I haven't added Constructed fields or Persistent fields yet. I will when I have to use them.
Also added foreign-tester support
Emtest also now supports foreign testers. That is, it can communicate with an external process running a tester, and then report that tester's results and do all the bells and whistles (persistence, organizing results, expanding and collapsing them, point-and-shoot launching of tests, etc) So the external tester can be not much more than "find test, run test, build TESTRAL result".
It communicates in Rivest-style canonical s-expressions, which is as simple a structured format as anything ever. It's equally as expressive as XML and there exist interconverters.
I did this with the idea of using it for the Functional Reactive Programming stuff I was talking about before, if in fact I make a test implementation for it (Not sure).
And renamed to tame the chaos
At one time I had written Emtest so that the function and command
prefixes were all modular. Originally they were written-out, like
emtest/explorer/fileset/launch
. That was huge and unwieldy, so I
shortened their prefixes to module unique abbreviations like emtl:
But when I looked at it again now, that was chaos! So now
-
Everything the user would normally use is prefixed
emtest
-
Main entry point
emtest
-
Code-editing entry point
emtest:insert
-
"Panic" reset command
emtest:reset
- etc
-
Main entry point
-
Everything else is prefixed
emt:
followed by a 2 or 3 letter abbreviation of its module.
I haven't done this to the define and testhelp modules, though, since the old names are probably still in use somewhere.
Footnotes:
1 See, when I borrow ideas, I credit the people it came from, even if I have improved on it. Can't find the article but I did look; it was somewhat over 5 years ago, one of the first big articles on testing there.
2 Kent Beck's. Again, crediting the originator.
3 Again credit where it's due. He didn't invent tags, of course, and I don't know who was upstream from him wrt that.