The Test Development FAQ is addressed to those who develop tests or organize testing efforts. It should also be useful to those who develop specifications or who run tests.

About this document

The FAQ provides introductory information about the purpose of testing, how to get started, and what the testing process involves. This FAQ primarily documents what is already considered good testing practice or the norm, but it also includes a number of advanced testing goals that have not yet been fully achieved by any Working Group.

This is a living document that is updated periodically, particularly in response to feedback from readers. You can provide feedback by emailing www-qa@w3.org (a publicly archived mailing list).

Test Development FAQ

Last edited $Date: 2010/01/29 13:55:50 $

  1. Are tests suites required by the W3C Process?
  2. Why is testing important?
  3. When should test development start?
  4. Who will develop the tests?
  5. Can we re-use tests developed by another Working Group?
  6. How do we decide what tests to develop?
  7. What should we do with test contributions we receive?
  8. What makes a good test?
  9. How many tests are enough?
  10. How should tests report their outcome?
  11. Do I really have to worry about all that legal stuff?
  12. How should I package and publish my tests?
  13. What about documentation?
  14. Should I automate test execution?
  15. Once I publish my tests, I'm done, right?
  16. How should I handle bugs in my test suite?
  17. Should test results be published?
  18. Should we implement a branding or certification program?

1. Are Test Suites required by the W3C Process?

As part of the transition from Candidate Recommendation to Proposed Recommendation, the W3C Process Document requires that the Working Group demonstrates that:

[...] each feature of the technical report has been implemented. Preferably, the Working Group SHOULD be able to demonstrate two interoperable implementations of each feature.

In most cases, the most practical way to demonstrate both that all the features were implemented, and that they are implemented in an interoperable fashion, is to to show that there are test cases that cover most of the features of the specification, and that for each of these test cases, there are at least two implementations that pass it.

So, while the Process Document leaves some leeway (which is useful since not all specifcations can make use of a test suite), if a Working Group is developing a technology that can be tested in a sensible fashion, the W3C Director is likely to require a test suite before allowing to move to Proposed Recommendation.

2. Why is testing important?

As the About W3C document explains:

In order for the Web to reach its full potential, the most fundamental Web technologies must be compatible with one another and allow any hardware and software used to access the Web to work together. W3C refers to this goal as “Web interoperability.” By publishing open (non-proprietary) standards for Web languages and protocols, W3C seeks to avoid market fragmentation and thus Web fragmentation.

Two implementations of a technology are said to be compatible if they both conform to the same specifications. Conformance to specifications is a necessary condition for interoperability, but it is not sufficient; the specifications must also promote interoperability (by clearly defining behaviors and protocols, for example).

In order to promote these goals the W3C Process Document's Proposed Recommendation entrance criteria include the requirement to demonstrate two interoperable implementations of each feature in the specification (see how this relates to testing).

Two types of testing are particularly helpful:

Note that both forms of testing help to detect defects (ambiguities, lack of clarity, omissions, contradictions) in specifications and are therefore useful when conducted while the specification is being developed.

Because testing is the key to interoperability, Working Groups are increasingly interested in this subject.

This FAQ focuses primarily on conformance testing (a key to interoperability) although some of its recommendations are also applicable to other kinds of testing. (See the Software QA and Testing FAQ for much useful information, including a comprehensive classification of different types of testing.)

3. When should test development start?

Test planning should start very early; ideally at the same time as you start working on the specification. Defining a testing approach (what kinds of tests to develop and how they should operate) and thinking about testability are helpful even in the early stages of specification development.

During the planning phase, identify all the specifications to be tested. This may seem obvious but often specifications refer to or depend on other specifications. It is also important to understand and to limit the scope of what is to be tested; so, focus on what really needs testing and not on related or dependent technologies being utilized indirectly by implementations.

Typically, Working Groups develop their test suites when the specifications have reached a reasonable level of stability. However, it is important to start the test development process before the specification is frozen since this helps to identify problems (ambiguities, lack of clarity, omissions, contradictions) while there is still time to correct them. 

Another interesting approach—often referred to as Test Driven Development—is developing tests specifically to explore issues and problems in the specification. (The OWL Working Group found this approach helpful.) Note that this implies significantly more work as you will need to keep the specification and the tests synchronized.

4. Who will develop the tests?

Most likely, it will be the members of your Working Group who contribute resources for test development. However, it is also worthwhile to approach third parties and ask if they are interested in developing tests. (For example, organizations that do not participate directly in your activities may want to contribute to your testing efforts if they have an interest in the effective deployment of the tested technology.)

Whichever approach you take, you will need to solicit and to manage contributions from others. This can require a considerable amount of organization and effort, particularly if you want to provide high-quality tests covering the full range of the specification. So, do take the time to create an informative and persuasive appeal for contributions.

Specify the format for developing the tests (including how tests are invoked and how they report their outcome) and any metadata to be supplied with the tests (including a description of the purpose of the test, a pointer to the portion of the specification that is tested, and the test's expected results).

For examples of guidelines, see

See also Test Suite Principles in the HTML4 Test Suite Documentation, which you may find instructive and useful.

Likewise, the Method for Writing Testable Conformance Requirements can be a useful approach to integrate testability within the specification itself.

Providing guidelines like these to your test developers will make it more likely that you will receive quality submissions. Obviously, the clearer your guidelines, the easier it will be for people to develop tests, and the greater the likelihood of tests being developed correctly and effectively.

5. Can we re-use tests developed by another Working Group?

As the family of XML languages evolves, there is an increasing tendency to develop modular specifications (specifications that are intended to be re-used in a variety of technologies). For example, XSLT and XForms both use XPath as their expression language. This trend presents the opportunity (and also the challenge) of a more modular approach to test development. If your specification incorporates such a language module, you may be able to incorporate into your own test suite tests that were developed by the Working Group that defined that module.

Also, do consider this trend and plan for it if you are developing a specification that you already know will be re-used. The guidelines and practices outlined in this FAQ are likely to prove even more important when tests being developed are intended for incorporation into more than one test suite.

For a brief discussion of some of the issues involved in test re-use, see this presentation from the W3C's 2005 Technical Plenary.

6. How do we decide what tests to develop?

It is best to focus development efforts where they will be most effective and useful. Namely, where:

Do be proactive and guide test developers to give priority to testing the areas of the specification where coverage is most needed. Note that this implies the creation and maintenance of some kind of coverage map (more on this topic under Question 8). Also, proactive guidance will help to avoid duplication of effort.

If you do not guide test developers, you may receive tests for the areas of the specification that are most easily tested, but where the value of such tests is minimal (perhaps because implementers are more likely to test these areas themselves and to find and correct any problems).

Test development is of course an iterative process.  As the CSS Test Suite Principles point out, " [...] experience with existing implementations is a great help. As implementations progress, new areas worthy of being tested will come to light, and the test suite should be updated regularly to track these developments."

7. What should we do with test contributions we receive?

The more successful you are in soliciting contributions the more important it is to create a process for managing them. All submissions should be reviewed to ensure that they are appropriate, correct, and of satisfactory quality. Keep track of who submitted each test and of the state that each test is in (for example, submitted, accepted, reviewed, returned for revision, or rejected).

For examples of test review guidelines see: