Kent Tamura | 59ffb02 | 2018-11-27 05:30:56 | [diff] [blame] | 1 | # Web Test Expectations and Baselines |
pwnall | d8a25072 | 2016-11-09 18:24:03 | [diff] [blame] | 2 | |
Kent Tamura | 59ffb02 | 2018-11-27 05:30:56 | [diff] [blame] | 3 | The primary function of the web tests is as a regression test suite; this |
pwnall | d8a25072 | 2016-11-09 18:24:03 | [diff] [blame] | 4 | means that, while we care about whether a page is being rendered correctly, we |
| 5 | care more about whether the page is being rendered the way we expect it to. In |
| 6 | other words, we look more for changes in behavior than we do for correctness. |
| 7 | |
| 8 | [TOC] |
| 9 | |
Kent Tamura | 59ffb02 | 2018-11-27 05:30:56 | [diff] [blame] | 10 | All web tests have "expected results", or "baselines", which may be one of |
pwnall | d8a25072 | 2016-11-09 18:24:03 | [diff] [blame] | 11 | several forms. The test may produce one or more of: |
| 12 | |
| 13 | * A text file containing JavaScript log messages. |
| 14 | * A text rendering of the Render Tree. |
| 15 | * A screen capture of the rendered page as a PNG file. |
| 16 | * WAV files of the audio output, for WebAudio tests. |
| 17 | |
Kent Tamura | 59ffb02 | 2018-11-27 05:30:56 | [diff] [blame] | 18 | For any of these types of tests, baselines are checked into the web_tests |
Robert Ma | 06f7acc | 2017-11-14 17:55:47 | [diff] [blame] | 19 | directory. The filename of a baseline is the same as that of the corresponding |
| 20 | test, but the extension is replaced with `-expected.{txt,png,wav}` (depending on |
| 21 | the type of test output). Baselines usually live alongside tests, with the |
| 22 | exception when baselines vary by platforms; read |
Kent Tamura | 59ffb02 | 2018-11-27 05:30:56 | [diff] [blame] | 23 | [Web Test Baseline Fallback](web_test_baseline_fallback.md) for more |
Robert Ma | 06f7acc | 2017-11-14 17:55:47 | [diff] [blame] | 24 | details. |
| 25 | |
| 26 | Lastly, we also support the concept of "reference tests", which check that two |
| 27 | pages are rendered identically (pixel-by-pixel). As long as the two tests' |
| 28 | output match, the tests pass. For more on reference tests, see |
pwnall | d8a25072 | 2016-11-09 18:24:03 | [diff] [blame] | 29 | [Writing ref tests](https://trac.webkit.org/wiki/Writing%20Reftests). |
| 30 | |
| 31 | ## Failing tests |
| 32 | |
| 33 | When the output doesn't match, there are two potential reasons for it: |
| 34 | |
| 35 | * The port is performing "correctly", but the output simply won't match the |
| 36 | generic version. The usual reason for this is for things like form controls, |
| 37 | which are rendered differently on each platform. |
| 38 | * The port is performing "incorrectly" (i.e., the test is failing). |
| 39 | |
| 40 | In both cases, the convention is to check in a new baseline (aka rebaseline), |
| 41 | even though that file may be codifying errors. This helps us maintain test |
| 42 | coverage for all the other things the test is testing while we resolve the bug. |
| 43 | |
| 44 | *** promo |
| 45 | If a test can be rebaselined, it should always be rebaselined instead of adding |
| 46 | lines to TestExpectations. |
| 47 | *** |
| 48 | |
| 49 | Bugs at [crbug.com](https://crbug.com) should track fixing incorrect behavior, |
| 50 | not lines in |
Kent Tamura | 59ffb02 | 2018-11-27 05:30:56 | [diff] [blame] | 51 | [TestExpectations](../../third_party/blink/web_tests/TestExpectations). If a |
pwnall | d8a25072 | 2016-11-09 18:24:03 | [diff] [blame] | 52 | test is never supposed to pass (e.g. it's testing Windows-specific behavior, so |
| 53 | can't ever pass on Linux/Mac), move it to the |
Kent Tamura | 59ffb02 | 2018-11-27 05:30:56 | [diff] [blame] | 54 | [NeverFixTests](../../third_party/blink/web_tests/NeverFixTests) file. That |
pwnall | d8a25072 | 2016-11-09 18:24:03 | [diff] [blame] | 55 | gets it out of the way of the rest of the project. |
| 56 | |
| 57 | There are some cases where you can't rebaseline and, unfortunately, we don't |
| 58 | have a better solution than either: |
| 59 | |
| 60 | 1. Reverting the patch that caused the failure, or |
| 61 | 2. Adding a line to TestExpectations and fixing the bug later. |
| 62 | |
| 63 | In this case, **reverting the patch is strongly preferred**. |
| 64 | |
| 65 | These are the cases where you can't rebaseline: |
| 66 | |
| 67 | * The test is a reference test. |
| 68 | * The test gives different output in release and debug; in this case, generate a |
| 69 | baseline with the release build, and mark the debug build as expected to fail. |
| 70 | * The test is flaky, crashes or times out. |
| 71 | * The test is for a feature that hasn't yet shipped on some platforms yet, but |
| 72 | will shortly. |
| 73 | |
| 74 | ## Handling flaky tests |
| 75 | |
Alison Gale | 81f4f2c | 2024-04-22 19:33:31 | [diff] [blame] | 76 | <!-- TODO(crbug.com/40262793): Describe the current flakiness dashboard and |
Jonathan Lee | 80280d2 | 2023-11-27 22:40:56 | [diff] [blame] | 77 | LUCI test history. --> |
pwnall | d8a25072 | 2016-11-09 18:24:03 | [diff] [blame] | 78 | |
Jonathan Lee | 80280d2 | 2023-11-27 22:40:56 | [diff] [blame] | 79 | Once you decide that a test is truly flaky, you can suppress it using the |
| 80 | TestExpectations file, as [described below](#updating-the-expectations-files). |
pwnall | d8a25072 | 2016-11-09 18:24:03 | [diff] [blame] | 81 | We do not generally expect Chromium sheriffs to spend time trying to address |
| 82 | flakiness, though. |
| 83 | |
| 84 | ## How to rebaseline |
| 85 | |
| 86 | Since baselines themselves are often platform-specific, updating baselines in |
| 87 | general requires fetching new test results after running the test on multiple |
| 88 | platforms. |
| 89 | |
| 90 | ### Rebaselining using try jobs |
| 91 | |
| 92 | The recommended way to rebaseline for a currently-in-progress CL is to use |
Quinten Yearsley | a58f83c | 2017-05-30 16:00:57 | [diff] [blame] | 93 | results from try jobs, by using the command-tool |
Kent Tamura | b53757e | 2018-04-20 17:54:48 | [diff] [blame] | 94 | `third_party/blink/tools/blink_tool.py rebaseline-cl`: |
pwnall | d8a25072 | 2016-11-09 18:24:03 | [diff] [blame] | 95 | |
Quinten Yearsley | a58f83c | 2017-05-30 16:00:57 | [diff] [blame] | 96 | 1. First, upload a CL. |
Kent Tamura | b53757e | 2018-04-20 17:54:48 | [diff] [blame] | 97 | 2. Trigger try jobs by running `blink_tool.py rebaseline-cl`. This should |
Quinten Yearsley | a58f83c | 2017-05-30 16:00:57 | [diff] [blame] | 98 | trigger jobs on |
Preethi Mohan | 6ad00ee | 2020-11-17 03:09:42 | [diff] [blame] | 99 | [tryserver.blink](https://ci.chromium.org/p/chromium/g/tryserver.blink/builders). |
pwnall | d8a25072 | 2016-11-09 18:24:03 | [diff] [blame] | 100 | 3. Wait for all try jobs to finish. |
Kent Tamura | b53757e | 2018-04-20 17:54:48 | [diff] [blame] | 101 | 4. Run `blink_tool.py rebaseline-cl` again to fetch new baselines. |
pwnall | d8a25072 | 2016-11-09 18:24:03 | [diff] [blame] | 102 | 5. Commit the new baselines and upload a new patch. |
| 103 | |
| 104 | This way, the new baselines can be reviewed along with the changes, which helps |
| 105 | the reviewer verify that the new baselines are correct. It also means that there |
Kent Tamura | 59ffb02 | 2018-11-27 05:30:56 | [diff] [blame] | 106 | is no period of time when the web test results are ignored. |
pwnall | d8a25072 | 2016-11-09 18:24:03 | [diff] [blame] | 107 | |
Weizhong Xia | aa38f7c | 2022-10-17 21:34:00 | [diff] [blame] | 108 | #### Handle bot timeouts |
| 109 | |
| 110 | When a change will cause many tests to fail, the try jobs may exit early because |
| 111 | the number of failures exceeds the limit, or the try jobs may timeout because |
| 112 | more time is needed for the retries. Rebaseline based on such results are not |
| 113 | suggested. The solution is to temporarily increase the number of shards in |
Jonathan Lee | 80280d2 | 2023-11-27 22:40:56 | [diff] [blame] | 114 | [`test_suite_exceptions.pyl`](/testing/buildbot/test_suite_exceptions.pyl) in your CL. |
Weizhong Xia | aa38f7c | 2022-10-17 21:34:00 | [diff] [blame] | 115 | Change the values back to its original value before sending the CL to CQ. |
| 116 | |
Quinten Yearsley | a58f83c | 2017-05-30 16:00:57 | [diff] [blame] | 117 | #### Options |
| 118 | |
Kent Tamura | b53757e | 2018-04-20 17:54:48 | [diff] [blame] | 119 | The tests which `blink_tool.py rebaseline-cl` tries to download new baselines for |
pwnall | d8a25072 | 2016-11-09 18:24:03 | [diff] [blame] | 120 | depends on its arguments. |
| 121 | |
| 122 | * By default, it tries to download all baselines for tests that failed in the |
| 123 | try jobs. |
| 124 | * If you pass `--only-changed-tests`, then only tests modified in the CL will be |
| 125 | considered. |
| 126 | * You can also explicitly pass a list of test names, and then just those tests |
| 127 | will be rebaselined. |
Xianzhu Wang | c5e2eaf1 | 2020-01-16 22:13:09 | [diff] [blame] | 128 | * By default, it finds the try jobs by looking at the latest patchset. If you |
| 129 | have finished try jobs that are associated with an earlier patchset and you |
| 130 | want to use them instead of scheduling new try jobs, you can add the flag |
| 131 | `--patchset=n` to specify the patchset. This is very useful when the CL has |
| 132 | 'trivial' patchsets that are created e.g. by editing the CL descrpition. |
| 133 | |
Xianzhu Wang | 61d49d5 | 2021-07-31 16:44:53 | [diff] [blame] | 134 | ### Rebaseline script in results.html |
| 135 | |
| 136 | Web test results.html linked from bot job result page provides an alternative |
| 137 | way to rebaseline tests for a particular platform. |
| 138 | |
| 139 | * In the bot job result page, find the web test results.html link and click it. |
| 140 | * Choose "Rebaseline script" from the dropdown list after "Test shown ... in format". |
| 141 | * Click "Copy report" (or manually copy part of the script for the tests you want |
| 142 | to rebaseline). |
| 143 | * In local console, change directory into `third_party/blink/web_tests/platform/<platform>`. |
| 144 | * Paste. |
| 145 | * Add files into git and commit. |
| 146 | |
Xianzhu Wang | dca4902 | 2021-08-27 20:50:11 | [diff] [blame] | 147 | The generated command includes `blink_tool.py optimize-baselines <tests>` which |
Jonathan Lee | 80280d2 | 2023-11-27 22:40:56 | [diff] [blame] | 148 | removes redundant baselines. |
Xianzhu Wang | 61d49d5 | 2021-07-31 16:44:53 | [diff] [blame] | 149 | |
Xianzhu Wang | c5e2eaf1 | 2020-01-16 22:13:09 | [diff] [blame] | 150 | ### Local manual rebaselining |
| 151 | |
Xianzhu Wang | 61d49d5 | 2021-07-31 16:44:53 | [diff] [blame] | 152 | ```bash |
| 153 | third_party/blink/tools/run_web_tests.py --reset-results foo/bar/test.html |
| 154 | ``` |
pwnall | d8a25072 | 2016-11-09 18:24:03 | [diff] [blame] | 155 | |
Xianzhu Wang | 61d49d5 | 2021-07-31 16:44:53 | [diff] [blame] | 156 | If there are current expectation files for `web_tests/foo/bar/test.html`, |
| 157 | the above command will overwrite the current baselines at their original |
| 158 | locations with the actual results. The current baseline means the `-expected.*` |
| 159 | file used to compare the actual result when the test is run locally, i.e. the |
| 160 | first file found in the [baseline search path](https://cs.chromium.org/search/?q=port/base.py+baseline_search_path). |
| 161 | |
| 162 | If there are no current baselines, the above command will create new baselines |
| 163 | in the platform-independent directory, e.g. |
| 164 | `web_tests/foo/bar/test-expected.{txt,png}`. |
| 165 | |
| 166 | When you rebaseline a test, make sure your commit description explains why the |
| 167 | test is being re-baselined. |
| 168 | |
| 169 | ### Rebaselining flag-specific expectations |
| 170 | |
| 171 | See [Testing Runtime Flags](./web_tests.md#Testing-Runtime-Flags) for details |
| 172 | about flag-specific expectations. |
| 173 | |
Jonathan Lee | 80280d2 | 2023-11-27 22:40:56 | [diff] [blame] | 174 | The [Rebaseline Tool](#How-to-rebaseline) supports all flag-specific suites that |
| 175 | [run in CQ/CI](/third_party/blink/tools/blinkpy/common/config/builders.json). |
| 176 | You may also rebaseline flag-specific results locally with: |
Xianzhu Wang | 61d49d5 | 2021-07-31 16:44:53 | [diff] [blame] | 177 | |
| 178 | ```bash |
| 179 | third_party/blink/tools/run_web_tests.py --flag-specific=config --reset-results foo/bar/test.html |
| 180 | ``` |
| 181 | |
| 182 | New baselines will be created in the flag-specific baselines directory, e.g. |
| 183 | `web_tests/flag-specific/config/foo/bar/test-expected.{txt,png}` |
| 184 | |
| 185 | Then you can commit the new baselines and upload the patch for review. |
| 186 | |
| 187 | Sometimes it's difficult for reviewers to review the patch containing only new |
| 188 | files. You can follow the steps below for easier review. |
| 189 | |
| 190 | 1. Copy existing baselines to the flag-specific baselines directory for the |
| 191 | tests to be rebaselined: |
| 192 | ```bash |
| 193 | third_party/blink/tools/run_web_tests.py --flag-specific=config --copy-baselines foo/bar/test.html |
| 194 | ``` |
| 195 | Then add the newly created baseline files, commit and upload the patch. |
| 196 | Note that the above command won't copy baselines for passing tests. |
| 197 | |
| 198 | 2. Rebaseline the test locally: |
| 199 | ```bash |
| 200 | third_party/blink/tools/run_web_tests.py --flag-specific=config --reset-results foo/bar/test.html |
| 201 | ``` |
| 202 | Commit the changes and upload the patch. |
| 203 | |
| 204 | 3. Request review of the CL and tell the reviewer to compare the patch sets that |
| 205 | were uploaded in step 1 and step 2 to see the differences of the rebaselines. |
Jonathan Lee | dbea4d4d | 2022-05-25 15:35:09 | [diff] [blame] | 206 | |
pwnall | d8a25072 | 2016-11-09 18:24:03 | [diff] [blame] | 207 | ## Kinds of expectations files |
| 208 | |
Kent Tamura | 59ffb02 | 2018-11-27 05:30:56 | [diff] [blame] | 209 | * [TestExpectations](../../third_party/blink/web_tests/TestExpectations): The |
Quinten Yearsley | d13299d | 2017-07-25 17:22:17 | [diff] [blame] | 210 | main test failure suppression file. In theory, this should be used for |
| 211 | temporarily marking tests as flaky. |
Jonathan Lee | 80280d2 | 2023-11-27 22:40:56 | [diff] [blame] | 212 | See [the `run_wpt_tests.py` doc](run_web_platform_tests.md) for information |
| 213 | about WPT coverage for Chrome. |
Kent Tamura | 59ffb02 | 2018-11-27 05:30:56 | [diff] [blame] | 214 | * [ASANExpectations](../../third_party/blink/web_tests/ASANExpectations): |
pwnall | d8a25072 | 2016-11-09 18:24:03 | [diff] [blame] | 215 | Tests that fail under ASAN. |
Weizhong Xia | d51f8404 | 2025-04-04 01:53:43 | [diff] [blame] | 216 | * [CfTTestExpectations](../../third_party/blink/web_tests/CfTTestExpectations): |
An Sung | d75ea333 | 2024-12-05 18:45:23 | [diff] [blame] | 217 | Tests that fail under Chrome for Testing |
Kent Tamura | 59ffb02 | 2018-11-27 05:30:56 | [diff] [blame] | 218 | * [LeakExpectations](../../third_party/blink/web_tests/LeakExpectations): |
pwnall | d8a25072 | 2016-11-09 18:24:03 | [diff] [blame] | 219 | Tests that have memory leaks under the leak checker. |
Kent Tamura | 59ffb02 | 2018-11-27 05:30:56 | [diff] [blame] | 220 | * [MSANExpectations](../../third_party/blink/web_tests/MSANExpectations): |
pwnall | d8a25072 | 2016-11-09 18:24:03 | [diff] [blame] | 221 | Tests that fail under MSAN. |
Kent Tamura | 59ffb02 | 2018-11-27 05:30:56 | [diff] [blame] | 222 | * [NeverFixTests](../../third_party/blink/web_tests/NeverFixTests): Tests |
pwnall | d8a25072 | 2016-11-09 18:24:03 | [diff] [blame] | 223 | that we never intend to fix (e.g. a test for Windows-specific behavior will |
| 224 | never be fixed on Linux/Mac). Tests that will never pass on any platform |
| 225 | should just be deleted, though. |
Kent Tamura | 59ffb02 | 2018-11-27 05:30:56 | [diff] [blame] | 226 | * [SlowTests](../../third_party/blink/web_tests/SlowTests): Tests that take |
pwnall | d8a25072 | 2016-11-09 18:24:03 | [diff] [blame] | 227 | longer than the usual timeout to run. Slow tests are given 5x the usual |
| 228 | timeout. |
Kent Tamura | 59ffb02 | 2018-11-27 05:30:56 | [diff] [blame] | 229 | * [StaleTestExpectations](../../third_party/blink/web_tests/StaleTestExpectations): |
pwnall | d8a25072 | 2016-11-09 18:24:03 | [diff] [blame] | 230 | Platform-specific lines that have been in TestExpectations for many months. |
| 231 | They're moved here to get them out of the way of people doing rebaselines |
| 232 | since they're clearly not getting fixed anytime soon. |
Kent Tamura | 59ffb02 | 2018-11-27 05:30:56 | [diff] [blame] | 233 | * [W3CImportExpectations](../../third_party/blink/web_tests/W3CImportExpectations): |
pwnall | d8a25072 | 2016-11-09 18:24:03 | [diff] [blame] | 234 | A record of which W3C tests should be imported or skipped. |
pwnall | d8a25072 | 2016-11-09 18:24:03 | [diff] [blame] | 235 | |
| 236 | ### Flag-specific expectations files |
| 237 | |
| 238 | It is possible to handle tests that only fail when run with a particular flag |
| 239 | being passed to `content_shell`. See |
Kent Tamura | 59ffb02 | 2018-11-27 05:30:56 | [diff] [blame] | 240 | [web_tests/FlagExpectations/README.txt](../../third_party/blink/web_tests/FlagExpectations/README.txt) |
pwnall | d8a25072 | 2016-11-09 18:24:03 | [diff] [blame] | 241 | for more. |
| 242 | |
| 243 | ## Updating the expectations files |
| 244 | |
| 245 | ### Ordering |
| 246 | |
| 247 | The file is not ordered. If you put new changes somewhere in the middle of the |
| 248 | file, this will reduce the chance of merge conflicts when landing your patch. |
| 249 | |
| 250 | ### Syntax |
| 251 | |
Xianzhu Wang | 61d49d5 | 2021-07-31 16:44:53 | [diff] [blame] | 252 | *** promo |
| 253 | Please see [The Chromium Test List Format](http://bit.ly/chromium-test-list-format) |
| 254 | for a more complete and up-to-date description of the syntax. |
| 255 | *** |
| 256 | |
pwnall | d8a25072 | 2016-11-09 18:24:03 | [diff] [blame] | 257 | The syntax of the file is roughly one expectation per line. An expectation can |
| 258 | apply to either a directory of tests, or a specific tests. Lines prefixed with |
| 259 | `# ` are treated as comments, and blank lines are allowed as well. |
| 260 | |
| 261 | The syntax of a line is roughly: |
| 262 | |
| 263 | ``` |
Xianzhu Wang | 61d49d5 | 2021-07-31 16:44:53 | [diff] [blame] | 264 | [ bugs ] [ "[" modifiers "]" ] test_name_or_directory [ "[" expectations "]" ] |
pwnall | d8a25072 | 2016-11-09 18:24:03 | [diff] [blame] | 265 | ``` |
| 266 | |
| 267 | * Tokens are separated by whitespace. |
| 268 | * **The brackets delimiting the modifiers and expectations from the bugs and the |
Xianzhu Wang | 61d49d5 | 2021-07-31 16:44:53 | [diff] [blame] | 269 | test_name_or_directory are not optional**; however the modifiers component is optional. In |
pwnall | d8a25072 | 2016-11-09 18:24:03 | [diff] [blame] | 270 | other words, if you want to specify modifiers or expectations, you must |
| 271 | enclose them in brackets. |
Jonathan Lee | 80280d2 | 2023-11-27 22:40:56 | [diff] [blame] | 272 | * If test_name_or_directory is a directory, it should be ended with `/*`, and all |
Xianzhu Wang | 61d49d5 | 2021-07-31 16:44:53 | [diff] [blame] | 273 | tests under the directory will have the expectations, unless overridden by |
Weizhong Xia | c76b920 | 2023-02-03 00:13:02 | [diff] [blame] | 274 | more specific expectation lines. **The wildcard is intentionally only allowed at the |
| 275 | end of test_name_or_directory, so that it will be easy to reason about |
| 276 | which test(s) a test expectation will apply to.** |
pwnall | d8a25072 | 2016-11-09 18:24:03 | [diff] [blame] | 277 | * Lines are expected to have one or more bug identifiers, and the linter will |
| 278 | complain about lines missing them. Bug identifiers are of the form |
| 279 | `crbug.com/12345`, `code.google.com/p/v8/issues/detail?id=12345` or |
| 280 | `Bug(username)`. |
| 281 | * If no modifiers are specified, the test applies to all of the configurations |
| 282 | applicable to that file. |
Jonathan Lee | 4ea63d3 | 2024-07-24 17:35:27 | [diff] [blame] | 283 | * If specified, modifiers can be one of `Fuchsia`, `Mac`, `Mac11`, |
| 284 | `Mac11-arm64`, `Mac12`, `Mac12-arm64`, `Mac13`, `Mac13-arm64`, `Mac14`, |
Weizhong Xia | f410e34 | 2025-05-07 21:35:06 | [diff] [blame] | 285 | `Mac14-arm64`, `Mac15`, `Mac15-arm64`, `Linux`, `Win`, `Win10.20h2`, |
Gyuyoung Kim | d0b328da | 2025-05-08 15:40:18 | [diff] [blame] | 286 | `Win11`, `Win11-arm64`, `Android`, `Webview`, `iOS18-Simulator`, and, |
Weizhong Xia | f410e34 | 2025-05-07 21:35:06 | [diff] [blame] | 287 | optionally, `Release`, or `Debug`. |
Jonathan Lee | 80280d2 | 2023-11-27 22:40:56 | [diff] [blame] | 288 | Check the `# tags: ...` comments [at the top of each |
| 289 | file](/third_party/blink/web_tests/TestExpectations#1) to see which modifiers |
| 290 | that file supports. |
Weizhong Xia | 88cc6ef | 2022-06-10 21:36:55 | [diff] [blame] | 291 | * Some modifiers are meta keywords, e.g. `Win` represents `Win10.20h2` and `Win11`. |
| 292 | See the `CONFIGURATION_SPECIFIER_MACROS` dictionary in |
Kent Tamura | 0101944 | 2018-05-01 22:06:58 | [diff] [blame] | 293 | [third_party/blink/tools/blinkpy/web_tests/port/base.py](../../third_party/blink/tools/blinkpy/web_tests/port/base.py) |
pwnall | d8a25072 | 2016-11-09 18:24:03 | [diff] [blame] | 294 | for the meta keywords and which modifiers they represent. |
Jonathan Lee | 80280d2 | 2023-11-27 22:40:56 | [diff] [blame] | 295 | * Expectations can be one or more of `Crash`, `Failure`, `Pass`, `Slow`, or |
| 296 | `Skip`, `Timeout`. |
| 297 | Some results don't make sense for some files; check the `# results: ...` |
| 298 | comment at the top of each file to see what results that file supports. |
Quinten Yearsley | d13299d | 2017-07-25 17:22:17 | [diff] [blame] | 299 | If multiple expectations are listed, the test is considered "flaky" and any |
| 300 | of those results will be considered as expected. |
pwnall | d8a25072 | 2016-11-09 18:24:03 | [diff] [blame] | 301 | |
| 302 | For example: |
| 303 | |
| 304 | ``` |
| 305 | crbug.com/12345 [ Win Debug ] fast/html/keygen.html [ Crash ] |
| 306 | ``` |
| 307 | |
| 308 | which indicates that the "fast/html/keygen.html" test file is expected to crash |
| 309 | when run in the Debug configuration on Windows, and the tracking bug for this |
| 310 | crash is bug \#12345 in the [Chromium issue tracker](https://crbug.com). Note |
| 311 | that the test will still be run, so that we can notice if it doesn't actually |
| 312 | crash. |
| 313 | |
Jonathan Lee | 80280d2 | 2023-11-27 22:40:56 | [diff] [blame] | 314 | Assuming you're running a debug build on Mac 10.9, the following lines are |
pwnall | d8a25072 | 2016-11-09 18:24:03 | [diff] [blame] | 315 | equivalent (in terms of whether the test is performed and its expected outcome): |
| 316 | |
| 317 | ``` |
| 318 | fast/html/keygen.html [ Skip ] |
pwnall | d8a25072 | 2016-11-09 18:24:03 | [diff] [blame] | 319 | Bug(darin) [ Mac10.9 Debug ] fast/html/keygen.html [ Skip ] |
| 320 | ``` |
| 321 | |
| 322 | ### Semantics |
| 323 | |
Jonathan Lee | 80280d2 | 2023-11-27 22:40:56 | [diff] [blame] | 324 | `Slow` causes the test runner to give the test 5x the usual time limit to run. |
| 325 | `Slow` lines go in the |
| 326 | [`SlowTests` file](../../third_party/blink/web_tests/SlowTests). |
| 327 | A given line cannot have both Slow and Timeout. |
pwnall | d8a25072 | 2016-11-09 18:24:03 | [diff] [blame] | 328 | |
| 329 | Also, when parsing the file, we use two rules to figure out if an expectation |
| 330 | line applies to the current run: |
| 331 | |
| 332 | 1. If the configuration parameters don't match the configuration of the current |
| 333 | run, the expectation is ignored. |
| 334 | 2. Expectations that match more of a test name are used before expectations that |
| 335 | match less of a test name. |
| 336 | |
Jonathan Lee | 80280d2 | 2023-11-27 22:40:56 | [diff] [blame] | 337 | If a [virtual test] has no explicit expectations (following the rules above), |
| 338 | it inherits its expectations from the base (nonvirtual) test. |
| 339 | |
| 340 | [virtual test]: /docs/testing/web_tests.md#Virtual-test-suites |
| 341 | |
pwnall | d8a25072 | 2016-11-09 18:24:03 | [diff] [blame] | 342 | For example, if you had the following lines in your file, and you were running a |
| 343 | debug build on `Mac10.10`: |
| 344 | |
| 345 | ``` |
| 346 | crbug.com/12345 [ Mac10.10 ] fast/html [ Failure ] |
| 347 | crbug.com/12345 [ Mac10.10 ] fast/html/keygen.html [ Pass ] |
Weizhong Xia | 88cc6ef | 2022-06-10 21:36:55 | [diff] [blame] | 348 | crbug.com/12345 [ Win11 ] fast/forms/submit.html [ Failure ] |
pwnall | d8a25072 | 2016-11-09 18:24:03 | [diff] [blame] | 349 | crbug.com/12345 fast/html/section-element.html [ Failure Crash ] |
| 350 | ``` |
| 351 | |
| 352 | You would expect: |
| 353 | |
| 354 | * `fast/html/article-element.html` to fail with a text diff (since it is in the |
| 355 | fast/html directory). |
| 356 | * `fast/html/keygen.html` to pass (since the exact match on the test name). |
Staphany Park | 4b66843e | 2019-07-11 07:28:33 | [diff] [blame] | 357 | * `fast/forms/submit.html` to pass (since the configuration parameters don't |
pwnall | d8a25072 | 2016-11-09 18:24:03 | [diff] [blame] | 358 | match). |
| 359 | * `fast/html/section-element.html` to either crash or produce a text (or image |
| 360 | and text) failure, but not time out or pass. |
Jonathan Lee | 80280d2 | 2023-11-27 22:40:56 | [diff] [blame] | 361 | * `virtual/foo/fast/html/article-element.html` to fail with a text diff. The |
| 362 | virtual test inherits its expectation from the first line. |
pwnall | d8a25072 | 2016-11-09 18:24:03 | [diff] [blame] | 363 | |
Xianzhu Wang | 61d49d5 | 2021-07-31 16:44:53 | [diff] [blame] | 364 | Test expectation can also apply to all tests under a directory (specified with a |
| 365 | name ending with `/*`). A more specific expectation can override a less |
| 366 | specific expectation. For example: |
| 367 | ``` |
| 368 | crbug.com/12345 virtual/composite-after-paint/* [ Skip ] |
| 369 | crbug.com/12345 virtual/composite-after-paint/compositing/backface-visibility/* [ Pass ] |
| 370 | crbug.com/12345 virtual/composite-after-paint/compositing/backface-visibility/test.html [ Failure ] |
| 371 | ``` |
| 372 | |
pwnall | d8a25072 | 2016-11-09 18:24:03 | [diff] [blame] | 373 | *** promo |
| 374 | Duplicate expectations are not allowed within the file and will generate |
| 375 | warnings. |
| 376 | *** |
| 377 | |
| 378 | You can verify that any changes you've made to an expectations file are correct |
| 379 | by running: |
| 380 | |
| 381 | ```bash |
Kent Tamura | 02b4a5b1f | 2018-04-24 23:26:28 | [diff] [blame] | 382 | third_party/blink/tools/lint_test_expectations.py |
pwnall | d8a25072 | 2016-11-09 18:24:03 | [diff] [blame] | 383 | ``` |
| 384 | |
| 385 | which will cycle through all of the possible combinations of configurations |
| 386 | looking for problems. |