What could be wrong?

View test code on arewecompatibleyet.com

While the site arewecompatibleyet.com ships with test results from my web testing, it’s until now been somewhat of a mystery how those tests actually work.

Now, even casual visitors who do not want to search through big files of code on GitHub can get a sense of how the tests work. If you scroll down to the bottom of arewecompatibleyet.com’s front page, you will find a section listing all tests that output something which doesn’t match the state of the bug. If a bug is closed as fixed and the test fails, or the bug is open but the test passes, it’s listed.

There’s a tiny new feature: a link labelled “View test code” for each entry in this table. It brings up a small box where you can not only view the test code, but also copy a chunk of JavaScript that can run from the console.

For example, there’s an apparent regression on thestar.com, regarding bug 842184, no mobile content on m.thestar.com for Firefox on Android. Here’s the new feature at work, showing the test code (click to enlarge):

screenshot of AWCY site showing test code for bug 842184, regarding thestar.com site

Clicking in the grey area will select it for copying. The code you copy includes some “scaffolding” to make the test actually run - the code that is written for bug 842184 is the part inside the steps array:

(function(){
    var i=1, steps = [

function (){return mobileLinkOrScriptUrl() /*(regression test, expected to pass)*/ && hasViewportMeta() /*(regression test, expected to pass)*/}
]


    function doStep(){ if(typeof hasViewportMeta !== "function"){var s=document.body.appendChild(document.createElement('script')); s.src='http://hallvord.com/temp/moz/stdTests.js'; s.onload = doStep; return;};if(steps.length){var result = steps[0](); console.log('test step '+i+' says: '+result); if(result !=='delay-and-retry')steps.shift(); i++; setTimeout(doStep,300);}} doStep();})()

The instructions say you should switch User-Agent (either with a suitable extension or through about:config) to spoof Firefox OS (that’s actually a data bug since we’re dealing with a bug for Firefox on Android here) and load m.thestar.com. Clicking the site link opens it in a new window.

When the site loads, we open developer tools, go to the console and paste the testing code:

screenshot of console after pasting the test code - we see the output is false

As we see, the code returns false (aka ‘not good’). Now we can play around with the test code - for example by removing the code calling mobileLinkOrScriptUrl() which I highlighted in the screenshot above. And voila:

screenshot of console after pasting the edited test code - we see the output is true

So it looks like the test should no longer look for the ‘mobile’ keyword in file names. Unfortunately, testing the code we end up with against the desktop version of the site also returns “true”. To fix this test we should dig a bit deeper and look for some difference in the DOM. It’s sometimes hard to find stable features that are different - it can be an interesting challenge. Have fun if you try to help!

Testing live websites at scale

2014-06-12 / Mozilla, Site compat, Testing

When I started working at Mozilla, one of the things I wanted to attempt was live testing against real sites - at scale. One year on, I have a sort-of-good-enough framework, over 800 tests, thousands of test results from monthlyish test runs, and a much better infrastructure around the corner thanks to the work of volunteer Seif Lotfy. It’s probably a good time to document how I do my monthlyish test cycle.

I start by loading arewecompatibleyet.com, nicknamed AWCY. This site tracks Tech Evangelism bugs relating to specific websites, and associated test results.

The site has a secret feature I run from the console: a method to list all the Tech Evangelism bugs that it does not have test results for. So my first step is to open the console and type

listMissingTestBugs()

Screenshot of the Firefox console showing the listMissingTestBugs method and its output

I copy the tab-separated output data to a text editor and save it as sites.txt in a brand new folder named missing-2014-07-12.

(Well, actually - that’s not true. First I paste them into the first textarea on sitedata-explorer.htm to remove any entries we have tests for already - sometimes test exist but for some reason failed to output any test results during the last test run. This means AWCY forgets that the test exists, and we risk creating a duplicate test. The output from this script does into sites.txt. I could make this step superfluous by changing the next script to check for existing tests..)

Now I’m going to run a script for playing through the URLs in a real Firefox instance, using the Mozilla Marionette automation framework. This script will attempt to generate automatic tests, and take screenshots which I can review to verify the generated tests. (In the not-too-distant future this script will retire and be replaced by Compatipede which extracts even more data points of information, enabling more varied and hopefully robust tests.)

First I run the Firefox instance I want to control, making sure I pass the -marionette argument for enabling Marionette and the -no-remote argument so it doesn’t just hook up with my existing Firefox instance:

c:\Program Files (x86)\Nightly\firefox.exe -marionette -no-remote -p tester2

Having pointed testsites.py to the right directory by editing the dirname variable, I can simply do

python testsites.py

and it should run through all the data in sites.txt.

Screenshot of the Marionette script running, with Firefox being controlled by the script

It outputs some statistics from the tests while running. The Mozilla Marionette framework (or Gecko itself) tends to hang on certain points during this process, maybe it is not always firing the right events when a site stops loading? Hence this requires a bit of babysitting - if it hangs, I’ll have to resume from a specific index in the list:

python testsites.py -s 12

However, here there is a small wart in the process. The testsites.py script will generate several files: screenshots, comparison screenshots showing the site with two different user-agent strings, and sitedata-automated.js full of suggested automated tests. This file is a JavaScript file with function definitions and thus not valid JSON, and testsites.py can’t currently read it. To avoid overwriting the existing data when we resume running testsites.py, I have to review those tests or stash them somewhere before resuming.

Reviewing the tests is a crucial step - I verify that the user-agent and test code looks sensible. For example, the script suggests this test:

"1003466": {
    "url": "http://www.sismologia.cl", 
    "steps": [
        function(){return hasViewportMeta() && location.hostname === "200.9.100.120" && mobileLinkOrScriptUrl();}
    ], 
    "ua": "FirefoxOS", 
    "title": "sismologia.cl sends desktop site to Firefox OS"
}

This looks strange - why would mobile browser be taken to an IP-address? This might be a temporary solution while the mobile site gets a proper hostname. I might decide to remove the hostname test and verify that the mobile site has META viewport and the desktop site doesn’t.. Or when the script proposes this test:

"1019204": {
    "url": "http://match.com", 
    "steps": [
        function(){return hasViewportMeta() && location.hostname === "touch.uk.match.com" && hasHandheldFriendlyMeta();}
    ], 
    "ua": "FirefoxOS", 
    "title": "match.com sends simplified site to Firefox OS"
}

it will probably not run very well on the distributed, international infrastructure - not all machines will be in a location that match.com redirects to their UK site. This is probably a better version:

"1019204": {
    "url": "http://match.com", 
    "steps": [
        function(){return hasViewportMeta() && location.hostname.indexOf("touch.") === 0 && hasHandheldFriendlyMeta();}
    ], 
    "ua": "FirefoxOS", 
    "title": "match.com sends simplified site to Firefox OS"
}

Some bugs are not testable. For example, in this test run AWCY included bug 895485 in the list of bugs that require a test. This is more of a meta bug, it doesn’t really describe a specific issue on a specific site, and hence we don’t want a test for it. I’ve added a file called ignored_bugs.txt which lists the bugs testsites.py will ignore and not attempt to generate tests for. Each bug number has a comment that explains why we want to ignore that bug. Let’s add 895485.

When this test run is done, when all tests in sitedata-automated.py are reviewed and copied to sitedata.js, I will also go through the generated sitedata-missing.js file. The script has prepared empty JSON blocks for all sites it could not generate tests for. For example

"1019380": {
    "url": "http://www.hotels.com/", 
    "ua": "FirefoxOS", 
    "steps": [
        "function(){return }"
    ], 
    "title": "hotels.com sends Desktop Content to Firefox OS and Firefox Android"
}

(First thing to do is to remove the quotes around the function - this file is valid JSON.. apologies for not hooking up a JS parser here. I do this only once a month and the script is scheduled for retirement anyway..)

Investigating, I see there’s a good reason why the automation could not generate a test: the problem seems fixed! Yay! Last update on the bug is only 10 days ago, and we haven’t even contacted them - but it works fine now. So testsites.py obviously did not detect any problems, but we can write a test manually and keep it around to detect regressions:

"1019380": {
    "url": "http://www.hotels.com/", 
    "ua": "FirefoxOS", 
    "steps": [
        function(){return location.pathname.indexOf('/mobile') === 0}
    ], 
    "title": "hotels.com sends Desktop Content to Firefox OS and Firefox Android"
}

So this test will keep running - if they tweak their browser detection again, and no longer detect Firefox OS correctly, the test will flag a failure and announce it on the AWCY site.

Finally, when the new tests are added, I’ll do a test run and add test results to the AWCY site. I use a SlimerJS-based test script. I can run either individual tests:

slimerjs -P SlimerJSTester slimertester.js 1019380 1019204

or just run them all:

slimerjs -P SlimerJSTester slimertester.js

It takes quite a while to run 8-900 tests - but will eventually output results-2014-07-12.csv, and I’ll add that file to AWCY’s data/testing directory and update index.json to add a reference to the new file. And then the most exciting step: reloading the AWCY site to check the brand new list of fixes or regressions at the bottom of the front page..

The process still has some warts, but it works fairly well. We’re testing the web at scale, and we’re ready to scale even further. If you want to add or improve tests, pull requests for sitedata.js are naturally welcome! I haven’t yet had time to write these missing tests - can you?

The tests and infrastructure are not intended to be limited to Gecko/Firefox issues. With the growing, cross-browser bug list of issues on webcompat.com, now is a good time to start adding tests for other browsers’ issues to our framework. Seif’s “Compatipede” infrastructure will make cross-browser compatibility testing at scale even more powerful - we’ll run both “exploratory” tests for various configurations and reviewed, bug-specific tests. We’re in for a fun ride while scaling up!

Adding more browser sniffing magic to the web platform

On this blog and elsewhere, I’ve spent a lot of words ranting about websites that can’t seem to wean themselves off browser sniffing.

Let’s take a step back for context: browser sniffing is when a website looks at the name of a web browser and guesstimates its abilities. It’s a pretty common way to figure out whether a visiting browser should get desktop or mobile content, and how advanced styling and JavaScript it can handle. During several years testing web browsers and web sites I’ve seen a lot of broken and faulty sniffing - for engineers on the browser vendor side it’s easily taken for granted that all browser sniffing is BAD. The simple reason is that we see all the negative effects:

  • Website code making wrong assumptions, sites give browsers wrong content or no content at all.
  • It’s easy to see why mistakes happen: browser detection is insanely complex because of the thousands of browsers and millions of devices out there.
  • Browser detection is usually proprietary - many sites have different approaches and quirks
  • Browser detection logic frequently breaks when browsers are updated and improved - this makes it harder for browser vendors to ship updates.
  • Slower updates makes web standardisation work more difficult and slow.
  • Browser detection distorts the competition by favouring big browsers and platforms - they are detected and have features and special hacks delivered to them. For example, if you want to ship a tablet device and make all today’s tablet-specific content work, it might be tempting to include the word “iPad” in the User-Agent string. Can you even do that without being sued? I have no idea.
  • Similarly, big sites with finely tuned User-Agent detection have a competitive edge over smaller upstarts - how can the smaller site be confident that its web features work correctly across the diverse device landscape?

So, many browser engineers naturally hate browser sniffing. It’s a big hack that holds back the web.

Over the years, we’ve realised that some detection has use cases - for example, moving from dreaming of “one web” across all devices and same content everywhere to adding the string “Mobile” to the Firefox OS UA string. The intention is that you should be able to scan the string for the word “Mobile” (or just “Mobi” to handle certain Opera Mobile versions too) and serve your fancy mobile content if you see this string. On tablets with Firefox OS, the word “Tablet” should appear instead.

Also, last year we launched a User-Agent detection use cases survey, trying to figure out when web developers rely on browser sniffing.

Unfortunately, it’s not just web developers and websites.. Apparently, even Mozilla itself has a hard time finding better approaches to certain use cases, as seen in bugs like Tarako user agent says 28.0 not 28.1 and please ask Facebook/Twitter to point the Tarako UA to their “low-fi” version of their webpages.

The context here is the famous $25 phone. For that price you might not expect a lot of memory and a cutting edge CPU - and you would be right. It is a touch-screen smartphone, but a low-end one, and apparently the fancy mobile sites from Facebook and Twitter need more power and memory.

Detecting and adapting to hardware capabilities is an old problem, and evidently we haven’t found a nice solution yet - but it’s more than a little depressing that the solution we do come up with is adding a magic number in the UA string and nodding and winking to some major sites to make them add yet another small piece of browser sniffing magic to the web platform.

Do you have better ideas? Add them to this thread.

What will it take..?

A long time ago, web sites started detecting the name and version of the visitor’s browser to adjust in various ways or throw tauntrums and demand upgrades.

By now, developers and spec authors spent years trying to make it easy to avoid browser detection. For example, when we came up with the HTML5 VIDEO element, we added the canPlayType() method. You can detect if any browser supports playing any video format, using this method.

And web developers still apparently feel comfortable deploying this:


/**
 * Check if html5 player can be used on this platform
 * now only iphone and ipad are supported
 * @return bool 
 */
isPlatformSupported : function () {
    var ua = navigator.userAgent,
    isiPad = /iPad/i.test(ua),
    isiPhone = /iPhone/i.test(ua),
    response = (isiPhone || isiPad);
    
    return response;
}

Courtesy of xstream.dk - let’s call it xstreamely disappointing.

Blog reboot

2014-05-27 / Meta

I haven’t been blogging as much as I’d like to, but here’s hoping it will change. The blog is now built by Nanoc, mostly thanks to instructions found on Sebastian Morr’s site.

Comments are not implemented at the moment - I’m not sure if I want to throw in Disqus, find another solution or just leave it. The few comments that were posted in the old system are manually embedded in the posts - I guess you can always send me a comment by E-mail or on Twitter if it really matters :).