tag:github.com,2008:https://github.com/sharpic/DOMBlockClustering/releases Release notes from DOMBlockClustering 2015-05-07T15:56:36Z tag:github.com,2008:Repository/35168804/1.0.0 2015-05-15T14:25:43Z Final Slides for the W4A 2015 Conference <h1>Presentation Slides for - DOM Block Clustering for Enhanced Sampling and Evaluation</h1> <p><a href="http://dx.doi.org/10.5281/zenodo.17437" rel="nofollow"><img src="https://camo.githubusercontent.com/75a31bec100d050c056fefeb0d1139055be54cd1608d7f84a7913b37251c4a6f/68747470733a2f2f7a656e6f646f2e6f72672f62616467652f343330302f736861727069632f444f4d426c6f636b436c7573746572696e672e737667" alt="DOI" data-canonical-src="https://zenodo.org/badge/4300/sharpic/DOMBlockClustering.svg" style="max-width: 100%;"></a></p> <ul> <li>Slides at: <a href="http://sharpic.github.io/DOMBlockClustering/" rel="nofollow">http://sharpic.github.io/DOMBlockClustering/</a></li> <li>Full Paper at: <a href="http://dx.doi.org/10.1145/2745555.2746649" rel="nofollow">http://dx.doi.org/10.1145/2745555.2746649</a></li> <li>References at: <a href="http://sharpic.github.io/DOMBlockClustering/DOMBlockClustering.bib" rel="nofollow">http://sharpic.github.io/DOMBlockClustering/DOMBlockClustering.bib</a></li> </ul> <h2>Abstract</h2> <p>Large websites are difficult to evaluate for Web Accessibility compliance due to the shear number of pages, the inaccuracy of current Web evaluation engines, and the W3C stated need to include human evaluators within the testing regime. This makes evaluating large websites all-but technically unfeasible. Therefore, sampling of the pages becomes a critical first step in the evaluation process. Current methods rely on drawing random samples, best guess samples, or convenience samples. In all cases the evaluation results cannot be trusted because the underlying structure and nature of the site are not known; they are missing `website demographics'. By understanding the quantifiable statistics of a given population of pages we are better able to decide on the coverage we need for a full review, as well as the sample we need to draw in order to enact an evaluation. Our solution is to crawl a website comparing, and then clustering, the pages discovered based on Document Object Model block level similarity. This technique can be useful in reducing very large sites to a more manageable size, and allowing an 80% coverage by evaluating between approx 0.1-4% of pages; additionally, by refining our clustering algorithm, we discuss how this could be reduced further.</p> <h2>Authors</h2> <p>Simon Harper [1], Anwar Ahmad Moon [1], Markel Vigo [1], Giorgio Brajnik [2], and Yeliz Yesilada [3]</p> <ol> <li>School of Computer Science University of Manchester United Kingdom</li> <li>Dept. of Mathematics and Computer Science University of Udine<br> Italy</li> <li>Middle East Technical University Northern Cyprus Campus Turkey</li> </ol> sharpic tag:github.com,2008:Repository/35168804/v0.8.1 2015-05-06T16:40:26Z v0.8.1 No content. sharpic tag:github.com,2008:Repository/35168804/v0.8.0 2015-05-06T16:29:47Z v0.8.0 No content. sharpic