Martin Atkins (mart) wrote in changelog,
Martin Atkins
mart
changelog

[livejournal] r10295: A set of test cases for the correct hand...

Committer: mart
A set of test cases for the correct handling of various type attribute values when parsing Atom feeds.

Based on Phil Ringnalda's title conformance tests.

LiveJournal currently fails all except mode="html".
There are different possible "right answers" than those I've given in the test here, but
what LiveJournal is doing right now isn't any of them.


A   trunk/t/parsefeed-atom-types.t
Added: trunk/t/parsefeed-atom-types.t
===================================================================
--- trunk/t/parsefeed-atom-types.t	2007-02-03 18:09:33 UTC (rev 10294)
+++ trunk/t/parsefeed-atom-types.t	2007-02-03 19:11:58 UTC (rev 10295)
@@ -0,0 +1,90 @@
+# -*-perl-*-
+use strict;
+use Test::More 'no_plan';
+use lib "$ENV{LJHOME}/cgi-bin";
+require "parsefeed.pl";
+require 'ljlib.pl';
+
+## These test cases are based roughly on Phil Ringnalda's eight <title> conformance tests:
+##    <http://weblog.philringnalda.com/2005/12/18/who-knows-a-title-from-a-hole-in-the-ground>
+
+my $testfeed = sub {
+    my $entrybody = shift;
+
+    my $contents = qq{
+        <feed xmlns="http://www.w3.org/2005/Atom">
+          <id>testing:atom:feed</id>
+          <title>test atom feed</title>
+          <author><name>testing</name></author>
+          <link rel="alternate" type="text/html" href="$LJ::SITEROOT" />
+          <link rel="self" type="text/xml" href="$LJ::SITEROOT" />
+          <updated>2007-01-08T23:40:33Z</updated>
+          <entry>
+            <id>testing:atom:feed:entry</id>
+            <updated>2006-09-14T07:39:07Z</updated>
+            $entrybody
+            <link rel=\"alternate\" type=\"text/html\" href=\"$LJ::SITEROOT\" />
+          </entry>
+        </feed>
+    };
+
+    my ($feed, $error) = LJ::ParseFeed::parse_feed($contents);
+    return $feed->{'items'}->[0];
+
+};
+
+my $testtitle = sub {
+    my $titleelem = shift;
+
+    my $contents = qq{
+        $titleelem
+        <content type="html">content content content</content>
+    };
+
+    my $item = $testfeed->($contents);
+    return $item->{'subject'};
+};
+
+my $testcontent = sub {
+    my $contentelem = shift;
+
+    my $contents = qq{
+        <title>kumquats cheese blogosphere</title>
+        $contentelem
+    };
+
+    my $item = $testfeed->($contents);
+    return $item->{'text'};
+};
+
+#$testtitle->("<title>&lt;title&gt;</title>");
+
+# When type="html", the contents should be escaped HTML
+# The correct result is the content with one level of escaping removed
+is($testtitle->(qq{<title type="html"><![CDATA[<title>]]></title>}), "<title>", "Title: HTML + CDATA");
+is($testtitle->(qq{<title type="html">&lt;title></title>}), "<title>", "Title: HTML + Entities");
+is($testtitle->(qq{<title type="html">&lt;title></title>}), "<title>", "Title: HTML + Numeric character references");
+
+# When type="text", the contents are escaped plain text
+# Since LiveJournal expects HTML in the subject field, parsefeed should
+# be returning the text with HTML escaping applied.
+is($testtitle->(qq{<title type="text"><![CDATA[<title>]]></title>}), "<title>", "Title: Text + CDATA");
+is($testtitle->(qq{<title type="text"><title></title>}), "<title>", "Title: Text + Entity");
+is($testtitle->(qq{<title type="text"><title></title>}), "<title>", "Title: Text + Numeric character references");
+
+# When type="xhtml" the content is interpreted as normal XML with no special
+# escaping. Therefore it should be returned basically verbatim, with no
+# extra escaping or de-escaping.
+is($testtitle->(qq{<title type="xhtml"><div xmlns="http://www.w3.org/1999/xhtml"><title></div></title>}), qq{<div xmlns="http://www.w3.org/1999/xhtml"><title></div>}, "Title: XHTML + Entities");
+is($testtitle->(qq{<title type="xhtml"><div xmlns="http://www.w3.org/1999/xhtml"><title></div></title>}), qq{<div xmlns="http://www.w3.org/1999/xhtml"><title></div>}, "Title: XHTML + Numeric character references");
+
+# Now do the same eight tests but on the entry content instead
+is($testcontent->(qq{<content type="html"><![CDATA[<content>]]></content>}), "<content>", "Content: HTML + CDATA");
+is($testcontent->(qq{<content type="html">&lt;content></content>}), "<content>", "Content: HTML + Entities");
+is($testcontent->(qq{<content type="html">&lt;content></content>}), "<content>", "Content: HTML + Numeric character references");
+is($testcontent->(qq{<content type="text"><![CDATA[<content>]]></content>}), "<content>", "Content: Text + CDATA");
+is($testcontent->(qq{<content type="text"><content></content>}), "<content>", "Content: Text + Entity");
+is($testcontent->(qq{<content type="text"><content></content>}), "<content>", "Content: Text + Numeric character references");
+is($testcontent->(qq{<content type="xhtml"><div xmlns="http://www.w3.org/1999/xhtml"><content></div></content>}), qq{<div xmlns="http://www.w3.org/1999/xhtml"><content></div>}, "Content: XHTML + Entities");
+is($testcontent->(qq{<content type="xhtml"><div xmlns="http://www.w3.org/1999/xhtml"><content></div></content>}), qq{<div xmlns="http://www.w3.org/1999/xhtml"><content></div>}, "Content: XHTML + Numeric character references");
+


Property changes on: trunk/t/parsefeed-atom-types.t
___________________________________________________________________
Name: svn:executable
   + *

Subscribe
  • Post a new comment

    Error

    Anonymous comments are disabled in this journal

    default userpic

    Your reply will be screened

    Your IP address will be recorded 

  • 0 comments